Softpanorama
(slightly skeptical) Open Source Software Educational Society

May the source be with you, but remember the KISS principle ;-)

Google   


Scripting News 2001

Prev Up Next

coroutines for Ruby

developerWorks Web architecture Server-side scripting languages

Erik Zoltán (erik@zoltan.org)
Advanced Systems Engineer, EDS
April 2001 

Still can't decide whether to use PHP scripts, Perl CGIs, or Java servlets for your next Web development project? This article will help you decide by providing a side-by-side comparison of the functioning source code of all three languages. The three simple example programs provided take you from the most basic server-side scripts through object orientation to a simple Web storefront presenting product information to a user. Reading this article will give you a good idea of the difference between these three languages, and a better idea of which one is right for you.

This article explains how PHP scripts, Perl CGIs, and Java servlets work; it also covers the issues that separate the three languages. You don't need to know the languages in order to comprehend this piece, but you do need to have a passing familiarity with HTML to make reading it worth your while.

If you have a great deal of experience in any of these languages, you'll notice that this article neither discusses the in-depth details and advanced alternatives that an expert might choose, nor does it introduce many of the technical terms and underlying language features that appear confusing when you are familiarizing yourself with a programming language (of course, this is what makes the language interesting after years of programming with it). This is intentional.

Jerome Mrozak - Subject: I did eval for my Java shop and.. ( Mar 16, 2001 )  -- weak but still have some interesting observations. 

My outfit is creating a B2C web site using Java and JSP. Big, hairy thing.

Anyways, I read that survey that examined productivity, program speed and bugginess of a single project implemented in a variety of "heavy" languages (c, c++, Java) and script languages (Perl, Python, Tcl). For the uninitiated, the survey asserts that script languages could deliver the same quality of product in about half the time because the "heavy" languages were that much wordier to get the job done.

Could my firm benefit from scripting? I decided to see if we could efficiently prototype in a scripting language and then use the model to complete a business-quality site. To that end I studied Python and *finally* completed the Camel book for Perl.

My conclusions were:

**** A survey of scripting programming language implementation option    (Note: This is about syntax design, not interpreter building.) A very nice paper that discuss some choices in scripting language design. [Jan 10, 2000] Here are some quick notes:

**** Scripting with C

What are the advantages of scripting with C? This installment of Regular Expressions takes inventory of the principal products that are giving C a new look. (1,300 words)

Lua-an extensible extension language  -- alternative to TCL

There is increasing demand for customizable applications. As applications became more complex, customization with simple parameters became impossible: users now want to make configuration decisions at execution time; users also want to write macros and scripts to increase productivity [1,2,3,4]. In response to these needs, there is an important trend nowadays to split complex systems in two parts: kernel and configuration. The kernel implements the basic classes and objects of the system, and is usually written in a compiled, statically typed language, like C or Modula-2. The configuration part, usually written in an interpreted, flexible language, connects these classes and objects to give the final shape to the application [5].

Configuration languages come in several flavors, ranging from simple languages for selecting preferences, usually implemented as parameter lists in command lines or as variable-value pairs read from configuration files (e.g., MS-Windows' .ini files, X11 resource files), to embedded languages, for extending applications with user defined functions based on primitives provided by the applications. Embedded languages can be quite powerful, being sometimes simplified variants of mainstream programming languages such as Lisp and C. Such configuration languages are also called extension languages, since they allow the extension of the basic kernel semantics with new, user defined capabilities.

What makes extension languages different from stand alone languages is that they only work embedded in a host client, called the host program. Moreover, the host program can usually provide domain-specific extensions to customize the embedded language for its own purposes, typically by providing higher level abstractions. For this, an embedded language has both a syntax for its own programs and an application program interface (API) for communicating with hosts. Unlike simpler configuration languages, which are used to supply parameter values and sequences of actions to hosts, there is a two-way communication between embedded languages and host programs.

It is important to note that the requirements on extension languages are different from those on general purpose programming languages. The main requirements for extension languages are:

This paper describes Lua, an extensible procedural language with powerful data description facilities, designed to be used as a general purpose extension language. Lua arose as the fusion of two descriptive languages, designed for the configuration of two specific applications: one for scientific data entry [6], the other for visualizing lithology profiles obtained from geological probes. When users began to demand increasingly more power in these languages, it became clear that real programming facilities were needed. Instead of upgrading and maintaining two different languages in parallel, the solution adopted was to design a single language that could be used not only for these two applications, but for any other application. Therefore, Lua incorporates facilities common to most procedural programming languages - control structures (whiles, ifs, etc.), assignments, subroutines, and infix operators - but abstracts out facilities specific to any particular domain. In this way, Lua can be used not only as a complete language but also as a language framework.

Lua satisfies the requirements listed above quite well. Its syntax and control structures are quite simple, Pascal-like. Lua is small; the whole library is around six thousand lines of ANSI C, of which almost two thousand are generated by yacc. Finally, Lua is extensible. In its design, the addition of many different features has been replaced by the creation of a few meta mechanisms that allow programmers to implement those features themselves. These meta mechanisms are: dynamic associative arrays, reflexive facilities, and fallbacks.

Dynamic associative arrays directly implement a multitude of data types, like ordinary arrays, records, sets, and bags. They also lever the data description power of the language, by means of constructors.

Reflexive facilities allow the creation of highly polymorphic parts. Persistence and multiple name spaces are examples of features not directly present in Lua, but that can be easily implemented in Lua itself using reflexive facilities.

Finally, although Lua has a fixed syntax, fallbacks can extend the meaning of many syntactical constructions. For instance, fallbacks can be used to implement different kinds of inheritance, a feature not present in Lua.

 Perl vs Python

I don't have much against Python. It's a nice language. If I wouldn't already know Perl, I'd certainly code a lot in Python. Main reason I hardly code in Python is that it doesn't offer me much that Perl hasn't, and it just isn't worthwhile to become as fluent in Python as I'm currently in Perl. Two big plusses for Perl compared to Python: better error messages and warning (personally, I find Pythons errors cryptic and less suitable for beginners, although it by default gives a trace where Perl doesn't), and Perl has better documentation. Python 1.5.1 doesn't even come with any documentation by default; you have to install that separately. This is specially important since "core Python" is much smaller than "core Perl", even for simple things, you'd need to pull in a module. And unless you know your modules very well, you need to consult the documentation to find out which module to pull in. (Recently, I wanted to use sleep. It wasn't in os, or sys, or even in posix, but in timer (IIRC), which took me half an hour to find out.)

Big wins for Python: a much, much cleaner OO system. While Perl took Python's OO system as a basis, in its implementation it took every wrong turn possible. Also, Python has less of "it is this way because it's the way of the C API" as Perl has. Python is probably easier to learn than Perl, certainly for people without a Unix background. For a coding project that consists of a group of relative novice programmers, from a varying background, I'd prefer to use Python than Perl. But then, because Python doesn't have something like use strict; (as mentioned by Chip), other options like Ada or Eiffel should be looked into as well.

For individual programmers, I do not think the question "which is better Python or Perl?" is a reasonable question. To be a good coder in Perl, you need a specific mindset. Perl is a rich language, full of special cases and shortcuts. It's great for large groups of people. But that isn't good for an even larger group of people. Many people are much better of with a language that forces you to do things in a specific way, that has stricter rules on doing things. For those people, Python is much better. Frankly, I think that many of the people currently struggling with Perl are much better of coding in Python. The Python community should do more advocacy. (And of a different kind than "Perl sucks")

Successful evaluations 

Scripting's facility for the evaluation of source code texts can be abbreviated as eval, the name several different languages use for the command that performs this action. With eval's power comes the potential for misuse, of course, so we'll warn you here against several common misconceptions about the reliability of scripted applications.

Scripting's payoff
First, a few general remarks. One of the frustrations of writing this column is that there's never enough time or room to report on all of the progress people are making with scripting languages. We have several aging stories that are currently unpublished and will probably never see the light of day because others issues have pushed their way to the front.

This has led us to emphasize stories that illustrate general themes likely to touch (nearly) all the column's readers. This month's focus on eval aims to do much more than simply introduce a particular command or syntax. eval is exciting because it crystallizes important ideas with wide application.

Scripting is so lightweight that even nonprogrammers can do useful work with it. A lovely example of this is FlashPoint Technology's Digita Script system. Digita is a small, simple language for controlling digital cameras with which users can automate the construction of Webpages, annotate stills with text and sound, create instructional sequences, and so on. Digita's most remarkable asset is its ability to put all that power in the hands of camera enthusiasts with no software background. FlashPoint has successfully nurtured a thriving culture of script-swapping photography fans. Among their clever creations: compact scripts to turn cameras into imaging rulers, alarm clocks, tic-tac-toe players, and motion sensors.

Cheating with scripts
Much of scripting's power is a direct consequence of higher-order programming. As Frank Pilhofer, a graduate student specializing in distributed object systems, recently wrote, "A scripting language is defined by the equivalence of code and data, [which makes it possible] to generate and evaluate code at runtime." Languages like C and Java get their abilities from elaborate libraries of carefully-crafted pieces of functionality. Lisp, Forth, and the scripting languages have a looser legacy. Rather than precisely-fitted functions and objects, these languages emphasize domain-specific extensibility. Good style in C emphasizes small executable size and interfaces that correspond at least roughly to underlying hardware. Python programmers, in contrast, get the computer to do more of the work, and are more inclined to search for a solution that matches the expressiveness of human, rather than computing, language.

Configuration constitutes an unfair example of this leverage. Applications often need to maintain configuration information in an external file: editors read key bindings, Web servers read user authentication and filesystem locations, and so on. A C programmer typically starts this task by designing a small external language, something like: 

        # This is a comment.
    ROOTDIR: /usr/somewhere
    MAIN_USER: someone

The C programmer then needs to parse the language, often with the weight of the lex-yacc tool pair.

Users of scripting languages don't work nearly as hard. A Python programmer in the same situation will probably define the configuration file so that it looks like: 

        # This is a comment.
    ROOTDIR = "/usr/somewhere"
    MAIN_USER = "someone"

End users are at least as likely to be able to read and maintain a file written with =. Reading a configuration into an executing application requires a single line: 

    execfile("spec.conf")

It's silly, of course, to compare languages on the basis of individual facilities. One can imagine C having a built-in, configuration-reading, parsing API that would swing the balance back in its direction for this particular example. The point is that scripting makes this kind of cheating pervasive in application development. An experienced programmer develops a knack for seeing problems as programs that need to be written, then uses a scripting language to automate the production of answers. This is easy to do because scripting languages are built to treat code as data, and vice versa. 

Safety in scripting
Scripting languages are great at configurations of this sort, but many managers don't allow their use because they're regarded as unsafe. There's reason for this. In the example above, a simple-minded execfile() opens the invoking application to anything a user might edit into "spec.conf". Before long, a mischievous or unlucky end user will type in commands to reformat a hard disk and the exefile() will cheerfully follow orders.

The comparable C codings are rarely at such risk. It's typically hard enough to get lex and yacc to do what's they're supposed to do; using them for more general purposes is even trickier. Examples such as this lead some observers to conclude that serious programs should be built in compiled languages so that they can be properly tested. While this belief is widespread, we see it as a grave mistake.

It's a mistake on more levels than this single column can detail. Most superficially, all the leading scripting languages embody a notion of guarded interpretation, so that an evaluation like the execfile() above can be done safely. Among other things, this means that scripted solutions can be proven safer than conventional compiled ones. 

Strings that are scripts
Related misunderstandings arise with another common idiom of evaluation. A good model for many algorithms is that of an action that is applied to a collection of data. This is often done in C with a switch:
  

    if (some_condition) {
           f1(data);
    } else if (other_condition) {
           f2(data);
    } ...

In scripting languages, it's natural to refactor such a design. For example, the same logic in Perl might look like this:

    if ($some_condition) {
           $action = "f1";
    } elseif ($other_condition) {
           $action = "f2";
    }
    eval("$action $data");

C can do some of the same things by using function pointers. However, function pointers make enough C programmers uncomfortable that they're used much less often than the eval() in the Perl example above.

Among the advantages of the Perl-coded alternative is its flexibility. Incremental composition at runtime of a small script, which is itself evaluated, is an amazingly powerful technique. Scripting languages typically have good facilities for string manipulation -- and building up a script as a string exploits those strengths. One example involves sloppy or irregular legacy interfaces. A programmer can often unify or consolidate these interfaces by expressing them as textual strings.

The hazard is the same as that in the previous case. A bare eval has the full power of the Perl interpreter and can wreak quite a bit of havoc if fed malicious data. This contributes to a mistaken belief that errors in scripted applications can only be detected at runtime, and that compiled applications are easier to validate.

The solution to this problem is similar to the previous problem. A real application should use Perl's tainted variables, or a related mechanism, to manage eval()'s operation. More generally, a disciplined programmer can write Perl source that has better provability than the corresponding C code, simply because the Perl source takes advantage of superior extensibility mechanisms.

This doesn't happen accidentally. Writing Perl or another scripting language as though it were C leads to severe performance and correctness problems. The worst problem we commonly see with eval is that it's simply not understood. Unsophisticated Tcl programmers often write such baroque combinations as:

    eval exec {myapp "$arg1"}

rather than the simpler:

    exec myapp $arg1

simply because they've mislearned superstitions about using { or " with eval.

Scripting is about getting more done with less. A computer is a good tool for building an expressive text or other data structure. When you then turn around and evaluate that text as a script in its own right, you're taking advantage of one of scripting's most powerful benefits. Learn the eval and associated functions of your favorite scripting languages and you'll multiply your ability to write compact and effective applications.

For more on this subject -- and almost everything else having to do with procedural computing languages -- start by reading Structure and Interpretation of Computer Programs (see Resources for more information). 


Copyright © 1996-2008 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

Standard disclaimer: The statements, views and opinions presented on this web page are those of the author and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

Last updated: March 15, 2008