Softpanorama
(slightly skeptical) Open Source Software Educational Society

May the source be with you, but remember the KISS principle ;-)

Softpanorama Search

Introduction to Perl for Unix System Administrators

(Perl without excessive complexity)

by Dr Nikolai Bezroukov


Prev | Up | Contents | Down | Next

1.2.The Place of Perl among other scripting languages; good and bad things about Perl

Right now the main Perl competitor is PHP.  

What is good about Perl

I come to Perl with some mainframe PL/1 (and REXX) programming background and was surprised that Perl emulated some PL/1 ideas (see below). And probably because of my positive experience with PL/1 (and especially with PL/1 compilers) I like complex, non-orthogonal expressive languages and unlike most people from academia do not share excessive excitements in ortogonality on programming languages.

IMHO programming languages should be more like Hamming Code : the most frequently used constructs should be made into very short special constructs. That idea is pretty foreign to the lovers of ortogonality in languages. Actually due to my PL/1 background I do not regard C as a good multipurpose language. It's too low for this, although it is probably the first langue that systematically tried provide Humming code style design: shortcuts for the most frequently used operations, just note such idioms as i++; i+=2; , etc.  C value lies in the fact that it provides a very good abstraction of a modern computers instruction set. That's why it manage to dominate system programming for more then 20 years. As C was derived from PL/1 it throw away some important PL/1 features like flexibility of its exception mechanism as well as and power of working with strings. Even in the field of syntax PL/1 has features that are not present even in modern C dialects like multilevel closure of blocks with label led end statement. In this sense Perl is much closer to PL/1 in philosophy: it is a complex non-orthogonal language with an excellent string processing capabilities and that's why I like it. 

But at the same time I think that Perl designers did not completely understand PL/1 tradition which in some areas is more rich that we see in Perl.  In some places where I like to write short expressive idioms Perl still provide unnecessary verbose solutions.  This situation can be improved as Perl library of built-in functions mature.

Like any scripting language and more than many other scripting languages Perl is a very high level language -- it hides details a particular structures and underscored the purpose of algorithms/data structures.

Many people complain about naming convention in Perl (prefixing certain typed of data structures with certain letters. But actually naming convention for variables in Perl can be considered as a kind of Hungarian notation and as such are not that bad even from a puritan view of classic academic language designer. 

What is important is that Perl try to move in the right direction extending existing structures to add more power or flexibility or both. In many areas it succeeded. In some it failed.  But all in all Perl is the most successful of the new bread of very high language and it generate a lot of following. Think about Python and Ruby as two examples.

Also judging Perl we need to understand that it belongs to the family of complex non-orthogonal languages, the family that was started by PL/1 and successfully continued by Ada and later C++ (parochially as C++ did not solve one of the key C language problem: difficulties of working with strings).

In some areas Perl gone too far and contains features that overcomplicated the language . One example that is often mentioned in this area is the postfix if statement. It doesn't add any new capability to the language, and it confuses new users. But thinks are not that simple: for seasoned professionals it provides a convenient notation for an important special case (if and only if it is used for writing exit conditions in loops.)  So there is a general contradiction between ease of learning and ease of use by professionals. In Perl (in the best Unix tradition), compromise tends to favor professionals. And while we can criticize the language we need to understand the nature of this tradeoff: Perl was not designed for a casual users.

I also like the fact that Perl OO is an afterthought, not a principle like is Java.  Isn't that the Perl way?

What is problematic

Language faults are logical continuation of their strong points. JavaScript has more clean syntax, is standardized and now supported by Microsoft in the form of JScript as a scripting language for ActiveX (actually Perl is also partially supported by Microsoft via investment in ActiveState). But Microsoft Javascript implementation is well integrated into Windows (via scripting engine).

Python is more academic and supports coroutines. One of the best reasons I can think of for a Perl programmer to learn Python is simply the fact that it enriches your understanding of scripting languages. Over time, exposure to other scripting languages like Tcl and Python should help your everyday work in Perl. After all, how many of us started using closures in Perl without seeing them first in something Lisp-like? My expectation would be that there are good things about Python that are applicable outside Python. Determining what they are and adopting them (perhaps in a more Perlish fashion) should make a Perl programmer better over time.

Of course, there are many things a Perl programmer will find disturbing about Python (The opposite is also true.)

Now, if Python really was Fred Brooke's silver bullet to slay the software engineering beast, I'd be shouting the word from the roof-tops. Python is does not significantly reduce the complexity of software development. "Cleaner" or more restrictive syntax may aid newbies, but it more likely hinders experienced programmers. Bad habits aside, there are usually more ways to express a solution correctly than a traditionally orthogonal language may lend itself to. B&D style tools like Java, which have an amazing paranoid class protection system, seem to implicitly doubt the ability of its users. Although there are probably good uses for it, "finalizing" a class to me is the height of arrogance. Better to say "none can improve on this fabled code!".

The other viable competitor among open source products in not Python as many think -- it's PHP.  PHP is now running over a million web sites and with good reason. PHP is open source, it runs equally well on NT and UNIX, it's well documented. PHP is no doubt stealing market share from Perl, but the OSS media has been unusually quiet about the issue.

The main competitor of Perl is not Java or Python, two major competitors are JavaScript and PHP

 

The price that you need to pay

A concept of "cognitive dissonance" exists in psychological literature. When mastering a new language first you face a level of "cognitive overload" until the quirks of the language become easily handled by your unconscious mind. At that point, cognitive dissonance disappears, and all of the sudden the quirky interaction becomes a "standard" way of performing the task. Once learned the quirks become incorporated into your understanding of the language.

One early sign of the disappearance of the cognitive dissonance in learning Perl is when you start  to put $ on all scalar variables automatically, and overcome the notational difficulties of using different  two operations ("==" and eq) for comparison.

Perl uses "==" for numeric comparison and "eq" for string.
This is a constant source of a number of difficult to find errors even for experienced programmer,
probably the most glaring "misfeature" of Perl

From the point of view of computer scientist Perl is a rather strange language. It adopts what could be called "assembler-style approach to operators". Perl is probably the only high-level language in which like in assembler operation defines type of the operator. The buzz-word context is used in Perl literature for this particular feature. You will see the catch phrases scalar context, string context, numeric context pretty soon. That means that, like as assembler, the accidental use of wrong comparison operation leads to a subtle error that is difficult to detect. It is a design flaw similar to the controversial decision to consider statements to be expressions in C that lead to typical and nasty errors like accidental use of an assignment instead of a comparison in if statement ( if (a=1)... ) by beginners (and  not only beginners ;-)

You to check your scripts against most common errors with some king of lint program (might be a custom one) and create special checklists for most common errors. Checklists in the chapters might help you but it is your own exprience and ideosyncracies that matter most.

The other useful advice for Perl programmer please is the famous KISS principle. Please be vary of using complex constructs -- diagnostic in Perl is not impressive and the beginner can easily shoot himself in the foot.  A lot of Perl Gurus are addicted to complexity so treat thier advice skeptically.  Cutting addition dozen of bytes from the code is not alway a noble goal :-).

Perl diagnostics

Perl is a difficult language to master and it is oriented more of professionals then on novices. Exacerbating all problems with the language are the diagnostics the interpreter generate, which are sometimes cruelly misleading and often uninformative. Complaints that something is wrong on line X can often mean that the error is several lines above or below.  In no way one should rely on lines numbers that Perl interpreter will include in the error message. Consider it as a very fuzzy pointer and search the vicinity.

Perl and regular expressions

A related problem is Perl's over-reliance on regular expressions that is exaggerated by advocating regex-based solution in almost all O'Reilly books. The latter until recently were the most authoritative source of published information about Perl.

While simple regular expression is a beautiful thing and can simplify operations with string considerably, overcomplexity in regular expressions is extremly dangerous: it cannot serve a basis for serious, professional programming, it is fraught with pitfalls, a big semantic mess as a result of outgrowing its primary purpose.  Diagnostic for errors in regular expressions is even weaker then for the language itself and here many things are just go unnoticed.

Unless you understand semantic of regular expressions well and is able to debug them, you should limit the complexity to the level at which you feel comfortable and not to jump over your head creating "one expression solve the whole problem" type of mess. Complex regular expressions are the source of subtle bugs that are difficult to find even for people who wrote regular expression engines for living. But again, like sharp blade in skilled hands regex is a very powerful tool. Yes, you may cut yourself, but you can do the work quickly if you have necessary skills. Just don't overdo it in your programs.

This advice can be extended to other features of the language: remember about the KISS principle and try to use a particular feature in the simplest way possible. For example it's perfectly possible to write simple Perl scripts without complex regular expressions or the fancy OO idioms recommended by some Perl gurus.

Remember the KISS principle: it is vital for suicessul programming in Perl.

PL/I style conversions-on-demand between arithmetic and string types adopted by Perl, lead to problems that are well known to seasoned PL/1 programmers -- everything is fine until Perl makes a wrong decision and then you end up searching for this error for a day or more  that might wipe any economy achieved by the feature. After a couple of such errors one usually starts thinking about moving to another language, but this similar to cigarettes this Camel is addictive and few will ;-). 

As I already noted the flaws in languages are logical continuations of their strongest points -- some are inherent to the language and beyond redemption, others could in theory be fixed, say in Perl 6...

Anyway, we must to live (and survive) in a far from perfect world. Unlike PL/1 Perl will never die just because simpler languages are available (PL/1 was mainly overrun by C and Pascal). WWW created a huge niche for complex non-orthogonal languages. Perl provides an excellent level of integration with Unix, especially as for usage of classic Unix tools. I'm a hard-core believer that by integrating components from several level, and especially from the OS utilities level (classic Unix tools), programs get simpler and take less time to create to maintain and modify. 

Perl provides an excellent level of integration with Unix, especially as for usage of classic Unix tools. I'm a hard-core believer that by integrating components from several level, and especially from the OS utilities level (classic Unix tools), programs get simpler and take less time to create to maintain and modify. 

If somebody tries to teach Perl as the first language I do not envy the students. IMHO Perl should never be taught as the first language. See What's wrong with Perl for more information. I agree with most points but disagree with conclusion and prefer Perl to Python.

In general, what this means is that Perl is a large and complex language, which takes a long time to learn properly. In my opinion, this complexity is unnecessary and a simpler language would have been much better. I think this also means that many non-expert Perl developers write suboptimal code.

Another thing is that I think few Perl developers (percentage-wise) write general and reusable modules, because you need to learn the language well before doing so, something that is relatively hard and takes time. Another thing is that the language itself does not encourage this.

Perl's rise to prominence (as is true of all other popular languages) is to a certain extent accidental and is more of a result of being at the right place at right time. It is not directly connected with the quality of the language. The WEB tide raised all boats and it raised Perl much like it raised HTML, Javascript, PHP and Java. But after the language became popular this very fact became an extremely important "feature" of the language and  should be considered as such.  Nothing succeeds like success. The same situation exists with VB -- popularity is probably the most important feature of that language.

Situations in which a language is excessively complex become especially annoying when you need to learn and support several such languages: Bash/ksh,  Perl, Java, C++, etc. Resistance is futile and most programmers who need to work in a Unix environment in fact end up having to learn at least a half-dozen half-baked languages. I use the term "half-baked" for the language without a standard -- in this sense Perl is half-baked. If you need to work on both Windows and Unix the situation is much worse, but most of Unix-based languages are now ported to Win32. Actually the ActiveState's Perl debugger for Win32 is better than any Perl debugger that exists for Unix. Bad thing is that even if you need to work in two environments you still have only one head.

In the current overcomplicated environment the idea of minimizing your language collection is very important. In this sense Perl can be a very reasonable choice -- it is easier to learn this complex language than several simple and less universal Unix utilities that have the same functionality

Some folks have argued that you should to use the "right language for the job" -- but with so many of them  one cannot afford to divert too many resources into "treading water". Therefore minimizing your "language collection" is very important and here there is an important question of whether one needs to use Perl or Tcl or Python. It looks like the idea of PL/1 as a universal high level language now is much more realistic than before. One of Perl advantages is that it's pretty universal and can be used instead of  a collection of several other scripting languages. The complexity and fuzziness are definite disadvantages, but power is an important advantage. In any case there is no free lunch. Perl looks ugly and smalls bad, but the other scripting language may be even worse.

What we need is to create a decent framework for Perl -- distinct from all this "Perl is similar to the natural languages" blah-blah-blah in which some Perl advocates are engaged. Perl has nothing to do with natural languages, and any specialist who claims otherwise can be considered a pure marketer.

Perl has nothing to do with natural languages and any specialist who claims otherwise can be considered a pure marketer

Moreover, while  it is a Unix language Perl breaks with Unix tradition -- it is a universal language and it is written in a language tradition represented by PL/1 and MS Office.  Perl is the language that breaks with Unix tradition. And that's why in no case it should be your first language, Python probably would be a better choice.  Here is one interesting comment from Linux Today Perl.com Teaching Perl to First-Time Programmers that illustrates one typical problem with Perl:

Internet Dog - Subject: The example illustrate the contrary (Nov 19, 1999, 13:57:54 )
The article state:

    "For instance, the automatic conversion between string and numeric types is
    what non-programmers expect"

The automatic conversion "feature" of perl should be considered a flaw. It allows silent errors to be missed. A newbie might expect the conversion to be automatic, but a skilled programmer will appreciate the problems that automatic silent type conversion can cause. The first time this occurs it is a simple matter to explain the difference between types. A good teaching language would flag the addition of unlike types as an error. The example assumes that other languages would interpret the string "3" as the numerical value of 51. This type of error can occur in a weakly typed language like C, but not in strongly typed languages. To compare Perl to C and state it is a good teaching language is to compare it to a straw dog. There are many other languages to consider for the role.

I like the Unix idea, of "hyper tools" the collection of  small utilities cooperating to achieve tasks. But the problem is that one hundred simple tools, each with a dozen of slightly different one letter options are not that simple. It's actually a pretty complex "ad-hos" language and far from being orthogonal meta language -- paradoxically the Unix tools collection became exact opposite of the initial idea. And Perl as an opposite approach is still simpler, it is a honest attempt to integrate as much functionality as possible into a single monstrous utility (like MS Office) and this integration make things more orthogonal. anyway, if you are a Unix fundamentalist please stop here --  you will be much better off using TCL.

I also reject the assumption that because some people have difficulties with a language, it is a bad language -- this is too simplistic. People are very flexible and can adapt to any language even a horrible one. If, for example, novices has difficulties in mastering the language  (and some of them probably will never be able to master it due to luck of IQ, motivation, needs or all those factors) that does not mean that most professional programmers cannot benefit from it.

Shell vs Macro language

A scripting language is an amalgam of two roles: shell for integrating of existing components and a macro language for applications. Bourne shell was historically the first successful languages for the first role; REXX, TCL and JavaScript were reasonably successful in the second. A bigger problem for Perl is that it is mainly a shell, it's presence of a macro language is almost non-existence. There are several reasons for that. One of them is that Perl was never designed to have a clean interface with the high level languages (for example C that is its implementation language). .

Regrettably, Perl cannot be used both as a scripting language and as a macro language.
Perl's strengths are on the shell side of scripting; it's much weaker on the macrolanguage side

All in all, Perl is a good example of the New Jersey "Worse is Better" approach. It works, it is here now, but it's far from being small, simple and bug-free implementation.  However given its text oriented approach and a vast amount of operators, Perl makes an excellent  integration of the capabilities of the existing Unix text-oriented filters, making many of them obsolete. It is interesting to note that Perl can replace many traditional Unix utilities -- good thing because most of them outlived their usefulness anyway. You can connect with pipes small Perl scripts and accomplish the same task more easily using just one syntax not ten different syntaxes for ten different utilities (find, grep, awk, sed, cut, etc.).  Unfortunately you cannot connect internal Perl subroutines with pipes  -- in this sense Perl is  weaker than ksh93. So, the traditional "worse is better" holds because "better is the enemy of done".

Unfortunately you cannot connect via pipes internal Perl subroutines (coroutines are absent from Perl) -- in this sense Perl is  weaker than ksh93 shell. So, the traditional "worse is better" holds because "better is the enemy of done".

Now let's talk about Perl and performance problems and that will bring us to Perl vs Java and Perl vc C/C++. Yes, Perl is slow, but performance problems that are attributed to the text orientation are often overblown. I think that expressing things using text primitives and pipes with the set of filters and lower level partially written in C often leads to simpler and more manageable programs than  using any other notations including object-oriented approach. Programs that have acceptable efficiency in most circumstances and that can be easily tuned to achieve higher efficiency when necessary.  In comparison with OO-style solutions, you trade fashionable notation and buzzwords for a lot more power.

Is Perl Slow ?

Like it or not, the vast majority (90%) of software written is not (and need not be) of operating system or compiler caliber in terms of efficiency (and probably robustness, quality, performance, maintainability). A lot of software belongs to the category of one seazon software -- in many cases, the life expectancy of the software is far too short, because needs, requirements, etc. change very quickly, to warrant the additional time spent in development. The costs just don't justify the benefits.

Also in many cases developers are  under too much pressure, lack the skills, or simply don't care enough about the quality of their work to do a good enough job with Java, C or C++.  So anyone who truly thinks that Perl is "too slow" on modern hardware should understand that any speed comparison is meaningless if the program does not work and if it works that any program  can be improved with profiling and experimentation quite considerably.

Perl might be slow for some applications, but this is not terminal disadvantage and sped can be improved by selective tuning of less then 20% of the code.  All-in-all it is much less disadvantage that one may think

There are few problems that I've needed to solve in the last few years that I couldn't solve with Perl with reasonable efficiency, and never did I think that the quality or performance suffered. You can get some feeling for the compromizes involed from the following questions:

As Donald Knuth discovered, profiling any code shows that more then 80% of computational time is spent in less then 20% of the code. Let's reverse this fact and think what percentage of the code has requirements demanding top performance? Probably less than 20%.  So I feel that within those limits Perl is a pragmatic approach and as such deserves your attention. For reemaining 20% C programs should be written and interfaced with existing Perl code (for example via pipes or sockets).



Copyright © 1996-2009 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Site uses AdSense so you need to be aware of Google privacy policy. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

Disclaimer:

Last modified: September 05, 2009