Softpanorama

May the source be with you, but remember the KISS principle ;-)
Contents Bulletin Scripting in shell and Perl Network troubleshooting History Humor


Perl for Unix System Administrators

News

Scripting Languages

eBook: Perl for system admins

Recommended Perl Books Recommended Links Recommended Papers Perl Language Perl Reference
Perl as a command line tool Perl Regular Expressions Overview of Perl regular expressions More Complex Perl Regular Expressions Perl namespaces Perl modules Subroutines and Functions Pipes
Perl Debugging Perl applications Perl One Liners HTML Matching Examples Perl IDE Perl Certification Notes on Perl Hype Perl for Win32
Perl Style Beautifiers and Pretty Printers Perl Xref Perl lint Perl power tools Perl IDE and Programming Environment Perl Error Checklist Perl Warts
Perl POD documentation Larry Wall Quotes Larry Wall Articles and Interviews Tips Perl history and evolution Humor Etc

Introduction

This is a very limited effort to help Unix sysadmins to learn of Perl. It is based on my FDU lectures to CS students. We discuss mainly "simple Perl" and the site tries to counter "excessive complexity" drive that dominates many Perl-related sites and publications.  See also my ebook Introduction to Perl for Unix system administrators (It probably can be also used for preparing to the Certified Internet Web Professional Exam). Perl continue to evolve. Thankfully it is evolves slowly but during the last decade we got state variables (5.10) and a couple of other useful features along with several other which many would consider redundant or even harmful.

Syntax of Perl is pretty regular and is favorably compared with the disaster which is syntax of Borne shell and derivatives as well as with syntax of C and C-derivatives. Larry Wall managed to avoid most classic pitfalls in creating of the syntax of the language, pitfalls in which creators on PHP readily fell ("dangling else" is one example).

Systems administrators need to deal with many repetitive tasks in a very complex, and changing environment which often includes several different flavors of Linux (RHEL and Suse) and Unix (Solaris, HP-UX and AIX). Linux has Perl installed by  default.  It is also included in all major Unix flavors but version installed might be outdated and need upgrading. Oracle installs Perl too, so it is automatically present on all servers with Oracle.  Some other application install Perl too.  That means that it provides the simplest way to automate recurring tasks on multiple platforms. Among typical tasks which sysadmin need to deal with:

You definitely can greatly simplify your life as well as improve "manageability" of the servers (or group of servers) with additional Perl scripts. As long  as you still apply KISS principle, because at some point the task of maintaining the scripts became evident and unless scripts are simple the game might not worth the candles. 

As most sysadmins already know shell, the affinity with shell is one of the major advantages of using Perl as the second scripting language for Unix sysadmin. no other language come close in this respect: Perl allow to reuse most of the key concepts of the shell.

IMHO the main advantage of using powerful complex language like Perl is the ability to write simple programs. Perhaps the world has gone overboard on this object-oriented thing. You do not need many tricks used in lower level languages as Perl itself provides you high level primitives for the task. This page is linked to several sub-pages. The most important among them are:

All language have quirks, and all inflict a lot of pain before one can adapt to them. Once learned the quirks become incorporated into your understanding of the language. But there is no royal way to mastering the language. The more different is one's background is, more one needs to suffer. Generally any user of a new programming language needs to suffer a lot ;-)

When mastering a new language first you face a level of "cognitive overload" until the quirks of the language become easily handled by your unconscious mind. At that point, all of the sudden the quirky interaction becomes a "standard" way of performing the task. For example, regular expression syntax seems to be a weird base for serious programs, fraught with pitfalls, a big semantic mess as a result of outgrowing its primary purpose. On the other hand, in skilled hands this is a very powerful tool that can serve a reliable parser for complex data and in certain cases as a substitute for string functions such as index and substr.

There are three notable steps in adaptation to Perl idiosyncrasies for programmers who got used to other languages:

  1. One early sign is when you start to put $ on all scalar variables automatically.  That's easy for those people who use write their own shell sctips and generally is not a problem for sysadmins. Most mistakes when you omit $ in front of the variable are diagnosed by interpreter but some like $location{city} are not. the problems arise if  along with Unix shell you use the third language, for example C.  In this case you make mistakes, despite you experince, and you need conscious effort all the time. This is the case with me.
  2. The next step is to overcome notational difficulties of using two different comparison operations, one for strings and the other for numerical values (like "==" for numbers vs. eq for strings) for comparison numbers and strings. In case two variables are involved, the interpreter does not provide any warnings, so you need to be very careful as if you use other language in parallel with Perl such errors crop into your scripts automatically. If one of the operators of "==" is a string constant automatic diagnostic can be provided.
  3. Conscious use of "==" as equality predicate for numbers, if you previous language allowed to used "=" for assignments you are in trouble. The pitfall of using "=" for assignment,  results in the side effect of introducing errors in comparisons in Perl, as you put "=" instead of "--".  For example,    if ($a=1)...  instead of if  ($a==1)... This problem was understood by designers on Algol 60, which was designed is late 50th. But Perl designers like lemmings followed C designers and naturally stepped on this rake again. With predictable result. To avoid it they used := for assignment instead of plain =.  Actually the designers of Fortran, PL/1, C (as derivative of PL/1), C++ and Java ignored this lesson (Fortran designers are actually not guily as it predates Algol 60). But  because C  (with its derivatives such as C++ and Java) became dominant programming language we have, what we have: propagation of this blunder to many top programming languages. Now think a little, about the notion of progress in programming language design ;-)  It's sad that the design blunder about which designers knew 65 years ago still is present in the most popular languages used today ;-). In all languages that have lexical structure of C, this blunder remains one of the most rich source of subtle errors for novices. Naturally this list includes Perl. C programmers typically are already trained to be aware about  this language pitfall. But in Perl you too can use several protective measures
    1. Modify syntax highlighting in your editor  so that such cases were marked in bold red.
    2. Manually or automatically (simple regex done in the editor can detected ~ 99% of cases) reorganize such comparisons putting the constant on the left part of comparison, like in  if (1==$a)....
    3. Recent versions of Perl interpreter provide warning in this case, so checking your script with option -cw  or better using use warnings  pregma in all your scripts. It also helps if IDE provides capability to display of results of checking of syntax in one of the windows and jump to the line in code listed in the error or warning (this is a standard feature of all IDEs and actually this can be done in most editors too). 
  4. Learning to find missing closing "}". This problem is typical for all C-style languages and generally requires pretty printer to spot.  But Perl interpreter has a blunder -- it does not recognize the fact that in Perl subroutines can't be nested within blocks and does not point to the first subroutine as the diagnostic point -- it points to the end of the script.  You the best way is to extract suspicious section into a new file and check it separately, cutting not relevant parts, until you detect the problem. The longer the program is the more acute this problem becomes. BTW this problem was solved in PL/1 which was create in early 60th: PL/1 has labels for closing statements as in "mainloop: do... end mainloop" which close all intermediate constructs automatically Both C and Perl failed to adopt this innovation.  Neither Perl not C also use the concept of Pascal "local numeric labels"  -- labels that exist only until they are redefined, see discussion at Knuth.
  5. Leaning to find missing " (double quote) or ' (single quote).  With good editor this is not a problem as syntax highlighting points you where the problems begin.

Please note that as syntax of Perl is complex. So the diagnostic in Perl interpreter is really bad and often point to the spot far below where the error occurred.  It is nowhere near the quality of diagnostics that mainframe programmers got in IBM PL/1 diagnostic complier, which is also probably 50 years old and run on tiny by today standard machines with 256K (kilobytes, not megabytes)  of RAM and 7M (megabytes, not gigabytes, or terabytes) harddrives.  The only comfort  is that other scripting languages are even worse then Perl ;-).

Benefits that Perl brings to system administration 

All-in-all Perl is the language that fits most sysadmin needs, It' not fancy and its general usage is in decline since 2000 but fashion should never be primary factor in choosing the scripting language. Perl has stable and well tested  interpreter and is close to shell (to the extent  that most concepts of shell can be directly reused). And that's what important. As on modern servers Perl interpreter loads in a fraction of a second, Perl also allows to get rid of most usage of AWK and SED, making you environment more uniform and less complex. This is important advantage.  Among benefits that Perl bring to system administration are

In short if make sense to learn Perl as it makes sysadmin like a lot easier. Probably more so then any other tool in sysadmin arsenal...

Perl is really great for text processing and in this particular domain is probably unmatched. For example in Python regular expressions are implemented via standard library module; they are not even a part of the language.

A warning about relative popularity

 As of 2017 Perl no longer belongs to the top 10 programming languages (Scripting languages slip in popularity, except JavaScript and Python, Infoworld, Nov 13, 2017).  It's still more popular then Visual Basic, so there nothing to worry about.  But far less then popular then Python.  Of cause popularity is not everything. Python and Perl share some characteristics, but don't exactly occupy the same niches. But it is a lot: fashion rules the programming, so this is a factor that you need consciously evaluate and be aware of.

In large enterprise environment, outside system administration area Perl now is almost invisible. Python is gaining ground in research. Mostly because universities both in the USA and Europe now teach Python in introductory classes and engineers come "knowing some Python". This looks like "Java success story" of late 1990th on new level. Like Perl, Python is also now installed on all Linux distributions by default and there are several important linux system programs written in Python (yum, Anaconda, etc) which implicitly suggest that Python has Red Hat mark of adoption/approval too (yum was originally written at Duke University Department of Physics)

So there is now a pressure to adopt Python. That's sad, because IMHO Perl is a great scripting language which can be used on many different levels, starting from AWK/SED replacement tool. Going from Perl to Python for text processing to me feels like leaving a Corvette and driving a station wagon. Python will gets you there. But it's not fun and will take more time althouht you probably might feel more comfortable inside.

To say nothing about the list of IDE available. Perl does not even ship with the "Standard IDE" although Padre, which is somewhat competitive with Komodo is available for free, but the latest binary distribution suitable for beginners is from 2012. It's still highly usable. Komodo edit can serve as a surrogate IDE too.  Pycharm -- the most popular Python IDE can work with Perl and works well.  My feeling is that for Perl to remain competitive IDE should be maintained and shipped along with Perl interpreter (like in Python and R distributions).  May be at the expense of some esoteric modules included in standard  library.  Also number of books per year devoted to Python and available via Amazon for 2017 is at least one order of magnitude larger then the number of books devoted to Perl (quality issues aside). All this creates a real pressure to use Python everywhere, even in system administration.

Here is an insightful post on this topic (Which is better, Perl or Python Which one is more robust How do they compare with each other):

Joe Pepersack, Just Another Perl Hacker Answered May 30 2015

Perl is better. Perl has almost no constraints.  It's philosophy is that there is more than one way to do it (TIMTOWTDI, pronounced Tim Toady). Python artificially restricts what you can do as a programmer.  It's philosophy is that there should be one way to do it.   If you don't agree with Guido's way of doing it, you're sh*t out of luck.

Basically, Python is Perl with training wheels.   Training wheels are a great thing for a beginner, but eventually you should outgrow them.  Yes, riding without training wheels is less safe.   You can wreck and make a bloody mess of yourself.   But you can also do things that you can't do if you have training wheels.   You can go faster and do interesting and useful tricks that aren't possible otherwise. Perl gives you great power, but with great power comes great responsibility.

A big thing that Pythonistas tout as their superiority is that Python forces you to write clean code.   That's true, it does... at the point of a gun, sometimes at the detriment of simplicity or brevity.   Perl merely gives you the tools to write clean code (perltidy, perlcritic, use strict, /x option for commenting regexes) and gently encourages you to use them.

Perl gives you more than enough rope to hang yourself (and not just rope, Perl gives you bungee cords, wire, chain, string, and just about any other thing you can possibly hang yourself with).   This can be a problem.   Python was a reaction to this, and their idea of "solving" the problem was to only give you one piece of rope and make it so short you can't possibly hurt yourself with it.    If you want to tie a bungee cord around your waist and jump off a bridge, Python says "no way, bungee cords aren't allowed".  Perl says "Here you go, hope you know what you are doing... and by the way here are some things that you can optionally use if you want to be safer"

Some clear advantage of Perl:

Advantages of Python

The most common versions of Perl  5 in production

RHEL 6.x now ships with Perl 5.10. Many classic Unixes still ship with Perl 5.8.8. Older versions of  Solaris and HP-US servers might have version below Perl 5.8.8 but in 2017 that's rare as most of such servers are decommissioned (typical lifespan of a server in corporate environment is 5-7 years). 

It you need compatibility with all major flavor of Unix in you scripts it is a safe bet to write for Perl 5.8.8. Such a decision virtually guarantee compatibility with all enterprise servers, except those that should be discarded 5 or 10 years ago. In other words no "state" variables, if you want "perfect" compatibility. Non perfect, but acceptable.

If you need only Linux deployment compatibility that can be achieved by using version 5.10 which allow you to use "state" variables.

If you need compatibility with linux servers only version 5.10 look like a more or less safe bet too (very few enterprise servers in 2017 are now below version RHEL 6; those typically have Perl 5.8). 

Also too high version of Perl also is always desirable --  see note about Perl 5.22. (the current version of Perl 5 is version 26).  Hopefully warts added to version 22  will be corrected based on the feedback. Here is a slide from Oct 3, 2016 by Tom Radcliffe The Perl Paradox

The problem of Perl complexity junkies

There is a type of Perl books authors that enjoy the fact that Perl is complex non-orthogonal language and like to drive this notion to the extreme. I would call them complexity junkies. Be skeptical and do not take recommendations of Perl advocates like Randal L. Schwartz  or Tom Christiansen for granted :-) Fancy idioms are very bad for novices. Please remember about KISS principle and try to write simple Perl scripts without complex regular expressions and/or fancy idioms. Some Perl gurus pathological preoccupation with idioms is definitely not healthy and is part of the problem, not a part of the solution...

We can defines three main types of Perl complexity junkies:

My issues with Perl is when people get Overly Obfuscated with their code, because the person thinks that less characters and a few pointers makes the code faster.

Please remember about KISS principle and try to write simple Perl scripts without overly complex regular expressions or fancy idioms.  If you do this Perl is great language, unmatched for sysadmin domain. Simplicity has great merits even if goes again current fancy.

Generally the problems with OO mentioned above are more fundamental than the trivial "abstraction is the enemy of convenience". It is more like that badly chosen notational abstraction at one level can lead to an inhibition of innovative notational abstraction on others. In general OO is similar to idea of "compiler-copiler" when you create a new language in such a way that it allow to compile new constructs with the existing complier.  While in some cases useful or even indispensible, there is always a price to pay for such  fancy staff.

Tips

Missing semicolon problem in Perl

Some deficiencies of Perl syntax were directly inhered from C. One of the most notable "obligatory semi-colon after ach statement. Which lead to tremendous amount of errors. "soft semicolon" (implied semicolon on line end if round brackets or similar symmetrical symbols are balanced) is a better approach, but for some reason it was never implemented (semicolon is an option only before closing curvy bracket in Perl interpreter. ). See also discussion below.

Avoiding mistyping "=" instead of "==" blunders

One of most famous C design blunder was the introduction of a small lexical difference between assignment and comparison (remember that Algol used := for assignment; PL/1 uses = for both) caused by the design decision to make the language more compact (terminals at this type were not very reliable and number of symbols typed matter greatly. In C assignment is allowed in if statement but no attempts were made to make language more failsafe by avoiding possibility of mixing up "=" and "==". In C syntax if ($a = $b) assigns the contents of $b to a and executes the code following if b not equal to 0. It is easy to mix thing and write if ($a = $b ) instead of (if ($a == $b) which is a pretty nasty bug. You can often reverse the sequence and put constant first like in

if ( 1==$i ) ...
as
if ( 1=$i ) ...
does not make any sense, such a blunder will be detected on syntax level.

Locating unbalanced "}" errors

Typically this is connected with your recent changes so you should know where to look. If not use diff with tha last version that complied OK (I hope you use come king of CMS like subversion or git; if you that's time to switch)

The optimal way to spot this problem is to use pretty printer. In the absence of pretty printer you can insert '}' in binary search fashion until you find the spot where it is missing.

You can also extract part of your script and analyze it separately, deleting "balanced" parts one by one.

This error actually discourages writing long Perl scripts so the is a silver lining in each dark cloud.

You can also use pseudo comments that signify nesting level zero and check those points with special program or by writing an editor macro. One also can mark closing brackets with the name of construct it is closing

if (... ) { 

} # if 

Problem of unclosed quote at the end of the line string literal ("...")

Use a good editor. split long literals into one line literals and concatenate them with dot opaerator.

As a historical note specifying max length of literals is an effecting way of catching missing quote that was implemented in PL/1 compilers. You can also have an option to limit literal to a single line. In general multi-line literals should have different lexical markers (like "here" construct in shell). Perl provides the opportunity to use concatenation operator for splitting literals into multiple line, which are "merged" at compile time, so there is no performance penalty for this constructs. But there is no limit on the number of lines string literal can occupy so this does not help much. If such limit can be communicated via pragma statement at compile type in a particular fragment of text this is an effective way to avoid the problem. Usually only few places in program use multiline literals, if any. Editors that use coloring help to detect unclosed literal problem but there are cases when they are useless.

How to avoid using wrong comparison operator comparing two variable

If you are comparing a variable and a constant Perl interpret can help you to detect this error. but if you are comparing two variable you are on your own. And I often use wrong comparison operator just out of inertia or after usage of C. the most typical for ma error is to use == for stings comparison.

One way is to comment sting comparisons and then match comments with the comparison operator used using simple macro in editor (you should use programmable editor, and vim is programmable)

Usage of different set of comparison operator for number and string comparison is probably the blunder in Perl design (which Python actually avoided) and was inherited from shell. Programmer that use other languages along with Perl are in huge disadvantage her as other language experience force them to make the same errors again and again. Even shell solution (using different enclosing brackets); it might well be that in Perl usage of ( ) for arithmetic comparison and ((...)) for string would be a better deal. They still can be used as a part of defensive programming so that you can spot inconsistencies easier

Perl as a new programming paradigm

Perl + C and, especially Perl+Unix+shell represent a new programming paradigm in which the OS became a part of your programming toolkit and which is much more productive for large class of programs that OO-style development (OO-cult ;-). It became especially convenient in virtual machine environment when application typically "owns" the machine. In this case the level of integration of the language and operating system became of paramount importance and Perl excels in this respect. You can use shell for file manipulation and pipelines, Perl for high-level data structure manipulation and C when Perl is insufficient or too slow. The latter question for complex programs is non-trivial and correct detection of bottlenecks needs careful measurements; generally Perl is fast enough for most system programs.

The key idea here is that any sufficiently flexible and programmable environment - and Perl is such an environment -- gradually begins to take on characteristics of both language and operating system as it grows. See Stevey's Blog Rants Get Famous By Not Programming for more about this effect.

Any sufficiently flexible and programmable environment - and Perl is such an environment -- gradually begins to take on characteristics of both language and operating system as it grows.

Unix shell can actually provide a good "in the large" framework of complex programming system serving as a glue for the components.

From the point of view of typical application-level programming Perl is very under appreciated and very little understood language. Almost nobody is interested in details of interpreter, where debugger is integrated with the language really brilliantly. Also namespaces in Perl and OO constructs are very unorthodox and very interesting design.

References are major Perl innovation

References are Perl innovation: classic CS view is that scripting language should not contain references (OO languages operate with references but only implicitly). Role of list construct as implicit subroutine argument list is also implemented non trivially (elements are "by reference" not "by name") and against CS orthodoxy (which favors default "by name" passing of arguments). There are many other unique things about design of Perl. All-in-all for a professional like me, who used to write compilers, Perl is one of the few relatively "new" languages that is not boring :-).

Perl has a great debugger

Debugger for the language is as important as the language itself. Perl debugger is simply great. See Debugging Perl Scripts

Brilliance of Perl Artistic license

Perl license is a real brilliance. Incredible from my point of view feat taking into account when it was done. It provided peaceful co-existence with GPL which is no small feat ;-). Dual licensing was a neat, extremely elegant cultural hack to make Perl acceptable both to businesses and the FSF.

It's very sad that there no really good into for Perl written from the point of view of CS professional despite 100 or more books published.

Perl warts


A small, crocky feature that sticks out of an otherwise clean design. Something conspicuous for localized ugliness, especially a special-case exception to a general rule. ...

Jargon File's definition of the term "wart"

Language design warts

Perl extended C-style syntax in innovative way. For example if statement always uses {} block, never an individual statement, also ; before } is optional. But it shares several C-style syntax shortcomings and introduced a couple of its own:

For a language aficionado Larry Wall make way too many blunders in the design of Perl. Which is understandable (he has no computer science background and was hacker in heart), but sad.

There are also several semantically problems with the language:

Absence of good development environment

R-language has RStudio which probably can be viewed as gold standard of minimal features needed for scripting language GUI. While RStudio has a weak editor it has syntax highlighting and integration with debugger and as such is adequate for medium scripts.

There is no similar "established" as standard de-facto GUI shipped with Perl interpreter and looks like nobody cares. That's a bad design decision although you can use Orthodox file manager (such as Midnight commander, or in Windows Far or Total Commander) as poor man IDE. Komodo Edit is more or less OK editor for Perl and is free although in no way it is full IDE.

This is not a show stopper for system administrators as they can use screen and multiple/different terminal sessions for running scripting and editing them. Also mcedit is handy and generally adequate for small scripts. To say nothing that each sysadmin know badic set of command for vi/vim, and many know it well.

But this is a problem when you try to write Perl scripts with over 1K lines which consist of multiple files. Many things in modern IDE helps to avoid typical errors (for example identifiers can be picked up from the many by right clicking, braces are easier to match if editor provide small almost invisible vertical rulers, color of the string help to detect running string constants, etc.

Currently Komodo and free Komodo editor are almost the only viable game in town.

See

for additional discussion.

Lost development priorities

For mature language the key area of development is not questionable enhancements, but improvement of interpreter diagnostics and efforts in preventing typical errors (which at this point are known).

Perl version 5.10 was the version when two very useful enhancement to the language were added:

Still very little was done to improve interpreter in order to help programmers to avoid most typical Perl errors. that means that the quality of the editor for Perl programmers is of paramount importance. I would recommend free Komodo editor. It allows you to see the list of already declared variables in the program and thus avoid classic "typo in the variable" type of errors.

Not all enhancements that Perl developers adopters after version 5.10 have practical value. Some, as requirement to use backslash in regular expressions number of iterations ( so that /\d{2}/ in "normal" Perl became /\d\{2}/ in version 5.22), are counterproductive. For that reason I do not recommend using version 5.22. You can also use pragma

use v5.12.0

to avoid stupid warnings version 5.20 generates.

There is no attempts to standardize Perl and do enhancements via orderly, negotiated by major stakeholders process. Like is done with C or Fortran (each 11 years; which is a very reasonable period which allow current fads to die ;-). At the same time quality of diagnostics of typical errors by Perl interpreter remains weak (it imporved with the introduction of strict though).

Support for a couple of useful pragma, for example, the ability to limit the length of string constants to a given length (for example 120) for certain parts of the script is absent. Ot something similar like "do not cross the line" limitation.

Local labels might help to close multiple level of nesting (the problem of missing curvy bracket is typical in al C-style languages)

 1:if( $i==1 ){
     if( $k==0 ){
         if ($m==0 ){
   # the curvy bracket below closes all opened clock since the local label 1
 }:1 

Multiple entry points into subroutines might help to organize namespaces.

Working with namespaces can and should be improved and rules for Perl namespaces should be much better better documented. Like pointers namespaces provide powerful facity to structuring language programs. which can be used with or without modules framework. this is a very nice and very powerful Perl feature that makes Perl a class or its own for experienced programmers. Please note that modules are not the only game in town. Actually the way they were constructed has some issues and (sometime stupid) overemphasis on OO only exacerbate those issues. Multiple entry points in procedures would be probably more useful and more efficient addition to the language. Additional that is very easy to implement. The desire to be like the rest of the pack often backfire... From SE point of view scripting language as VHL stands above OO in pecking order ;-). OO is mainly force feed for low level guys who suffer from Java...

Actually there are certain features that should probably be eliminated from Perl 5. For example use of unquoted words as indexes to hashes is definitely a language designers blunder and should be gone. String functions and array functions should be better unified. Exception mechanism should be introduced. Assignment in if statements should be somehow restricted. Assignment of constants to variables in if statement (and all conditions) should be flagged as a clear error (as in if ($a=5) ... ). I think latest version of Perl interpreter do this already.

Problems with Perl 5. 22

Attention: The release contains an obvious newly introduced wart in regex tokenizer, which now requires backslash for number of repetitions part of basic regex symbols. For example in case of /\d{2}/ which you now need to write /\d\{2}/ -- pretty illogical as a curvy brace here a part of \d construct, not a separate symbol (which of course should be escaped);

Looks to me like a typical SNAFU. But the problem is wider and not limited to Perl. There is generally tendency for a gradual loss of architectural integrity after the initial author is gone and there is no strong "language standard committee" which drive the language development (like in Fortran, which issues an undated version of the standard of the language each 11 years).

So some languages like Python this is still in the future, but for many older languages is is already reality and a real danger. Mechanism for preventing this are not well understood. The same situation happens with OS like Linux (systemd).

This newly introduced bug (aka feature) also affects regexes that use opening curvy bracket as a delimiter. Which is a minor but pretty annoying "change we can believe in" ;-). I think that idiosyncrasy will prevent spread this version into production version of Linux Unix for a long, long time (say 10 years) or forever. Image the task of modification of somebody else 30-50K lines Perl scripts for those warnings that heavily uses curvy braces in regex or use \d{1,3} constructs for parsing IP addresses.

This looks more and more like an artificially created year 2000 problem for Perl.

Dr. Nikolai Bezroukov


Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

[Nov 18, 2017] Using the built-in debugger of Perl as REPL by Gabor Szabo

Youtube video, Mainly explain how to use x command in Perl debugger.
Nov 18, 2017 | www.youtube.com

The command line debugger that comes with perl is very powerful.
Not only does it allow us to debug script but it can be used as a REPL - a Read Eval Print Loop to explore the capabilities of the language. There are a few basic examples in this screencast.

http://perlmaven.com/using-the-built-...

To see all the Perl tutorials visit http://perlmaven.com/perl-tutorial

About Perl Programming and Perl programmers.

In this screencast:

perl -d e 1

p - print scalar
x - print data structure
b subname - set breakpoint

[Nov 17, 2017] perl modules

Nov 17, 2017 | perlmonks.com

Discipulus (Monsignor) on Nov 16, 2017 at 09:04 UTC

Re: perl modules

Hello codestroman and welcome to the monastery and to the wonderful world of Perl!

First of all, please, add <c> code tags </c> around your code and output.

Then be sure to have read the standard documentation: perlmod and perlnewmod

Infact a basic perl module define a package and use Exporter to export functions in the using perl program.

In my homenode i've collected a lot of links on about module creation

L*


Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

thanos1983 (Priest) on Nov 16, 2017 at 09:17 UTC

Re: perl modules

Hello codestroman

Just to add a minor suggestion here, to the full cover reply of fellow monk Discipulus . It will assist you a lot also to read Simple Module Tutorial

Update: Direct answer to your question can be found here How to add a relative directory to @INC with multiple possible solutions. I would strongly recommend to go through all the articles that all monks proposed.

Hope this helps, BR.

Seeking for Perl wisdom...on the process of learning...not there...yet!

hippo (Abbot) on Nov 16, 2017 at 09:21 UTC

Re: perl modules (Can't locate in @INC)
PLEASE HELP!!

This is a monastery - a place of quite contemplation. The louder you shout the less wisdom shall you receive.

The error message Can't locate dog.pm in @INC is pretty explicit. Either your module file is not called dog.pm in which case, change it or else your file dog.pm is not in any of the directories listed in @INC in which case either move it to one of those directories or else change @INC with use lib .

I also see, despite the lack of formatting in your post that your module doesn't use any namespace. You should probably address that. Perhaps a solid read through Simple Module Tutorial would be a good idea?

Anonymous Monk on Nov 16, 2017 at 09:07 UTC

Re: perl modules

use an absolute pathname in use lib

Anonymous Monk on Nov 16, 2017 at 15:16 UTC

Re: perl modules

Welcome to the language ... and, to the Monastery. The "simple module tutorial" listed above is a very good place to start. Like all languages of its kind, Perl looks at runtime for external modules in a prescribed list of places, in a specified order. You can affect this in several ways, as the tutorials describe. Please read them carefully.

In the Perl(-5) language, this list is stored in a pre-defined array variable called @INC and it is populated from a variety of sources: a base-list that is compiled directly into the Perl interpreter, the PERL5LIB environment-variable, use lib statements, and even direct modification of the variable itself. Perl searches this list from beginning to end and processes (only) the first matching file that it finds.

(Note that, in Perl, the use statement is actually a pragma, or declaration to the compiler, and as such it has many "uses" and a somewhat complicated syntax.)

Corion (Pope) on Nov 16, 2017 at 15:23 UTC

Re^2: perl modules


by Corion (Pope) on Nov 16, 2017 at 15:23 UTC

(Note that, in Perl, the use statement is actually a pragma, or declaration to the compiler, and as such it has many "uses" and a somewhat complicated syntax.)

Please no.

The word "pragma" has a special meaning in Perl, and it is highly confusing to claim that a Perl "keyword" would be a "pragma". use certainly is a keyword and nothing else.

If you mean to say something different, please describe in more words what you want to say.

[Nov 17, 2017] Using the built-in debugger of Perl by Gabor Szabo

Youtube video, 9 min.
Nov 17, 2017 | www.youtube.com

Perl comes with a very powerful built-in command line debugger. In this screencast you can see basics how to use it.

For blog entries and for more screencasts see http://perlmaven.com/

About Perl Programming and Perl programmers.

For the blog entry of this screencast visit
http://perlmaven.com/using-the-built-...

Debugger commands used:
q - quit,
h - help,
p - print,
s - step in,
n - step over,
r - step out,
T - stack trace
l - listing code

The Padre project can be found here: http://padre.perlide.org/

The book mentioned was Pro Perl Debugging: http://www.apress.com/9781590594544

If you are interested an on-site Perl training contact me http://szabgab.com/contact.html

[Nov 17, 2017] Modern Perl by chromatic

Notable quotes:
"... 'shift key br0ken' ..."
"... # appease the Mitchell estate ..."
Nov 17, 2017 | www.amazon.com

Regex Modifiers

Several modifiers change the behavior of the regular expression operators. These modifiers appear at the end of the match, substitution, and qr// operators. For example, here's how to enable case-insensitive matching:

​ my ​ $pet = ​ 'ELLie' ​;
like $pet, ​ qr ​/Ellie/, ​ 'Nice puppy!' ​;
like $pet, ​ qr ​/Ellie/i, ​ 'shift key br0ken' ​;

The first like() will fail because the strings contain different letters. The second like() will pass, because the /i modifier causes the regex to ignore case distinctions. and are effectively equivalent in the second regex due to the modifier.

You may also embed regex modifiers within a pattern:

​ my ​ $find_a_cat = ​ qr ​/(?<feline>(?i)cat)/;

The (?i) syntax enables case-insensitive matching only for its enclosing group -- in this case, the named capture. You may use multiple modifiers with this form. Disable specific modifiers by preceding them with the minus character ( ):

​ my ​ $find_a_rational = ​ qr ​/(?<number>(?-i)Rat)/;

... ... ...

The /e modifier lets you write arbitrary code on the right side of a substitution operation. If the match succeeds, the regex engine will use the return value of that code as the substitution value. The earlier global substitution example could be simpler with code like the following:

# appease the Mitchell estate
$sequel =~ ​ ​{Scarlett( O​ 'Hara)?}
{
' ​Mauve​ ' . defined $1
? ' ​ Midway​ '
: ''
}ge;

Each additional occurrence of the /e modifier will cause another evaluation of the result of the expression, though only Perl golfers use anything beyond /ee

[Nov 16, 2017] Namespaces and modules

Nov 16, 2017 | perlmonks.com

on Feb 09, 2015 at 13:21 UTC ( # 1116049 = perlquestion : print w/replies , xml ) Need Help?? kzwix has asked for the wisdom of the Perl Monks concerning the following question:

Greetings, Ô wise monks !

I come to you because of a mystery I'd like to unravel: The module import code doesn't work as I expected. So, as I'm thinking that it probably is a problem with my chair-keyboard interface, rather than with the language, I need your help.

So, there are these modules I have, the first one goes like this:

use utf8; use Date::Manip; use LogsMarcoPolo; package LibOutils; BEGIN { require Exporter; # set the version for version checking our $VERSION = 1.00; # Inherit from Exporter to export functions and variables our @ISA = qw(Exporter); # Functions and variables which are exported by default our @EXPORT = qw(getDateDuJour getHeureActuelle getInfosSemaine ge tTailleRepertoire getInfosPartition getHashInfosContenuRepertoire dor mir); # Functions and variables which can be optionally exported our @EXPORT_OK = qw(); } # Under this line are definitions of local variables, and the subs. [download]

I also have another module, which goes like that:

use utf8; use strict; use warnings; use Cwd; # Module "CORE" use Encode; use LibOutils qw(getHeureActuelle); package LogsMarcoPolo; BEGIN { require Exporter; # set the version for version checking our $VERSION = 1.00; # Inherit from Exporter to export functions and variables our @ISA = qw(Exporter); # Functions and variables which are exported by default our @EXPORT = qw(setNomProgramme ouvreFichierPourLog assigneFluxPo urLog pushFlux popFlux init printAndLog); # Functions and variables which can be optionally exported our @EXPORT_OK = qw(); } # Here are other definitions of variables and subs, which I removed fo r the sake of clarity sub init { my ($nomDuProgramme, $pathLogGeneral, $pathLogErreurs) = @_; my $date = LibOutils::getDateDuJour(); # La date de l'appel à init() my $time = LibOutils::getHeureActuelle(); # L'heure de l'appel à init() $nomProgramme = $nomDuProgramme; # Ouverture du flux pour STDOUT: my $stdout = assigneFluxPourLog(*STDOUT); # On l'ajoute à la liste de flux 'OUT': pushFlux('OUT', $stdout); # Ouverture du flux pour STDERR: my $stderr = assigneFluxPourLog(*STDERR); # On l'ajoute à la liste de flux 'ERR', et à la liste 'DUO': pushFlux('ERR', $stderr); pushFlux('DUO', $stderr); if (defined $pathLogGeneral) { my $plg = $pathLogGeneral; $plg =~ s/<DATE>/$date/g; $plg =~ s/<TIME>/$time/g; my $logG = ouvreFichierPourLog($plg); pushFlux('OUT', $logG); pushFlux('DUO', $logG); } if (defined $pathLogErreurs) { my $ple = $pathLogErreurs; $ple =~ s/<DATE>/$date/g; $ple =~ s/<TIME>/$time/g; my $logE = ouvreFichierPourLog($ple); pushFlux('ERR', $logE); pushFlux('DUO', $logE); } } [download]

Now, look at the second module: When, in the "init" sub, I call the getDateDuJour() and getHeureActuelle() functions with an explicit namespace, it works fine.

If I remove the prefix, it doesn't work, even for the function whose name I put in the "qw(...)" chain after the use.

Would a fellow monk know why ?

choroba (Bishop) on Feb 09, 2015 at 13:24 UTC

Re: Namespaces and modules

By putting package after the use clauses, you are importing all the functions to the "main" namespace, not into your package's namespace. Moving the package declaration up should help. لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

kzwix (Sexton) on Feb 09, 2015 at 13:34 UTC

Re^2: Namespaces and modules


by kzwix (Sexton) on Feb 09, 2015 at 13:34 UTC

I wonder, could it have something to do with loop-including ?

I mean, package "LibOutils" uses "LogsMarcoPolo" (for its logging system), but "LogsMarcoPolo" uses "LibOutils" for its dates and times.

Could that circular include be the origin of this bug ?

Anonymous Monk on Feb 09, 2015 at 14:18 UTC

Re^3: Namespaces and modules
by Anonymous Monk on Feb 09, 2015 at 14:18 UTC
I wonder, could it have something to do with loop-including ?

Circular dependencies don't automatically cause a problem, it also depends on what the module does in its body (which you haven't shown). If you think there is a problem, a short piece of example code that reproduces the problem would help, see http://sscce.org/

But first, did you try what choroba suggested ?

kzwix (Sexton) on Feb 09, 2015 at 15:04 UTC

Re^4: Namespaces and modules
by kzwix (Sexton) on Feb 09, 2015 at 15:04 UTC

Corion (Pope) on Feb 09, 2015 at 15:11 UTC

Re^5: Namespaces and modules
by Corion (Pope) on Feb 09, 2015 at 15:11 UTC

Anonymous Monk on Feb 09, 2015 at 15:59 UTC

Re^5: Namespaces and modules
by Anonymous Monk on Feb 09, 2015 at 15:59 UTC

Anonymous Monk on Feb 09, 2015 at 14:11 UTC

Re: Namespaces and modules
doesn't work as I expected ... it works fine ... it doesn't work

What are the exact error messages? What is the expected behavior vs. the behavior you're getting? See How do I post a question effectively?

Replies are listed 'Best First'.

[Nov 16, 2017] perl - Passing an inner array to a function - Stack Overflow

Nov 16, 2017 | stackoverflow.com

,

There are no arrays in your code. And there are no method calls in your code.

Your hash is defined incorrectly. You cannot embed hashes inside other hashes. You need to use hash references. Like this:

my %data = (
    'a' => {
        x => 'Hello',
        y => 'World'
    },
    'b' => {
        x => 'Foo',
        y => 'Bar'
    }
);

Note, I'm using { ... } to define your inner hashes, not ( ... ) .

That still gives us an error though.

Type of arg 1 to main::p must be hash (not hash element) at passhash line 20, near "})"

If that's unclear, we can always try adding use diagnostics to get more details of the error:

(F) This function requires the argument in that position to be of a certain type. Arrays must be @NAME or @{EXPR}. Hashes must be %NAME or %{EXPR}. No implicit dereferencing is allowed--use the {EXPR} forms as an explicit dereference. See perlref.

Parameter type definitions come from prototypes. Your prototype is \% . People often think that means a hash reference. It doesn't. It means, "give me a real hash in this position and I'll take a reference to it and pass that reference to the subroutine".

(See, this is why people say that prototypes shouldn't be used in Perl - they often don't do what you think they do.)

You're not passing a hash. You're passing a hash reference. You can fix it by dereferencing the hash in the subroutine call.

p(%{$data{a}});

But that's a really silly idea. Take a hash reference and turn it into a hash, so that Perl can take its reference to pass it into a subroutine.

What you really want to do is to change the prototype to just $ so the subroutine accepts a hash reference. You can then check that you have a hash reference using ref .

But that's still overkill. People advise against using Perl prototypes for very good reasons. Just remove it

> ,

Your definition of the structure is wrong. Inner hashes need to use {} , not () .
my %data = (
    a => {
        x => 'Hello',
        y => 'World'
    },
    b => {
        x => 'Foo',
        y => 'Bar'
    }
);

Also, to get a single hash element, use $data{'a'} (or even $data{a} ), not %data{'a'} .

Moreover, see Why are Perl 5's function prototypes bad? on why not to use prototypes. After correcting the syntax as above, the code works even without the prototype. If you really need the prototype, use % , not \% . But you clearly don't know exactly what purpose prototypes serve, so don't use them.

[Nov 16, 2017] perl get reference to temp list returned by function without making a copy - Stack Overflow

Nov 16, 2017 | stackoverflow.com

newguy, 2 days ago

I have a function in perl that returns a list. It is my understanding that when foo() is assigned to list a copy is made:
sub foo() { return `ping 127.0.0.1` }

my @list = foo();

That @list then needs to be transferred to another list like @oldlist = @list; and another copy is made. So I was thinking can I just make a reference from the returned list like my $listref = \foo(); and then I can assign that reference, but that doesn't work.

The function I'm working with runs a command that returns a pretty big list (the ping command is just for example purposes) and I have call it often so I want to minimize the copies if possible. what is a good way to deal with that?

zdim ,2 days ago

Make an anonymous array reference of the list that is returned
my $listref = [ foo() ];

But, can you not return an arrayref to start with? That is better in general, too.


What you attempted "takes a reference of a list" ... what one cannot do in the literal sense; lists are "elusive" things , while a reference can be taken

By using the backslash operator on a variable, subroutine, or value.

and a "list" isn't either (with a subroutine we need syntax \&sub_name )

However, with the \ operator a reference is taken, either to all elements of the list if in list context

my @ref_of_LIST = \( 1,2,3 );  #-->  @ref_of_LIST: (\1, \2, \3)

or to a scalar if in scalar context, which is what happens in your attempt. Since your sub returns a list of values, they are evaluated by the comma operator and discarded, one by one, until the last one. The reference is then taken of that scalar

my $ref_of_LIST = \( 1,2,3 );  #--> $ref_of_LIST: \3

As it happens, all this applies without parens as well, with \foo() .

newguy ,2 days ago

I don't know how to return an array ref from a command that returns a list. Would it be acceptable to do it as return [`ping 1.2.3.4`];newguy 2 days ago

zdim ,2 days ago

@newguy Yes, that would be a fine way to do it. Another is to store the command's return in an array variable (say, @ary ) -- if you need it elsewhere in the sub -- and then return \@ary;zdim 2 days ago

newguy ,2 days ago

Ok thanks. Wouldn't the @ary way create a copy though – newguy 2 days ago

zdim ,2 days ago

@newguy For one, those elements must be stored somewhere, either anonymously by [ .. ] or associated with a named variable by @ary = .. . I don't know whether yet an extra copy is made in order to construct an array, but I'd expect that it isn't When you return \@ary no new copies are made. I would expect that they are about the same. – zdim 2 days ago

zdim ,2 days ago

@newguy I added an explanation of what happens with \foo()zdim 2 days ago

[Nov 16, 2017] Perl captured digits from string are always 1

Nov 16, 2017 | stackoverflow.com

The match operator in scalar context evaluates to a boolean that indicates whether the match succeeded or not.

my $success = $user =~ /(\d+)/;

The match operator in list context returns the captured strings (or 1 if there are no captures) on success and an empty list on error.

my ($num) = $user =~ /(\d+)/;

You used the former, but you want the latter. That gives you the following (after a few other small fixes):

sub next_level {
    my ($user) = @_;
    my ($num) = $user =~ /(\d+)\z/;
    $user =~ s/\d+\z//g;
    $user .= ++$num;
    return $user;
}

But that approach is complicated and inefficient. Simpler solution:

sub next_level {
    my ($user) = @_;
    $user =~ s/(\d+)\z/ $1 + 1 /e;
    return $user;
}

[Nov 16, 2017] regex - Use of uninitialized value $a in concatenation (.) or string - Stack Overflow

Nov 16, 2017 | stackoverflow.com

sampath, yesterday

I am trying to remove the old files in a dir if the count is more than 3 over SSH

Kindly suggest how to resolve the issue.

Please refer the code snippet

#!/usr/bin/perl
use strict;
use warnings;

my $HOME="/opt/app/latest";
my $LIBS="${HOME}/libs";
my $LIBS_BACKUP_DIR="${HOME}/libs_backups";
my $a;
my $b;
my $c;
my $d;

my $command =qq(sudo /bin/su - jenkins -c "ssh username\@server 'my $a=ls ${LIBS_BACKUP_DIR} | wc -l;my $b=`$a`;if ($b > 3); { print " Found More than 3 back up files , removing older files..";my $c=ls -tr ${LIBS_BACKUP_DIR} | head -1;my $d=`$c`;print "Old file name $d";}else { print "No of back up files are less then 3 .";} '");

print "$command\n";
system($command);

output:

sudo /bin/su - jenkins -c "ssh username@server 'my ; =ls /opt/app/latest/libs_backups | wc -l;my ; =``;if ( > 3); { print " Found More than 3 back up files , removing older files..";my ; =ls -tr /opt/app/latest/libs_backups | head -1;my ; =``;print "Old file name ";}else { print "No of back up files are less then 3 .";} '" Found: -c: line 0: unexpected EOF while looking for matching `'' Found: -c: line 1: syntax error: unexpected end of file

janh ,yesterday

Are you trying to execute parts of your local perl script in an ssh session on a remote server? That will not work. – janh yesterday

simbabque ,yesterday

Look into Object::Remote. Here is a good talk by the author from the German Perl Workshop 2014. It will essentially let you write Perl code locally, and execute it completely on a remote machine. It doesn't even matter what Perl version you have there. – simbabque yesterday

simbabque ,yesterday

You should also not use $a and $b . They are reserved global variables for sort . – simbabque yesterday

Chris Turner ,yesterday

Why are you sudoing when your command is running on an entirely different server? – Chris Turner yesterday

shawnhcorey ,yesterday

Never put sudo or su in a script. This is security breach. Instead run the script as sudo or su . – shawnhcorey yesterday
If you have three levels of escaping, you're bound to get it wrong if you do it manually. Use String::ShellQuote's shell_quote instead.

Furthermore, avoid generating code. You're bound to get it wrong! Pass the necessary information using arguments, the environment or some other channel of communication instead.

There were numerous errors in the interior Perl script on top of the fact that you tried to execute a Perl script without actually invoking perl !

#!/usr/bin/perl

use strict;
use warnings;

use String::ShellQuote qw( shell_quote );

my $HOME = "/opt/app/latest";
my $LIBS = "$HOME/libs";
my $LIBS_BACKUP_DIR = "$HOME/libs_backups";

my $perl_script = <<'__EOI__';
   use strict;
   use warnings;

   use String::ShellQuote qw( shell_quote );

   my ($LIBS_BACKUP_DIR) = @ARGV;

   my $cmd = shell_quote("ls", "-tr", "--", $LIBS_BACKUP_DIR);
   chomp( my @files =  `$cmd` );
   if (@files > 3) {
      print "Found more than 3 back up files. Removing older files...\n";
      print "$_\n" for @files;
   } else {
      print "Found three or fewer backup files.\n";
   }
__EOI__

my $remote_cmd = shell_quote("perl", "-e", $perl_script, "--", $LIBS_BACKUP_DIR);
my $ssh_cmd = shell_quote("ssh", 'username@server', "--", $remote_cmd);
my $local_cmd = shell_quote("sudo", "su", "-c", $ssh_ccmd);
system($local_cmd);

[Nov 15, 2017] Suggestions for working with poor code

Notable quotes:
"... Still looking for time to record time usage ..."
Nov 15, 2017 | perlmonks.com

Suggestions for working with poor code by Ovid (Cardinal)

on May 10, 2001 at 01:34 UTC ( # 79261 = perlmeditation : print w/replies , xml ) Need Help??

I am currently working on adding a fair amount of functionality to a Web site whose programs have been designed very poorly. Amongst other things, taint checking and strict have not been used. Code has been thrown together without regard to side effects, massive Here docs are used to output HTML, etc. Since I am getting a fair amount of experience with these issues, I thought I would offer some of my observations for fellow monks. Some of these pertain to the existing code and concentrates on 'quick fixes'. Some pertains to new code that's added.

Quick (?) Fixes Adding new functionality

Any and all tips that others wish to add are welcome!

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

dws (Chancellor) on May 10, 2001 at 01:51 UTC

Re: Suggestions for working with poor code

Bad formatting can hide a number of sins.

If necessary, the first thing I do when taking on bad code is reformat it. It doesn't matter whether it's Perl, Java, C, or HTML. A surprising number of problems (like mangled boolean conditions in branches and loops) fall right out when the code is tidied up so that you can actually see what it's doing.

Then it's a lot easier to get on with the fixes Ovid suggests.

tinman (Curate) on May 10, 2001 at 02:02 UTC

Re: Suggestions for working with poor code

I've found that taking a deep breath and a step back from the turmoil of badly written code can help immensely.. quite a few instances where you can see places where code can be consolidated into a single reusable library..

With this in mind, trying to understand the basic intent of the code is really important to me.. I write down a small note describing what each section of code tries to do... this allows me to focus on reuse as well as consolidate several segments together..

Related to this: in addition to liberal comments, updating documentation or in some cases, writing some document that describes the structure and function of a code block is very helpful to any person maintaining the code.. you don't have to wonder "what was that guy thinking" or "why did he do *that* ?".. its all there in a document.. and also provides a cursory overview of what has been going on without jumping straight into the code (I'm a big fan of the saying that goes "the less time you spend planning, the more time you spend coding" )...
Caveat: Docs that aren't updated are worse than useless, though...

clemburg (Curate) on May 10, 2001 at 12:20 UTC

Re: Suggestions for working with poor code

Track how long it takes you to fix bugs.

I agree enthusiastically. It will be your only argument when somebody comes and asks you where all the hours have gone. For this kind of job (take responsibility for badly written code, fixing bugs, etc.) this is an absolute must.

For these purposes, two little forms (or spreadsheets, or editor modes/templates, or whatever) will be very helpful (pedantically detailed discussion of these can be found in An Introduction to the Personal Software Process , electronic materials are available at The PSP Resource Page , including time tracking tools, emacs modes, forms, etc.):

  • Time recording log
  • Defect recording log

These are the essentials of both (header columns, add date, person, project, client, etc. as you need):

Time recording log:

  • Start Time
  • Stop Time
  • Interruption Time
  • Delta Time
  • Activity Category (coding, testing, reading docs - make up your own)
  • Comments (more detailed description of task)

Defect recording log:

  • Defect ID (e.g., sequential number)
  • Type (one of: documentation, syntax, build/package, assignment, interface, checking, data, function, system, environment - your own are welcome)
  • Inject Phase (when was the defect put into the program - estimate - design, coding, testing, linking, etc.)
  • Remove Phase (when was the defect found - compile time, testing, etc.)
  • Fix Time (how long did it take to fix)
  • Description (description of defect)

Contrary to what you may think, it does *not* take much time to use these forms (or similar means to record the information). But it will give you all the data you need to be sure you did the Right Thing, and the confidence and evidence to convince your boss or client that what you did was worth the time and the money.

Christian Lemburg
Brainbench MVP for Perl
http://www.brainbench.com

coreolyn (Parson) on May 11, 2001 at 18:55 UTC

Re: Re: Suggestions for working with poor code


by coreolyn (Parson) on May 11, 2001 at 18:55 UTC

You mean these logs haven't been automated into CPAN module yet??

coreolyn Still looking for time to record time usage

r.joseph (Hermit) on May 10, 2001 at 04:04 UTC

Re: Suggestions for working with poor code

Wonderful post Ovid - just added to my favs list. For someone who had the great misfortune a while back of inheriting a large, ill-maintained and astrociously coded website, I know what you mean and this post really highlights some of the main points that go into fixing it.

I also have to agree heartily with the replies, although I would like to add something. I find sometimes that it actually helps, with particularily insubordinate code, to take part of it out of the main file (say, a sub) and put it into another script that has major error-checking, lots of warnings and what not, and then test it from there. Sometimes this will yield a solution very quickly, and other times it has quickly allowed me to see what was wrong and what needed to be recoded.

Just thought I'd offer a quick idea...great job again!

r.
"Violence is a last resort of the incompetent" - Salvor Hardin, Foundation by Issac AsimovW

knobunc (Pilgrim) on May 10, 2001 at 18:19 UTC

Re: Suggestions for working with poor code

Very cool node.

With regard to the To Do list, I scatter them throughout my code if there is a place I need to do further work. However, I have a make rule for todo that searches for all of the lines with TODO in them and prints them out. So a usage of a TODO:

if ($whatever) { # TODO - Finish code to take over the world } [download]

Becomes:

To Do List Dir/file.pl 132: Finish code to take over the world [download]

When run through the following (ugly, suboptimal, but working) code in Tools/todo.sh :

#/bin/sh echo 'To Do List' find . -type f | xargs grep -n TODO | perl -ne '($file, $line, $rest) = split /:/, $_, 3; $file =~ s|^./||; $rest =~ s|.*?TODO.*?[-\s:]+||; $rest =~ s|"[.;,]\s*$||; $rest =~ s|\\n||g; print "$file $line: \u$r est\n"' | sort | uniq | grep -v '.#' | grep -v Makefile | grep -v CVS [download]

Which I call from my Makefile:

todo: Tools/todo.sh [download]

Kinda ugly, but it lets me put the TODO statements where I actually need to do the work. So I can proof out a block of code by writing narrative comments with TODO at the start of the line (behind comment characters of course). Then fill in the code later and not worry about missing a piece. Also since the TODOs are where the stuff needs to be filled in, I have lots of context around the issue and don't need to write as much as I would if they were at the top of the file. Plus anyone without something to do in the group can just type make todo and add some code. Finally, it is easier to add a TODO right where you need it, than bop up to the top of the file and then have to find where you were back in the code.

-ben

Replies are listed 'Best First'.

[Nov 15, 2017] A crucial element in controlling time is controlling the amount of detail needed to gain understanding. It is easy to lose sight of the forest for the trees.

Notable quotes:
"... The Perl Monks website has 83 data tables, two main type hierarchies (nodetypes and perl classes), a core engine of about 12K and about 600 additional code units spread throughout the database. Documentation is scattered and mostly out of date. ..."
"... The initial architecture seems solid but its features have been used inconsistently over time. ..."
Nov 15, 2017 | perlmonks.com

Re^2: Swallowing an elephant in 10 easy steps
by ELISHEVA (Prior) on Aug 13, 2009 at 18:27 UTC

The time drivers are the overall quality of the design, ease of access to code and database schemas, and the size of the system: the number of database tables, the complexity of the type/class system(s), the amount of code, and the number of features in whatever subsystem you explore in step 10. Rather than an average, I'll take the most recent example, Perl Monks.

The Perl Monks website has 83 data tables, two main type hierarchies (nodetypes and perl classes), a core engine of about 12K and about 600 additional code units spread throughout the database. Documentation is scattered and mostly out of date.

The initial architecture seems solid but its features have been used inconsistently over time. Accessing the schema and code samples is slow because there is no tarball to download - it has to be done through the web interface or manually cut and pasted into files off line. The database/class assessment (1-4) took about 16 hours. Steps 5-7 took about 30 hours. Steps 8-10 took about 24 hours. All told that is 70 hours, including writing up documentation and formatting it with HTML.

However, I always like to leave myself some breathing space. If I were contracting to learn a system that size, I'd want 90 hours and an opportunity to reassess time schedules after the initial code walk through was complete. If a system is very poorly designed this process takes somewhat longer.

A crucial element in controlling time is controlling the amount of detail needed to gain understanding. It is easy to lose sight of the forest for the trees. That is why I advise stopping and moving onto the next phase once your categories give a place to most design elements and the categories work together to tell story. That is also why I recommend backtracking as needed. Sometimes we make mistakes about which details really matter and which can be temporarily blackboxed. Knowing I can backtrack lets me err on the side of black boxing.

The other element affecting time is, of course, the skill of the analyst or developer. I have the advantage that I have worked both at the coding and the architecture level of software. I doubt I could work that fast if I didn't know how to read code fluently and trace the flow of data through code. Having been exposed to many different system designs over the years also helps - architectural strategies leave telltale footprints and experience helps me pick up on those quickly.

However one can also learn these skills by doing. The more you practice scanning, categorizing and tracing through code and data the better you get at it. It will take longer, but the steps are designed to build on themselves and are, in a way, self-teaching. That is why you can't just do the 10 steps in parallel as jdporter jokingly suggests below.

However some theoretical context and a naturally open mind definitely helps: if you think that database tables should always have a one-to-one relationship with classes you will be very very confused by a system where that isn't true. If I had to delegate this work to someone else I probably would work up a set of reading materials on different design strategies that have been used in the past 30 years. Alternatively or in addition, I might pair an analyst with a programmer so that they could learn from each other (with neither having priority!)

Best, beth

Update: expanded description of the PerlMonks system so that it addresses all of the time drivers mentioned in the first paragaph.

Update: fixed miscalculation of time

[Nov 15, 2017] Xref helped me make sense of the interactions in the old codebase. I didn't bother with any visualization tools or graph-creation, though. I just took the output of perl -MO=Xref filename for each file, removed some of the cruft with a text editor, ran it through mpage -4 to print, and spent a day with coffee and pencil, figuring out how things worked.

Nov 15, 2017 | perlmonks.com

dave0 (Friar) on Apr 15, 2005 at 15:32 UTC

Re: Analyzing large Perl code base.

Having recently done this on a fairly large codebase that grew organically (no design, no refactoring) over the course of four years, I feel your pain.

Writing a testsuite, on any level, is nearly essential for this. If you're rewriting an existing module, you'll need to ensure it's compatible with the old one, and the only sane way to do that is to test. If the old code is monolithic, it might be difficult to test individual units, but don't let that stop you from testing at a higher level.

B::Xref helped me make sense of the interactions in the old codebase. I didn't bother with any visualization tools or graph-creation, though. I just took the output of perl -MO=Xref filename for each file, removed some of the cruft with a text editor, ran it through mpage -4 to print, and spent a day with coffee and pencil, figuring out how things worked.

Pretty much the same tactic was used on the actual code. Print it out, annotate it away from the computer, and then sit down with the notes to implement the refactoring. If your codebase is huge (mine was about 4-5k lines in several .pl and .pm files, and was still manageable) you might not want to do this, though.

[Nov 15, 2017] Generating documentation from Perl code

Nov 15, 2017 | perlmonks.com

Re: Strategies for maintenance of horrible code?
by aufflick (Deacon) on Jul 13, 2006 at 00:17 UTC

You might find the comments to my recent question Generating documentation from Perl code (not just POD) useful.

The Doxygen perl extension creates docs that are great for seeing what classes re-implement what methods etc. Also the UML::Sequence sounds intriguing - it pupports to generate a sequence diagram by monitoring code execution.

[Nov 15, 2017] Generating documentation from Perl code (not just POD)

Nov 15, 2017 | perlmonks.com

Generating documentation from Perl code (not just POD) by aufflick (Deacon)

on Jul 11, 2006 at 05:15 UTC ( # 560312 = perlquestion : print w/replies , xml ) Need Help?? aufflick has asked for the wisdom of the Perl Monks concerning the following question:

Ideally a script/module would do that, and also interleave the POD from the file, so any POD directly before the method/sub would be linked to it. Any method/sub without POD would at least be documented by it's name.

Of course a major limitation is that (for OO Perl at least), we have no idea what the method arguments are, simply from robotically inspecting the code. Something I always liked in OpenACS is the way that they replace the builtin Tcl proc keyword with a custom ad_proc keyword that works just the same as proc but which takes an optional documentation block that accepts javadoc-like keyword embedding and also a block detailing any arguments and their default values. Because of the tight coupling, the generated documentation is very rich for little developer effort.

Does anyone know of attempts at this sort of thing in Perl, or have any good ideas to offer? Ideally I want to come up with something that will work with existing Perl code, and that any extensions won't break normal Perl compilation (no literate programming preprocessors need apply).

/Mark

planetscape (Chancellor) on Jul 11, 2006 at 06:27 UTC

Re: Generating documentation from Perl code (not just POD)

These links should get you started:


DoxyFilt ( Doxygen for Perl) offsite
Analyzing large Perl code base.
Becoming familiar with a too-big codebase?

HTH,

aufflick (Deacon) on Jul 13, 2006 at 00:34 UTC

Re^2: Generating documentation from Perl code (not just POD)


by aufflick (Deacon) on Jul 13, 2006 at 00:34 UTC

Wow - Doxygen + Doxyfilt is *exactly* what I was looking for - fantastic!

planetscape (Chancellor) on Jul 13, 2006 at 00:39 UTC

Re^3: Generating documentation from Perl code (not just POD)
by planetscape (Chancellor) on Jul 13, 2006 at 00:39 UTC

Glad to hear it! :-D

If you have any questions about configuring Doxyfile to run Doxygen / DoxyFilt under Cygwin , for instance, please /msg me .

HTH,

aufflick (Deacon) on Jul 13, 2006 at 02:12 UTC

Re^4: Generating documentation from Perl code (not just POD)
by aufflick (Deacon) on Jul 13, 2006 at 02:12 UTC

BrowserUk (Pope) on Jul 13, 2006 at 02:31 UTC

Re^3: Generating documentation from Perl code (not just POD)
by BrowserUk (Pope) on Jul 13, 2006 at 02:31 UTC
Wow - Doxygen + Doxyfilt is *exactly* what I was looking for - fantastic!

Now, please someone with influence sell the P6 guys on Doxygen. Let's have it built into the language and allow the terminally-ill POD slip away peacefully.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.

philcrow (Priest) on Jul 11, 2006 at 13:41 UTC

Re: Generating documentation from Perl code (not just POD)

I'm interested in this area. Once, in the past, I wrote UML::Sequence which runs perl programs to produce sequence diagrams of what they actually do. This leads me to think that you could write a special driver using the debugger hooks to load the modules from your app, then dump out their inheritence relationships, etc. (based on what is loaded and what those modules have in their @ISA and symbol tables). Maybe that could be incorporated with some good POD parsing, but I'm just rambling now.

Phil

Replies are listed 'Best First'.

[Nov 15, 2017] With regard to the To Do list, I scatter them throughout my code if there is a place I need to do further work.

Nov 15, 2017 | perlmonks.com

knobunc (Pilgrim) on May 10, 2001 at 18:19 UTC

Re: Suggestions for working with poor code

Very cool node.

With regard to the To Do list, I scatter them throughout my code if there is a place I need to do further work. However, I have a make rule for todo that searches for all of the lines with TODO in them and prints them out. So a usage of a TODO:

if ($whatever) {
    # TODO - Finish code to take over the world
}
[download]

Becomes:

To Do List
Dir/file.pl 132: Finish code to take over the world
[download]

When run through the following (ugly, suboptimal, but working) code in Tools/todo.sh :

#/bin/sh

echo 'To Do List'

find . -type f | xargs grep -n TODO | perl -ne \
'($file, $line, $rest) += split /:/, $_, 3; 
$file =~ s|^./||;  
$rest =~ s|.*?TODO.*?[-\s:]+||; 
$rest =~ s|"[.;,]\s*$||;  
$rest =~ s|\\n||g; print "$file $line: \u$rest\n"' 

| sort | uniq | grep -v '.#' | grep -v Makefile | grep -v CVS
[download]

Which I call from my Makefile:

todo: Tools/todo.sh [download]

Kinda ugly, but it lets me put the TODO statements where I actually need to do the work.

So I can proof out a block of code by writing narrative comments with TODO at the start of the line (behind comment characters of course).

Then fill in the code later and not worry about missing a piece. Also since the TODOs are where the stuff needs to be filled in, I have lots of context around the issue and don't need to write as much as I would if they were at the top of the file. Plus anyone without something to do in the group can just type make todo and add some code. Finally, it is easier to add a TODO right where you need it, than bop up to the top of the file and then have to find where you were back in the code.

-ben

[Nov 15, 2017] Basic Debugger Commands

Notable quotes:
"... pseudo-signal handlers, ..."
"... programmatic debugger control ..."
Nov 15, 2017 | my.safaribooksonline.com

Debugging is just an extreme case of dynamic analysis. Third-party code can be extremely convoluted (so can your own code, of course, but you don't usually think of it that way because you're familiar with it; you knew it when it was just a subroutine); sometimes you just can't tell how part of the code fits in, or whether it's called at all. The code is laid out in some arrangement that makes no sense; if only you could see where the program would actually go when it was run.

Well, you can, using Perl's built-in debugger. Even though you're not actually trying to find a bug, the code-tracing ability of the debugger is perfect for the job.

This isn't the place for a full treatment of the debugger (you can see more detail in [ SCOTT01 ]), but fortunately you don't need a full treatment; a subset of the commands is enough for what you need to do. (Using the debugger is like getting in a fight; it's usually over very quickly without using many of the fancy moves you trained for.)

-d command-line flag; either edit the program to add -d to the shebang line, or run the program by invoking Perl explicitly:

% perl -d program argument argument...

Make sure that the perl in your path is the same one in the shebang line of program or you'll go crazy if there are differences between the two perls.

Basic Debugger Commands

Armed with these commands, we can go code spelunking. Suppose you are debugging a program containing the following code fragment:

77 for my $url (@url_queue)
78 {
79 my $res = $ua->request($url);
80 summarize($res->content);
81 }

and you know that whenever the program gets to the URL http://www.perlmedic.com/fnord.html something strange happens in the summarize() subroutine. You'd like to check the HTTP::Response object to see if there were any redirects you didn't know about. You start the program under the debugger and type:

DB<1> b 80 $url =~ /fnord/
DB<2>

The program will run until it has fetched the URL you're interested in, at which point you can examine the response object -- here's an example of what it might look like:

Perl 5.8.0 and later will give you a stack trace anyway if you run a program under the debugger and some code triggers a warning. But suppose you are either running under an earlier perl, or you'd really like to have a debugger prompt at the point the warning was about to happen.

You can combine two advanced features of Perl to do this: pseudo-signal handlers, and programmatic debugger control .

A signal handler is a subroutine you can tell Perl to execute whenever your program receives a signal. For instance, when the user interrupts your program by pressing Control-C, that works by sending an INT signal to your program, which interprets it by default as an instruction to stop executing.

There are two pseudo-signals, called __WARN__ and __DIE__ . They aren't real signals, but Perl "generates" them whenever it's told to issue a warning or to die, respectively. You can supply code to be run in those events by inserting a subroutine reference in the %SIG hash (see perlvar ) as follows:

$SIG{__WARN__} = sub { print "Ouch, I'm bad" };

(Try it on some code that generates a warning.)

The next piece of the solution is that the debugger can be controlled from within your program; the variable $single in the special package DB determines what Perl does at each statement: 0 means keep going, and 1 or 2 mean give a user prompt. 1 So setting $DB::single to 1 in a pseudo-signal handler will give us a debugger prompt at just the point we wanted.

1 . The difference between the two values is that a 1 causes the debugger to act as though the last n or s command the user typed was s , whereas a 2 is equivalent to an n . When you type an empty command in the debugger (just hit Return), it repeats whatever the last n or s command was.

Putting the pieces together, you can start running the program under the debugger and give the commands:

DB<1> $SIG{__WARN__} = sub { warn @_; $DB::single = 1 }
DB<2>

Now the program will breakpoint where it was about to issue a warning, and you can issue a T command to see a stack trace, examine data, or do anything else you want. 2 The warning is still printed first.

2 . Under some circumstances, the breakpoint might not occur at the actual place of warning: The current routine might return if the statement triggering the warning is the last one being executed in that routine.

Unfortunately, no __DIE__ pseudo-signal handler will return control to the debugger (evidently death is considered too pressing an engagement to be interrupted). However, you can get a stack trace by calling the confess() function in the Carp module:

DB<1> use Carp
DB<2> $SIG{__DIE__} = sub { confess (@_) }

The output will look something like this:

DB<3>
Insufficient privilege to launch preemptive strike at wargames line
109.
main::__ANON__[(eval 17)[/usr/lib/perl5/5.6.1/
perl5db.pl:1521]:2]('Insufficient privilege to launch preemptive
strike at wargames line 109.^J') called at wargames line 121
main::preemptive('Strike=HASH(0x82069d4)') called at wargames
line 109
main::make_strike('ICBM=HASH(0x820692c)') called at wargames
line 74
main::icbm('Silo_ND') called at wargames line 32
main::wmd('ICBM') called at wargames line 22
main::strike() called at wargames line 11
main::menu() called at wargames line 5
Debugged program terminated. Use q to quit or R to restart,
use O inhibit_exit to avoid stopping after program termination,
h q, h R or h O to get additional info.

I've often found it amusing that the debugger refers to the program at this point as "debugged."

[Nov 15, 2017] Strange behaviour of tr function in case the set1 is supplied by a variable

Notable quotes:
"... Characters may be literals or any of the escape sequences accepted in double-quoted strings. But there is no interpolation, so "$" and "@" are treated as literals. ..."
Nov 15, 2017 | perlmonks.com
Nov 16, 2017 at 02:50 UTC ( # 1203542 = perlquestion : print w/replies , xml ) Need Help??

likbez has asked for the wisdom of the Perl Monks concerning the following question:

Looks like in tr function a scalar variable is accepted as the first argument, but is not compiled properly into set of characters

use strict;
use warnings;

my $str1 = 'abcde';
my $str2 = 'eda';
my $diff1 = 0;

eval "\$diff1=\$str1=~tr/$str2//";

print "diff1: $diff1\n";

$ perl foo.pl
diff1: 3

[download]

This produces in perl 5, version 26:

Test 1: strait set diff1=0, diff2=3
Test 2: complement set diff1=5, diff2=2

[download]

Obviously only the second result in both tests is correct. Looks like only explicitly given first set is correctly compiled. Is this a feature or a bug ?

Athanasius (Chancellor) on Nov 16, 2017 at 03:08 UTC

Re: Strange behaviour of tr function in case the set1 is supplied by a variable

Hello likbez ,

The transliteration operator tr/SEARCHLIST/REPLACEMENTLIST/ does not interpolate its SEARCHLIST , so in your first example the search list is simply the literal characters , , , , . See Quote and Quote like Operators .

Hope that helps,

Athanasius  < contra mundum Iustus alius egestas vitae, eros Piratica,

roboticus (Chancellor) on Nov 16, 2017 at 03:08 UTC

Re: Strange behaviour of tr function in case the set1 is supplied by a variable

likbez :

Feature, per the tr docs

Characters may be literals or any of the escape sequences accepted in double-quoted strings. But there is no interpolation, so "$" and "@" are treated as literals.

A hyphen at the beginning or end, or preceded by a backslash is considered a literal. Escape sequence details are in the table near the beginning of this section.

So if you want to use a string to specify the values in a tr statement, you'll probably have to do it via a string eval:

$ cat foo.pl use strict; use warnings;
my $str1 = 'abcde';
my $str2 = 'eda';
my $diff1 = 0;
eval "\$diff1=\$str1=~tr/$str2//";
print "diff1: $diff1\n";
perl foo.pl diff1: 3

[download]

... roboticus

When your only tool is a hammer, all problems look like your thumb.

Anonymous Monk on Nov 16, 2017 at 03:09 UTC

Re: Strange behaviour of tr function in case the set1 is supplied by a variable

Looks like in tr function a scalar variable is accepted as the fist argument, but is not compiled properly into set of characters

:)

you're guessing how tr /// works, you're guessing it works like s/// or m///, but you can't guess , it doesn't work like that, it doesn't interpolate variables, read perldoc -f tr for the details

likbez !!! on Nov 16, 2017 at 04:41 UTC

Re^2: Strange behaviour of tr function in case the set1 is supplied by a variable
you're guessing how tr/// works, you're guessing it works like s/// or m///, but you can't guess , it doesn't work like that, it doesn't interpolate variables, read perldoc -f tr for the details
Houston, we have a problem ;-)

First of all that limits tr area of applicability.

The second, it's not that I am guessing, I just (wrongly) extrapolated regex behavior on tr, as people more often use regex then tr. Funny, but searching my old code and comments in it is clear that I remembered (probably discovered the hard way, not by reading the documentation ;-) this nuance several years ago. Not now. Completely forgotten. Erased from memory. And that tells you something about Perl complexity (actually tr is not that frequently used by most programmers, especially for counting characters).

And that's a real situation, that we face with Perl in other areas too (and not only with Perl): Perl exceeds typical human memory capacity to hold the information about the language. That's why we need "crutches" like strict.

You simply can't remember all the nuances of more then a dozen of string-related built-in functions, can you? You probably can (and should) for index/rindex and substr , but that's about it.

So here are two problems here:

1. Are / / strings uniformly interpreted in the language, or there is a "gotcha" because they are differently interpreted by tr (essentially as a single quoted strings) and regex (as double quoted strings) ?

2. If so, what is the quality of warnings about this gotcha? There is no warning issued, if you use strict and warnings. BTW, it looks like $ can be escaped:

main::(-e:1): 0
DB<5> $_='\$bba\$'
DB<6> tr/\$/?/
DB<7> print $_
\?bba\?

[download]

Right now there is zero warnings issued with use strict and use warnings enabled. Looks like this idea of using =~ for tr was not so good, after all. Regular syntax like tr(set1, set2) would be much better. But it's to late to change and now we need warnings to be implemented.

likbez !!! on Nov 16, 2017 at 03:10 UTC

Re: Strange behaviour of tr function in case the set1 is supplied by a variable

With eval statement works correctly. So it looks like $ is treated by tr as a regular symbol and no warnings are issued.

$statement='$diff1=$str1'."=~tr/$str2//;";
eval($statement);
print "With eval: diff1=$diff1\n";
[download]

that will produce:

With eval: diff1=3

ww (Archbishop) on Nov 16, 2017 at 03:16 UTC

Re: Strange behaviour of tr function in case the set1 is supplied by a variable

Same results in AS 5.24 under Win7x64.

Suspected problem might have arisen from lack of strict, warnings. Wrong, same results BUT using both remains a generally good idea.

Also wondered if compiling (with qr/.../ ) might change the outcome. Wrong again, albeit with variant (erroneous) output.

Correct me if I'm wrong, guessing that "strait" is a typo or personal shortening of "straight."

Update: Now that I've seen earlier replies... ouch, pounding forehead into brick wall!

[Nov 15, 2017] Preface (Modern Perl 2011-2012)

Nov 15, 2017 | modernperlbooks.com

Modern Perl is one way to describe the way the world's most effective Perl 5 programmers work. They use language idioms. They take advantage of the CPAN. They show good taste and craft to write powerful, maintainable, scalable, concise, and effective code. You can learn these skills too!

Perl first appeared in 1987 as a simple tool for system administration. Though it began by declaring and occupying a comfortable niche between shell scripting and C programming, it has become a powerful, general-purpose language family. Perl 5 has a solid history of pragmatism and a bright future of polish and enhancement Perl 6 is a reinvention of programming based on the solid principles of Perl, but it's a subject of another book.

Over Perl's long history -- especially the 17 years of Perl 5 -- our understanding of what makes great Perl programs has changed. While you can write productive programs which never take advantage of all the language has to offer, the global Perl community has invented, borrowed, enhanced, and polished ideas and made them available to anyone willing to learn them.

[Nov 14, 2017] Perl Medic Transforming Legacy Code Peter Scott 9780201795264 Amazon.com Books

Nov 14, 2017 | www.amazon.com

By Ricardo Signes on August 5, 2007

great if you inherit wretched old code

When Perl Medic came out, we received a review copy. It was the first review book we got, and I was pretty excited. I took it to (of all places) the gym with me, and read it while I ran. A lot of Perl People were raving about how it was the awesomest book in a long time, and I just couldn't get that excited about it. Despite that, I find myself recommending it to more and more fellow programmers.Here are the things that I find especially useful in the book:

Tests

I am all for testing. I like testing. Testing helps me code better. Testing helps me figure out what badly documented features should do, and helps me notice that my patches are going to break in production. It's just the right thing to do. When reading, I thought the coverage of testing was a too long, but that was because I was already at home with it. Really, it's the right length for someone who's not already testing, and Peter Scott should be applauded for writing one of the first Perl books to really explain and encourage testing. Moreover, he stays true to the book's "for maintenance programmers" nature and talks about the troubles with writing tests for code you didn't write, including testing traditionally hard to test things like CGI scripts.

perldelta

The book walks through historical versions of perl from version 4 (!) to 5.8.3. The authors tells you what changed, and what you should probably do if you upgrade the platform your old Perl code is running on. (Moving to 5.6? Now you can use C<our>. Moving to 5.8? Restricted hashes!)

Cargo Cult Perl

The author devotes almost a whole chapter to pointing out stupid things that people write because they don't really know what they're doing. open without checking the return, symbolic references, three part for, and (argh!) return undef.

I hear Dominus is working on a book on this topic. Until then, this should help people write code that will be less of a pain to maintain.

*other stuff*

Perl Medic offers a concise explanation of scoping and variable types. It's no Coping with Scoping, but it's quite clear and covers C<our>. While a lot of Perl Medic is really for experienced folks maintaining old code, this section is something every new Perl hacker should read.

There's a nice section on figuring out WTF existing code /does/, suggesting modules to benchmark, profile, deobfuscate, and otherwise dissect the horrible code you're handed.

In the appendices, there's a few pages on "How to Ask Questions that Get Answered." There exist many of these guides, but I don't care. If every technical book spent two or three pages on this, it would be a blessing. Somebody who knows how to ask a question is going to get help, and is going to quickly be received into the community where he asks it.

*the down side*

I guess the biggest down side for me was that I just didn't learn many new things from this book. A number of things were well stated, and I recognized that this book would be useful for many people, but for the most part I didn't have many "A-ha!" moments.

There's a section on "how to use the CPAN," and I felt it was out of place. This might be, though, because I find the idea of Perl without CPAN (for non-tiny projects) to be insane. I guess there must be people who do have to deal with "anti-CPAN policies," though.

This book is definitely a good buy for anybody who's going to be taking over someone else's code, especially code that's old or just lousy. (Even if it's good code, this book can help.) Whenever I leave my job, I will make sure they pick up a copy for the new guy.

[Nov 14, 2017] Perl archeology Need help in refactoring of old Perl code that does not use strict

Nov 14, 2017 | perlmonks.com

likbez has asked for the wisdom of the Perl Monks concerning the following question:

Edit

This is kind of topic that previously was reserved to Cobol and PL/1 forums ;-) but now Perl is almost 30 years old and it looks like the space for Perl archeology is gradually opening ;-).

I got a dozen of fairly large scripts (several thousand lines each) written in a (very) early version of Perl 5 (below Perl 5.6), I now need:

1. Convert them to use strict pragma. The problem is that all of them share (some heavily, some not) information from main program to subroutines (and sometimes among subroutines too) via global variables in addition to (or sometimes instead of) parameters. Those scripts mostly do not use my declarations either.

So I need to map variables into local and global namespaces for each subroutine (around 40 per script; each pretty small -- less then hundred lines) to declare them properly.

As initial step I just plan use global variable with namespace qualification or our lists for each subroutine. Currently I plan to postprocess output of perl -MO=Xref old_perl_script.pl

and generate such statement. Is there a better way ?

2. If possible, I want to split the main namespace into at least two chunks putting all subroutines into another namespace, or module. I actually do not know how to export subroutines names into other namespace (for example main::) when just package statements is used in Perl as in example below. Modules do some magic via exporter that I just use but do not fully understand. For example if we have

#main_script ... ... ... x:a(1,2,3); ... ... ... package x; sub a {...) sub b {...} sub c {...} package y; ... ... ... [download] How can I access subs a,b,c without qualifying them with namespace x from the main:: namespace?

3. Generally this task looks like a case of refactoring. I wonder, if any Perl IDE has some of required capabilities, or are there tools that can helpful.

My time to make the conversion is limited and using some off the shelf tools that speed up the process would be a great help.

Any advice will be greatly appreciated.

AnomalousMonk (Chancellor) on Nov 14, 2017 at 07:20 UTC

Re: Perl archeology: Need help in refactoring of old Perl code that does not use strict

I'd like to suggest that you also need a

Step 0: Write a test suite that the current code passes for all normal modes of operation and for all failure modes.
With this test suite, you can be reasonably certain that refactored code isn't just going to be spreading the devastation.

Given that you seem to be describing a spaghetti-coded application with communication from function to function via all kinds of secret tunnels and spooky-action-at-a-distance global variables, I'd say you have a job on your hands just with Step 0. But you've already taken a test suite into consideration... Right?


Give a man a fish : <%-{-{-{-<

Monk::Thomas (Friar) on Nov 14, 2017 at 12:14 UTC

Re^2: Perl archeology: Need help in refactoring of old Perl code that does not use strict


by Monk::Thomas (Friar) on Nov 14, 2017 at 12:14 UTC

This is what I would do after 'Step 0':

If the variable does change during the run then pick a different function first. When you got the global state disentangled a bit it's a lot easier to reason about what this code is doing. Everything that's still using a global needs to be treated with very careful attention.

Corion (Pope) on Nov 14, 2017 at 08:45 UTC

Re: Perl archeology: Need help in refactoring of old Perl code that does not use strict

In addition to AnomalousMonk s advice of a test suite, I would suggest at the very least to invest the time up front to run automatic regression tests between whatever development version of the program you have and the current "good" (but ugly) version. That way you can easily verify whether your change affected the output and operation of the program. Ideally, the output of your new program and the old program should remain identical while you are cleaning things up.

Note that you can enable strict locally in blocks, so you don't need to make the main program compliant but can start out with subroutines or files and slowly convert them.

For your second question, have a look at Exporter . Basically it allows you to im/export subroutine names between packages:

package x;
use Exporter 'import';
our @EXPORT_OK = ('a', 'b', 'c');
[download] #main_script use x 'a', 'b'; # makes a() and b() available in the main namespace [download]

To find and collect the global variables, maybe it helps you to dump the global namespace before and after your program has run. All these names are good candidates for being at least declared via our to make them visible, and then ideally removed to pass the parameters explicitly instead of implicitly:

      #!perl -w
use strict;

our $already_fixed = 1; # this won't show up

# Put this right before the "uncleaned" part of the script starts
my %initial_variables;
BEGIN {
    %initial_variables = %main::; # make a copy at the start of the program
}
END {
#use Data::Dumper;
#warn Dumper \%initial_variables;
#warn Dumper \%main::;
    # At the end, look what names came newly into being, and tell us about them:
    for my $key (sort keys %main::) {
        if( ! exists $initial_variables{ $key } ) {
            print "Undeclared global variable '$key' found\n";
            
            my $glob = $main::{ $key };
            
            if( defined *{ $glob }{GLOB}) {
                print "used as filehandle *'$key', replace by a lexical filehandle\n";
            };
            if( defined *{ $glob }{CODE}) {
                print "used as subroutine '$key'\n"; # so maybe a false alarm unless you dynamically load code?!
            };
            if( defined *{ $glob }{SCALAR}) {
                print "used as scalar \$'$key', declare as 'our'\n";
            };
            if( defined *{ $glob }{ARRAY}) {
                print "used as array \@'$key', declare as 'our'\n";
            };
            if( defined *{ $glob }{HASH}) {
                print "used as hash \%'$key', declare as 'our'\n";
            };
        };
    };
}
no strict;

$foo = 1;
@bar = (qw(baz bat man));
open LOG, '<', *STDIN;
sub foo_2 {}
 
[download]

The above code is a rough cut and for some reason it claims all global names as scalars in addition to their real use, but it should give you a start at generating a list of undeclared names.

Also see Of Symbol Tables and Globs .

Anonymous Monk on Nov 14, 2017 at 08:26 UTC

Re: Perl archeology: Need help in refactoring of old Perl code that does not use strict (hurry up and wait)

1) ... strict pragma ...My time to make the conversion is limited and using some off the shelf tools that speed up the process would be a great help.

Hurry up and leave it alone :)

use strict; itself confers no benefits; The benefits come from avoidance of the bad practices forbidden by strict :)

That pretty much means convert one at a time by hand after you have learned the understanding of importance of knowing :) Speed kills

2. If possible ... I do not understand ...

That is a hint you shouldn't be refactoring anything programmatically. There are a million nodes on perlmonks, and a readers digest version might be Modern Perl a loose description of how experienced and effective Perl 5 programmers work....You can learn this too.

Hurry up and bone up

3. Generally this task looks like a case of refactoring. I wonder, if any Perl IDE has some of required capabilities, or are there tools that can helpful.

I hope you have foot insurance :) happy hunting :) perlcritic , PPI / PPIx::XPath , PPIx::EditorTools ,
App::EditorTools - Command line tool for Perl code refactoring
Code::CutNPaste - Find Duplicate Perl Code

So enjoy, test first, step0++

[Nov 14, 2017] scoping - What is the difference between my and local in Perl - Stack Overflow

Notable quotes:
"... temporarily changes the value of the variable ..."
"... within the scope ..."
"... Unlike dynamic variables created by the local operator, lexical variables declared with my are totally hidden from the outside world, including any called subroutines. ..."
Nov 14, 2017 | stackoverflow.com

down vote favorite 10

Brian G ,Sep 24, 2008 at 20:12

I am seeing both of them used in this script I am trying to debug and the literature is just not clear. Can someone demystify this for me?

J.J. ,Sep 24, 2008 at 20:24

Dynamic Scoping. It is a neat concept. Many people don't use it, or understand it.

Basically think of my as creating and anchoring a variable to one block of {}, A.K.A. scope.

my $foo if (true); # $foo lives and dies within the if statement.

So a my variable is what you are used to. whereas with dynamic scoping $var can be declared anywhere and used anywhere. So with local you basically suspend the use of that global variable, and use a "local value" to work with it. So local creates a temporary scope for a temporary variable.

$var = 4;
print $var, "\n";
&hello;
print $var, "\n";

# subroutines
sub hello {
     local $var = 10;
     print $var, "\n";
     &gogo; # calling subroutine gogo
     print $var, "\n";
}
sub gogo {
     $var ++;
}

This should print:

4
10
11
4

Brad Gilbert ,Sep 24, 2008 at 20:50

You didn't call the subroutines. – Brad Gilbert Sep 24 '08 at 20:50

brian d foy ,Sep 25, 2008 at 18:23

Don't conditionally declare lexical variables: it has undefined behavior. – brian d foy Sep 25 '08 at 18:23

Jeremy Bourque ,Sep 24, 2008 at 20:26

The short answer is that my marks a variable as private in a lexical scope, and local marks a variable as private in a dynamic scope.

It's easier to understand my , since that creates a local variable in the usual sense. There is a new variable created and it's accessible only within the enclosing lexical block, which is usually marked by curly braces. There are some exceptions to the curly-brace rule, such as:

foreach my $x (@foo) { print "$x\n"; }

But that's just Perl doing what you mean. Normally you have something like this:

sub Foo {
   my $x = shift;

   print "$x\n";
}

In that case, $x is private to the subroutine and it's scope is enclosed by the curly braces. The thing to note, and this is the contrast to local , is that the scope of a my variable is defined with respect to your code as it is written in the file. It's a compile-time phenomenon.

To understand local , you need to think in terms of the calling stack of your program as it is running. When a variable is local , it is redefined from the point at which the local statement executes for everything below that on the stack, until you return back up the stack to the caller of the block containing the local .

This can be confusing at first, so consider the following example.

sub foo { print "$x\n"; }
sub bar { local $x; $x = 2; foo(); }

$x = 1;
foo(); # prints '1'
bar(); # prints '2' because $x was localed in bar
foo(); # prints '1' again because local from foo is no longer in effect

When foo is called the first time, it sees the global value of $x which is 1. When bar is called and local $x runs, that redefines the global $x on the stack. Now when foo is called from bar , it sees the new value of 2 for $x . So far that isn't very special, because the same thing would have happened without the call to local . The magic is that when bar returns we exit the dynamic scope created by local $x and the previous global $x comes back into scope. So for the final call of foo , $x is 1.

You will almost always want to use my , since that gives you the local variable you're looking for. Once in a blue moon, local is really handy to do cool things.

Drew Stephens ,Sep 24, 2008 at 22:58

Quoting from Learning Perl :

But local is misnamed, or at least misleadingly named. Our friend Chip Salzenberg says that if he ever gets a chance to go back in a time machine to 1986 and give Larry one piece of advice, he'd tell Larry to call local by the name "save" instead.[14] That's because local actually will save the given global variable's value away, so it will later automatically be restored to the global variable. (That's right: these so-called "local" variables are actually globals!) This save-and-restore mechanism is the same one we've already seen twice now, in the control variable of a foreach loop, and in the @_ array of subroutine parameters.

So, local saves a global variable's current value and then set it to some form of empty value. You'll often see it used to slurp an entire file, rather than leading just a line:

my $file_content;
{
    local $/;
    open IN, "foo.txt";
    $file_content = <IN>;
}

Calling local $/ sets the input record separator (the value that Perl stops reading a "line" at) to an empty value, causing the spaceship operator to read the entire file, so it never hits the input record separator.

Aristotle Pagaltzis ,Sep 25, 2008 at 23:25

I can't believe no one has linked to Mark Jason Dominus' exhaustive treatises on the matter:

dan1111 ,Jan 28, 2013 at 11:21

Word of warning: both of these articles are quite old, and the second one (by the author's own warning) is obsolete. It demonstrates techniques for localization of file handles that have been superseded by lexical file handles in modern versions of Perl. – dan1111 Jan 28 '13 at 11:21

Floegipoky ,Jan 23, 2015 at 16:51

As in Clinton was President (of the US) when the first was written – Floegipoky Jan 23 '15 at 16:51

Steve Jessop ,Sep 24, 2008 at 20:21

http://perldoc.perl.org/perlsub.html#Private-Variables-via-my()

Unlike dynamic variables created by the local operator, lexical variables declared with my are totally hidden from the outside world, including any called subroutines. This is true if it's the same subroutine called from itself or elsewhere--every call gets its own copy.

http://perldoc.perl.org/perlsub.html#Temporary-Values-via-local()

A local modifies its listed variables to be "local" to the enclosing block, eval, or do FILE --and to any subroutine called from within that block. A local just gives temporary values to global (meaning package) variables. It does not create a local variable. This is known as dynamic scoping. Lexical scoping is done with my, which works more like C's auto declarations.

I don't think this is at all unclear, other than to say that by "local to the enclosing block", what it means is that the original value is restored when the block is exited.

dlamblin ,Sep 24, 2008 at 20:14

Well Google really works for you on this one: http://www.perlmonks.org/?node_id=94007

From the link:

Quick summary: 'my' creates a new variable, 'local' temporarily amends the value of a variable.

ie, 'local' temporarily changes the value of the variable , but only within the scope it exists in.

Generally use my, it's faster and doesn't do anything kind of weird.

Kevin Crumley ,Sep 24, 2008 at 20:27

While this may be true, it's basically a side effect of the fact that "local"s are intended to be visible down the callstack, while "my"s are not. And while overriding the value of a global may be the main reason for using "local", there's no reason you can't use "local" to define a new variable. – Kevin Crumley Sep 24 '08 at 20:27

1800 INFORMATION ,Jan 21, 2009 at 10:02

local does not actually define a new variable. For example, try using local to define a variable when option explicit is enabled. You need to use "our" or "my" to define a new global or local variable. "local" is correctly used to give a variable a new value – 1800 INFORMATION Jan 21 '09 at 10:02

1800 INFORMATION ,Jan 29, 2009 at 10:45

Jesus did I really say option explicit to refer to the Perl feature. I meant obviously "use strict". I've obviously not coded in Perl in a while – 1800 INFORMATION Jan 29 '09 at 10:45

catfood ,Sep 24, 2008 at 20:18

From man perlsub :

Unlike dynamic variables created by the local operator, lexical variables declared with my are totally hidden from the outside world, including any called subroutines.

So, oversimplifying, my makes your variable visible only where it's declared. local makes it visible down the call stack too. You will usually want to use my instead of local .

Michael Carman ,Sep 25, 2008 at 2:00

Your confusion is understandable. Lexical scoping is fairly easy to understand but dynamic scoping is an unusual concept. The situation is made worse by the names my and local being somewhat inaccurate (or at least unintuitive) for historical reasons.

my declares a lexical variable -- one that is visible from the point of declaration until the end of the enclosing block (or file). It is completely independent from any other variables with the same name in the rest of the program. It is private to that block.

local , on the other hand, declares a temporary change to the value of a global variable. The change ends at the end of the enclosing scope, but the variable -- being global -- is visible anywhere in the program.

As a rule of thumb, use my to declare your own variables and local to control the impact of changes to Perl's built-in variables.

For a more thorough description see Mark Jason Dominus' article Coping with Scoping .

skiphoppy ,Sep 25, 2008 at 18:52

local is an older method of localization, from the times when Perl had only dynamic scoping. Lexical scoping is much more natural for the programmer and much safer in many situations. my variables belong to the scope (block, package, or file) in which they are declared.

local variables instead actually belong to a global namespace. If you refer to a variable $x with local, you are actually referring to $main::x, which is a global variable. Contrary to what it's name implies, all local does is push a new value onto a stack of values for $main::x until the end of this block, at which time the old value will be restored. That's a useful feature in and of itself, but it's not a good way to have local variables for a host of reasons (think what happens when you have threads! and think what happens when you call a routine that genuinely wants to use a global that you have localized!). However, it was the only way to have variables that looked like local variables back in the bad old days before Perl 5. We're still stuck with it.

andy ,Sep 24, 2008 at 20:18

"my" variables are visible in the current code block only. "local" variables are also visible where ever they were visible before. For example, if you say "my $x;" and call a sub-function, it cannot see that variable $x. But if you say "local $/;" (to null out the value of the record separator) then you change the way reading from files works in any functions you call.

In practice, you almost always want "my", not "local".

Abhishek Kulkarni ,Apr 10, 2013 at 5:44

Look at the following code and its output to understand the difference.
our $name = "Abhishek";

sub sub1
{
    print "\nName = $name\n";
    local $name = "Abhijeet";

    &sub2;
    &sub3;
}

sub sub2
{
    print "\nName = $name\n";
}

sub sub3
{
    my $name = "Abhinav";
    print "\nName = $name\n";
}


&sub1;

Output is :

Name = Abhishek

Name = Abhijeet

Name = Abhinav

phreakre ,Oct 1, 2008 at 16:01

dinomite's example of using local to redefine the record delimiter is the only time I have ran across in a lot of perl programming. I live in a niche perl environment [security programming], but it really is a rarely used scope in my experience.

Saravanarajan

add a comment,Aug 6, 2009 at 8:12
&s;

sub s()
{
    local $s="5";
    &b;
    print $s;
}

sub b()
{
    $s++;
}

The above script prints 6.

But if we change local to my it will print 5.

This is the difference. Simple.

,

I think the easiest way to remember it is this way. MY creates a new variable. LOCAL temporarily changes the value of an existing variable.

[Nov 13, 2017] 'our' is not 'my' by Ovid

Notable quotes:
"... I didn't convert them to my variables because these variables were being declared in another package. ..."
"... the use of the variable name ..."
Nov 13, 2017 | perlmonks.com

Aug 16, 2001

Update: At the request of a couple of people, some as replies to this node and one as a private message, I have added a link to this node from the Tutorials section. I have linked it rather than repost it to tutorials as there is no reason for me to get the rep twice. I might add that I would not have done this were it not for the high rep this node has received (thus suggesting to me that monks find this information useful).

Many people misunderstand how our is used and I've often seen code where it's used as a synonym for my . This is not the case and I hope this clears up some of the differences. I'm posting this because I've run across this a few times lately and I thought I should just 'get it out there' for others.

There are basically two ways of declaring variables in Perl: global and lexical. A global variable has a package name prepending to it:

$main::foo; $CGI::POST_MAX; @foo::bar; [download]

All packages can access the variable $foo in the main symbol table ( %main:: or the shorthand %:: ) by using $main::foo . Global variables are generally a bad idea, but do crop up in a lot of code.

A lexically scoped variable is declared with my : my $foo; my $POST_MAX; my @bar; [download]

Though they look similar to the package variables above, they are not the same and cannot be accessed outside of the scope in which they were declared.

If you use strict and you try to access a variable that's not previously declared, you'll get an error similar to the following:

Global symbol "$foo" requires explicit package name at C:\test.pl line 2. [download]

Basically, Perl expects that you are trying to use a package variable, but left off the package name. The problem with using package names ( $main::foo ) is that strict ignores these variables and if you type $main::field in one place and $main::feild in another place, strict won't warn you of the typo.

our tries to help out by letting you use package variables without adding the package name. The following two variables are the same:

package foo; use strict; our $bar; # These are the same $foo::bar; # These are the same [download]

That eliminates the $main::f(ie|ei)ld problem by allowing you to do this:

our $field; [download]

From there, you just drop package names and the script will run and later Perl will kill the script if you try use $feild . Unfortunately, there are a lot of problems with this.

The main problem is that you are now using global variables which is generally a Bad Thing. Any code, anywhere, can change your global. Large systems with lots of globals typically suffer this problems and quickly become unmaintainable.

This also leads to very subtle coding errors. Try the following code:

use strict; for ( 1 .. 3 ) { &doit } sub doit { our $foo; print ++$foo . "\n"; } [download]

That will print 1, 2, and 3 on successive lines. Change the our to a my and it prints 1, 1, and 1. Because the my variable goes out of scope at the end of the sub and the our variable doesn't (since it's global), you get wildly different results.

There is one use I have found for our variables. if I am forced to maintain a large code base with lots of global variables, I usually expect to see lots of code problems, including misspellings of those globals that are overlooked due to the hacked nature of those code. For example, one site I worked on had some object-oriented code that stuffs a lot of data into package main (you can keep reading after you stop laughing). I've seen code like this:

$main::sql = $order->lineItemSQL; $main::dbh->prepare( $main::sql ); [download]

I know, it doesn't make a lot of sense (why doesn't it just return the line items instead of the SQL?), but I needed to clean up stuff like that. So, in my code, I put the following:

our ( $sql, $dbh ); #later $sql = $order->lineItemSQL; $dbh->prepare( $sql ); [download]

I didn't convert them to my variables because these variables were being declared in another package.

This doesn't seem like much benefit aside from eliminating a few bytes (and it certainly wasn't the best solution, but we've all had time constraints...). However, there is a huge benefit. Later in the code, when we encounter $main::slq , it becomes $slq and the program dies a much needed death.

As a side note: the primary difference (that I can see) between our and use vars appears to be that our lexically scopes the use of the variable name (not the variable) and the other does not. The following code snippets should clarify that:

use strict; { use vars qw/ $foo /; $foo = "Ovid"; } print $foo; [download]

That code will work fine. However, change the use vars qw/ $foo /; to our $foo and Perl dies, telling you that $foo requires an explicit package name. However, change the print statement to include the package name and all is well:

use strict; { our $foo; $foo = "Ovid"; } print $main::foo; [download]

This behavior may not seem immediately useful, but you can then use an our declaration at the top of a subroutine, strip off the package names in the sub, and not worry about this conflicting with my variables in the rest of the code.

use strict; my $foo = 'bar'; &baz; sub baz { our $foo = 7; print $foo; } print $foo; [download]

Change that code to use " use vars " and you'll see it print 7 twice, stepping on your my variable.

Interesting to note that the only uses I've found for our have been hacking bad code...

Cheers,
Ovid

Vote for paco !

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

dragonchild (Archbishop) on Aug 16, 2001 at 23:35 UTC

Re: 'our' is not 'my'

Good points, Ovid . ++! Personally, the only time I use our is in the following:

use 5.6.0; use strict; use warnings; package Foo; use Exporter; our @ISA = qw(Exporter); our @EXPORT_OK = qw(bar baz); # Code down here, including bar() and baz() [download] That way, I get to have the warmfuzzy feeling that I'm not using ickypoo globals, but, instead, using package-scoped globals. @ISA needing to be a global is why I even bother with our .

------
/me wants to be the brightest bulb in the chandelier!

Vote paco for President!

damian1301 (Curate) on Aug 17, 2001 at 03:21 UTC

Re: 'our' is not 'my'

I often see this error when people use the GetOpt modules. Good job Ovid, ++ (maybe could be a tutorial?).

$_.=($=+(6<<1));print(chr(my$a=$_));$^H=$_+$_;$_=$^H; print chr($_-39); # Easy but its ok. [download]

LD2 (Curate) on Aug 17, 2001 at 03:47 UTC

Re: Re: 'our' is not 'my'


by LD2 (Curate) on Aug 17, 2001 at 03:47 UTC

I second the suggestion, I think it'd be perfect for the tutorial section! Ovid ++!

ikegami (Pope) on Oct 26, 2007 at 12:02 UTC

Re: 'our' is not 'my'

There is a catch with our that doesn't exist with use vars :

package AA; $AA::var = __PACKAGE__; our $var; print "$var\n"; package BB; $BB::var = __PACKAGE__; print "$var\n"; # Prints 'AA'. [download]

Using curlies when using multiple packages in one file avoids the problem.

{ package AA; $AA::var = __PACKAGE__; our $var; print "$var\n"; } { package BB; $BB::var = __PACKAGE__; print "$var\n"; # Compile error! } [download]

Keeping that exception in mind, our is like no strict 'vars'; on a per-var basis.

ruzam (Curate) on Oct 26, 2007 at 15:45 UTC

Re^2: 'our' is not 'my'


by ruzam (Curate) on Oct 26, 2007 at 15:45 UTC

In your first example, how is it that 'our $var;' in package AA makes $var available to both package BB and package main? Why does package BB fall back to AA's result without any kind of warning?

ikegami (Pope) on Oct 26, 2007 at 16:09 UTC

Re^3: 'our' is not 'my'
by ikegami (Pope) on Oct 26, 2007 at 16:09 UTC

I prefer to think of our as the equivalent of no strict 'vars' for a single variable, but in reality, our creates a lexically-scoped variable aliased to a package variable.

In effect,
our $var;
is the same as
use Data::Alias;
alias my $var = $__PACKAGE__::var;
(Syntax issues aside.)

Unlike blocks and files, packages aren't lexical scopes. Switching package neither destroys the our variable (because the our variable is lexically scoped) nor change to which package variable the our variable is aliased (because package doesn't know anything about the our variable).

ruzam (Curate) on Oct 26, 2007 at 16:37 UTC

Re^4: 'our' is not 'my'
by ruzam (Curate) on Oct 26, 2007 at 16:37 UTC

JadeNB (Chaplain) on Nov 26, 2008 at 15:30 UTC

Re^5: 'our' is not 'my'
by JadeNB (Chaplain) on Nov 26, 2008 at 15:30 UTC

Tabari (Monk) on Oct 26, 2007 at 11:55 UTC

Re: 'our' is not 'my'

I was cleaning up some old code which did not use strict.
As it used a lot of global vars, I was forced to use our , the way you explained. I didn't know the exact meaning before
Tabari

Anonymous Monk on Nov 20, 2008 at 19:47 UTC

Re: 'our' is not 'my'

Is slq a typo?
This line:

However, there is a huge benefit. Later in the code, when we encounter $main::slq , it becomes $slq and the program dies a much needed death.

Did you mean sql ?

-J_Tom_Moon_79

StommePoes (Scribe) on Nov 26, 2008 at 09:55 UTC

Re^2: 'our' is not 'my'


by StommePoes (Scribe) on Nov 26, 2008 at 09:55 UTC

As I understand it, it's a deliberate typo to show that when using our , the typo called "slq" will get caught (instead of being treated like some new variable as happens when variables are global).

So, maybe I'm misunderstanding (I don't write speak or read Perl yet, lawlz), but I thought "slq" was the whole point.

Replies are listed 'Best First'.

[Nov 13, 2017] How to export names from one namespace into another

Nov 13, 2017 | stackoverflow.com

Rancho ,Apr 3, 2014 at 17:13

I have a variable $x which currently has a local scope in A.pm and I want to use the output of $x (which is usually PASSED/FAILED) in an if else statement in B.pm

Something like below

A.pm:

if (condition1) { $x = 'PASSED'; }
if (condition2) { $x = 'FAILED'; }

B.pm:

if ($x=='PASSED') { $y=1; } else { $y=0; }

I tried using require ("A.pm"); in B.pm but it gives me an error global symbol requires an explicit package name which means it is not able to read the variable from require. Any inputs would help

Borodin ,Apr 3, 2014 at 17:27

This sounds like a very strange configuration. Your A.pm has executable code as well as values that you want to access externally. Is that code in subroutines? Are you aware that any code outside a subroutine will be executed the first time the external code requires the file? You need to show us the contents of A.pm or we can't help you much. – Borodin Apr 3 '14 at 17:27

Jonathan Leffler ,Apr 3, 2014 at 17:29

Normally, you'd return $x from a function defined in A and called in B; this is a much cleaner, less pathological way of getting at the information. – Jonathan Leffler Apr 3 '14 at 17:29

Rancho ,Apr 3, 2014 at 17:41

Yes the above if conditions in A.pm are in a subroutine. Is there a way I could read that subroutine outside to extract the value of $x? – Rancho Apr 3 '14 at 17:41

ysth ,Apr 3, 2014 at 18:04

there is a core module named B - avoid using that name even in examples. – ysth Apr 3 '14 at 18:04

David W. ,Apr 3, 2014 at 19:08

I have a variable $x which currently has a local scope in A.pm and I want to use the output of $x (which is usually PASSED/FAILED) in an if else statement in B.pm

We could show you how to do this, but this is a really bad, awful idea.

There's a reason why variables are scoped, and even global variables declared with our and not my are still scoped to a particular package.

Imagine someone modifying one of your packages, and not realizing there's a direct connection to a variable name $x . They could end up making a big mess without even knowing why.

What I would HIGHLY recommend is that you use functions (subroutines) to pass around the value you need:

Local/A.pm
package Local::A;
use strict;
use warnings;
use lib qw($ENV{HOME});


use Exporter qw(import);
our @EXPORT_OK = qw(set_condition);

sub set_condition {
    if ( condition1 ) {
       return "PASSED";
    elsif ( condition2 ) {
       return "FALSED";
    else {
       return "Huh?";
}
1;

Here's what I did:

Local/B.pm
package Local::B;
use lib qw($ENV{HOME});

use Local::A qw(set_condition);

my $condition = set_contition();

my $y;
if ( $condition eq 'PASSED' ) {   # Note: Use `eq` and not `==` because THIS IS A STRING!
   $y = 1;
else {
   $y = 0;
}
1;

If all of this looks like mysterious magic, you need to read about Perl modules . This isn't light summer reading. It can be a bit impenetrable, but it's definitely worth the struggle. Or, get Learning Perl and read up on Chapter 11.

Rancho ,Apr 23, 2014 at 16:57

Thanks a lot for the detailed explanation. Appreciate it – Rancho Apr 23 '14 at 16:57

Miller ,Apr 3, 2014 at 17:21

After you require A; , you can then access the variable by giving it an explicit package name like the error message says.

in B.pm:

my $y = $A::x eq 'PASSED ? 1 : 0

The variable $x will have to be declared with our instead of my .

Finally, use eq instead of == for doing string comparisons.

Borodin ,Apr 3, 2014 at 17:24

... as long as $x isn't a lexical variable declared with myBorodin Apr 3 '14 at 17:24

[Nov 13, 2017] aristotle73

Nov 13, 2017 | perlmonks.com

Variable Scoping in Perl: the basics

print "$Robert has canned $name's sorry butt\n"; I tried running this in PERL and it yelled at me saying that it didn't like $name::s. I changed this line of code to: print "$Robert has canned $name sorry butt\n"; And it worked fine 0_o An error in the tutorial perhaps?

Aristotle (Chancellor) on Dec 24, 2004 at 01:50 UTC

Re^2: Variable Scoping in Perl: the basics


by Aristotle (Chancellor) on Dec 24, 2004 at 01:50 UTC

Try

print "$Robert has canned ${name}'s sorry butt\n"; [download]

The apostrophe is the old-style package separator, still supported, so $name's is indeed equivalent to $name::s . By putting the curlies in there, you tell Perl exactly which part of the string to consider part of the variable name, and which part to consider a literal value.

[Nov 13, 2017] Variable Scope

Nov 13, 2017 | perlmonks.com

on Nov 10, 2017 at 16:52 UTC ( # 1203128 = perlquestion : print w/replies , xml ) Need Help?? dave741 has asked for the wisdom of the Perl Monks concerning the following question:

#!/usr/local/bin/perl use strict; foreach my $name ('A', 'B') { my $res = 'Init' if (0); if (defined ($res)) { print "$name: res = $res\n"; } else { print "$name: res is undef\n" } $res = 'Post'; } [download]
Result:
A: res is undef
B: res = Post

As $res is under lexical variable scope, shouldn't it disappear at the bottom of the block
and be recreated by the second pass, producing an identical result?
Bug? Feature? Saving CPU?

perl -v
This is perl, v5.10.1 (*) built for x86_64-linux-thread-multi

Thoughts?
Dave

haukex (Monsignor) on Nov 10, 2017 at 16:55 UTC

Re: Variable Scope (updated)

From perlsyn :

NOTE: The behaviour of a my , state , or our modified with a statement modifier conditional or loop construct (for example, my $x if ... ) is undefined . The value of the my variable may be undef , any previously assigned value, or possibly anything else. Don't rely on it. Future versions of perl might do something different from the version of perl you try it out on. Here be dragons.

Update: Heh, Eily and I posted within 4 seconds of another ;-)

Update 2: Historically, sometimes this "feature/bug" was (ab)used to make variables " static ", just two references of many found with a quick search: Unusual Closure Behaviour , Re: Making a variable in a sub retain its value between calls . The better ways to do this are described in Persistent Private Variables :

BEGIN { my $static_val = 0; sub gimme_another { return ++$static_val; } } # - OR - in Perl >=5.10: use feature 'state'; sub gimme_another { state $static_val = 0; return ++$static_val; } [download]

But nowadays, anywhere you see the pattern, it should be considered a bug, see "Using my() in false conditional" in perldeprecation . On Perl 5.26:

$ perl -e 'my $x if 0' Deprecated use of my() in false conditional. This will be a fatal erro r in Perl 5.30 at -e line 1. [download]

Update 3: Apparently, the warning " Deprecated use of my() in false conditional " first showed up in Perl 5.10 and became a default warning in 5.12. Note that your Perl 5.10.1 is now more than eight years old, and you should upgrade. Also, you should generally use warnings; ( Use strict and warnings ).

Eily (Parson) on Nov 10, 2017 at 16:55 UTC

Re: Variable Scope

According to perlsyn :

NOTE: The behaviour of a my, state, or our modified with a statement modifier conditional or loop construct (for example, my $x if ... ) is undefined. The value of the my variable may be undef, any previously assigned value, or possibly anything else. Don't rely on it. Future versions of perl might do something different from the version of perl you try it out on. Here be dragons.
So neither bug nor feature, third option.

AnomalousMonk (Chancellor) on Nov 10, 2017 at 17:07 UTC

Re: Variable Scope

See the state feature from Perl 5.10 onward for similar "static variable" behavior that is well-defined.


Give a man a fish : <%-{-{-{-<

Replies are listed 'Best First'.

[Nov 13, 2017] How to declare perl variable without using my - Stack Overflow

Nov 13, 2017 | stackoverflow.com

down vote favorite 1

DavidO ,May 22, 2013 at 2:04

I'm new to Perl programming. I've noticed that every time I want to declare a new variable, I should use the my keyword before that variable if strict and warnings are on (which I was told to do, for reasons also I do not know.)

So how to declare a variable in perl without using my and without getting warnings?

My question is: Is it possible to declare a variable without using my and without omitting the use strict; and use warnings; and without getting warnings at all?

[Nov 13, 2017] perl - Why use strict and warnings

Nov 13, 2017 | stackoverflow.com

down vote favorite 12

TLP , Nov 5, 2011 at 23:04

It seems to me that many of the questions in the Perl tag could be solved if people would use:
use strict;
use warnings;

I think some people consider these to be akin to training wheels, or unnecessary complications, which is clearly not true, since even very skilled Perl programmers use them.

It seems as though most people who are proficient in Perl always use these two pragmas, whereas those who would benefit most from using them seldom do. So, I thought it would be a good idea to have a question to link to when encouraging people to use strict and warnings .

So, why should a Perl developer use strict and warnings ?

Paul Tyng , Nov 5, 2011 at 23:08

I always wonder for stuff like this why they don't just make it the default and have the dev actually have to actively loosen stuff, where is the use loose;Paul Tyng Nov 5 '11 at 23:08

Daniel Böhmer , Nov 5, 2011 at 23:15

Like many cool and useful things Perl started as a hack, as a tool for the guy who invents it. Later it became more popular and an increasing number of unskilled people started using it. This is when you start thinking something like use strict was a good idea but backwards compatibility has already become a real problem to you:-( – Daniel Böhmer Nov 5 '11 at 23:15

ikegami , Nov 6, 2011 at 0:04

@JB Nizet, @Paul T., Actually, use strict; is on by default when you request the Perl 5.12 (or higher) language. Try perl -e"use v5.012; $x=123;" . no strict; actually turns it off. – ikegami Nov 6 '11 at 0:04

Joel Berger , Nov 6, 2011 at 3:05

Though in the end your point is true, the more times we say it, maybe the more people will hear. There has been some rumbling lately of trying to make more/better/modern Perl tutorials available and certainly strict/warnings will be on the top of each of these. For mine I plan to have s/w on the top of every snippet, just so that all newbies see it every time – Joel Berger Nov 6 '11 at 3:05

TLP , Nov 6, 2011 at 5:04

@JoelBerger No, actually it is nothing like it. Just like I said, it only has similar words in the title. It's for backwards compatibility. is the first sentence in the accepted answer, how do you propose that applies to my question? – TLP Nov 6 '11 at 5:04

ikegami , Nov 6, 2011 at 0:00

For starters, it helps find typos in variable names. Even experienced programmers make such errors. A common case is forgetting to rename an instance of a variable when cleaning up or refactoring code.

The pragmas catch many errors sooner than they would be caught otherwise, which makes it easier to find the root causes of the errors. The root cause might be the need for an error or validation check, and that can happen regardless or programmer skill.

What's good about Perl warnings is that they are rarely spurious, so there's next to no cost to using them.


Related reading: Why use my ?

ikegami , Nov 6, 2011 at 19:42

@TLP, I'm not about to make a study to quantify how much it helps. It should suffice to say that they help unconditionally. – ikegami Nov 6 '11 at 19:42

Jean , Sep 26, 2013 at 16:11

Why is it made optional then if it has so many benefits ? Why not enable it by default (like someone commented above) ? Is it for compatibility reasons ? – Jean Sep 26 '13 at 16:11

ikegami , Sep 26, 2013 at 16:34

@Jean, backwards compatibility. Note that use strict; is enabled by default if you use version 5.12 or newer of the language ( use 5.012; ). – ikegami Sep 26 '13 at 16:34

user2676847 , Aug 17, 2014 at 8:51

@Jean if you are writing a simple script you really don't want to get alerted by warnings about file handler names or for not declaring the variable before using them :-) – user2676847 Aug 17 '14 at 8:51

user966588

add a comment, Feb 12, 2013 at 8:31
Apparently use strict should(must) be used when you want to force perl to code properly which could be forcing declaration, being explicit on strings and subs i.e. barewords or using refs with caution. Note: if there are errors use strict will abort the execution if used.

While use warnings; will help you find typing mistakes in program like you missed a semicolon, you used 'elseif' and not 'elsif', you are using deprecated syntax or function, whatever like that. Note: use warnings will only provide warnings and continue execution i.e. wont abort the execution..

Anyway, It would be better if we go into details, which I am specifiying below

From perl.com (my favourite):

use strict 'vars';

which means that you must always declare variables before you use them.

If you don't declare you will probably get error message for the undeclared variable

Global symbol "$variablename" requires explicit package name at scriptname.pl line 3

This warning mean Perl is not exactly clear about what the scope of variable is. So you need to be explicit about your variables, which means either declaring them with my so they are restricted to the current block, or referring to them with their fully qualified name (for ex: $MAIN::variablename).

So, a compile-time error is triggered if you attempt to access a variable that hasn't met at least one of the following criteria:

use strict 'subs';

Consider two programs

# prog 1
   $a = test_value;
   print "First program: ", $a, "\n";
   sub test_value { return "test passed"; }
 Output: First program's result: test_value

# prog 2
   sub test_value { return "test passed"; }
   $a = test_value;
   print "Second program: ", $a, "\n";
 Output: Second program's result: test passed

In both cases we have a test_value() sub and we want to put its result into $a. And yet, when we run the two programs, we get two different results:

In the first program, at the point we get to $a = test_value; , Perl doesn't know of any test_value() sub, and test_value is interpreted as string 'test_value'. In the second program, the definition of test_value() comes before the $a = test_value; line. Perl thinks test_value as sub call.

The technical term for isolated words like test_value that might be subs and might be strings depending on context, by the way, is bareword . Perl's handling of barewords can be confusing, and it can cause bug in program.

The bug is what we encountered in our first program, Remember that Perl won't look forward to find test_value() , so since it hasn't already seen test_value(), it assumes that you want a string. So if you use strict subs; , it will cause this program to die with an error:

Bareword "test_value" not allowed while "strict subs" in use at ./a6-strictsubs.pl line 3.

Solution to this error would be
1. Use parentheses to make it clear you're calling a sub. If Perl sees $a = test_value();,
2. Declare your sub before you first use it

use strict;
sub test_value;  # Declares that there's a test_value() coming later ...
my $a = test_value;  # ...so Perl will know this line is okay.
.......
sub test_value { return "test_passed"; }

3. And If you mean to use it as a string, quote it.

So, This stricture makes Perl treat all barewords as syntax errors. *A bareword is any bare name or identifier that has no other interpretation forced by context. (Context is often forced by a nearby keyword or token, or by predeclaration of the word in question.)* So If you mean to use it as a string, quote it and If you mean to use it as a function call, predeclare it or use parentheses.

Barewords are dangerous because of this unpredictable behavior. use strict; (or use strict 'subs';) makes them predictable, because barewords that might cause strange behavior in the future will make your program die before they can wreak havoc

There's one place where it's OK to use barewords even when you've turned on strict subs: when you are assigning hash keys.

$hash{sample} = 6;   # Same as $hash{'sample'} = 6
%other_hash = ( pie => 'apple' );

Barewords in hash keys are always interpreted as strings, so there is no ambiguity.

use strict 'refs';

This generates a run-time error if you use symbolic references, intentionally or otherwise. A value that is not a hard reference is then treated as a symbolic reference . That is, the reference is interpreted as a string representing the name of a global variable.

use strict 'refs';

$ref = \$foo;       # Store "real" (hard) reference.
print $$ref;        # Dereferencing is ok.

$ref = "foo";       # Store name of global (package) variable.
print $$ref;        # WRONG, run-time error under strict refs.

use warnings;

This lexically scoped pragma permits flexible control over Perl's built-in warnings, both those emitted by the compiler as well as those from the run-time system.

From perldiag :

So The majority of warning messages from the classifications below i.e. W, D & S can be controlled using the warnings pragma.

(W) A warning (optional)
(D) A deprecation (enabled by default)
(S) A severe warning (enabled by default)

I have listed some of warnings messages those occurs often below by classifications. For detailed info on them and others messages refer perldiag

(W) A warning (optional):

Missing argument in %s
Missing argument to -%c
(Did you mean &%s instead?)
(Did you mean "local" instead of "our"?)
(Did you mean $ or @ instead of %?)
'%s' is not a code reference
length() used on %s
Misplaced _ in number

(D) A deprecation (enabled by default):

defined(@array) is deprecated
defined(%hash) is deprecated
Deprecated use of my() in false conditional
$# is no longer supported

(S) A severe warning (enabled by default)

elseif should be elsif
%s found where operator expected
(Missing operator before %s?)
(Missing semicolon on previous line?)
%s never introduced
Operator or semicolon missing before %s
Precedence problem: open %s should be open(%s)
Prototype mismatch: %s vs %s
Warning: Use of "%s" without parentheses is ambiguous
Can't open %s: %s

toolic , Nov 5, 2011 at 23:50

These two pragmas can automatically identify bugs in your code.

I always use this in my code:

use strict;
use warnings FATAL => 'all';

FATAL makes the code die on warnings, just like strict does.

For additional information, see: Get stricter with use warnings FATAL => 'all';

Also... The strictures, according to Seuss

tchrist , Nov 5, 2011 at 23:52

Actually, you have to delay the FATAL => "all" till runtime, by assigning to $SIG{__WARN__} = sub { croak "fatalized warning @_" }; or else you screw up the compiler trying to tell you what it needs to. – tchrist Nov 5 '11 at 23:52

toolic , Nov 6, 2011 at 0:06

@tchrist: This has always worked for me as-is and as documented. If you have found a case where it doesn't work as documented, please patch the documentation using perlbug . – toolic Nov 6 '11 at 0:06

moodywoody , Nov 5, 2011 at 23:21

There's a good thread on perlmonks about this question.

The basic reason obviously is that strict and warnings massively help you catch mistakes and aid debugging.

Rahul Reddy , Jul 1, 2013 at 7:35

Source :: Different blogs

Use will export functions and variable names to the main namespace by calling modules import() function.

A pragma is a module which influences some aspect of the compile time or run time behavior of perl.Pragmas give hints to the compiler.

Use warnings - perl complaints about variables used only once,improper conversions of strings into numbers,.Trying to write to files that are not opened .it happens at compile time.It is used to control warnings.

Use strict - declare variables scope. It is used to set some kind of discipline in the script.If barewords are used in the code they are interpreted.All the variables should be given scope ,like my,our or local.

MikasaAckerman , Jun 29, 2015 at 13:43

The "use strict" directive tells Perl to do extra checking during the compilation of your code. Using this directive will save you time debugging your Perl code because it finds common coding bugs that you might overlook otherwise.

dhana govindarajan , Jun 28, 2016 at 12:24

 use strict;use warnings;

strict and warnings are the mode for the perl program,and it is allowing the user to enter the code more liberally and more than that,that perl code will be look formal and its coding standard will be effective.

warnings means same like "-w" in the perl shabang line,so it will provide you the warnings generated by the perl program,it will display inthe terminal

Andreas Ährlund , Apr 6, 2015 at 14:58

Strict and warnings make sure your variables are not global.

It is much neater to be able to have variables unique for individual methods rather than having to keep track of each and every variable name.

$_, or no variable for certain functions, can also be useful to write more compact code quicker.

However, if you do not use strict and warnings, $_ becomes global!

[Nov 13, 2017] Basic debugging checklist by toolic

Feb 23, 2009 | perlmonks.com
checklist of tips and techniques to get you started.

This list is meant for debugging some of the most common Perl programming problems; it assumes no prior working experience with the Perl debugger ( perldebtut ). Think of it as a First Aid kit, rather than a fully-staffed state-of-the-art operating room.

These tips are meant to act as a guide to help you answer the following questions:

  1. Add the "stricture" pragmas ( Use strict and warnings ) use strict; use warnings; use diagnostics; [download]
  2. Display the contents of variables using print or warn warn "$var\n"; print "@things\n"; # array with spaces between elements [download]
  3. Check for unexpected whitespace
    • chomp , then print with delimiters of your choice, such as colons or balanced brackets, for visibility chomp $var; print ">>>$var<<<\n"; [download]
    • Check for unprintable characters by converting them into their ASCII hex codes using ord my $copy = $str; $copy =~ s/([^\x20-\x7E])/sprintf '\x{%02x}', ord $1/eg; print ":$copy:\n"; [download]
  4. Dump arrays, hashes and arbitrarily complex data structures. You can get started using the core module Data::Dumper . Should the output prove to be unsuitable to you, other alternatives can be downloaded from CPAN, such as Data::Dump , YAML , or JSON . See also How can I visualize my complex data structure? use Data::Dumper; print Dumper(\%hash); print Dumper($ref); [download]
  5. If you were expecting a ref erence, make sure it is the right kind (ARRAY, HASH, etc.) print ref $ref, "\n"; [download]
  6. Check to see if your code is what you thought it was: B::Deparse $ perl -MO=Deparse -p program.pl [download]
  7. Check the return ( error ) status of your commands
    • open with $! open my $fh, '<', 'foo.txt' or die "can not open foo.txt: $!"; [download]
    • system and backticks ( qx ) with $? if (system $cmd) { print "Error: $? for command $cmd" } else { print "Command $cmd is OK" } $out = `$cmd`; print $? if $?; [download]
    • eval with $@ eval { do_something() }; warn $@ if $@; [download]
  8. Use Carp to display variables with a stack trace of module names and function calls. use Carp qw(cluck); cluck("var is ($var)"); [download]

    Better yet, install and use the Carp::Always CPAN module to make your existing warn / die complain with a stack trace:

    $ perl -MCarp::Always program.pl [download]
  9. Demystify regular expressions by installing and using the CPAN module YAPE::Regex::Explain # what the heck does /^\s+$/ mean? use YAPE::Regex::Explain; print YAPE::Regex::Explain->new('/^\s+$/')->explain(); [download]
  10. Neaten up your code by installing and using the CPAN script perltidy . Poor indentation can often obscure problems.
  11. Checklist for debugging when using CPAN modules:
    • Check the Bug List by following the module's "View Bugs" link.
    • Is your installed version the latest version? If not, check the change log by following the "Changes" link. Also follow the "Other Tools" link to "Diff" and "Grep" the release.
    • If a module provides status methods, check them in your code as you would check return status of built-in functions: use WWW::Mechanize; if ($mech->success()) { ... } [download]
What's next? If you are not already doing so, use an editor that understands Perl syntax (such as vim or emacs), a GUI debugger (such as Devel::ptkdb ) or use a full-blown IDE. Lastly, use a version control system so that you can fearlessly make these temporary hacks to your code without trashing the real thing.

For more relevant discussions, refer to the initial Meditation post: RFC: Basic debugging checklist

Updated: Sep 8, 2009: Added CPAN Diff/Grep tip.
Updated: Jan 11, 2011: Added Carp::Always.

Bloodnok (Vicar) on Feb 23, 2009 at 01:19 UTC

Re: Basic debugging checklist

Damned decent posting :D ... just a couple of suggestions tho'...

Just a thought...

hexcoder (Chaplain) on Jun 21, 2014 at 11:07 UTC

Re: Basic debugging checklist

When debugging warnings from the perl core like Use of uninitialized value ... let the debugger pause right there. Then have a good look at the context that led to this situation and investigate variables and the callstack.

To let the debugger do this automatically I use a debugger customization script:

sub afterinit 
{
    $::SIG{'__WARN__'} = sub {
        my $warning = shift;
        if ( $warning =~ m{\s at \s \S+ \s line \s \d+ \. $}xms ) {
            $DB::single = 1;    # debugger stops here automatically
        }
        warn $warning;
    };
    print "sigwarn handler installed!\n";
    return;
}
[download]

Save the content to file .perldb (or perldb.ini on Windows) and place it in the current or in your HOME directory.

The subroutine will be called initially by the debugger and installs a signal handler for all warnings. If the format matches one from the perl core, execution in the debugger is paused by setting $DB::single = 1 .

LanX (Bishop) on Jun 21, 2014 at 12:30 UTC

Re^2: Basic debugging checklist

by LanX (Bishop) on Jun 21, 2014 at 12:30 UTC

hexcoder ++ :)

For further informations: afterinit , .perldb and other options are described in:

--> perldebug#Debugger Customization

Anonymous Monk on Oct 03, 2014 at 02:58 UTC

Re: Basic debugging checklist

perl -M re =debug foo.pl can help you see how perl interprets your regex, much in the way O=Deparse does for perl code

Anonymous Monk on Oct 31, 2014 at 07:38 UTC

Re^2: Basic debugging checklist

If you don't quite understand what you're looking at (output of deparse, perl syntax), then ppi_dumper can help you look at the right part of the manual, an example

$ ppi_dumper 2
PPI::Document
  PPI::Statement::Include
    PPI::Token::Word    'use'
    PPI::Token::Whitespace      ' '
    PPI::Token::Word    'constant'
    PPI::Token::Whitespace      ' '
    PPI::Token::Word    'X'
    PPI::Token::Whitespace      ' '
    PPI::Token::Operator        '=>'
    PPI::Token::Whitespace      ' '
    PPI::Token::Number          '1'
    PPI::Token::Operator        '/'
    PPI::Token::Number          '3'
    PPI::Token::Structure       ';'
  PPI::Token::Whitespace        '\n'
  PPI::Token::Whitespace        '\n'
[download]

So PPI::Token::Operator tells you "=>" is an operator

If you read perl /"perltoc" then you'll know that operators are documented in perlop

App::PPI::Dumper / ppi_dumper

wxPPIxregexplain.pl

[Nov 08, 2017] In Perl, how can I concisely check if a $variable is defined and contains a non zero length string?

Nov 08, 2017 | stackoverflow.com

Jessica , Sep 26, 2009 at 0:23

I currently use the following Perl to check if a variable is defined and contains text. I have to check defined first to avoid an 'uninitialized value' warning:
if (defined $name && length $name > 0) {
    # do something with $name
}

Is there a better (presumably more concise) way to write this?

brian d foy ,Sep 26, 2009 at 19:27

You often see the check for definedness so you don't have to deal with the warning for using an undef value (and in Perl 5.10 it tells you the offending variable):
 Use of uninitialized value $name in ...

So, to get around this warning, people come up with all sorts of code, and that code starts to look like an important part of the solution rather than the bubble gum and duct tape that it is. Sometimes, it's better to show what you are doing by explicitly turning off the warning that you are trying to avoid:

 {
 no warnings 'uninitialized';

 if( length $name ) {
      ...
      }
 }

In other cases, use some sort of null value instead of the data. With Perl 5.10's defined-or operator , you can give length an explicit empty string (defined, and give back zero length) instead of the variable that will trigger the warning:

 use 5.010;

 if( length( $name // '' ) ) {
      ...
      }

In Perl 5.12, it's a bit easier because length on an undefined value also returns undefined . That might seem like a bit of silliness, but that pleases the mathematician I might have wanted to be. That doesn't issue a warning, which is the reason this question exists.

use 5.012;
use warnings;

my $name;

if( length $name ) { # no warning
    ...
    }

rjbs , Jul 9, 2014 at 17:14

Also, in v5.12 and later, length undef returns undef, instead of warning and returning 0. In boolean context, undef is just as false as 0, so if you're targeting v5.12 or later, you can just write if (length $name) { ... }rjbs Jul 9 '14 at 17:14

Adam Bellaire ,Sep 26, 2009 at 0:44

As mobrule indicates, you could use the following instead for a small savings:
if (defined $name && $name ne '') {
    # do something with $name
}

You could ditch the defined check and get something even shorter, e.g.:

if ($name ne '') {
    # do something with $name
}

But in the case where $name is not defined, although the logic flow will work just as intended, if you are using warnings (and you should be), then you'll get the following admonishment:

Use of uninitialized value in string ne

So, if there's a chance that $name might not be defined, you really do need to check for definedness first and foremost in order to avoid that warning. As Sinan Ünür points out, you can use Scalar::MoreUtils to get code that does exactly that (checks for definedness, then checks for zero length) out of the box, via the empty() method:

use Scalar::MoreUtils qw(empty);
if(not empty($name)) {
    # do something with $name 
}

Sinan Ünür ,Sep 26, 2009 at 10:07

First, since length always returns a non-negative number,
if ( length $name )

and

if ( length $name > 0 )

are equivalent.

If you are OK with replacing an undefined value with an empty string, you can use Perl 5.10's //= operator which assigns the RHS to the LHS unless the LHS is defined:

#!/usr/bin/perl

use feature qw( say );
use strict; use warnings;

my $name;

say 'nonempty' if length($name //= '');
say "'$name'";

Note the absence of warnings about an uninitialized variable as $name is assigned the empty string if it is undefined.

However, if you do not want to depend on 5.10 being installed, use the functions provided by Scalar::MoreUtils . For example, the above can be written as:

#!/usr/bin/perl

use strict; use warnings;

use Scalar::MoreUtils qw( define );

my $name;

print "nonempty\n" if length($name = define $name);
print "'$name'\n";

If you don't want to clobber $name , use default .

DVK , Sep 26, 2009 at 15:19

+1 for "//=" mention (how'd I know that'd be Sinan's answer :) – DVK Sep 26 '09 at 15:19

brian d foy , Sep 26, 2009 at 19:16

I wouldn't use //= in this case since it changes the data as a side effect. Instead, use the slightly shorter length( $name // '' ) . – brian d foy Sep 26 '09 at 19:16

Sinan Ünür , Sep 26, 2009 at 20:15

@brian d'foy I think it depends on what is being done in the function. – Sinan Ünür Sep 26 '09 at 20:15

Chris Lutz , Sep 28, 2009 at 6:11

+1 The // and //= operators are possibly the most useful specialized operators in existence. – Chris Lutz Sep 28 '09 at 6:11

brian d foy , Aug 29, 2015 at 1:44

As @rjbs pointed out in my answer, with v5.12 and later length can now return something that is not a number (but not NaN ;) – brian d foy Aug 29 '15 at 1:44

Gaurav ,Sep 28, 2009 at 6:29

In cases where I don't care whether the variable is undef or equal to '' , I usually summarize it as:
$name = "" unless defined $name;
if($name ne '') {
  # do something with $name
}

Chris Lutz , Sep 28, 2009 at 6:31

In Perl 5.10, this can be shortened to $name //= ""; which is exactly what Sinan posted. – Chris Lutz Sep 28 '09 at 6:31

RET , Sep 28, 2009 at 7:16

And even if you don't have perl 5.10, you can still write $name ||= "";RET Sep 28 '09 at 7:16

brian d foy , Sep 29, 2009 at 4:25

@RET: you can't use the || operator here since it replaces the string '0' with ''. You have to check if it is defined, not true. – brian d foy Sep 29 '09 at 4:25

Gaurav , Sep 29, 2009 at 4:27

Chris, RET: Yup, I know. I was specifically trying to suggest that if Jessica was not concerned with the difference between undef and "" , she should just change one to the other and use a single test. This won't work in the general case, for which the other solutions posted are way better, but in this specific case leads to neat code. Should I rephrase my answer to make this clearer? – Gaurav Sep 29 '09 at 4:27

mob ,Sep 26, 2009 at 0:29

You could say
 $name ne ""

instead of

 length $name > 0

brian d foy , Sep 26, 2009 at 19:20

This will still give you a warning. The reason people check definedness first is to avoid the 'uninitialized value' warning. – brian d foy Sep 26 '09 at 19:20

daotoad ,Sep 26, 2009 at 1:33

It isn't always possible to do repetitive things in a simple and elegant way.

Just do what you always do when you have common code that gets replicated across many projects:

Search CPAN, someone may have already the code for you. For this issue I found Scalar::MoreUtils .

If you don't fined something you like on CPAN, make a module and put the code in a subroutine:

package My::String::Util;
use strict;
use warnings;
our @ISA = qw( Exporter );
our @EXPORT = ();
our @EXPORT_OK = qw( is_nonempty);

use Carp  qw(croak);

sub is_nonempty ($) {
    croak "is_nonempty() requires an argument" 
        unless @_ == 1;

    no warnings 'uninitialized';

    return( defined $_[0] and length $_[0] != 0 );
}

1;

=head1 BOILERPLATE POD

blah blah blah

=head3 is_nonempty

Returns true if the argument is defined and has non-zero length.    

More boilerplate POD.

=cut

Then in your code call it:

use My::String::Util qw( is_nonempty );

if ( is_nonempty $name ) {
    # do something with $name
}

Or if you object to prototypes and don't object to the extra parens, skip the prototype in the module, and call it like: is_nonempty($name) .

Zoran Simic , Sep 26, 2009 at 4:03

Isn't this like using a hammer to kill a fly? – Zoran Simic Sep 26 '09 at 4:03

Sinan Ünür , Sep 26, 2009 at 7:05

@Zoran No. Factoring code out like this beats having a complicated condition replicated in many different places. That would be like using pinpricks to kill an elephant. @daotoad: I think you should shorten your answer to emphasize the use of Scalar::MoreUtils . – Sinan Ünür Sep 26 '09 at 7:05

Adam Bellaire , Sep 26, 2009 at 11:45

@Zoran: Scalar::MoreUtils is a very lightweight module with no dependencies. Its semantics are also well known. Unless you are allergic to CPAN, there's not much reason to avoid using it. – Adam Bellaire Sep 26 '09 at 11:45

daotoad , Sep 28, 2009 at 7:35

@Chris Lutz, yeah, I shouldn't. But prototypes are semi-broken--there are easy ways to break prototype enforcement. For example, crappy and/or outdated tutorials continue to encourage the use of the & sigil when calling functions. So I tend not to rely on prototypes to do all the work. I suppose I could add "and quit using the & sigil on sub calls unless you really mean it" to the error message. – daotoad Sep 28 '09 at 7:35

brian d foy , Sep 28, 2009 at 18:52

It's easier to think about prototypes as hints to the perl compiler so it knows how to parse something. They aren't there to validate arguments. They may be broken in terms of people's expectations, but so many things are. :) – brian d foy Sep 28 '09 at 18:52

Puriney ,Jan 28, 2014 at 22:02

my %hash ; 
$hash{"what"} = "What"; 
$hash{"how"} = "How"; 
my $word = $hash{"now"}; 
print $word; 
if (! $word) {
    print "Catch Ya\n"; 
}
else {
    print $word ; 
}

Dave Sherohman ,Sep 26, 2009 at 10:04

How about
if (length ($name || '')) {
  # do something with $name
}

This isn't quite equivalent to your original version, as it will also return false if $name is the numeric value 0 or the string '0' , but will behave the same in all other cases.

In perl 5.10 (or later), the appropriate approach would be to use the defined-or operator instead:

use feature ':5.10';
if (length ($name // '')) {
  # do something with $name
}

This will decide what to get the length of based on whether $name is defined, rather than whether it's true, so 0/ '0' will handle those cases correctly, but it requires a more recent version of perl than many people have available.

brian d foy , Sep 26, 2009 at 19:17

Why lead off with a broken solution only to say that it is broken? – brian d foy Sep 26 '09 at 19:17

Dave Sherohman , Sep 26, 2009 at 22:20

Because, as I also mentioned, 5.10 is "a more recent version of perl than many people have available." YMMV, but "this is a 99% solution that I know you can use, but there's a better one that maybe you can use, maybe you can't" seems better to me than "here's the perfect solution, but you probably can't use it, so here's an alternative you can probably get by with as a fallback." – Dave Sherohman Sep 26 '09 at 22:20

brian d foy , Sep 28, 2009 at 6:09

Even with earlier perls you can have a working solution instead of a broken one. – brian d foy Sep 28 '09 at 6:09

Joseph ,Sep 29, 2009 at 13:23

if ($name )
{
    #since undef and '' both evaluate to false 
    #this should work only when string is defined and non-empty...
    #unless you're expecting someting like $name="0" which is false.
    #notice though that $name="00" is not false
}

; , Sep 29, 2009 at 15:20

Unfortunately this will be false when $name = 0; – user180804 Sep 29 '09 at 15:20

Joseph ,Sep 9 , 2012 at 3:39

yep, you're right – Joseph Oct 2 '09 at 9:07

[Nov 08, 2017] Skipping null/empty fields caught by split()

Jun 01, 2019 | bytes.com

sicarie

I am attempting to parse a CSV, but am not allowed to install the CSV parsing module because of "security reasons" (what a joke), so I'm attempting to use 'split' to break up a comma-delimited file.

My issue is that as soon as an "empty" field comes up (two commas in a row), split seems to think the line is done and goes to the next one.

Everything I've read online says that split will return a null field, but I don't know how to get it to go to the next element and not just skip to the next line.

Expand|Select|Wrap|Line Numbers

  1. while (<INFILE>) {
  2. # use 'split' to avoid module-dependent functionality
  3. # split line on commas, OS info in [3] (4th group, but
  4. # counting starts first element at 0)
  5. # line = <textonly>,<text+num>,<ip>,<whatIwant>,
  6. chomp($_);
  7. @a_splitLine = split (/,/, $_);
  8. # move OS info out of string to avoid accidentally
  9. # parsing over stuff
  10. $s_info = $a_splitLine[3];
Could anyone see either a better way to accomplish what I'm trying to do, or help get split to capture all the elements?

I was thinking I could run a simple substitution before parsing of a known string (something ridiculous that'll never show up in my data - like &^%$#), then split, and then when printing, if that matches the current item, just print some sort of whitespace, but that doesn't sound like the best method to me - like I'm overcomplicating it.

Jun 19 '09 # 1



Expert Mod 100+

P: 465
RonB
My issue is that as soon as an "empty" field comes up (two commas in a row), split seems to think the line is done and goes to the next one .
No it doesn't. You have a flawed impression of what's happening.

Expand|Select|Wrap|Line Numbers

  1. C:\TEMP>type test.pl
  2. #!/usr/bin/perl
  3. use strict;
  4. use warnings;
  5. use Data::Dumper;
  6. my $str = 'a,,,b,,,,6,,';
  7. my @fields = split /,/, $str;
  8. print Dumper @fields;
Expand|Select|Wrap|Line Numbers
  1. C:\TEMP>test.pl
  2. $VAR1 = 'a';
  3. $VAR2 = '';
  4. $VAR3 = '';
  5. $VAR4 = 'b';
  6. $VAR5 = '';
  7. $VAR6 = '';
  8. $VAR7 = '';
  9. $VAR8 = '6';
Expand|Select|Wrap|Line Numbers
  1. C:\TEMP>perldoc -f split
  2. split /PATTERN/,EXPR,LIMIT
  3. split /PATTERN/,EXPR
  4. split /PATTERN/
  5. split Splits the string EXPR into a list of strings and returns that
  6. list. By default, empty leading fields are preserved, and empty
  7. trailing ones are deleted. (If all fields are empty, they are
  8. considered to be trailing.)
  9. ....
  10. ....
  11. ....
Jun 19 '09 # 2

reply


Expert Mod 2.5K+

P: 4,667
sicarie

Interesting, so then how would I access the b or the 6?

  1. #!/bin/perl
  2. use strict;
  3. use warnings;
  4. use Data::Dumper;
  5. my $str = 'a,,,b,,,,6,,';
  6. my @fields = split /,/, $str;
  7. my $n = 0;
  8. print Dumper @fields;
  9. while ($fields[$n]) {
  10. print "$n: $fields[$n]\n";
  11. $n++;
  12. }
  13. print "done!\n";
Expand|Select|Wrap|Line Numbers
  1. $ ./splitTest.pl
  2. $VAR1 = 'a';
  3. $VAR2 = '';
  4. $VAR3 = '';
  5. $VAR4 = 'b';
  6. $VAR5 = '';
  7. $VAR6 = '';
  8. $VAR7 = '';
  9. $VAR8 = '6';
  10. 0: a
  11. done!
In the above, my attempt to print with a while loop stops as soon as the first empty set is reached. I'm guessing I'd have to check each one to see which are valid and which are not, but what am I looking for - null?

Jun 19 '09 # 3

reply

Expert Mod 100+

P: 465
RonB

If you know which field/index you want, then simply print that field.

If you want/need to loop over the array elements, then use a for or foreach loop, not a while loop.

Expand|Select|Wrap|Line Numbers
  1. for my $i ( 0..$#fields ) {
  2. # only print fields that have a value
  3. print "induce $i = '$fields[$i]'\n" if length $fields[$i];
  4. }
Jun 19 '09 # 4

reply


Expert Mod 2.5K+

P: 3,496
numberwhun

I have to agree with Ron. Since this is a csv file, you should already know which field is what. All you would have to do is reference it by its index. Otherwise, you can use the code above to iterate through each one and pull out the variables with values other than null.

Regards,

Jeff

Jun 20 '09 # 5

reply


Expert Mod 2.5K+

P: 4,667
sicarie

Cool, thanks. I am really only interested in one of those fields, but then have to make sure once I edit that field, I re-append all the others back on, so I will play around with that.

Thanks again!

[Nov 08, 2017] A tip from Perl Debugged

sanctum.geek.nz

There's an alternative way of option parsing using the core module Getopt::Long:

use Getopt::Long;
use constant WEB => 1;
use constant SQL => 2;
use constant REG => 4;

my %dbg_flag = (
   'WEB' => WEB,
   'SQL' => SQL,
   'REG' => REG);
my %dbg;

GetOptions(\%dbg, "D=s@");
my $DEBUG = WEB | REG unless $dbg{D};

$DEBUG |= $dbg_flag{$_}
       || die "Unknown debug flag $_\n"
   foreach @{$dbg{D}};

(We've shown only the option-parsing part.) This has a slightly different interface; instead of separating multiple values with commas, we must repeat the -D flag:

%  whizzbang.pl -D WEB -D REG 
				

(We could also say "-D=WEB -D=REG".)

Getopt::Long permits many more choices than this, of course. If you have a complex program (particularly one worked on by multiple programmers), look to it for support for the kind of debug flag setting interface you'll find useful. Brad Appleton says

Sometimes I will have a debug flag per each important subsystem of a large Perl program encompassing many modules. Each one will have an integer value. Sometimes I will have a debug argument along the lines of

-debug Pkg1=f1,f2,... -debug Pkg2=f3,f4,...

If the value for Pkg1 is an integer, it sets a package-wide debug level/flag. Otherwise, it says which specific functions to debug in the package.

[Nov 08, 2017] The Perl Debugger Starting

From the book Perl Debugged By: Peter Scott; Ed Wright (2001)
Nov 08, 2017 | www.amazon.com
7.2. Starting

The -d command line option makes your script run under the debugger. You can either add it to the options in the #! line at the beginning of your script, or you can override the options by explicitly running the script through perl. So for example, if wombat . pl currently has the -w option set, you can either change its first line to

#!/usr/bin/perl -wd

or you can type

% perl -wd wombat.pl
                                

to debug it without having to change the script. Unlike some debuggers, with this one you supply arguments to the program on the command line, not as part of a debugger command; for example:

% perl -wd wombat.pl kangaroo platypus wallaby
                                

The debugger will announce itself and provide a prompt:

Loading DB routines from perl5db.pl version 1.07
Emacs support available.

Enter h or `h h' for help.

main::(wombat.pl:1):   my $marsupial = shift;
  DB<1>

From now on we will elide everything before the first prompt (and the code on which it is stopped) when reproducing debugger sessions.

To begin, let's look at some simple commands. The very first of interest, of course, is h for help. The output from this is several screens long, which gives us an opportunity to mention an option we can apply to all commands: put a vertical bar ( | ) before any command and it will run the output through your pager (the program that prints things one screen at a time, waiting for you to tell it when to continue -- more or less more or less ).

7.2.1. Watch the Code Execute: s, n, r

Enter this simple program into debug . pl:

#!/usr/local/bin/perl -w
use strict;

my @parole = qw(Salutations Hello Hey);

print_line(@parole);
print "Done\n";

# Our subroutine accepts an array, then prints the
# value of each element appended to "Perl World."
sub print_line
   {
   my @parole = @_;
   foreach (@parole)
      {
      print "$_ Perl World\n";
      }
   }

Now run it under the debugger and step through the program one statement at a time using the n (next) command:

% perl -dw debug.pl
main::(debug.pl:4):     my @parole = qw(Salutations Hello Hey);
  DB<1> n
main::(debug.pl:6):     &print_line(@parole);
  DB<1> n
Salutations Perl World
Hello Perl World
Hey Perl World
main::(debug.pl:7):     print "Done\n";
  DB<1> n
Done
Debugged program terminated.  Use q to quit or R to restart, use
O inhibit_exit to avoid stopping after program termination, h
q, h R or h O to get additional info.
  DB<1> q
                                        

Before the prompt, the debugger prints the source line(s) containing the statement to be executed in the next step. (If you have more than one executable statement in a line, it prints the line each time you type n until it's done executing all the statements on the line.) Notice the output of our program going to the terminal is intermingled with the debugger text. Notice also when we called print_line(@parole) , we executed all the statements in the subroutine before we got another prompt.

(From now on, we won't reproduce the optimistic Debugged program terminated blurb printed by the debugger.)

Suppose we wanted to step through the code inside subroutines like print_line . That's the reason for s (single step). Let's see how it's used, along with another handy stepping command, r (return):

% perl -d debug.pl
main::(debug.pl:4):     my @parole = qw(Salutations Hello Hey);
  DB<1> n
main::(debug.pl:6):     print_line(@parole);
  DB<1> s
main::print_line(debug.pl:13):     my @parole = @_;
  DB<1> n
main::print_line(debug.pl:14):     foreach (@parole)
main::print_line(debug.pl:15):        {
  DB<1>
main::print_line(debug.pl:16):        print "$_ Perl World\n";
  DB<1> r
Salutations Perl World
Hello Perl World
Hey Perl World
void context return from main::print_line
main::(debug.pl:7):     print "Done\n";
  DB<1> s
Done
Debugged program terminated.

The effect of r is to execute all the code up to the end of the current subroutine. (All these command letters are copied from existing popular Unix command line debuggers and are mnemonic -- n ext, s tep, r eturn). In addition, note that just hitting carriage return as a command repeats the last n or s command (and if there hasn't been one yet, it does nothing).

7.2.2. Examining Variables: p, x, V

Stepping through code is dandy, but how do we check our variables' values? Use either p expression to print the result of the expression (which is equivalent to print ing to the filehandle $DB::OUT , so expression is put in list context) or x variable , which prints a variable in a pleasantly formatted form, following references. Once again, with the simple program:

% perl -wd debug.pl
main::(debug.pl:4):     my @parole = qw(Salutations Hello Hey);
  DB<1> p
                                                @parole
  DB<2> n
main::(debug.pl:6):     print_line(@parole);
  DB<2> p
                                                @parole
SalutationsHelloHey
  DB<3> x
                                                @parole
0  'Salutations'
1  'Hello'
2  'Hey'

In the first command, we instruct the debugger to print the value of @parole . However, the @parole assignment has yet to execute, so nothing comes out. Step past the assignment and then print the value with p ; we see the current state of the array in a list format. Print the array value with x , and we see the individual elements formatted with array indices (a pretty print).

This might look familiar if you've been playing with the Data::Dumper module we referenced in Chapter 5 . In fact, the output of x is intentionally very similar.

You can see all of the dynamic variables in a given package (default: main:: ) with the V command. This isn't as useful as it sounds because, unlike the x or p commands, it won't show you any lexical variables (which you declared with my ). Yet you want to make as many of your variables as possible lexical ones (see Perl of Wisdom #8). Unfortunately there is no (easy) way to dump out all the lexical variables in a package, so you're reduced to printing the ones you know about.

A common problem is running off the end of the program and getting the Debugged program terminated message. At that point, all your variables have been destroyed. If you want to inspect the state of variables after the last line of your program has executed, add a dummy line (a 1 by itself will work) so that you can set a breakpoint on it.

Tip

when e x amining a hash, examine a reference to it instead. This lets the x command see the datatype you're inspecting instead of being handed the list that it evaluates to, and it can format it more appealingly:

  DB<1> %h = (Craig   => 'Stirling', \
                                                Sharron => 'Macready', Richard => 'Barrett');
  DB<2> x
                                                %h
0  'Sharron'
1  'Macready'
2  'Craig'
3  'Stirling'
4  'Richard'
5  'Barrett'
  DB<3> x
                                                \%h
0  HASH(0x8330d5c)
   'Craig' => 'Stirling'
   'Richard' => 'Barrett'
   'Sharron' => 'Macready'

Examine references to hashes instead of the hashes themselves in the debugger to get well-formatted output.


7.2.3. Examining Source: l, -, w,

Sometimes you want more of the context of your program than just the current line. The following commands show you parts of your source code:

l List successive windows of source code starting from the current line about to be executed.
l x + y List + 1 lines of source starting from line x.
l x - y List source lines through y.
- List successive windows of source code before the current line.
w List a window of lines around the current line.
w line List a window of lines around line.
. Reset pointer for window listings to current line.

Source lines that are breakable (i.e. can have a breakpoint inserted before them -- see the following section) have a colon after the line number.

7.2.4. Playing in the Sandbox

Since the debugger is a full-fledged Perl environment, you can type in Perl code on the fly to examine its effects under the debugger; [1] some people do this as a way of testing code quickly without having to enter it in a script or type in everything perfectly before hitting end-of-file. (You just saw us do this at the end of section 7.2.2 .)

[1] So, you might wonder, how would you enter Perl code which happened to look like a debugger command (because you'd defined a subroutine l , perhaps)? In versions of Perl prior to 5.6.0, if you enter leading white space before text, the debugger assumes it must be Perl code and not a debugger command. So be careful not to hit the space bar by accident before typing a debugger command. This was no longer true as of version 5.6.0.

Type perl -de 0 to enter this environment. [2] Let's use this as a sandbox for testing Perl constructs:

[2] There are many expressions other than 0 that would work equally well, of course. Perl just needs something innocuous to run.

% perl -de 0
DB<1> $_ = 'What_do*parens-in=split+do%again?';
  DB<2> @a = split /(\W+)/;
  DB<3> x
                                                @a
0  'What_do'
1  '*'
2  'parens'
3  '-'
4  'in'
5  '='
6  'split'
7  '+'
8  'do'
9  '%'
10  'again'
11  '?'

You can even use this feature to change the values of variables in a program you are debugging, which can be a legitimate strategy for seeing how your program behaves under different circumstances. If the way your program constructs the value of some internal variable is complex and it would require numerous changes in the input to have it form the variable differently, then a good way of playing "What if?" is to stop the program at the right place in the debugger and change the value by hand. How would we stop it? Let's see

7.2.5. Breakpointing: c, b, L

An important feature of a debugger is the ability to allow your program to continue executing until some condition is met. The most common such condition is the arrival of the debugger at a particular line in your source. You can tell the Perl debugger to run until a particular line number with the c (for continue) command:

main::(debug.pl:4):     my @parole = qw(Salutations Hello Hey);
  DB<1> c
                                                16
main::print_line(debug.pl:16):        print "$_ Perl World\n";
  DB<2>

What the debugger actually did was set a one-time breakpoint at line 16 and then executed your code until it got there. If it had hit another breakpoint earlier, it would have stopped there first.

So what's a breakpoint? It's a marker set by you immediately before a line of code, invisible to anyone but the perl debugger, which causes it to halt when it gets there and return control to you with a debugger prompt. If you have a breakpoint set at a line of code that gets printed out with one of the source examination commands listed earlier, you'll see a b next to it. It's analogous to putting a horse pill in the trail of bread crumbs the mouse follows so the mouse gets indigestion and stops to take a breather (we really have to give up this metaphor soon).

You set a breakpoint with the b command; the most useful forms are b line or b subroutine to set a breakpoint either at a given line number or immediately upon entering a subroutine. To run until the next breakpoint, type c . To delete a breakpoint, use d line to delete the breakpoint at line number line or D to delete all breakpoints.

In certain situations you won't want to break the next time you hit a particular breakpoint, but only when some condition is true, like every hundredth time through a loop. You can add a third argument to b specifying a condition that must be true before the debugger will stop at the breakpoint. For example,

Code View: Scroll / Show All
main::(debug.pl:4):   my @parole = qw(Salutations Hello Hey);
  DB<1> l
4==>    my @parole = qw(Salutations Hello Hey);
5
6:      print_line(@parole);
7:      print "Done\n";
8
9       # Our subroutine which accepts an array, then prints
10      # the value of each element appended to "Perl World."
11      sub print_line
12         {
13:        my @parole = @_;
  DB<1> l
14:        foreach (@parole)
15            {
16:           print "$_ Perl World\n";
17            }
18         }
  DB<1> b
                                                16 /Hey/
  DB<2> c
Salutations Perl World
Hello Perl World
main::print_line(debug.pl:16):        print "$_ Perl World\n";
  DB<2> p
Hey


                                        

Notice that we've demonstrated several things here: the source listing command l , the conditional breakpoint with the criterion that $_ must match /Hey/ , and that $_ is the default variable for the p command (because p just calls print ).

The capability of the debugger to insert code that gets executed in the context of the program being debugged does not exist in compiled languages and is a significant example of the kind of thing that is possible in a language as well designed as Perl.

The command L lists all breakpoints.

7.2.6. Taking Action: a, A

An even more advanced use of the facility to execute arbitrary code in the debugger is the action capability. With the a command (syntax: a line code ), you can specify code to be executed just before a line would be executed. (If a breakpoint is set for that line, the action executes first; then you get the debugger prompt.) The action can be arbitrarily complicated and, unlike this facility in debuggers for compiled languages, lets you reach into the program itself:

main::(debug.pl:4):     my @parole = qw(Salutations Hello Hey);
  DB<1> a
                                                16 s/Hey/Greetings/
  DB<2> c
Salutations Perl World
Hello Perl World
Greetings Perl World
Done
Debugged program terminated.

L also lists any actions you have created. Delete all the installed actions with the A command. This process is commonly used to insert tracing code on the fly. For example, suppose you have a program executing a loop containing way too much code to step through, but you want to monitor the state of certain variables each time it goes around the loop. You might want to confirm what's actually being ordered in a shopping cart application test (looking at just a fragment of an imaginary such application here):

Code View: Scroll / Show All
  DB<1> l
                                                79-91
79:     while (my $item = shift @shopping_basket)
80         {
81:        if ($item->in_stock)
82            {
83:           $inventory->remove($item);
84:           $order->add($item);
85            }
86         else
87            {
88:           $order->back_order($item);
89:           $inventory->order($item);
90            }
91         }
  DB<2> a
                                                81 printf "Item: %25s, Cost: %5.2f\n",
                                                $item->name, $item->cost
  DB<3> c
                                                92
Item:          Forbidden Planet, Cost: 24.50
Item:      Kentucky Fried Movie, Cost: 29.95
Item:                Eraserhead, Cost: 14.75
main::(cart.pl:92):    $customer->charge($order->total);
  DB<3>


                                        

7.2.7. Watch It:

Suppose you want to break on a condition that is dictated not by a particular line of code but by a change in a particular variable. This is called a watchpoint, and in Perl you set it with the W command followed by the name of a variable. [3]

[3] In fact, Perl can monitor anything that evaluates to an lvalue, so you can watch just specific array or hash entries, for example.

Let's say you're reading a file of telephone numbers and to whom they belong into a hash, and you want to stop once you've read in the number 555-1212 to inspect the next input line before going on to check other things:

main::(foo:1):  my %phone;
  DB<1> l
                                                1-5
1==>    my %phone;
2:      while (<>) {
3:        my ($k, $v) = split;
4:        $phone{$k} = $v;
5       }
  DB<2> W
                                                $phone{'555-1212'}
  DB<3> c
Watchpoint 0:   $phone{'555-1212'} changed:
    old value:  undef
    new value:  'Information'
main::(foo:2):  while (<>) {
  DB<3> n
main::(foo:3):    my ($k, $v) = split;
  DB<3> p
555-1234 Weather

Delete all watchpoints with a blank W command.

7.2.8. Trace:

The debugger's t command provides a trace mode for those instances that require a complete trace of program execution. Running the program with an active trace mode:

Code View: Scroll / Show All
% perl -wd debug.pl
main::(debug.pl:4):     my @parole = qw(Salutations Hello Hey);
  DB<1> t
Trace = on
  DB<1> n
main::(debug.pl:6):     print_line(@parole);
  DB<1> n
main::print_line(debug.pl:13):     my @parole = @_;
main::print_line(debug.pl:14):     foreach (@parole)
main::print_line(debug.pl:15):        {
main::print_line(debug.pl:16):        print "$_ Perl World\n";
Salutations Perl World
main::print_line(debug.pl:14):     foreach (@parole)
main::print_line(debug.pl:15):        {
main::print_line(debug.pl:16):        print "$_ Perl World\n";
Hello Perl World
main::print_line(debug.pl:14):     foreach (@parole)
main::print_line(debug.pl:15):        {
main::print_line(debug.pl:16):        print "$_ Perl World\n";
Hey Perl World
main::print_line(debug.pl:14):     foreach (@parole)
main::print_line(debug.pl:15):        {
main::(debug.pl:7):     print "Done\n";


                                        

Notice that trace mode causes the debugger to output the call tree when execution enters the print_line subroutine.

7.2.9. Programmatic Interaction with the Debugger

You can put code in your program to force a call to the debugger at a particular point. For instance, suppose you're processing a long input file line by line and you want to start tracing when it reaches a particular line. You could set a conditional breakpoint, but you could also extend the semantics of your input by creating "enable debugger" lines. Consider the following code:

while (<INPUT>)
   {
   $DB::trace = 1, next if /debug/;
   $DB::trace = 0, next if /nodebug/;
   # more code
   }

When run under the debugger, this enables tracing when the loop encounters an input line containing "debug" and ceases tracing upon reading one containing " nodebug ". You can even force the debugger to breakpoint by setting the variable $DB::single to 1 , which also happens to provide a way you can debug code in BEGIN blocks (which otherwise are executed before control is given to the debugger).

7.2.10. Optimization

Although the Perl debugger displays lines of code as it runs, it's important to note that these are not what actually executes. Perl internally executes its compiled opcode tree, which doesn't always have a contiguous mapping to the lines of code you typed, due to the processes of compilation and optimization. If you have used interactive debuggers on C code in the past, you may be familiar with this process.

When debugging C programs on VAX/VMS, it was common for me to want to examine an important variable only to get the message that the variable was not in memory and had been "optimized away."


Perl has an optimizer to do as good a job as it can -- in the short amount of time people will wait for compilation -- of taking shortcuts in the code you've given it. For instance, in a process called constant folding, it does things like build a single string in places where you concatenate various constant strings together so that the concatenation operator need not be called at run-time.

The optimization process also means that perl may execute opcodes in an order different from the order of statements in your program, and therefore when the debugger displays the current statement, you may see it jump around oddly. As recently as version 5.004_04 of perl, this could be observed in a program like the following:

1   my @a = qw(one two three);
2   while ($_ = pop @a)
3      {
4      print "$_\n";
5      }
6   1;

See what happens when we step through this, again using perl 5.004_04 or earlier:

main::(while.pl:1):   my @a = qw(one two three);
  DB<1> n
main::(while.pl:6):   1;
  DB<1>
main::(while.pl:4):     print "$_\n";
  DB<1>
three
main::(while.pl:2):   while ($_ = pop @a)
  DB<1>
main::(while.pl:4):     print "$_\n";
  DB<1>
two

In fact, if we set a breakpoint for line 6 and ran to it, we'd get there before the loop executed at all. So it's important to realize that under some circumstances, what the debugger tells you about where you are can be confusing. If this inconveniences you, upgrade.

7.2.11. Another "Gotcha"

If you set a lexical variable as the last statement of a block, there is no way to see what it was set to if the block exits to a scope that doesn't include the lexical. Why would code do that? In a word, closures. For example,

{                  # Start a closure-enclosing block
my $spam_type;     # This lexical will outlive its block
sub type_spam
   {
   # ...
   $spam_type = $spam_types[complex_func()];
   }
}

In this case, either type_spam or some other subroutine in the closure block would have a good reason for seeing the last value of $spam_type . But if you're stepping through in the debugger, you won't see the value it gets set to on the last line because, after the statement executes, the debugger pops out to a scope where $spam_type is not in scope (unless type_spam() was called from within the enclosing block). Unfortunately, in this case, if the result of the function is not used by the caller, you're out of luck.

[Nov 06, 2017] scope - What is the difference between my and our in Perl - Stack Overflow

Notable quotes:
"... use strict "vars" ..."
Nov 06, 2017 | stackoverflow.com

Fran Corpier , May 20, 2009 at 2:22

Great question: How does our differ from my and what does our do?

In Summary:

Available since Perl 5, my is a way to declare:

On the other hand, our variables are:

Declaring a variable with our allows you to predeclare variables in order to use them under use strict without getting typo warnings or compile-time errors. Since Perl 5.6, it has replaced the obsolete use vars , which was only file-scoped, and not lexically scoped as is our

For example, the formal, qualified name for variable $x inside package main is $main::x . Declaring our $x allows you to use the bare $x variable without penalty (i.e., without a resulting error), in the scope of the declaration, when the script uses use strict or use strict "vars" . The scope might be one, or two, or more packages, or one small block.

Nathan Fellman , Aug 23, 2009 at 13:51

So how does our differ from local? Nathan Fellman Aug 23 '09 at 13:51

ikegami , Sep 21, 2011 at 16:57

@Nathan Fellman, local doesn't create variables. It doesn't relate to my and our at all. local temporarily backs up the value of variable and clears its current value. ikegami Sep 21 '11 at 16:57

ikegami , Nov 20, 2016 at 1:15

our variables are not package variables. They aren't globally-scoped, but lexically-scoped variables just like my variables. You can see that in the following program: package Foo; our $x = 123; package Bar; say $x; . If you want to "declare" a package variable, you need to use use vars qw( $x ); . our $x; declares a lexically-scoped variable that is aliased to the same-named variable in the package in which the our was compiled. ikegami Nov 20 '16 at 1:15

bubaker , May 10, 2009 at 14:00

The PerlMonks and PerlDoc links from cartman and Olafur are a great reference - below is my crack at a summary:

my variables are lexically scoped within a single block defined by {} or within the same file if not in {} s. They are not accessible from packages/subroutines defined outside of the same lexical scope / block.

our variables are scoped within a package/file and accessible from any code that use or require that package/file - name conflicts are resolved between packages by prepending the appropriate namespace.

Just to round it out, local variables are "dynamically" scoped, differing from my variables in that they are also accessible from subroutines called within the same block.

Georg , Oct 1, 2016 at 6:41

+1 for " my variables are lexically scoped [...] within the same file if not in {} s". That was useful for me, thanks. Georg Oct 1 '16 at 6:41

FMc , Jun 13, 2009 at 16:11

An example:
use strict;

for (1 .. 2){
    # Both variables are lexically scoped to the block.
    our ($o);  # Belongs to 'main' package.
    my  ($m);  # Does not belong to a package.

    # The variables differ with respect to newness.
    $o ++;
    $m ++;
    print __PACKAGE__, " >> o=$o m=$m\n";  # $m is always 1.

    # The package has changed, but we still have direct,
    # unqualified access to both variables, because the
    # lexical scope has not changed.
    package Fubb;
    print __PACKAGE__, " >> o=$o m=$m\n";
}

# The our() and my() variables differ with respect to privacy.
# We can still access the variable declared with our(), provided
# that we fully qualify its name, but the variable declared
# with my() is unavailable.
print __PACKAGE__, " >> main::o=$main::o\n";  # 2
print __PACKAGE__, " >> main::m=$main::m\n";  # Undefined.

# Attempts to access the variables directly won't compile.
# print __PACKAGE__, " >> o=$o\n";
# print __PACKAGE__, " >> m=$m\n";

# Variables declared with use vars() are like those declared
# with our(): belong to a package; not private; and not new.
# However, their scoping is package-based rather than lexical.
for (1 .. 9){
    use vars qw($uv);
    $uv ++;
}

# Even though we are outside the lexical scope where the
# use vars() variable was declared, we have direct access
# because the package has not changed.
print __PACKAGE__, " >> uv=$uv\n";

# And we can access it from another package.
package Bubb;
print __PACKAGE__, " >> main::uv=$main::uv\n";

Nathan Fellman , Jun 13, 2009 at 17:52

Good answer. It's a shame I can't upvote it more than once Nathan Fellman Jun 13 '09 at 17:52

Roland Illig , Nov 20, 2015 at 18:46

Instead of # 5 , the comment should read # 2 . Roland Illig Nov 20 '15 at 18:46

daotoad , May 10, 2009 at 16:37

Coping with Scoping is a good overview of Perl scoping rules. It's old enough that our is not discussed in the body of the text. It is addressed in the Notes section at the end.

The article talks about package variables and dynamic scope and how that differs from lexical variables and lexical scope.

ismail , May 10, 2009 at 10:27

my is used for local variables, where as our is used for global variables. More reading over Variable Scoping in Perl: the basics .

Chas. Owens , May 11, 2009 at 0:16

Be careful tossing around the words local and global. The proper terms are lexical and package. You can't create true global variables in Perl, but some already exist like $_, and local refers to package variables with localized values (created by local), not to lexical variables (created with my). Chas. Owens May 11 '09 at 0:16

MJD , Oct 7, 2013 at 14:02

${^Potato} is global. It refers to the same variable regardless of where you use it. MJD Oct 7 '13 at 14:02

Xu Ding , Nov 7, 2013 at 15:31

It's an old question, but I ever met some pitfalls about lexical declarations in Perl that messed me up, which are also related to this question, so I just add my summary here:

1. definition or declaration?

local $var = 42; 
print "var: $var\n";

The output is var: 42 . However we couldn't tell if local $var = 42; is a definition or declaration. But how about this:

use strict;
use warnings;

local $var = 42;
print "var: $var\n";

The second program will throw an error:

Global symbol "$var" requires explicit package name.

$var is not defined, which means local $var; is just a declaration! Before using local to declare a variable, make sure that it is defined as a global variable previously.

But why this won't fail?

use strict;
use warnings;

local $a = 42;
print "var: $a\n";

The output is: var: 42 .

That's because $a , as well as $b , is a global variable pre-defined in Perl. Remember the sort function?

2. lexical or global?

I was a C programmer before starting using Perl, so the concept of lexical and global variables seems straightforward to me: just corresponds to auto and external variables in C. But there're small differences:

In C, an external variable is a variable defined outside any function block. On the other hand, an automatic variable is a variable defined inside a function block. Like this:

int global;

int main(void) {
    int local;
}

While in Perl, things are subtle:

sub main {
    $var = 42;
}

&main;

print "var: $var\n";

The output is var: 42 , $var is a global variable even it's defined in a function block! Actually in Perl, any variable is declared as global by default.

The lesson is to always add use strict; use warnings; at the beginning of a Perl program, which will force the programmer to declare the lexical variable explicitly, so that we don't get messed up by some mistakes taken for granted.

ruffin , Feb 10, 2015 at 19:47

More on ["remembering [$a and $b in] sort" here]( stackoverflow.com/a/26128328/1028230 ). Perl never ceases to, um, astound me. ruffin Feb 10 '15 at 19:47

lafur Waage , May 10, 2009 at 10:25

The perldoc has a good definition of our.

Unlike my, which both allocates storage for a variable and associates a simple name with that storage for use within the current scope, our associates a simple name with a package variable in the current package, for use within the current scope. In other words, our has the same scoping rules as my, but does not necessarily create a variable.

Misha Gale , Dec 2, 2011 at 15:03

This is only somewhat related to the question, but I've just discovered a (to me) obscure bit of perl syntax that you can use with "our" (package) variables that you can't use with "my" (local) variables.
#!/usr/bin/perl

our $foo = "BAR";

print $foo . "\n";
${"foo"} = "BAZ";
print $foo . "\n";

Output:

BAR
BAZ

This won't work if you change 'our' to 'my'.

Cosmicnet , Oct 21, 2014 at 14:08

Not so. $foo ${foo} ${'foo'} ${"foo"} all work the same for variable assignment or dereferencing. Swapping the our in the above example for my does work. What you probably experienced was trying to dereference $foo as a package variable, such as $main::foo or $::foo which will only work for package globals, such as those defined with our . Cosmicnet Oct 21 '14 at 14:08

Misha Gale , Oct 21, 2014 at 17:50

Just retested using v5.20, and it definitely doesn't give the same output with my (it prints BAR twice.) Misha Gale Oct 21 '14 at 17:50

Cosmicnet , Nov 22, 2014 at 13:44

My test (on windows): perl -e "my $foo = 'bar'; print $foo; ${foo} = 'baz'; pr int $foo" output: barbaz perl -e "my $foo = 'bar'; print $foo; ${"foo"} = 'baz'; print $foo" output: barbaz perl -e "my $foo = 'bar'; print $foo; ${\"foo\"} = 'baz'; print $foo" output: barbar So in my testing I'd fallen into the same trap. ${foo} is the same as $foo, the brackets are useful when interpolating. ${"foo"} is actually a look up to $main::{} which is the main symbol table, as such only contains package scoped variables. Cosmicnet Nov 22 '14 at 13:44

Cosmicnet , Nov 22, 2014 at 13:57

${"main::foo"}, ${"::foo"}, and $main::foo are the same as ${"foo"}. The shorthand is package sensitive perl -e "package test; our $foo = 'bar'; print $foo; ${\"foo\"} = 'baz'; print $foo" works, as in this context ${"foo"} is now equal to ${"test::foo"}. Of Symbol Tables and Globs has some information on it, as does the Advanced Perl programming book. Sorry for my previous mistake. Cosmicnet Nov 22 '14 at 13:57

Lavi Buchnik , Sep 5, 2014 at 12:09

print "package is: " . __PACKAGE__ . "\n";
our $test = 1;
print "trying to print global var from main package: $test\n";

package Changed;
{
        my $test = 10;
        my $test1 = 11;
        print "trying to print local vars from a closed block: $test, $test1\n";
}

&Check_global;

sub Check_global {
        print "trying to print global var from a function: $test\n";
}
print "package is: " . __PACKAGE__ . "\n";
print "trying to print global var outside the func and from \"Changed\" package:     $test\n";
print "trying to print local var outside the block $test1\n";

Will Output this:

package is: main
trying to print global var from main package: 1
trying to print local vars from a closed block: 10, 11
trying to print global var from a function: 1
package is: Changed
trying to print global var outside the func and from "Changed" package: 1
trying to print local var outside the block

In case using "use strict" will get this failure while attempting to run the script:

Global symbol "$test1" requires explicit package name at ./check_global.pl line 24.
Execution of ./check_global.pl aborted due to compilation errors.

Okuma.Scott , Sep 5, 2014 at 12:29

Please provide some kind of explanation. Dumping code like this is rarely considered appropriate. Okuma.Scott Sep 5 '14 at 12:29

Lavi Buchnik , Sep 6, 2014 at 20:08

in simple words: Our (as the name sais) is a variable decliration to use that variable from any place in the script (function, block etc ...), every variable by default (in case not declared) belong to "main" package, our variable still can be used even after decliration of another package in the script. "my" variable in case declared in a block or function, can be used in that block/function only. in case "my" variable was declared not closed in a block, it can be used any where in the scriot, in a closed block as well or in a function as "our" variable, but can't used in case package changed Lavi Buchnik Sep 6 '14 at 20:08

Lavi Buchnik , Sep 6, 2014 at 20:13

My script above shows that by default we are in the "main" package, then the script print an "our" variable from "main" package (not closed in a block), then we declare two "my" variables in a function and print them from that function. then we print an "our" variable from another function to show it can be used in a function. then we changing the package to "changed" (not "main" no more), and we print again the "our" variable successfully. then trying to print a "my" variable outside of the function and failed. the script just showing the difference between "our" and "my" usage. Lavi Buchnik Sep 6 '14 at 20:13

Yugdev , Nov 5, 2015 at 11:08

Just try to use the following program :
#!/usr/local/bin/perl
use feature ':5.10';
#use warnings;
package a;
{
my $b = 100;
our $a = 10;


print "$a \n";
print "$b \n";
}

package b;

#my $b = 200;
#our $a = 20 ;

print "in package b value of  my b $a::b \n";
print "in package b value of our a  $a::a \n";

Nathan Fellman , Nov 5, 2015 at 13:11

yes, but why is that? Nathan Fellman Nov 5 '15 at 13:11

Yugdev , Nov 5, 2015 at 14:03

This explains the difference between my and our. The my variable goes out of scope outside the curly braces and is garbage collected but the our variable still lives. Yugdev Nov 5 '15 at 14:03

xoid , May 16, 2013 at 8:02

#!/usr/bin/perl -l

use strict;

# if string below commented out, prints 'lol' , if the string enabled, prints 'eeeeeeeee'
#my $lol = 'eeeeeeeeeee' ;
# no errors or warnings at any case, despite of 'strict'

our $lol = eval {$lol} || 'lol' ;

print $lol;

Nathan Fellman , May 16, 2013 at 11:07

Can you explain what this code is meant to demonstrate? Why are our and my different? How does this example show it? Nathan Fellman May 16 '13 at 11:07

Evgeniy , Jan 27, 2016 at 4:57

Let us think what an interpreter actually is: it's a piece of code that stores values in memory and lets the instructions in a program that it interprets access those values by their names, which are specified inside these instructions. So, the big job of an interpreter is to shape the rules of how we should use the names in those instructions to access the values that the interpreter stores.

On encountering "my", the interpreter creates a lexical variable: a named value that the interpreter can access only while it executes a block, and only from within that syntactic block. On encountering "our", the interpreter makes a lexical alias of a package variable: it binds a name, which the interpreter is supposed from then on to process as a lexical variable's name, until the block is finished, to the value of the package variable with the same name.

The effect is that you can then pretend that you're using a lexical variable and bypass the rules of 'use strict' on full qualification of package variables. Since the interpreter automatically creates package variables when they are first used, the side effect of using "our" may also be that the interpreter creates a package variable as well. In this case, two things are created: a package variable, which the interpreter can access from everywhere, provided it's properly designated as requested by 'use strict' (prepended with the name of its package and two colons), and its lexical alias.

Sources:

[Nov 06, 2017] How to create a Perl Module for code reuse

Nov 06, 2017 | perlmaven.com

You may be creating more and more scripts for your systems, which need to use the same functions.

You already mastered the ancient art of copy-paste, but you are not satisfied with the result.

You probably know lots of Perl modules that allow you to use their functions and you also want to create one.

However, you don't know how to create such a module.

The module

  1. package My :: Math
  2. use strict
  3. use warnings
  4. use Exporter qw import );
  5. our @EXPORT_OK = qw add multiply );
  6. sub add
  7. my $x $y = @_
  8. return $x $y
  9. sub multiply
  10. my $x $y = @_
  11. return $x $y

Save this in somedir/lib/My/Math.pm (or somedir\lib\My\Math.pm on Windows).

The script
  1. #!/usr/bin/perl
  2. use strict
  3. use warnings
  4. use My :: Math qw add );
  5. print add 19 23 );

Save this in somedir/bin/app.pl (or somedir\bin\app.pl on Windows).

Now run perl somedir/bin/app.pl . (or perl somedir\bin\app.pl on Windows).

It is going to print an error like this:

Can't locate My/Math.pm in @INC (@INC contains:
...
...
...
BEGIN failed--compilation aborted at somedir/bin/app.pl line 9.
What is the problem?

In the script we loaded the module with the use keyword. Specifically with the use My::Math qw(add); line. This searches the directories listed in the built-in @INC variable looking for a subdirectory called My and in that subdirectory for a file called Math.pm .

The problem is that your .pm file is not in any of the standard directories of perl: it is not in any of the directories listed in @INC.

You could either move your module, or you could change @INC.

The former can be problematic, especially on systems where there is a strong separation between the system administrator and the user. For example on Unix and Linux system only the user "root" (the administrator) has write access to these directories. So in general it is easier and more correct to change @INC.

Change @INC from the command line

Before we try to load the module, we have to make sure the directory of the module is in the @INC array.

Try this:

perl -Isomedir/lib/ somedir/bin/app.pl .

This will print the answer: 42.

In this case, the -I flag of perl helped us add a directory path to @INC.

Change @INC from inside the script

Because we know that the "My" directory that holds our module is in a fixed place relative to the script, we have another possibility for changing the script:

  1. #!/usr/bin/perl
  2. use strict
  3. use warnings
  4. use File :: Basename qw dirname );
  5. use Cwd qw abs_path );
  6. use lib dirname dirname abs_path $0 '/lib'
  7. use My :: Math qw add );
  8. print add 19 23 );

and run it again with this command:

perl somedir/bin/app.pl .

Now it works.

Let's explain the change:

How to change @INC to point to a relative directory

This line: use lib dirname(dirname abs_path $0) . '/lib'; adds the relative lib directory to the beginning of @INC

$0 holds the name of the current script. abs_path() of Cwd returns the absolute path to the script.

Given a path to a file or to a directory the call to dirname() of File::Basename returns the directory part, except of the last part.

In our case $0 contains app.pl

abs_path($0) returns .../somedir/bin/app.pl

dirname(abs_path $0) returns .../somedir/bin

dirname( dirname abs_path $0) returns .../somedir

That's the root directory of our project.

dirname( dirname abs_path $0) . '/lib' then points to .../somedir/lib

So what we have there is basically

use lib '.../somedir/lib';

but without hard-coding the actual location of the whole tree.

The whole task of this call is to add the '.../somedir/lib' to be the first element of @INC.

Once that's done, the subsequent call to use My::Math qw(add); will find the 'My' directory in '.../somedir/lib' and the Math.pm in '.../somedir/lib/My'.

The advantage of this solution is that the user of the script does not have to remember to put the -I... on the command line.

There are other ways to change @INC for you use in other situations.

Explaining use

So as I wrote earlier, the use call will look for the My directory and the Math.pm file in it.

The first one it finds will be loaded into memory and the import function of My::Math will be called with the parameters after the name of the module. In our case import( qw(add) ) which is just the same as calling import( 'add' )

The explanation of the script

There is not much left to explain in the script. After the use statement is done calling the import function, we can just call the newly imported add function of the My::Math module. Just as if I declared the function in the same script.

What is more interesting is to see the parts of the module.

The explanation of the module

A module in Perl is a namespace in the file corresponding to that namespace. The package keyword creates the namespace. A module name My::Math maps to the file My/Math.pm. A module name A::B::C maps to the file A/B/C.pm somewhere in the directories listed in @INC.

As you recall, the use My::Math qw(add); statement in the script will load the module and then call the import function. Most people don't want to implement their own import function, so they load the Exporter module and import the 'import' function.

Yes, it is a bit confusing. The important thing to remember is that Exporter gives you the import.

That import function will look at the @EXPORT_OK array in your module and arrange for on-demand importing of the functions listed in this array.

OK, maybe I need to clarify: The module "exports" functions and the script "imports" them.

The last thing I need to mention is the 1; at the end of the module. Basically the use statement is executing the module and it needs to see some kind of a true statement there. It could be anything. Some people put there 42; , others, the really funny ones put "FALSE" there. After all every string with letters in it is considered to be true in Perl . That confuses about everyone. There are even people who put quotes from poems there.

"Famous last words."

That's actually nice, but might still confuse some people at first.

There are also two functions in the module. We decided to export both of them, but the user (the author of the script) wanted to import only one of the subroutines.

Conclusion

Aside from a few lines that I explained above, it is quite simple to create a Perl module. Of course there are other things you might want to learn about modules that will appear in other articles, but there is nothing stopping you now from moving some common functions into a module.

Maybe one more advice on how to call your module:

Naming of modules

It is highly recommended to use capital letter as the first letter of every part in the module name and lower case for the rest of the letters. It is also recommended to use a namespace several levels deep.

If you work in a company called Abc, I'd recommend preceding all the modules with the Abc:: namespace. If within the company the project is called Xyz, then all its modules should be in Abc::Xyz::.

So if you have a module dealing with configuration you might call the package Abc::Xyz::Config which indicates the file .../projectdir/lib/Abc/Xyz/Config.pm

Please avoid calling it just Config.pm. That will confuse both Perl (that comes with its own Config.pm) and you.

[Nov 06, 2017] In Perl, what is the difference between a .pm (Perl module) and .pl (Perl script) file - Stack Overflow

Nov 06, 2017 | stackoverflow.com

user380979 , Aug 4, 2010 at 5:20

What is the Difference between .pm (Perl module) and .pl (Perl script) file?

Please also tell me why we return 1 from file. If return 2 or anything else, it's not generating any error, so why do we return 1 from Perl module?

Amadan , Aug 4, 2010 at 5:32

1 does not matter. It can be 2 , it can be "foo" , it can be ["a", "list"] . What matters is it's not 0 , or anything else that evaluates as false, or use would fail. – Amadan Aug 4 '10 at 5:32

Marc Lehmann , Oct 16, 2015 at 22:08

.pl is actually a perl library - perl scripts, like C programs or programs written in other languages, do not have an ending, except on operating systems that need one to functiopn, such as windows. – Marc Lehmann Oct 16 '15 at 22:08

Sinan Ünür , Aug 4, 2010 at 12:41

At the very core, the file extension you use makes no difference as to how perl interprets those files.

However, putting modules in .pm files following a certain directory structure that follows the package name provides a convenience. So, if you have a module Example::Plot::FourD and you put it in a directory Example/Plot/FourD.pm in a path in your @INC , then use and require will do the right thing when given the package name as in use Example::Plot::FourD .

The file must return true as the last statement to indicate successful execution of any initialization code, so it's customary to end such a file with 1; unless you're sure it'll return true otherwise. But it's better just to put the 1; , in case you add more statements.

If EXPR is a bareword, the require assumes a ".pm" extension and replaces "::" with "/" in the filename for you, to make it easy to load standard modules. This form of loading of modules does not risk altering your namespace.

All use does is to figure out the filename from the package name provided, require it in a BEGIN block and invoke import on the package. There is nothing preventing you from not using use but taking those steps manually.

For example, below I put the Example::Plot::FourD package in a file called t.pl , loaded it in a script in file s.pl .

C:\Temp> cat t.pl
package Example::Plot::FourD;

use strict; use warnings;

sub new { bless {} => shift }

sub something { print "something\n" }

"Example::Plot::FourD"

C:\Temp> cat s.pl
#!/usr/bin/perl
use strict; use warnings;

BEGIN {
    require 't.pl';
}

my $p = Example::Plot::FourD->new;
$p->something;


C:\Temp> s
something

This example shows that module files do not have to end in 1 , any true value will do.

Igor Oks , Aug 4, 2010 at 5:25

A .pl is a single script.

In .pm ( Perl Module ) you have functions that you can use from other Perl scripts:

A Perl module is a self-contained piece of Perl code that can be used by a Perl program or by other Perl modules. It is conceptually similar to a C link library, or a C++ class.

Dave Cross , Sep 17, 2010 at 9:37

"A .pl is a single script." Not true. It's only on broken operating systems that you need to identify Perl programs with a .pl extension. And originally .pl indicated a "Perl library" - external subroutines that you loaded with a "require" or "do" command. – Dave Cross Sep 17 '10 at 9:37

[Nov 06, 2017] W>hat is difference between namespace,package and module in perl

Nov 06, 2017 | stackoverflow.com
The package directive sets the namespace. As such, the namespace is also called the package.

Perl doesn't have a formal definition of module. There's a lot of variance, but the following holds for a huge majority of modules:

It's not uncommon to encounter .pm files with multiple packages. Whether that's a single module, multiple modules or both is up for debate.

Namespace is a general computing term meaning a container for a distinct set of identifiers. The same identifier can appear independently in different namespaces and refer to different objects, and a fully-qualified identifier which unambiguously identifies an object consists of the namespace plus the identifier.

Perl implements namespaces using the package keyword.

A Perl module is a different thing altogether. It is a piece of Perl code that can be incorporated into any program with the use keyword. The filename should end with .pm - for erl odule - and the code it contains should have a package statement using a package name that is equivalent to the file's name, including its path. For instance, a module written in a file called My/Useful/Module.pm should have a package statement like package My::Useful::Module .

What you may have been thinking of is a class which, again, is a general computing term, this time meaning a type of object-oriented data. Perl uses its packages as class names, and an object-oriented module will have a constructor subroutine - usually called new - that will return a reference to data that has been blessed to make it behave in an object-oriented fashion. By no means all Perl modules are object-oriented ones: some can be simple libraries of subroutines.

[Nov 06, 2017] Constants and read-only variables in Perl by Gabor Szabo

Nov 06, 2017 | perlmaven.com
Often in programs we would like to have symbols that represent a constant value. Symbols that we can set to a specific values once, and be sure they never change. As with many other problems, there are several ways to solve this in Perl, but in most cases enforcement of "constantness" is not necessary.

In most cases we can just adhere to the established consensus, that variables with all upper-case names should be treated as constants .

Later we'll see a couple of solutions that actually enforce the "constantness" of the variables, but for most purposes, having a variable in upper case is enough.

Treat upper-case variables as constants

We declare and set the values just as we'd do with any other variable in Perl:

  1. use strict
  2. use warnings
  3. use 5.010
  4. my $SPEED_OF_LIGHT 299 _792_458 # m/s
  5. my DATA
  6. Mercury => 0.4 0.055 ],
  7. Venus => 0.7 0.815 ],
  8. Earth => ],
  9. Mars => 1.5 0.107 ],
  10. Ceres => 2.77 0.00015 ],
  11. Jupiter => 5.2 318 ],
  12. Saturn => 9.5 95 ],
  13. Uranus => 19.6 14 ],
  14. Neptune => 30 17 ],
  15. Pluto => 39 0.00218 ],
  16. Charon => 39 0.000254 ],
  17. );
  18. my @PLANETS sort keys DATA

Each planet in the Solar System has two values. The first is their average distance from the Sun and the second is their mass, relative to the Earth.

Once the values are initially set, they should NOT be changed. Nothing enforces it, besides a secret agreement among Perl programmers and Astronomers.
  1. say join ', ' @PLANETS
  2. say $SPEED_OF_LIGHT
  3. $SPEED_OF_LIGHT 300 _000_000
  4. say "The speed of light is now $SPEED_OF_LIGHT"

We can use these "constants" in the same way as we would use any variable. We could even change the values, but it is not recommended.

Besides its simplicity, one of the nice things in this solution is that we can actually compute the values of these constants during run time, as we did with the @PLANETS array.

In many cases this is enough and the cost of creating "real" constants is unnecessary.

Nevertheless, let's see two other solutions:

The Readonly module

The Readonly module from CPAN allow us to designate some of our "variables" to be read-only. Effectively turning them into constants.

  1. use strict
  2. use warnings
  3. use 5.010
  4. use Readonly
  5. Readonly my $SPEED_OF_LIGHT => 299 _792_458 # m/s
  6. Readonly my DATA =>
  7. Mercury => 0.4 0.055 ],
  8. Venus => 0.7 0.815 ],
  9. Earth => ],
  10. Mars => 1.5 0.107 ],
  11. Ceres => 2.77 0.00015 ],
  12. Jupiter => 5.2 318 ],
  13. Saturn => 9.5 95 ],
  14. Uranus => 19.6 14 ],
  15. Neptune => 30 17 ],
  16. Pluto => 39 0.00218 ],
  17. Charon => 39 0.000254 ],
  18. );
  19. Readonly my @PLANETS => sort keys DATA

The declaration of the read-only variables (our constants) is very similar to what happens with regular variables, except that we precede each declaration with the Readonly keyword, and instead of assignment , we separate the name of the variable and their values by a fat-arrow: =>

While the names of the read-only variables can be in any case, it is recommended to only use UPPER-CASE names, to make it easy for the reader of the code to recognize them, even without looking at the declaration.

Readonly allows us to create constants during the run-time as we have done above with the @PLANETS array.

  1. say join ', ' @PLANETS
  2. say "The speed of light is $SPEED_OF_LIGHT"
  3. $SPEED_OF_LIGHT 300 _000_000
  4. say "The speed of light is now $SPEED_OF_LIGHT"

If we run the above code, we'll get an exception that says: Modification of a read-only value attempted at ... at the line where we tried to assign the new value to the $SPEED_OF_LIGHT

The same would have happened if we attempted to change one of the internal values such as either of these:

  1. $DATA Sun 'big'
  2. $DATA Mercury }[

The biggest drawback of Readonly, is its relatively slow performance.

Readonly::XS

There is also the Readonly::XS module that can be installed. One does not need to make any changes to their code, once the use Readonly; statement notices that Readonly::XS is also installed, the latter will be used to provide a speed improvement.

The constant pragma

Perl comes with the constant pragma that can create constants.

The constants themselves can only hold scalars or references to complex data structure (arrays and hashes). The names of the constants do not have any sigils in front of them. The names can be any case, but even in the documentation of constant all the examples use upper case, and it is probably better to stick to that style for clarity.

  1. use strict
  2. use warnings
  3. use 5.010
  4. use constant SPEED_OF_LIGHT => 299 _792_458 # m/s
  5. use constant DATA =>
  6. Mercury => 0.4 0.055 ],
  7. Venus => 0.7 0.815 ],
  8. Earth => ],
  9. Mars => 1.5 0.107 ],
  10. Ceres => 2.77 0.00015 ],
  11. Jupiter => 5.2 318 ],
  12. Saturn => 9.5 95 ],
  13. Uranus => 19.6 14 ],
  14. Neptune => 30 17 ],
  15. Pluto => 39 0.00218 ],
  16. Charon => 39 0.000254 ],
  17. };
  18. use constant PLANETS => sort keys %{ DATA () ];

Creating a constant with a scalar value, such as the SPEED_OF_LIGHT is easy. We just need to use the constant pragma. We cannot create a constant hash, but we can create a constant reference to an anonymous hash. The difficulty comes when we would like to use it as a real hash. We need to dereference it using the %{ } construct, but in order to make it work we have to put a pair of parentheses after the name DATA .

This might look strange, but the reason is that the constant actually creates functions with the given names, that return the fixed values. In the above case use constant DATA ... created a function called DATA()

We don't have to always use the parentheses. For example we can write:

  1. say SPEED_OF_LIGHT

and that will work. On the other hand the following code will print The speed of light is now SPEED_OF_LIGHT . Because these constants don't have sigils, they cannot interpolate in a string.

  1. say "The speed of light is now SPEED_OF_LIGHT"

If we try to modify the constant:

  1. SPEED_OF_LIGHT 300 _000_000

we get an exception: Can't modify constant item in scalar assignment at ... . but we can re-declare them:

  1. use constant SPEED_OF_LIGHT => 300 _000_000 # m/s
  2. say SPEED_OF_LIGHT

that will print 300000000. It will give a warning Constant subroutine main::SPEED_OF_LIGHT redefined only if we have use warnings; enabled.

So the constant pragma does not fully protect us from changing the "constant".

Note, fetching the values from a constant that holds a reference to an array also requires the parentheses again, and the de-referencing construct:

  1. say join ', ' @{ PLANETS () };
Other ways to create constants

If the above is not enough, Neil Bowers wrote a review comparing 21 different ways to define constants .

Conclusion

You can probably get away with regular upper-case variables, but if you'd really like to make the variables immutable, use the Readonly module.

Gabor Szabo Written by
Gabor Szabo

[Nov 06, 2017] Perl string concatenation and repetition by Andrew Solomon

Nov 06, 2017 | blog.geekuni.com

October 28 2017

One of the first Perl operators to learn is the "dot" concatenation operator (.) for strings. For example:
my $string = 'foo' . 'bar';
# $string is 'foobar'.

On the other hand, if you have an array of strings @arr, then you can concatenate them by joining them with an empty string in-between:
my $string = join('', @arr);
But what if you just want 10 "foo"s in a line? You might try the Python approach with 'foo' * 10 but Perl with its type conversion on the fly will try to convert 'foo' into a number and say something like:
 Argument "foo" isn't numeric in multiplication (*) at...

Instead you should use the repetition operator (x) which takes a string on the left and a number on the right:
my $string = 'foo' x 10;
and $string is then
foofoofoofoofoofoofoofoofoofoo
Note that even if you have integers on both sides, the 'x' repetition operator will cast the left operand into a string so that:
my $str = 20 x 10;
# $str is "2020202020202020202020"

Now this isn't all the repetition operator is good for - it can also be used for repetition of lists. For example:
('x','y','z') x 10

evaluates as:
('x','y','z','x','y','z','x','y', ...)

But be warned : if the left operand is not enclosed in parentheses it is treated as a scalar.
my @arr = ('x', 'y', 'z');
my @bar =  @arr  x 10;

is equivalent to
my @bar = scalar(@arr) x 10;
# @bar is an array of a single integer (3333333333)
while, turning the array into a list of its elements by enclosing it in parentheses:
my @foo = (  (@arr)  x 10 );
# then @foo is ('x','y','z','x','y','z','x','y', ...)

In summary, if you remember that 'x' is different to '*' and lists are treated differently to scalars, it's less likely your code will give you an unpleasant surprise!

Posted by Andrew Solomon at Email This BlogThis! Share to Twitter Share to Facebook Share to Pinterest

https://apis.google.com/se/0/_/+1/fastbutton?usegapi=1&size=medium&annotation=inline&width=300&source=blogger%3Ablog%3Aplusone&hl=en-GB&origin=http%3A%2F%2Fblog.geekuni.com&url=http%3A%2F%2Fblog.geekuni.com%2F2017%2F10%2Fperl-string-concatenation-and-repetition.html&gsrc=3p&ic=1&jsh=m%3B%2F_%2Fscs%2Fapps-static%2F_%2Fjs%2Fk%3Doz.gapi.en_US.3LJqs_BmJws.O%2Fm%3D__features__%2Fam%3DAQ%2Frt%3Dj%2Fd%3D1%2Frs%3DAGLTcCON37lSPuqreC9udwNmc7WciA8O-A#_methods=onPlusOne%2C_ready%2C_close%2C_open%2C_resizeMe%2C_renderstart%2Concircled%2Cdrefresh%2Cerefresh&id=I0_1509953042304&_gfid=I0_1509953042304&parent=http%3A%2F%2Fblog.geekuni.com&pfname=&rpctoken=95560381

[Nov 06, 2017] Split retaining the separator or parts of it in Perl

Notable quotes:
"... Usually when we use split , we provide a regex. Whatever the regex matches is "thrown away" and the pieces between these matches are returned. In this example we get back the two strings even including the spaces. ..."
"... If however, the regex contains capturing parentheses, then whatever they captured will be also included in the list of returned strings. In this case the string of digits '23' is also included. ..."
Nov 06, 2017 | perlmaven.com
Usually when we use split , we provide a regex. Whatever the regex matches is "thrown away" and the pieces between these matches are returned. In this example we get back the two strings even including the spaces.

examples/split_str.pl

  1. use strict
  2. use warnings
  3. use Data :: Dumper qw Dumper );
  4. my $str "abc 23 def"
  5. my @pieces split \d +/, $str
  6. print Dumper \@pieces
$VAR1 = [
          'abc ',
          ' def'
        ];

If however, the regex contains capturing parentheses, then whatever they captured will be also included in the list of returned strings. In this case the string of digits '23' is also included.

examples/split_str_retain.pl

  1. use strict
  2. use warnings
  3. use Data :: Dumper qw Dumper );
  4. my $str "abc 23 def"
  5. my @pieces split /( \d +)/, $str
  6. print Dumper \@pieces
$VAR1 = [
          'abc ',
          '23',
          ' def'
        ];
Only what is captured is retained 

Remember, not the separator, the substring that was matched, will be retained, but whatever is in the capturing parentheses.

examples/split_str_multiple.pl

  1. use strict
  2. use warnings
  3. use Data :: Dumper qw Dumper );
  4. my $str "abc 2=3 def "
  5. my @pieces split /( \d +)=( \d +)/, $str
  6. print Dumper \@pieces
$VAR1 = [
          'abc ',
          '2',
          '3',
          ' def '
        ];

In this example the sign is not in the resulting list.

[Oct 31, 2017] Perl references explained by Tom Ryder

Jan 27, 2012 | sanctum.geek.nz

Coming to Perl from PHP can be confusing because of the apparently similar and yet quite different ways the two languages use the $ identifier as a variable prefix. If you're accustomed to PHP, you'll be used to declaring variables of various types, using the $ prefix each time:

<?php
$string = "string";
$integer = 6;
$float = 1.337;
$object = new Object();

So when you begin working in Perl, you're lulled into a false sense of security because it looks like you can do the same thing with any kind of data:

#!/usr/bin/perl
my $string = "string";
my $integer = 6;
my $float = 1.337;
my $object = Object->new();

But then you start dealing with arrays and hashes, and suddenly everything stops working the way you expect. Consider this snippet:

#!/usr/bin/perl
my $array = (1, 2, 3);
print $array. "\n";

That's perfectly valid syntax. However, when you run it, expecting to see all your elements or at least an Array like in PHP, you get the output of "3". Not being able to assign a list to a scalar, Perl just gives you the last item of the list instead. You sit there confused, wondering how on earth Perl could think that's what you meant.

References in PHP

In PHP, every identifier is a reference, or pointer, towards some underlying data. PHP handles the memory management for you. When you declare an object, PHP writes the data into memory, and puts into the variable you define a reference to that data. The variable is not the data itself, it's just a pointer to it. To oversimplify things a bit, what's actually stored in your $variable is an address in memory, and not a sequence of data.

When PHP manages all this for you and you're writing basic programs, you don't really notice, because any time you actually use the value it gets dereferenced implicitly, meaning that PHP will use the data that the variable points to. So when you write something like:

$string = "string";
print $string;

The output you get is "string", and not "0x00abf0a9", which might be the "real" value of $string as an address in memory. In this way, PHP is kind of coddling you a bit. In fact, if you actually want two identifiers to point to the same piece of data rather than making a copy in memory, you have to use a special reference syntax:

$string2 = &$string1;

Perl and C programmers aren't quite as timid about hiding references, because being able to manipulate them a bit more directly turns out to be very useful for writing quick, clean code, in particular for conserving memory and dealing with state intelligently.

References in Perl

In Perl, you have three basic data types; scalars arrays , and hashes . These all use different identifiers; Scalars use $ , arrays use @ , and hashes use % .

my $string = "string";
my $integer = 0;
my $float = 0.0;
my @array = (1,2,3);
my %hash = (name => "Tom Ryder",
            blog => "Arabesque");

So scalars can refer directly to data in the way you're accustomed to in PHP, but they can also be references to any other kind of data. For example, you could write:

my $string = "string";
my $copy = $string;
my $reference = \$string;

The value of both $string and $copy , when printed, will be "string", as you might expect. However, the $reference scalar becomes a reference to the data stored in $string , and when printed out would give something like SCALAR(0x2160718) . Similarly, you can define a scalar as a reference to an array or hash:

my @array = (1,2,3);
my $arrayref = \@array;
my %hash = (name => "Tom Ryder",
            blog => "Arabesque");
my $hashref = \%hash;

There are even shorthands for doing this, if you want to declare a reference and the data it references inline. For array references, you use square brackets, and for hash references, curly brackets:

$arrayref = [1,2,3];
$hashref = {name => "Tom Ryder",
            blog => "Arabesque"};

And if you really do want to operate with the data of the reference, rather than the reference itself, you can explicitly dereference it:

$string = ${$reference};
@array = @{$arrayref};
%hash = %{$hashref};

For a much more in-depth discussion of how references work in Perl and their general usefulness, check out perlref in the Perl documentation.

[Oct 19, 2017] How can I translate a shell script to Perl

Oct 19, 2017 | stackoverflow.com

down vote accepted

> ,

I'll answer seriously. I do not know of any program to translate a shell script into Perl, and I doubt any interpreter module would provide the performance benefits. So I'll give an outline of how I would go about it.

Now, you want to reuse your code as much as possible. In that case, I suggest selecting pieces of that code, write a Perl version of that, and then call the Perl script from the main script. That will enable you to do the conversion in small steps, assert that the converted part is working, and improve gradually your Perl knowledge.

As you can call outside programs from a Perl script, you can even replace some bigger logic with Perl, and call smaller shell scripts (or other commands) from Perl to do something you don't feel comfortable yet to convert. So you'll have a shell script calling a perl script calling another shell script. And, in fact, I did exactly that with my own very first Perl script.

Of course, it's important to select well what to convert. I'll explain, below, how many patterns common in shell scripts are written in Perl, so that you can identify them inside your script, and create replacements by as much cut&paste as possible.

First, both Perl scripts and Shell scripts are code+functions. Ie, anything which is not a function declaration is executed in the order it is encountered. You don't need to declare functions before use, though. That means the general layout of the script can be preserved, though the ability to keep things in memory (like a whole file, or a processed form of it) makes it possible to simplify tasks.

A Perl script, in Unix, starts with something like this:

#!/usr/bin/perl

use strict;
use warnings;

use Data::Dumper;
#other libraries

(rest of the code)

The first line, obviously, points to the commands to be used to run the script, just like normal shells do. The following two "use" lines make then language more strict, which should decrease the amount of bugs you encounter because you don't know the language well (or plain did something wrong). The third use line imports the "Dumper" function of the "Data" module. It's useful for debugging purposes. If you want to know the value of an array or hash table, just print Dumper(whatever).

Note also that comments are just like shell's, lines starting with "#".

Now, you call external programs and pipe to or pipe from them. For example:

open THIS, "cat $ARGV[0] |";

That will run cat, passing " $ARGV[0] ", which would be $1 on shell -- the first argument passed to it. The result of that will be piped into your Perl script through "THIS", which you can use to read that from it, as I'll show later.

You can use "|" at the beginning or end of line, to indicate the mode "pipe to" or "pipe from", and specify a command to be run, and you can also use ">" or ">>" at the beginning, to open a file for writing with or without truncation, "<" to explicitly indicate opening a file for reading (the default), or "+<" and "+>" for read and write. Notice that the later will truncate the file first.

Another syntax for "open", which will avoid problems with files with such characters in their names, is having the opening mode as a second argument:

open THIS, "-|", "cat $ARGV[0]";

This will do the same thing. The mode "-|" stands for "pipe from" and "|-" stands for "pipe to". The rest of the modes can be used as they were ( >, >>, <, +>, +< ). While there is more than this to open, it should suffice for most things.

But you should avoid calling external programs as much as possible. You could open the file directly, by doing open THIS, "$ARGV[0]"; , for example, and have much better performance.

So, what external programs you could cut out? Well, almost everything. But let's stay with the basics: cat, grep, cut, head, tail, uniq, wc, sort.

CAT

Well, there isn't much to be said about this one. Just remember that, if possible, read the file only once and keep it in memory. If the file is huge you won't do that, of course, but there are almost always ways to avoid reading a file more than once.

Anyway, the basic syntax for cat would be:

my $filename = "whatever";
open FILE, "$filename" or die "Could not open $filename!\n";
while(<FILE>) {
  print $_;
}
close FILE;

This opens a file, and prints all it's contents (" while(<FILE>) " will loop until EOF, assigning each line to " $_ "), and close it again.

If I wanted to direct the output to another file, I could do this:

my $filename = "whatever";
my $anotherfile = "another";
open (FILE, "$filename") || die "Could not open $filename!\n";
open OUT, ">", "$anotherfile" or die "Could not open $anotherfile for writing!\n";
while(<FILE>) {
  print OUT $_;
}
close FILE;

This will print the line to the file indicated by " OUT ". You can use STDIN , STDOUT and STDERR in the appropriate places as well, without having to open them first. In fact, " print " defaults to STDOUT , and " die " defaults to " STDERR ".

Notice also the " or die ... " and " || die ... ". The operators or and || means it will only execute the following command if the first returns false (which means empty string, null reference, 0, and the like). The die command stops the script with an error message.

The main difference between " or " and " || " is priority. If " or " was replaced by " || " in the examples above, it would not work as expected, because the line would be interpreted as:

open FILE, ("$filename" || die "Could not open $filename!\n");

Which is not at all what is expected. As " or " has a lower priority, it works. In the line where " || " is used, the parameters to open are passed between parenthesis, making it possible to use " || ".

Alas, there is something which is pretty much what cat does:

while(<>) {
  print $_;
}

That will print all files in the command line, or anything passed through STDIN.

GREP

So, how would our "grep" script work? I'll assume "grep -E", because that's easier in Perl than simple grep. Anyway:

my $pattern = $ARGV[0];
shift @ARGV;
while(<>) {
        print $_ if /$pattern/o;
}

The "o" passed to $patttern instructs Perl to compile that pattern only once, thus gaining you speed. Not the style "something if cond". It means it will only execute "something" if the condition is true. Finally, " /$pattern/ ", alone, is the same as " $_ =~ m/$pattern/ ", which means compare $_ with the regex pattern indicated. If you want standard grep behavior, ie, just substring matching, you could write:

print $_ if $_ =~ "$pattern";

CUT

Usually, you do better using regex groups to get the exact string than cut. What you would do with "sed", for instance. Anyway, here are two ways of reproducing cut:

while(<>) {
  my @array = split ",";
  print $array[3], "\n";
}

That will get you the fourth column of every line, using "," as separator. Note @array and $array[3] . The @ sigil means "array" should be treated as an, well, array. It will receive an array composed of each column in the currently processed line. Next, the $ sigil means array[3] is a scalar value. It will return the column you are asking for.

This is not a good implementation, though, as "split" will scan the whole string. I once reduced a process from 30 minutes to 2 seconds just by not using split -- the lines where rather large, though. Anyway, the following has a superior performance if the lines are expected to be big, and the columns you want are low:

while(<>) {
  my ($column) = /^(?:[^,]*,){3}([^,]*),/;
  print $column, "\n";
}

This leverages regular expressions to get the desired information, and only that.

If you want positional columns, you can use:

while(<>) {
  print substr($_, 5, 10), "\n";
}

Which will print 10 characters starting from the sixth (again, 0 means the first character).

HEAD

This one is pretty simple:

my $printlines = abs(shift);
my $lines = 0;
my $current;
while(<>) {
  if($ARGV ne $current) {
    $lines = 0;
    $current = $ARGV;
  }
  print "$_" if $lines < $printlines;
  $lines++;
}

Things to note here. I use "ne" to compare strings. Now, $ARGV will always point to the current file, being read, so I keep track of them to restart my counting once I'm reading a new file. Also note the more traditional syntax for "if", right along with the post-fixed one.

I also use a simplified syntax to get the number of lines to be printed. When you use "shift" by itself it will assume "shift @ARGV". Also, note that shift, besides modifying @ARGV, will return the element that was shifted out of it.

As with a shell, there is no distinction between a number and a string -- you just use it. Even things like "2"+"2" will work. In fact, Perl is even more lenient, cheerfully treating anything non-number as a 0, so you might want to be careful there.

This script is very inefficient, though, as it reads ALL file, not only the required lines. Let's improve it, and see a couple of important keywords in the process:

my $printlines = abs(shift);
my @files;
if(scalar(@ARGV) == 0) {
  @files = ("-");
} else {
  @files = @ARGV;
}
for my $file (@files) {
  next unless -f $file && -r $file;
  open FILE, "<", $file or next;
  my $lines = 0;

  while(<FILE>) {
    last if $lines == $printlines;
    print "$_";
    $lines++;
  }

  close FILE;
}

The keywords "next" and "last" are very useful. First, "next" will tell Perl to go back to the loop condition, getting the next element if applicable. Here we use it to skip a file unless it is truly a file (not a directory) and readable. It will also skip if we couldn't open the file even then.

Then "last" is used to immediately jump out of a loop. We use it to stop reading the file once we have reached the required number of lines. It's true we read one line too many, but having "last" in that position shows clearly that the lines after it won't be executed.

There is also "redo", which will go back to the beginning of the loop, but without reevaluating the condition nor getting the next element.

TAIL

I'll do a little trick here.

my $skiplines = abs(shift);
my @lines;
my $current = "";
while(<>) {
  if($ARGV ne $current) {
    print @lines;
    undef @lines;
    $current = $ARGV;
  }
  push @lines, $_;
  shift @lines if $#lines == $skiplines;
}
print @lines;

Ok, I'm combining "push", which appends a value to an array, with "shift", which takes something from the beginning of an array. If you want a stack, you can use push/pop or shift/unshift. Mix them, and you have a queue. I keep my queue with at most 10 elements with $#lines which will give me the index of the last element in the array. You could also get the number of elements in @lines with scalar(@lines) .

UNIQ

Now, uniq only eliminates repeated consecutive lines, which should be easy with what you have seen so far. So I'll eliminate all of them:

my $current = "";
my %lines;
while(<>) {
  if($ARGV ne $current) {
    undef %lines;
    $current = $ARGV;
  }
  print $_ unless defined($lines{$_});
  $lines{$_} = "";
}

Now here I'm keeping the whole file in memory, inside %lines . The use of the % sigil indicates this is a hash table. I'm using the lines as keys, and storing nothing as value -- as I have no interest in the values. I check where the key exist with "defined($lines{$_})", which will test if the value associated with that key is defined or not; the keyword "unless" works just like "if", but with the opposite effect, so it only prints a line if the line is NOT defined.

Note, too, the syntax $lines{$_} = "" as a way to store something in a hash table. Note the use of {} for hash table, as opposed to [] for arrays.

WC

This will actually use a lot of stuff we have seen:

my $current;
my %lines;
my %words;
my %chars;
while(<>) {
  $lines{"$ARGV"}++;
  $chars{"$ARGV"} += length($_);
  $words{"$ARGV"} += scalar(grep {$_ ne ""} split /\s/);
}

for my $file (keys %lines) {
  print "$lines{$file} $words{$file} $chars{$file} $file\n";
}

Three new things. Two are the "+=" operator, which should be obvious, and the "for" expression. Basically, a "for" will assign each element of the array to the variable indicated. The "my" is there to declare the variable, though it's unneeded if declared previously. I could have an @array variable inside those parenthesis. The "keys %lines" expression will return as an array they keys (the filenames) which exist for the hash table "%lines". The rest should be obvious.

The third thing, which I actually added only revising the answer, is the "grep". The format here is:

grep { code } array

It will run "code" for each element of the array, passing the element as "$_". Then grep will return all elements for which the code evaluates to "true" (not 0, not "", etc). This avoids counting empty strings resulting from consecutive spaces.

Similar to "grep" there is "map", which I won't demonstrate here. Instead of filtering, it will return an array formed by the results of "code" for each element.

SORT

Finally, sort. This one is easy too:

my @lines;
my $current = "";
while(<>) {
  if($ARGV ne $current) {
    print sort @lines;
    undef @lines;
    $current = $ARGV;
  }
  push @lines, $_;
}
print sort @lines;

Here, "sort" will sort the array. Note that sort can receive a function to define the sorting criteria. For instance, if I wanted to sort numbers I could do this:

my @lines;
my $current = "";
while(<>) {
  if($ARGV ne $current) {
    print sort @lines;
    undef @lines;
    $current = $ARGV;
  }
  push @lines, $_;
}
print sort {$a <=> $b} @lines;

Here " $a " and " $b " receive the elements to be compared. " <=> " returns -1, 0 or 1 depending on whether the number is less than, equal to or greater than the other. For strings, "cmp" does the same thing.

HANDLING FILES, DIRECTORIES & OTHER STUFF

As for the rest, basic mathematical expressions should be easy to understand. You can test certain conditions about files this way:

for my $file (@ARGV) {
  print "$file is a file\n" if -f "$file";
  print "$file is a directory\n" if -d "$file";
  print "I can read $file\n" if -r "$file";
  print "I can write to $file\n" if -w "$file";
}

I'm not trying to be exaustive here, there are many other such tests. I can also do "glob" patterns, like shell's "*" and "?", like this:

for my $file (glob("*")) {
  print $file;
  print "*" if -x "$file" && ! -d "$file";
  print "/" if -d "$file";
  print "\t";
}

If you combined that with "chdir", you can emulate "find" as well:

sub list_dir($$) {
  my ($dir, $prefix) = @_;
  my $newprefix = $prefix;
  if ($prefix eq "") {
    $newprefix = $dir;
  } else {
    $newprefix .= "/$dir";
  }
  chdir $dir;
  for my $file (glob("*")) {
    print "$prefix/" if $prefix ne "";
    print "$dir/$file\n";
    list_dir($file, $newprefix) if -d "$file";
  }
  chdir "..";
}

list_dir(".", "");

Here we see, finally, a function. A function is declared with the syntax:

sub name (params) { code }

Strictly speakings, "(params)" is optional. The declared parameter I used, " ($$) ", means I'm receiving two scalar parameters. I could have " @ " or " % " in there as well. The array " @_ " has all the parameters passed. The line " my ($dir, $prefix) = @_ " is just a simple way of assigning the first two elements of that array to the variables $dir and $prefix .

This function does not return anything (it's a procedure, really), but you can have functions which return values just by adding " return something; " to it, and have it return "something".

The rest of it should be pretty obvious.

MIXING EVERYTHING

Now I'll present a more involved example. I'll show some bad code to explain what's wrong with it, and then show better code.

For this first example, I have two files, the names.txt file, which names and phone numbers, the systems.txt, with systems and the name of the responsible for them. Here they are:

names.txt

John Doe, (555) 1234-4321
Jane Doe, (555) 5555-5555
The Boss, (666) 5555-5555

systems.txt

Sales, Jane Doe
Inventory, John Doe
Payment, That Guy

I want, then, to print the first file, with the system appended to the name of the person, if that person is responsible for that system. The first version might look like this:

#!/usr/bin/perl

use strict;
use warnings;

open FILE, "names.txt";

while(<FILE>) {
  my ($name) = /^([^,]*),/;
  my $system = get_system($name);
  print $_ . ", $system\n";
}

close FILE;

sub get_system($) {
  my ($name) = @_;
  my $system = "";

  open FILE, "systems.txt";

  while(<FILE>) {
    next unless /$name/o;
    ($system) = /([^,]*)/;
  }

  close FILE;

  return $system;
}

This code won't work, though. Perl will complain that the function was used too early for the prototype to be checked, but that's just a warning. It will give an error on line 8 (the first while loop), complaining about a readline on a closed filehandle. What happened here is that " FILE " is global, so the function get_system is changing it. Let's rewrite it, fixing both things:

#!/usr/bin/perl

use strict;
use warnings;

sub get_system($) {
  my ($name) = @_;
  my $system = "";

  open my $filehandle, "systems.txt";

  while(<$filehandle>) {
    next unless /$name/o;
    ($system) = /([^,]*)/;
  }

  close $filehandle;

  return $system;
}

open FILE, "names.txt";

while(<FILE>) {
  my ($name) = /^([^,]*),/;
  my $system = get_system($name);
  print $_ . ", $system\n";
}

close FILE;

This won't give any error or warnings, nor will it work. It returns just the sysems, but not the names and phone numbers! What happened? Well, what happened is that we are making a reference to " $_ " after calling get_system , but, by reading the file, get_system is overwriting the value of $_ !

To avoid that, we'll make $_ local inside get_system . This will give it a local scope, and the original value will then be restored once returned from get_system :

#!/usr/bin/perl

use strict;
use warnings;

sub get_system($) {
  my ($name) = @_;
  my $system = "";
  local $_;

  open my $filehandle, "systems.txt";

  while(<$filehandle>) {
    next unless /$name/o;
    ($system) = /([^,]*)/;
  }

  close $filehandle;

  return $system;
}

open FILE, "names.txt";

while(<FILE>) {
  my ($name) = /^([^,]*),/;
  my $system = get_system($name);
  print $_ . ", $system\n";
}

close FILE;

And that still doesn't work! It prints a newline between the name and the system. Well, Perl reads the line including any newline it might have. There is a neat command which will remove newlines from strings, " chomp ", which we'll use to fix this problem. And since not every name has a system, we might, as well, avoid printing the comma when that happens:

#!/usr/bin/perl

use strict;
use warnings;

sub get_system($) {
  my ($name) = @_;
  my $system = "";
  local $_;

  open my $filehandle, "systems.txt";

  while(<$filehandle>) {
    next unless /$name/o;
    ($system) = /([^,]*)/;
  }

  close $filehandle;

  return $system;
}

open FILE, "names.txt";

while(<FILE>) {
  my ($name) = /^([^,]*),/;
  my $system = get_system($name);
  chomp;
  print $_;
  print ", $system" if $system ne "";
  print "\n";
}

close FILE;

That works, but it also happens to be horribly inefficient. We read the whole systems file for every line in the names file. To avoid that, we'll read all data from systems once, and then use that to process names.

Now, sometimes a file is so big you can't read it into memory. When that happens, you should try to read into memory any other file needed to process it, so that you can do everything in a single pass for each file. Anyway, here is the first optimized version of it:

#!/usr/bin/perl

use strict;
use warnings;

our %systems;
open SYSTEMS, "systems.txt";
while(<SYSTEMS>) {
  my ($system, $name) = /([^,]*),(.*)/;
  $systems{$name} = $system;
}
close SYSTEMS;

open NAMES, "names.txt";
while(<NAMES>) {
  my ($name) = /^([^,]*),/;
  chomp;
  print $_;
  print ", $systems{$name}" if defined $systems{$name};
  print "\n";
}
close NAMES;

Unfortunately, it doesn't work. No system ever appears! What has happened? Well, let's look into what " %systems " contains, by using Data::Dumper :

#!/usr/bin/perl

use strict;
use warnings;

use Data::Dumper;

our %systems;
open SYSTEMS, "systems.txt";
while(<SYSTEMS>) {
  my ($system, $name) = /([^,]*),(.*)/;
  $systems{$name} = $system;
}
close SYSTEMS;

print Dumper(%systems);

open NAMES, "names.txt";
while(<NAMES>) {
  my ($name) = /^([^,]*),/;
  chomp;
  print $_;
  print ", $systems{$name}" if defined $systems{$name};
  print "\n";
}
close NAMES;

The output will be something like this:

$VAR1 = ' Jane Doe';
$VAR2 = 'Sales';
$VAR3 = ' That Guy';
$VAR4 = 'Payment';
$VAR5 = ' John Doe';
$VAR6 = 'Inventory';
John Doe, (555) 1234-4321
Jane Doe, (555) 5555-5555
The Boss, (666) 5555-5555

Those $VAR1/$VAR2/etc is how Dumper displays a hash table. The odd numbers are the keys, and the succeeding even numbers are the values. Now we can see that each name in %systems has a preceeding space! Silly regex mistake, let's fix it:

#!/usr/bin/perl

use strict;
use warnings;

our %systems;
open SYSTEMS, "systems.txt";
while(<SYSTEMS>) {
  my ($system, $name) = /^\s*([^,]*?)\s*,\s*(.*?)\s*$/;
  $systems{$name} = $system;
}
close SYSTEMS;

open NAMES, "names.txt";
while(<NAMES>) {
  my ($name) = /^\s*([^,]*?)\s*,/;
  chomp;
  print $_;
  print ", $systems{$name}" if defined $systems{$name};
  print "\n";
}
close NAMES;

So, here, we are aggressively removing any spaces from the beginning or end of name and system. There are other ways to form that regex, but that's beside the point. There is still one problem with this script, which you'll have seen if your "names.txt" and/or "systems.txt" files have an empty line at the end. The warnings look like this:

Use of uninitialized value in hash element at ./exemplo3e.pl line 10, <SYSTEMS> line 4.
Use of uninitialized value in hash element at ./exemplo3e.pl line 10, <SYSTEMS> line 4.
John Doe, (555) 1234-4321, Inventory
Jane Doe, (555) 5555-5555, Sales
The Boss, (666) 5555-5555
Use of uninitialized value in hash element at ./exemplo3e.pl line 19, <NAMES> line 4.

What happened here is that nothing went into the " $name " variable when the empty line was processed. There are many ways around that, but I choose the following:

#!/usr/bin/perl

use strict;
use warnings;

our %systems;
open SYSTEMS, "systems.txt" or die "Could not open systems.txt!";
while(<SYSTEMS>) {
  my ($system, $name) = /^\s*([^,]+?)\s*,\s*(.+?)\s*$/;
  $systems{$name} = $system if defined $name;
}
close SYSTEMS;

open NAMES, "names.txt" or die "Could not open names.txt!";
while(<NAMES>) {
  my ($name) = /^\s*([^,]+?)\s*,/;
  chomp;
  print $_;
  print ", $systems{$name}" if defined($name) && defined($systems{$name});
  print "\n";
}
close NAMES;

The regular expressions now require at least one character for name and system, and we test to see if " $name " is defined before we use it.

CONCLUSION

Well, then, these are the basic tools to translate a shell script. You can do MUCH more with Perl, but that was not your question, and it wouldn't fit here anyway.

Just as a basic overview of some important topics,

There is a free, good, hands-on, hard & fast book about Perl called Learning Perl The Hard Way . It's style is similar to this very answer. It might be a good place to go from here.

I hope this helped.

DISCLAIMER

I'm NOT trying to teach Perl, and you will need to have at least some reference material. There are guidelines to good Perl habits, such as using " use strict; " and " use warnings; " at the beginning of the script, to make it less lenient of badly written code, or using STDOUT and STDERR on the print lines, to indicate the correct output pipe.

This is stuff I agree with, but I decided it would detract from the basic goal of showing patterns for common shell script utilities.

[Oct 16, 2017] Indenting Here Documents

Oct 16, 2017 | docstore.mik.ua
1.11. Indenting Here Documents Problem

When using the multiline quoting mechanism called a here document , the text must be flush against the margin, which looks out of place in the code. You would like to indent the here document text in the code, but not have the indentation appear in the final string value. Solution

Use a s///

# all in one
($var = <<HERE_TARGET) =~ s/^\s+//gm;
    your text
    goes here
HERE_TARGET

# or with two steps
$var = <<HERE_TARGET;
    your text
    goes here
HERE_TARGET
$var =~ s/^\s+//gm;
Discussion

The substitution is straightforward. It removes leading whitespace from the text of the here document. The /m modifier lets the ^ character match at the start of each line in the string, and the /g modifier makes the pattern matching engine repeat the substitution as often as it can (i.e., for every line in the here document).

($definition = <<'FINIS') =~ s/^\s+//gm;
    The five varieties of camelids
    are the familiar camel, his friends
    the llama and the alpaca, and the
    rather less well-known guanaco
    and vicuЯa.
FINIS

Be warned: all the patterns in this recipe use \s \s with [^\S\n] in the patterns.

The substitution makes use of the property that the result of an assignment can be used as the left-hand side of =~ . This lets us do it all in one line, but it only works when you're assigning to a variable. When you're using the here document directly, it would be considered a constant value and you wouldn't be able to modify it. In fact, you can't change a here document's value unless you first put it into a variable.

Not to worry, though, because there's an easy way around this, particularly if you're going to do this a lot in the program. Just write a subroutine to do it:

sub fix {
    my $string = shift;
    $string =~ s/^\s+//gm;
    return $string;
}

print fix(<<"END");
    My stuff goes here
END

# With function predeclaration, you can omit the parens:
print fix <<"END";
    My stuff goes here
END

As with all here documents, you have to place this here document's target (the token that marks its end, END in this case) flush against the left-hand margin. If you want to have the target indented also, you'll have to put the same amount of whitespace in the quoted string as you use to indent the token.

($quote = <<'    FINIS') =~ s/^\s+//gm;
        ...we will have peace, when you and all your works have
        perished--and the works of your dark master to whom you would
        deliver us. You are a liar, Saruman, and a corrupter of men's
        hearts.  --Theoden in /usr/src/perl/taint.c
    FINIS
$quote =~ s/\s+--/\n--/;      #move attribution to line of its own

If you're doing this to strings that contain code you're building up for an eval , or just text to print out, you might not want to blindly strip off all leading whitespace because that would destroy your indentation. Although eval wouldn't care, your reader might.

Another embellishment is to use a special leading string for code that stands out. For example, here we'll prepend each line with @@@ , properly indented:

if ($REMEMBER_THE_MAIN) {
    $perl_main_C = dequote<<'    MAIN_INTERPRETER_LOOP';
        @@@ int
        @@@ runops() {
        @@@     SAVEI32(runlevel);
        @@@     runlevel++;
        @@@     while ( op = (*op->op_ppaddr)() ) ;
        @@@     TAINT_NOT;
        @@@     return 0;
        @@@ }
    MAIN_INTERPRETER_LOOP
    # add more code here if you want
}

Destroying indentation also gets you in trouble with poets.

sub dequote;
$poem = dequote<<EVER_ON_AND_ON;
       Now far ahead the Road has gone,
          And I must follow, if I can,
       Pursuing it with eager feet,
          Until it joins some larger way
       Where many paths and errands meet.
          And whither then? I cannot say.
                --Bilbo in /usr/src/perl/pp_ctl.c
EVER_ON_AND_ON
print "Here's your poem:\n\n$poem\n";

Here is its sample output:




Here's your poem:  








Now far ahead the Road has gone,








   And I must follow, if I can,








Pursuing it with eager feet,








   Until it joins some larger way








Where many paths and errands meet.








   And whither then? I cannot say.








         --Bilbo in /usr/src/perl/pp_ctl.c



The following dequote

sub dequote {
    local $_ = shift;
    my ($white, $leader);  # common whitespace and common leading string
    if (/^\s*(?:([^\w\s]+)(\s*).*\n)(?:\s*\1\2?.*\n)+$/) {
        ($white, $leader) = ($2, quotemeta($1));
    } else {
        ($white, $leader) = (/^(\s+)/, '');
    }
    s/^\s*?$leader(?:$white)?//gm;
    return $_;
}

If that pattern makes your eyes glaze over, you could always break it up and add comments by adding /x :

    if (m{
            ^                       # start of line
            \s *                    # 0 or more whitespace chars
            (?:                     # begin first non-remembered grouping
                 (                  #   begin save buffer $1
                    [^\w\s]         #     one byte neither space nor word
                    +               #     1 or more of such
                 )                  #   end save buffer $1
                 ( \s* )            #   put 0 or more white in buffer $2
                 .* \n              #   match through the end of first line
             )                      # end of first grouping
             (?:                    # begin second non-remembered grouping
                \s *                #   0 or more whitespace chars
                \1                  #   whatever string is destined for $1
                \2 ?                #   what'll be in $2, but optionally
                .* \n               #   match through the end of the line
             ) +                    # now repeat that group idea 1 or more
             $                      # until the end of the line
          }x
       )
    {
        ($white, $leader) = ($2, quotemeta($1));
    } else {
        ($white, $leader) = (/^(\s+)/, '');
    }
    s{
         ^                          # start of each line (due to /m)
         \s *                       # any amount of leading whitespace
            ?                       #   but minimally matched
         $leader                    # our quoted, saved per-line leader
         (?:                        # begin unremembered grouping
            $white                  #    the same amount
         ) ?                        # optionalize in case EOL after leader
    }{}xgm;

There, isn't that much easier to read? Well, maybe not; sometimes it doesn't help to pepper your code with insipid comments that mirror the code. This may be one of those cases. See Also

The "Scalar Value Constructors" section of perldata (1) and the "Here Documents" section of Chapter 2 of Programming Perl ; the s/// operator in perlre (1) and perlop (1), and the "Pattern Matching" section of Chapter 2 of Programming Perl

[Oct 16, 2017] HERE documents

Oct 16, 2017 | www.perlmeme.org

http://platform.twitter.com/widgets/tweet_button.f7323036818f270c17ea2eebc8e6be4f.en.html#dnt=false&id=twitter-widget-0&lang=en&original_referer=http%3A%2F%2Fwww.perlmeme.org%2Fhowtos%2Fsyntax%2Fhere_document.html&size=m&text=HERE%20documents&time=1508128162389&type=share&url=http%3A%2F%2Fwww.perlmeme.org%2Fhowtos%2Fsyntax%2Fhere_document.html

Introduction

If you're tempted to write multi-line output with multiple print() statements, because that's what you're used to in some other language, consider using a HERE-document instead.

Inspired by the here-documents in the Unix command line shells, Perl HERE-documents provide a convenient way to handle the quoting of multi-line values.

So you can replace this:

    print "Welcome to the MCG Carpark.\n";
    print "\n";
    print "There are currently 2,506 parking spaces available.\n";
    print "Please drive up to a booth and collect a ticket.\n";

with this:

    print <<'EOT';
    Welcome to the MCG Carpark.

    There are currently 2,506 parking spaces available.
    Please drive up to a booth and collect a ticket.
    EOT

The EOT in this example is an arbitrary string that you provide to indicate the start and end of the text being quoted. The terminating string must appear on a line by itself.

Quoting conventions.

The usual Perl quoting conventions apply, so if you want to interpolate variables in a here-document, use double quotes around your chosen terminating string:

    print <<"EOT";
    Welcome to the MCG Carpark.

    There are currently $available_places parking spaces available.
    Please drive up to booth and collect a ticket.
    EOT

Note that whilst you can quote your terminator with " or ' , you cannot use the equivalent qq() and q() operators. So this code is invalid:

    # This example will fail
    print <<qq(EOT);
    Welcome to the MCG Carpark.

    There are currently $available_places parking spaces available.
    Please drive up to booth and collect a ticket.
    EOT
The terminating string

Naturally, all of the text you supply to a here-document is quoted by the starting and ending strings. This means that any indentation you provide becomes part of the text that is used. In this example, each line of the output will contain four leading spaces.

    # Let's indent the text to be displayed. The leading spaces will be
    # preserved in the output.
    print <<"EOT";
        Welcome to the MCG Carpark.

        CAR PARK FULL. 
    EOT

The terminating string must appear on a line by itself, and it must have no whitespace before or after it. In this example, the terminating string EOT is preceded by four spaces, so Perl will not find it:

    # Let's indent the following lines. This introduces an error
        print <<"EOT";
        Welcome to the MCG Carpark.

        CAR PARK FULL. 
        EOT
    Can't find string terminator "EOT" anywhere before EOF at ....
Assignment

The here-document mechanism is just a generalized means of quoting text, so you can just as easily use it in an assignment:

    my $message =  <<"EOT";
    Welcome to the MCG Carpark.

    CAR PARK FULL. 
    EOT

    print $message;

And don't let the samples you've seen so far stop from considering the full range of possibilities. The terminating tag doesn't have to appear at the end of a statement.

Here is an example from CPAN.pm that conditionally assigns some text to $msg .

    $msg = <<EOF unless $configpm =~ /MyConfig/;

    # This is CPAN.pm's systemwide configuration file. This file provides
    # defaults for users, and the values can be changed in a per-user
    # configuration file. The user-config file is being looked for as
    # ~/.cpan/CPAN/MyConfig.pm.

    EOF

And this example from Module::Build::PPMMaker uses a here-document to construct the format string for sprintf() :

    $ppd .= sprintf(<<'EOF', $perl_version, $^O, $self->_varchname($build->config) );
        <PERLCORE VERSION="%s" />
        <OS NAME="%s" />
        <ARCHITECTURE NAME="%s" />
    EOF
See Also
     perldoc -q "HERE documents"
     perldoc perlop (see the <<EOF section).
     Perl for Newbies - Lecture 4

[Oct 14, 2017] In December 18, 2017 Perl turns 30 by Ruth Holloway

Notable quotes:
"... there is more than one way to do it ..."
"... Perl version 5.10 of Perl was released on the 20th anniversary of Perl 1.0: December 18, 2007. Version 5.10 marks the start of the "Modern Perl" movement. ..."
Oct 14, 2017 | opensource.com

Larry Wall released Perl 1.0 to the comp.sources.misc Usenet newsgroup on December 18, 1987. In the nearly 30 years since then, both the language and the community of enthusiasts that sprung up around it have grown and thrived -- and they continue to do so, despite suggestions to the contrary!

Wall's fundamental assertion -- there is more than one way to do it -- continues to resonate with developers. Perl allows programmers to embody the three chief virtues of a programmer: laziness, impatience, and hubris. Perl was originally designed for utility, not beauty. Perl is a programming language for fixing things, for quick hacks, and for making complicated things possible partly through the power of community. This was a conscious decision on Larry Wall's part: In an interview in 1999, he posed the question, "When's the last time you used duct tape on a duct?"

A history lesson

... ... ...

The Perl community

... ... ...

... ... ...

As Perl turns 30, the community that emerged around Larry Wall's solution to sticky system administration problems continues to grow and thrive. New developers enter the community all the time, and substantial new work is being done to modernize the language and keep it useful for solving a new generation of problems. Interested? Find your local Perl Mongers group, or join us online, or attend a Perl Conference near you!

Ruth Holloway - Ruth Holloway has been a system administrator and software developer for a long, long time, getting her professional start on a VAX 11/780, way back when. She spent a lot of her career (so far) serving the technology needs of libraries, and has been a contributor since 2008 to the Koha open source library automation suite.Ruth is currently a Perl Developer at cPanel in Houston, and also serves as chief of staff for an obnoxious cat. In her copious free time, she occasionally reviews old romance... "

[Sep 27, 2017] g flag in Perl regular expressions

/g transforms m// and s/// to different commands with different contextual behaviour!
Sep 27, 2017 | www.perlmonks.org

Hello perl-diddler ,

If you look at the documentation for qr// , youll see that the /g modifier is not supported:

qr/ STRING /msixpodualn
-- perlop#Regexp-Quote-Like-Operators

Which makes sense: qr turns STRING into a regular expression, which may then be used in any number of m{...} and s{...}{...} constructs. The appropriate place to add a /g modifier is at the point of use:

use strict; use warnings; use P; my $re = qr{ (\w+) }x; my $dat = "Just another cats meow"; my @matches = $dat =~ /$re/g; P "#matches=%s, matches=%s", scalar(@matches), \@matches; exit scalar(@matches); [download]

Output:

12:53 >perl 1645_SoPW.pl #matches=4, matches=["Just", "another", "cats", "meow"] 12:54 > [download]

Update:

P.S. - I also just noticed that in addition to stripping out the 'g' option, the 'x' option doesn't seem to work in the regex's parens, i.e. - (?x).

I dont understand what youre saying here. Can you give some example code?

Hope that helps,

[Sep 27, 2017] qq qw qr qx

crookedtimber.org

q// is generally the same thing as using single quotes - meaning it doesn't interpolate values inside the delimiters.
qq// is the same as double quoting a string. It interpolates.
qw// return a list of white space delimited words. @q = qw/this is a test/ is functionally the same as @q = ('this', 'is', 'a', 'test')
qx// is the same thing as using the backtick operators.

[Sep 18, 2017] The Fall Of Perl, The Webs Most Promising Language by Conor Myhrvold

The author pays outsize attention to superficial things like popularity with particular groups of users. For sysadmin this matter less then the the level of integration with the underling OS and the quality of the debugger.
The real story is that Python has less steep initial learning curve and that helped to entrenched it in universities. Students brought it to large companies like Red Hat. The rest is history. Google support also was a positive factor. Python also basked in OO hype. So this is more widespread language now much like Microsoft Basic. That does not automatically makes it a better language in sysadmin domain.
The phase " Perl's quirky stylistic conventions, such as using $ in front to declare variables, are in contrast for the other declarative symbol $ for practical programmers todaythe money that goes into the continued development and feature set of Perl's frenemies such as Python and Ruby." smells with "syntax junkie" mentality. What wrong with dereferencing using $ symbol? yes it creates problem if you are using simultaneously other languages like C or Python, but for experienced programmer this is a minor thing. Yes Perl has some questionable syntax choices so so are any other language in existence. While painful, it is the semantic and "programming environment" that mater most.
My impression is that Perl returned to its roots -- migrated back to being an excellent sysadmin tool -- as there is strong synergy between Perl and Unix shells. The fact that Perl 5 is reasonably stable is a huge plus in this area.
Notable quotes:
"... By the late 2000s Python was not only the dominant alternative to Perl for many text parsing tasks typically associated with Perl (i.e. regular expressions in the field of bioinformatics ) but it was also the most proclaimed popular language , talked about with elegance and eloquence among my circle of campus friends, who liked being part of an up-and-coming movement. ..."
"... Others point out that Perl is left out of the languages to learn first in an era where Python and Java had grown enormously, and a new entrant from the mid-2000s, Ruby, continues to gain ground by attracting new users in the web application arena (via Rails ), followed by the Django framework in Python (PHP has remained stable as the simplest option as well). ..."
"... In bioinformatics, where Perl's position as the most popular scripting language powered many 1990s breakthroughs like genetic sequencing, Perl has been supplanted by Python and the statistical language R (a variant of S-plus and descendent of S , also developed in the 1980s). ..."
"... By 2013, Python was the language of choice in academia, where I was to return for a year, and whatever it lacked in OOP classes, it made up for in college classes. Python was like Google, who helped spread Python and employed van Rossum for many years. Meanwhile, its adversary Yahoo (largely developed in Perl ) did well, but comparatively fell further behind in defining the future of programming. Python was the favorite and the incumbent; roles had been reversed. ..."
"... from my experience? Perl's eventual problem is that if the Perl community cannot attract beginner users like Python successfully has ..."
"... The fact that you have to import a library, or put up with some extra syntax, is significantly easier than the transactional cost of learning a new language and switching to it. ..."
"... MIT Python replaced Scheme as the first language of instruction for all incoming freshman, in the mid-2000s ..."
Jan 13, 2014 | www.fastcompany.com

And the rise of Python. Does Perl have a future?

I first heard of Perl when I was in middle school in the early 2000s. It was one of the world's most versatile programming languages, dubbed the Swiss army knife of the Internet. But compared to its rival Python, Perl has faded from popularity. What happened to the web's most promising language? Perl's low entry barrier compared to compiled, lower level language alternatives (namely, C) meant that Perl attracted users without a formal CS background (read: script kiddies and beginners who wrote poor code). It also boasted a small group of power users ("hardcore hackers") who could quickly and flexibly write powerful, dense programs that fueled Perl's popularity to a new generation of programmers.

A central repository (the Comprehensive Perl Archive Network, or CPAN ) meant that for every person who wrote code, many more in the Perl community (the Programming Republic of Perl ) could employ it. This, along with the witty evangelism by eclectic creator Larry Wall , whose interest in language ensured that Perl led in text parsing, was a formula for success during a time in which lots of text information was spreading over the Internet.

As the 21st century approached, many pearls of wisdom were wrought to move and analyze information on the web. Perl did have a learning curveoften meaning that it was the third or fourth language learned by adoptersbut it sat at the top of the stack.

"In the race to the millennium, it looks like C++ will win, Java will place, and Perl will show," Wall said in the third State of Perl address in 1999. "Some of you no doubt will wish we could erase those top two lines, but I don't think you should be unduly concerned. Note that both C++ and Java are systems programming languages. They're the two sports cars out in front of the race. Meanwhile, Perl is the fastest SUV, coming up in front of all the other SUVs. It's the best in its class. Of course, we all know Perl is in a class of its own."

Then came the upset.

The Perl vs. Python Grudge Match

Then Python came along. Compared to Perl's straight-jacketed scripting, Python was a lopsided affair. It even took after its namesake, Monty Python's Flying Circus. Fittingly, most of Wall's early references to Python were lighthearted jokes at its expense. Well, the millennium passed, computers survived Y2K , and my teenage years came and went. I studied math, science, and humanities but kept myself an arm's distance away from typing computer code. My knowledge of Perl remained like the start of a new text file: cursory , followed by a lot of blank space to fill up.

In college, CS friends at Princeton raved about Python as their favorite language (in spite of popular professor Brian Kernighan on campus, who helped popularize C). I thought Python was new, but I later learned it was around when I grew up as well, just not visible on the charts.

By the late 2000s Python was not only the dominant alternative to Perl for many text parsing tasks typically associated with Perl (i.e. regular expressions in the field of bioinformatics ) but it was also the most proclaimed popular language , talked about with elegance and eloquence among my circle of campus friends, who liked being part of an up-and-coming movement.

Side By Side Comparison: Binary Search

Despite Python and Perl's well documented rivalry and design decision differenceswhich persist to this daythey occupy a similar niche in the programming ecosystem. Both are frequently referred to as "scripting languages," even though later versions are retro-fitted with object oriented programming (OOP) capabilities.

Stylistically, Perl and Python have different philosophies. Perl's best known mottos is " There's More Than One Way to Do It ". Python is designed to have one obvious way to do it. Python's construction gave an advantage to beginners: A syntax with more rules and stylistic conventions (for example, requiring whitespace indentations for functions) ensured newcomers would see a more consistent set of programming practices; code that accomplished the same task would look more or less the same. Perl's construction favors experienced programmers: a more compact, less verbose language with built-in shortcuts which made programming for the expert a breeze.

During the dotcom era and the tech recovery of the mid to late 2000s, high-profile websites and companies such as Dropbox (Python) and Amazon and Craigslist (Perl), in addition to some of the world's largest news organizations ( BBC , Perl ) used the languages to accomplish tasks integral to the functioning of doing business on the Internet. But over the course of the last 15 years , not only how companies do business has changed and grown, but so have the tools they use to have grown as well, unequally to the detriment of Perl. (A growing trend that was identified in the last comparison of the languages, " A Perl Hacker in the Land of Python ," as well as from the Python side a Pythonista's evangelism aggregator , also done in the year 2000.)

Perl's Slow Decline

Today, Perl's growth has stagnated. At the Orlando Perl Workshop in 2013, one of the talks was titled " Perl is not Dead, It is a Dead End ," and claimed that Perl now existed on an island. Once Perl programmers checked out, they always left for good, never to return. Others point out that Perl is left out of the languages to learn first in an era where Python and Java had grown enormously, and a new entrant from the mid-2000s, Ruby, continues to gain ground by attracting new users in the web application arena (via Rails ), followed by the Django framework in Python (PHP has remained stable as the simplest option as well).

In bioinformatics, where Perl's position as the most popular scripting language powered many 1990s breakthroughs like genetic sequencing, Perl has been supplanted by Python and the statistical language R (a variant of S-plus and descendent of S , also developed in the 1980s).

In scientific computing, my present field, Python, not Perl, is the open source overlord, even expanding at Matlab's expense (also a child of the 1980s , and similarly retrofitted with OOP abilities ). And upstart PHP grew in size to the point where it is now arguably the most common language for web development (although its position is dynamic, as Ruby and Python have quelled PHP's dominance and are now entrenched as legitimate alternatives.)

While Perl is not in danger of disappearing altogether, it is in danger of losing cultural relevance , an ironic fate given Wall's love of language. How has Perl become the underdog, and can this trend be reversed? (And, perhaps more importantly, will Perl 6 be released!?)

How I Grew To Love Python

Why Python , and not Perl? Perhaps an illustrative example of what happened to Perl is my own experience with the language. In college, I still stuck to the contained environments of Matlab and Mathematica, but my programming perspective changed dramatically in 2012. I realized lacking knowledge of structured computer code outside the "walled garden" of a desktop application prevented me from fully simulating hypotheses about the natural world, let alone analyzing data sets using the web, which was also becoming an increasingly intellectual and financially lucrative skill set.

One year after college, I resolved to learn a "real" programming language in a serious manner: An all-in immersion taking me over the hump of knowledge so that, even if I took a break, I would still retain enough to pick up where I left off. An older alum from my college who shared similar interestsand an experienced programmer since the late 1990sconvinced me of his favorite language to sift and sort through text in just a few lines of code, and "get things done": Perl. Python, he dismissed, was what "what academics used to think." I was about to be acquainted formally.

Before making a definitive decision on which language to learn, I took stock of online resources, lurked on PerlMonks , and acquired several used O'Reilly books, the Camel Book and the Llama Book , in addition to other beginner books. Yet once again, Python reared its head , and even Perl forums and sites dedicated to the language were lamenting the digital siege their language was succumbing to . What happened to Perl? I wondered. Ultimately undeterred, I found enough to get started (quality over quantity, I figured!), and began studying the syntax and working through examples.

But it was not to be. In trying to overcome the engineered flexibility of Perl's syntax choices, I hit a wall. I had adopted Perl for text analysis, but upon accepting an engineering graduate program offer, switched to Python to prepare.

By this point, CPAN's enormous advantage had been whittled away by ad hoc, hodgepodge efforts from uncoordinated but overwhelming groups of Pythonistas that now assemble in Meetups , at startups, and on college and corporate campuses to evangelize the Zen of Python . This has created a lot of issues with importing ( pointed out by Wall ), and package download synchronizations to get scientific computing libraries (as I found), but has also resulted in distributions of Python such as Anaconda that incorporate the most important libraries besides the standard library to ease the time tariff on imports.

As if to capitalize on the zeitgiest, technical book publisher O'Reilly ran this ad , inflaming Perl devotees.


By 2013, Python was the language of choice in academia, where I was to return for a year, and whatever it lacked in OOP classes, it made up for in college classes. Python was like Google, who helped spread Python and employed van Rossum for many years. Meanwhile, its adversary Yahoo (largely developed in Perl ) did well, but comparatively fell further behind in defining the future of programming. Python was the favorite and the incumbent; roles had been reversed.

So after six months of Perl-making effort, this straw of reality broke the Perl camel's back and caused a coup that overthrew the programming Republic which had established itself on my laptop. I sheepishly abandoned the llama . Several weeks later, the tantalizing promise of a new MIT edX course teaching general CS principles in Python, in addition to numerous n00b examples , made Perl's syntax all too easy to forget instead of regret.

Measurements of the popularity of programming languages, in addition to friends and fellow programming enthusiasts I have met in the development community in the past year and a half, have confirmed this trend, along with the rise of Ruby in the mid-2000s, which has also eaten away at Perl's ubiquity in stitching together programs written in different languages.

While historically many arguments could explain away any one of these studiesperhaps Perl programmers do not cheerlead their language as much, since they are too busy productively programming. Job listings or search engine hits could mean that a programming language has many errors and issues with it, or that there is simply a large temporary gap between supply and demand.

The concomitant picture, and one that many in the Perl community now acknowledge, is that Perl is now essentially a second-tier language, one that has its place but will not be the first several languages known outside of the Computer Science domain such as Java, C, or now Python.

The Future Of Perl (Yes, It Has One)

I believe Perl has a future , but it could be one for a limited audience. Present-day Perl is more suitable to users who have worked with the language from its early days , already dressed to impress . Perl's quirky stylistic conventions, such as using $ in front to declare variables, are in contrast for the other declarative symbol $ for practical programmers todaythe money that goes into the continued development and feature set of Perl's frenemies such as Python and Ruby. And the high activation cost of learning Perl, instead of implementing a Python solution. Ironically, much in the same way that Perl jested at other languages, Perl now finds itself at the receiving end .

What's wrong with Perl , from my experience? Perl's eventual problem is that if the Perl community cannot attract beginner users like Python successfully has, it runs the risk of become like Children of Men , dwindling away to a standstill; vast repositories of hieroglyphic code looming in sections of the Internet and in data center partitions like the halls of the Mines of Moria . (Awe-inspiring and historical? Yes. Lively? No.)

Perl 6 has been an ongoing development since 2000. Yet after 14 years it is not officially done , making it the equivalent of Chinese Democracy for Guns N' Roses. In Larry Wall's words : "We're not trying to make Perl a better language than C++, or Python, or Java, or JavaScript. We're trying to make Perl a better language than Perl. That's all." Perl may be on the same self-inflicted path to perfection as Axl Rose, underestimating not others but itself. "All" might still be too much.

Absent a game-changing Perl release (which still could be "too little, too late") people who learn to program in Python have no need to switch if Python can fulfill their needs, even if it is widely regarded as second or third best in some areas. The fact that you have to import a library, or put up with some extra syntax, is significantly easier than the transactional cost of learning a new language and switching to it. So over time, Python's audience stays young through its gateway strategy that van Rossum himself pioneered, Computer Programming for Everybody . (This effort has been a complete success. For example, at MIT Python replaced Scheme as the first language of instruction for all incoming freshman, in the mid-2000s.)

Python Plows Forward

Python continues to gain footholds one by one in areas of interest, such as visualization (where Python still lags behind other language graphics, like Matlab, Mathematica, or the recent d3.js ), website creation (the Django framework is now a mainstream choice), scientific computing (including NumPy/SciPy), parallel programming (mpi4py with CUDA), machine learning, and natural language processing (scikit-learn and NLTK) and the list continues.

While none of these efforts are centrally coordinated by van Rossum himself, a continually expanding user base, and getting to CS students first before other languages (such as even Java or C), increases the odds that collaborations in disciplines will emerge to build a Python library for themselves, in the same open source spirit that made Perl a success in the 1990s.

As for me? I'm open to returning to Perl if it can offer me a significantly different experience from Python (but "being frustrating" doesn't count!). Perhaps Perl 6 will be that release. However, in the interim, I have heeded the advice of many others with a similar dilemma on the web. I'll just wait and C .

[Sep 17, 2017] Function pos() and finding the postion at which the pattern matched the string

Notable quotes:
"... Using raw input $string in a regexp will act weird if somebody types in special characters (accidentally or maliciously). Consider using /\Q$string/gi to avoid treating $string as a regexp. ..."
May 16, 2017 | perldoc.perl.org
function. For example,
  1. $x = "cat dog house" ; # 3 words
  2. while ( $x =~ /(\w+)/g ) {
  3. print "Word is $1, ends at position " , pos $x , "\n" ;
  4. }

prints

  1. Word is cat, ends at position 3
  2. Word is dog, ends at position 7
  3. Word is house, ends at position 13

A failed match or changing the target string resets the position. If you don't want the position reset after failure to match, add the //c , as in /regex/gc .

In Perl, how do you find the position of a match in a string, if forced to use a foreach loop pos - Stack Overflow

favorite 2 I have to find all the positions of matching strings within a larger string using a while loop, and as a second method using a foreach loop. I have figured out the while loop method, but I am stuck on a foreach method. Here is the 'while' method:

....

my
 $sequence 
=
'AACAAATTGAAACAATAAACAGAAACAAAAATGGATGCGATCAAGAAAAAGATGC'
.
'AGGCGATGAAAATCGAGAAGGATAACGCTCTCGATCGAGCCGATGCCGCGGAAGA'
.
'AAAAGTACGTCAAATGACGGAAAAGTTGGAACGAATCGAGGAAGAACTACGTGAT'
.
'ACCCAGAAAAAGATGATGCNAACTGAAAATGATTTAGATAAAGCACAGGAAGATT'
.
'TATCTGTTGCAAATACCAACTTGGAAGATAAGGAAAAGAAAGTTCAAGAGGCGGA'
.
'GGCTGAGGTAGCANCCCTGAATCGTCGTATGACACTTCTGGAAGAGGAATTGGAA'
.
'CGAGCTGAGGAACGTTTGAAGATTGCAACGGATAAATTGGAAGAAGCAACACATA'
.
'CAGCTGATGAATCTGAACGTGTTCGCNAGGTTATGGAAA'
;
my
 $string 
=
<
STDIN
>;

chomp $string
;
while
(
$sequence 
=~
/
$string
/
gi 
)
{

 printf 
"Sequence found at position: %d\n"
,
 pos
(
$sequence
)-
 length
(
$string
);
}

Here is my foreach method:


foreach
(
$sequence 
=~
/
$string
/
gi 
)
 

 printf 
"Sequence found at position: %d\n"
,
 pos
(
$sequence
)
-
 length
(
$string
);
}

Could someone please give me a clue on why it doesn't work the same way? Thanks!

My Output if I input "aaca":


Part
1
 using a 
while
 loop
Sequence
 found at position
:
0
Sequence
 found at position
:
10
Sequence
 found at position
:
17
Sequence
 found at position
:
23
Sequence
 found at position
:
377
Part
2
 using a 
foreach
 loop
Sequence
 found at position
:
-
4
Sequence
 found at position
:
-
4
Sequence
 found at position
:
-
4
Sequence
 found at position
:
-
4
Sequence
 found at position
:
-
4

perl foreach
share | improve this question asked Jan 31 '11 at 21:38 user83598 66 3 9
2
Using raw input $string in a regexp will act weird if somebody types in special characters (accidentally or maliciously). Consider using /\Q$string/gi to avoid treating $string as a regexp. aschepler Jan 31 '11 at 22:15
add a comment

Answers 2 active oldest votes

up vote 9 down vote accepted Your problem here is context. In the while loop, the condition is in scalar context. In scalar context, the match operator in g mode will sequentially match along the string. Thus checking pos within the loop does what you want.

In the foreach loop, the condition is in list context. In list context, the match operator in g mode will return a list of all matches (and it will calculate all of the matches before the loop body is ever entered). foreach is then loading the matches one by one into $_ for you, but you are never using the variable. pos in the body of the loop is not useful as it contains the result after the matches have ended.

The takeaway here is that if you want pos to work, and you are using the g modifier, you should use the while loop which imposes scalar context and makes the regex iterate across the matches in the string.

Sinan inspired me to write a few foreach examples:

  • This one is fairly succinct using split in separator retention mode:
    
    my
     $pos 
    =
    0
    ;
    foreach
    (
    split 
    /(
    $string
    )/
    i 
    =>
     $sequence
    )
    {
    print
    "Sequence found at position: $pos\n"
    if
     lc eq lc $string
    ;
    
        $pos 
    +=
     length
    ;
    }
    
    
  • A regex equivalent of the split solution:
    
    my
     $pos 
    =
    0
    ;
    foreach
    (
    $sequence 
    =~
    /(
    \Q$string\E
    |(?:(?!
    \Q$string\E
    ).)+)/
    gi
    )
    {
    print
    "Sequence found at position: $pos\n"
    if
     lc eq lc $string
    ;
    
        $pos 
    +=
     length
    ;
    }
    
    
  • But this is clearly the best solution for your problem:
    
    {
    package
    Dumb
    ::
    Homework
    ;
    sub
     TIEARRAY 
    {
    
            bless 
    {
    
                haystack 
    =>
     $_
    [
    1
    ],
    
                needle   
    =>
     $_
    [
    2
    ],
    
                size     
    =>
    2
    **
    31
    -
    1
    ,
    
                pos      
    =>
    [],
    }
    }
    sub
     FETCH 
    {
    my
    (
    $self
    ,
     $index
    )
    =
    @_
    ;
    my
    (
    $pos
    ,
     $needle
    )
    =
    @$self
    {
    qw
    (
    pos needle
    )};
    
    
            return $$pos
    [
    $index
    ]
    if
     $index 
    <
    @$pos
    ;
    while
    (
    $index 
    +
    1
    >=
    @$pos
    )
    {
    unless
    (
    $$self
    {
    haystack
    }
    =~
    /
    \Q$needle
    /
    gi
    )
    {
    
                    $$self
    {
    size
    }
    =
    @$pos
    ;
    last
    }
    
                push 
    @$pos
    ,
     pos 
    (
    $$self
    {
    haystack
    })
    -
     length $needle
    ;
    }
    
            $$pos
    [
    $index
    ]
    }
    sub
     FETCHSIZE 
    {
    $_
    [
    0
    ]{
    size
    }}
    }
    
    
    tie 
    my
    @pos
    ,
    'Dumb::Homework'
    =>
     $sequence
    ,
     $string
    ;
    print
    "Sequence found at position: $_\n"
    foreach
    @pos
    ;
    # look how clean it is
    
    

    The reason its the best is because the other two solutions have to process the entire global match first, before you ever see a result. For large inputs (like DNA) that could be a problem. The Dumb::Homework package implements an array that will lazily find the next position each time the foreach iterator asks for it. It will even store the positions so you can get to them again without reprocessing. (In truth it looks one match past the requested match, this allows it to end properly in the foreach , but still much better than processing the whole list)

  • Actually, the best solution is still to not use foreach as it is not the correct tool for the job.

[Jun 28, 2017] A Short Guide to DBI

Notable quotes:
"... Structured Query Language ..."
"... database handle ..."
"... statement handle ..."
Jun 18, 2017 | www.perl.com
By Mark-Jason Dominus on October 22, 1999 12:00 AM Short guide to DBI (The Perl Database Interface Module) General information about relational databases

Relational databases started to get to be a big deal in the 1970's, and they're still a big deal today, which is a little peculiar, because they're a 1960's technology.

A relational database is a bunch of rectangular tables. Each row of a table is a record about one person or thing; the record contains several pieces of information called fields . Here is an example table:

        LASTNAME   FIRSTNAME   ID   POSTAL_CODE   AGE  SEX
        Gauss      Karl        119  19107         30   M
        Smith      Mark        3    T2V 3V4       53   M
        Noether    Emmy        118  19107         31   F
        Smith      Jeff        28   K2G 5J9       19   M
        Hamilton   William     247  10139         2    M

The names of the fields are LASTNAME , FIRSTNAME , ID , POSTAL_CODE , AGE , and SEX . Each line in the table is a record , or sometimes a row or tuple . For example, the first row of the table represents a 30-year-old male whose name is Karl Gauss, who lives at postal code 19107, and whose ID number is 119.

Sometimes this is a very silly way to store information. When the information naturally has a tabular structure it's fine. When it doesn't, you have to squeeze it into a table, and some of the techniques for doing that are more successful than others. Nevertheless, tables are simple and are easy to understand, and most of the high-performance database systems you can buy today operate under this 1960's model.

About SQL

SQL stands for Structured Query Language . It was invented at IBM in the 1970's. It's a language for describing searches and modifications to a relational database.

SQL was a huge success, probably because it's incredibly simple and anyone can pick it up in ten minutes. As a result, all the important database systems support it in some fashion or another. This includes the big players, like Oracle and Sybase, high-quality free or inexpensive database systems like MySQL, and funny hacks like Perl's DBD::CSV module, which we'll see later.

There are four important things one can do with a table:

SELECT
Find all the records that have a certain property

INSERT

DELETE
Remove old records

UPDATE
Modify records that are already there

Those are the four most important SQL commands, also called queries . Suppose that the example table above is named people . Here are examples of each of the four important kinds of queries:

 SELECT firstname FROM people WHERE lastname = 'Smith'

(Locate the first names of all the Smiths.)

 DELETE FROM people WHERE id = 3

(Delete Mark Smith from the table)

 UPDATE people SET age = age+1 WHERE id = 247

(William Hamilton just had a birthday.)



   

(Add Leonhard Euler to the table.)

There are a bunch of other SQL commands for creating and discarding tables, for granting and revoking access permissions, for committing and abandoning transactions, and so forth. But these four are the important ones. Congratulations; you are now a SQL programmer. For the details, go to any reasonable bookstore and pick up a SQL quick reference.

About Databases --

Every database system is a little different. You talk to some databases over the network and make requests of the database engine; other databases you talk to through files or something else.

Typically when you buy a commercial database, you get a library with it. The vendor has written some functions for talking to the database in some language like C, compiled the functions, and the compiled code is the library. You can write a C program that calls the functions in the library when it wants to talk to the database.

There's a saying that any software problem can be solved by adding a layer of indirection. That's what Perl's DBI (`Database Interface') module is all about. It was written by Tim Bunce.

DBI is designed to protect you from the details of the vendor libraries. It has a very simple interface for saying what SQL queries you want to make, and for getting the results back. DBI doesn't know how to talk to any particular database, but it does know how to locate and load in DBD modules have the vendor libraries in them and know how to talk to the real databases; there is one DBD module for every different database.

When you ask DBI module, which spins around three times or drinks out of its sneaker or whatever is necessary to communicate with the real database. When it gets the results back, it passes them to DBI . Then DBI gives you the results. Since your program only has to deal with DBI , and not with the real database, you don't have to worry about barking like a chicken.

Here's your program talking to the DBI library. You are using two databases at once. One is an Oracle database server on some other machine, and another is a DBD::CSV database that stores the data in a bunch of plain text files on the local disk.

Your program sends a query to DBI , which forwards it to the appropriate DBD module; let's say it's DBD::Oracle . DBD::Oracle knows how to translate what it gets from DBI into the format demanded by the Oracle library, which is built into it. The library forwards the request across the network, gets the results back, and returns them to DBD::Oracle . DBD::Oracle returns the results to DBI as a Perl data structure. Finally, your program can get the results from DBI .

On the other hand, suppose that your program was querying the text files. It would prepare the same sort of query in exactly the same way, and send it to DBI in exactly the same way. DBI would see that you were trying to talk to the DBD::CSV database and forward the request to the DBD::CSV module. The DBD::CSV module has Perl functions in it that tell it how to parse SQL and how to hunt around in the text files to find the information you asked for. It then returns the results to DBI as a Perl data structure. Finally, your program gets the results from DBI in exactly the same way that it would have if you were talking to Oracle instead.

There are two big wins that result from this organization. First, you don't have to worry about the details of hunting around in text files or talking on the network to the Oracle server or dealing with Oracle's library. You just have to know how to talk to DBI .

Second, if you build your program to use Oracle, and then the following week upper management signs a new Strategic Partnership with Sybase, it's easy to convert your code to use Sybase instead of Oracle. You change exactly one line in your program, the line that tells DBI to talk to DBD::Oracle , and have it use DBD::Sybase instead. Or you might build your program to talk to a cheap, crappy database like MS Access, and then next year when the application is doing well and getting more use than you expected, you can upgrade to a better database next year without changing any of your code.

There are DBD modules for talking to every important kind of SQL database. DBD::Oracle will talk to Oracle, and DBD::Sybase will talk to Sybase. DBD::ODBC will talk to any ODBC database including Microsoft Acesss. (ODBC is a Microsoft invention that is analogous to DBI itself. There is no DBD module for talking to Access directly.) DBD::CSV allows SQL queries on plain text files. DBD::mysql talks to the excellent MySQL database from TCX DataKonsultAB in Sweden. (MySQL is a tremendous bargain: It's $200 for commercial use, and free for noncommerical use.)

Example of How to Use DBI

Here's a typical program. When you run it, it waits for you to type a last name. Then it searches the database for people with that last name and prints out the full name and ID number for each person it finds. For example:

 Enter name> Noether
                118: Emmy Noether

        Enter name> Smith
                3: Mark Smith
                28: Jeff Smith

        Enter name> Snonkopus
                No names matched `Snonkopus'.
       
        Enter name> ^D

Here is the code:

 use DBI;

        my $dbh = DBI->connect('DBI:Oracle:payroll')
                or die "Couldn't connect to database: " . DBI->errstr;
        my $sth = $dbh->prepare('SELECT * FROM people WHERE lastname = ?')
                or die "Couldn't prepare statement: " . $dbh->errstr;

        print "Enter name> ";
        while ($lastname = <>) {               # Read input from the user
          my @data;
          chomp $lastname;
          $sth->execute($lastname)             # Execute the query
            or die "Couldn't execute statement: " . $sth->errstr;

          # Read the matching records and print them out         
          while (@data = $sth->fetchrow_array()) {
            my $firstname = $data[1];
            my $id = $data[2];
            print "\t$id: $firstname $lastname\n";
          }

          if ($sth->rows == 0) {
            print "No names matched `$lastname'.\n\n";
          }

          $sth->finish;
          print "\n";
          print "Enter name> ";
        }
         
        $dbh->disconnect;

Explanation of the Example --

 use DBI;

This loads in the DBI module. Notice that we don't have to load in any DBD module. DBI will do that for us when it needs to.

 my $dbh = DBI->connect('DBI:Oracle:payroll');
                or die "Couldn't connect to database: " . DBI->errstr;

The connect call tries to connect to a database. The first argument, DBI:Oracle:payroll , tells DBI what kind of database it is connecting to. The Oracle part tells it to load DBD::Oracle and to use that to communicate with the database. If we had to switch to Sybase next week, this is the one line of the program that we would change. We would have to change Oracle to Sybase .

payroll is the name of the database we will be searching. If we were going to supply a username and password to the database, we would do it in the connect call:

 my $dbh = DBI->connect('DBI:Oracle:payroll', 'username', 'password')
                or die "Couldn't connect to database: " . DBI->errstr;

If DBI connects to the database, it returns a database handle object, which we store into $dbh . This object represents the database connection. We can be connected to many databases at once and have many such database connection objects.

If DBI can't connect, it returns an undefined value. In this case, we use die to abort the program with an error message. DBI->errstr returns the reason why we couldn't connect-``Bad password'' for example.

 my $sth = $dbh->prepare('SELECT * FROM people WHERE lastname = ?')
                or die "Couldn't prepare statement: " . $dbh->errstr;

The prepare call prepares a query to be executed by the database. The argument is any SQL at all. On high-end databases, prepare will send the SQL to the database server, which will compile it. If prepare is successful, it returns a statement handle object which represents the statement; otherwise it returns an undefined value and we abort the program. $dbh->errstr will return the reason for failure, which might be ``Syntax error in SQL''. It gets this reason from the actual database, if possible.

The ? in the SQL will be filled in later. Most databases can handle this. For some databases that don't understand the ? , the DBD module will emulate it for you and will pretend that the database understands how to fill values in later, even though it doesn't.

 print "Enter name> ";

Here we just print a prompt for the user.

 while ($lastname = <>) {               # Read input from the user
          ...
        }

This loop will repeat over and over again as long as the user enters a last name. If they type a blank line, it will exit. The Perl <> symbol means to read from the terminal or from files named on the command line if there were any.

 my @data;

This declares a variable to hold the data that we will get back from the database.

 chomp $lastname;

This trims the newline character off the end of the user's input.

 $sth->execute($lastname)             # Execute the query
            or die "Couldn't execute statement: " . $sth->errstr;

execute executes the statement that we prepared before. The argument $lastname is substituted into the SQL in place of the ? that we saw earlier. execute returns a true value if it succeeds and a false value otherwise, so we abort if for some reason the execution fails.

 while (@data = $sth->fetchrow_array()) {
            ...
           }

fetchrow_array returns one of the selected rows from the database. You get back an array whose elements contain the data from the selected row. In this case, the array you get back has six elements. The first element is the person's last name; the second element is the first name; the third element is the ID, and then the other elements are the postal code, age, and sex.

Each time we call fetchrow_array , we get back a different record from the database. When there are no more matching records, fetchrow_array returns the empty list and the while loop exits.

 my $firstname = $data[1];
             my $id = $data[2];

These lines extract the first name and the ID number from the record data.

 print "\t$id: $firstname $lastname\n";

This prints out the result.

 if ($sth->rows == 0) {
            print "No names matched `$lastname'.\n\n";
          }

The rows method returns the number of rows of the database that were selected. If no rows were selected, then there is nobody in the database with the last name that the user is looking for. In that case, we print out a message. We have to do this after the while loop that fetches whatever rows were available, because with some databases you don't know how many rows there were until after you've gotten them all.

 $sth->finish;
          print "\n";
          print "Enter name> ";

Once we're done reporting about the result of the query, we print another prompt so that the user can enter another name. finish tells the database that we have finished retrieving all the data for this query and allows it to reinitialize the handle so that we can execute it again for the next query.

 $dbh->disconnect;

When the user has finished querying the database, they type a blank line and the main while loop exits. disconnect closes the connection to the database.

Cached Queries

Here's a function which looks up someone in the example table, given their ID number, and returns their age:

 sub age_by_id {
          # Arguments: database handle, person ID number
          my ($dbh, $id) = @_;
          my $sth = $dbh->prepare('SELECT age FROM people WHERE id = ?')
            or die "Couldn't prepare statement: " . $dbh->errstr;

 $sth->execute($id)
            or die "Couldn't execute statement: " . $sth->errstr;

 my ($age) = $sth->fetchrow_array();
          return $age;
        }

It prepares the query, executes it, and retrieves the result.

There's a problem here though. Even though the function works correctly, it's inefficient. Every time it's called, it prepares a new query. Typically, preparing a query is a relatively expensive operation. For example, the database engine may parse and understand the SQL and translate it into an internal format. Since the query is the same every time, it's wasteful to throw away this work when the function returns.

Here's one solution:

 { my $sth;
          sub age_by_id {
            # Arguments: database handle, person ID number
            my ($dbh, $id) = @_;

 if (! defined $sth) {
              $sth = $dbh->prepare('SELECT age FROM people WHERE id = ?')
                or die "Couldn't prepare statement: " . $dbh->errstr;
            }

 $sth->execute($id)
              or die "Couldn't execute statement: " . $sth->errstr;

 my ($age) = $sth->fetchrow_array();
            return $age;
          }
        }

There are two big changes to this function from the previous version. First, the $sth variable has moved outside of the function; this tells Perl that its value should persist even after the function returns. Next time the function is called, $sth will have the same value as before.

Second, the prepare code is in a conditional block. It's only executed if $sth does not yet have a value. The first time the function is called, the prepare code is executed and the statement handle is stored into $sth . This value persists after the function returns, and the next time the function is called, $sth still contains the statement handle and the prepare code is skipped.

Here's another solution:

 sub age_by_id {
          # Arguments: database handle, person ID number
          my ($dbh, $id) = @_;
          my $sth = $dbh->prepare_cached('SELECT age FROM people WHERE id = ?')
            or die "Couldn't prepare statement: " . $dbh->errstr;

 $sth->execute($id)
            or die "Couldn't execute statement: " . $sth->errstr;

 my ($age) = $sth->fetchrow_array();
          return $age;
        }

Here the only change to to replace prepare with prepare_cached . The prepare_cached call is just like prepare , except that it looks to see if the query is the same as last time. If so, it gives you the statement handle that it gave you before.

Transactions

Many databases support transactions . This means that you can make a whole bunch of queries which would modify the databases, but none of the changes are actually made. Then at the end you issue the special SQL query COMMIT , and all the changes are made simultaneously. Alternatively, you can issue the query ROLLBACK , in which case all the queries are thrown away.

As an example of this, consider a function to add a new employee to a database. The database has a table called employees that looks like this:

 FIRSTNAME  LASTNAME   DEPARTMENT_ID
        Gauss      Karl       17
        Smith      Mark       19
        Noether    Emmy       17
        Smith      Jeff       666
        Hamilton   William    17

and a table called departments that looks like this:

 ID   NAME               NUM_MEMBERS
        17   Mathematics        3
        666  Legal              1
        19   Grounds Crew       1

The mathematics department is department #17 and has three members: Karl Gauss, Emmy Noether, and William Hamilton.

Here's our first cut at a function to insert a new employee. It will return true or false depending on whether or not it was successful:

 sub new_employee {
          # Arguments: database handle; first and last names of new employee;
          # department ID number for new employee's work assignment
          my ($dbh, $first, $last, $department) = @_;
          my ($insert_handle, $update_handle);

 my $insert_handle =
            $dbh->prepare_cached('INSERT INTO employees VALUES (?,?,?)');
          my $update_handle =
            $dbh->prepare_cached('UPDATE departments
                                     SET num_members = num_members + 1
                                   WHERE id = ?');

 die "Couldn't prepare queries; aborting"
            unless defined $insert_handle && defined $update_handle;

 $insert_handle->execute($first, $last, $department) or return 0;
          $update_handle->execute($department) or return 0;
          return 1;   # Success
        }

We create two handles, one for an insert query that will insert the new employee's name and department number into the employees table, and an update query that will increment the number of members in the new employee's department in the department table. Then we execute the two queries with the appropriate arguments.

There's a big problem here: Suppose, for some reason, the second query fails. Our function returns a failure code, but it's too late, it has already added the employee to the employees table, and that means that the count in the departments table is wrong. The database now has corrupted data in it.

The solution is to make both updates part of the same transaction. Most databases will do this automatically, but without an explicit instruction about whether or not to commit the changes, some databases will commit the changes when we disconnect from the database, and others will roll them back. We should specify the behavior explicitly.

Typically, no changes will actually be made to the database until we issue a commit . The version of our program with commit looks like this:

 sub new_employee {
          # Arguments: database handle; first and last names of new employee;
          # department ID number for new employee's work assignment
          my ($dbh, $first, $last, $department) = @_;
          my ($insert_handle, $update_handle);

 my $insert_handle =
            $dbh->prepare_cached('INSERT INTO employees VALUES (?,?,?)');
          my $update_handle =
            $dbh->prepare_cached('UPDATE departments
                                     SET num_members = num_members + 1
                                   WHERE id = ?');

 die "Couldn't prepare queries; aborting"
            unless defined $insert_handle && defined $update_handle;

 my $success = 1;
          $success &&= $insert_handle->execute($first, $last, $department);
          $success &&= $update_handle->execute($department);

 my $result = ($success ? $dbh->commit : $dbh->rollback);
          unless ($result) {
            die "Couldn't finish transaction: " . $dbh->errstr
          }
          return $success;
        }

We perform both queries, and record in $success whether they both succeeded. $success will be true if both queries succeeded, false otherwise. If the queries succeded, we commit the transaction; otherwise, we roll it back, cancelling all our changes.

The problem of concurrent database access is also solved by transactions. Suppose that queries were executed immediately, and that some other program came along and examined the database after our insert but before our update. It would see inconsistent data in the database, even if our update would eventually have succeeded. But with transactions, all the changes happen simultaneously when we do the commit , and the changes are committed automatically, which means that any other program looking at the database either sees all of them or none.

Miscellaneous --

do

If you're doing an UPDATE , INSERT , or DELETE there is no data that comes back from the database, so there is a short cut. You can say

 $dbh->do('DELETE FROM people WHERE age > 65');

for example, and DBI will prepare the statement, execute it, and finish it. do returns a true value if it succeeded, and a false value if it failed. Actually, if it succeeds it returns the number of affected rows. In the example it would return the number of rows that were actually deleted. ( DBI plays a magic trick so that the value it turns is true even when it is 0. This is bizarre, because 0 is usually false in Perl. But it's convenient because you can use it either as a number or as a true-or-false success code, and it works both ways.)

AutoCommit

If your transactions are simple, you can save yourself the trouble of having to issue a lot of commit s. When you make the connect call, you can specify an AutoCommit option which will perform an automatic commit operation after every successful query. Here's what it looks like:

 my $dbh = DBI->connect('DBI:Oracle:payroll',
                               {AutoCommit => 1},
                              )
                or die "Couldn't connect to database: " . DBI->errstr;

Automatic Error Handling

When you make the connect call, you can specify a RaiseErrors option that handles errors for you automatically. When an error occurs, DBI will abort your program instead of returning a failure code. If all you want is to abort the program on an error, this can be convenient:

 my $dbh = DBI->connect('DBI:Oracle:payroll',
                               {RaiseError => 1},
                              )
                or die "Couldn't connect to database: " . DBI->errstr;

Don't do This

People are always writing code like this:

 while ($lastname = <>) {
          my $sth = $dbh->prepare("SELECT * FROM people
                                   WHERE lastname = '$lastname'");
          $sth->execute();
          # and so on ...
        }

Here we interpolated the value of $lastname directly into the SQL in the prepare call.

This is a bad thing to do for three reasons.

First, prepare calls can take a long time. The database server has to compile the SQL and figure out how it is going to run the query. If you have many similar queries, that is a waste of time.

Second, it will not work if $lastname contains a name like O'Malley or D'Amico or some other name with an ' . The ' has a special meaning in SQL, and the database will not understand when you ask it to prepare a statement that looks like

 SELECT * FROM people WHERE lastname = 'O'Malley'

It will see that you have three ' s and complain that you don't have a fourth matching ' somewhere else.

Finally, if you're going to be constructing your query based on a user input, as we did in the example program, it's unsafe to simply interpolate the input directly into the query, because the user can construct a strange input in an attempt to trick your program into doing something it didn't expect. For example, suppose the user enters the following bizarre value for $input :

 x' or lastname = lastname or lastname = 'y

Now our query has become something very surprising:

 SELECT * FROM people WHERE lastname = 'x'
         or lastname = lastname or lastname = 'y'

The part of this query that our sneaky user is interested in is the second or clause. This clause selects all the records for which lastname is equal to lastname ; that is, all of them. We thought that the user was only going to be able to see a few records at a time, and now they've found a way to get them all at once. This probably wasn't what we wanted.

References

A complete list of DBD modules are available here
You can download these modules here
DBI modules are available here
You can get MySQL from www.tcx.se

People go to all sorts of trouble to get around these problems with interpolation. They write a function that puts the last name in quotes and then backslashes any apostrophes that appear in it. Then it breaks because they forgot to backslash backslashes. Then they make their escape function better. Then their code is a big message because they are calling the backslashing function every other line. They put a lot of work into it the backslashing function, and it was all for nothing, because the whole problem is solved by just putting a ? into the query, like this

 SELECT * FROM people WHERE lastname = ?

All my examples look like this. It is safer and more convenient and more efficient to do it this way.

[Jun 28, 2017] Bless My Referents by Damian Conway

September 16, 1999 | www.perl.com
Introduction

Damian Conway is the author of the newly released Object Oriented Perl , the first of a new series of Perl books from Manning.

Object-oriented programming in Perl is easy. Forget the heavy theory and the sesquipedalian jargon: classes in Perl are just regular packages, objects are just variables, methods are just subroutines. The syntax and semantics are a little different from regular Perl, but the basic building blocks are completely familiar.

The one problem most newcomers to object-oriented Perl seem to stumble over is the notion of references and referents, and how the two combine to create objects in Perl. So let's look at how references and referents relate to Perl objects, and see who gets to be blessed and who just gets to point the finger.

Let's start with a short detour down a dark alley...

References and referents

Sometimes it's important to be able to access a variable indirectly- to be able to use it without specifying its name. There are two obvious motivations: the variable you want may not have a name (it may be an anonymous array or hash), or you may only know which variable you want at run-time (so you don't have a name to offer the compiler).

To handle such cases, Perl provides a special scalar datatype called a reference . A reference is like the traditional Zen idea of the "finger pointing at the moon". It's something that identifies a variable, and allows us to locate it. And that's the stumbling block most people need to get over: the finger (reference) isn't the moon (variable); it's merely a means of working out where the moon is.

Making a reference

When you prefix an existing variable or value with the unary \ operator you get a reference to the original variable or value. That original is then known as the referent to which the reference refers.

For example, if $s is a scalar variable, then \$s is a reference to that scalar variable (i.e. a finger pointing at it) and $s is that finger's referent. Likewise, if @a in an array, then \@a is a reference to it.

In Perl, a reference to any kind of variable can be stored in another scalar variable. For example:

$slr_ref = \$s;     
# scalar $slr_ref stores a reference to scalar $s
$arr_ref = \@a;     
# scalar $arr_ref stores a reference to array @a
$hsh_ref = \%h;     
# scalar $hsh_ref stores a reference to hash %h

Figure 1 shows the relationships produced by those assignments.

Note that the references are separate entities from the referents at which they point. The only time that isn't the case is when a variable happens to contain a reference to itself:

$self_ref = \$self_ref;
     # $self_ref stores a reference to itself!

That (highly unusual) situation produces an arrangement shown in Figure 2.

Once you have a reference, you can get back to the original thing it refers to-it's referent-simply by prefixing the variable containing the reference (optionally in curly braces) with the appropriate variable symbol. Hence to access $s , you could write $$slr_ref or ${$slr_ref} . At first glance, that might look like one too many dollar signs, but it isn't. The $slr_ref tells Perl which variable has the reference; the extra $ tells Perl to follow that reference and treat the referent as a scalar.

Similarly, you could access the array @a as @{$arr_ref} , or the hash %h as %{$hsh_ref} . In each case, the $whatever_ref is the name of the scalar containing the reference, and the leading @ or % indicates what type of variable the referent is. That type is important: if you attempt to prefix a reference with the wrong symbol (for example, @{$slr_ref} or ${$hsh_ref} ), Perl produces a fatal run-time error.

[A series of scalar variables with arrows pointing to
other variables]
Figure 1: References and their referents

[A scalar variable with an arrow pointing back to
itself]
Figure 2: A reference that is its own referent

The "arrow" operator

Accessing the elements of an array or a hash through a reference can be awkward using the syntax shown above. You end up with a confusing tangle of dollar signs and brackets:

${$arr_ref}[0] = ${$hsh_ref}{"first"};  
# i.e. $a[0] = $h{"first"}

So Perl provides a little extra syntax to make life just a little less cluttered:
$arr_ref->[0] = $hsh_ref->{"first"};    
# i.e. $a[0] = $h{"first"}

The "arrow" operator ( -> ) takes a reference on its left and either an array index (in square brackets) or a hash key (in curly braces) on its right. It locates the array or hash that the reference refers to, and then accesses the appropriate element of it.

Identifying a referent

Because a scalar variable can store a reference to any kind of data, and because dereferencing a reference with the wrong prefix leads to fatal errors, it's sometimes important to be able to determine what type of referent a specific reference refers to. Perl provides a built-in function called ref that takes a scalar and returns a description of the kind of reference it contains. Table 1 summarizes the string that is returned for each type of reference.

If $slr_ref contains... then ref($slr_ref) returns... undef
a reference to a scalar
a reference to an array "ARRAY"
a reference to a hash "HASH"
a reference to a subroutine "CODE"
a reference to a filehandle "IO" or "IO::Handle"
a reference to a typeglob "GLOB"
a reference to a precompiled pattern "Regexp"
a reference to another reference "REF"


Table 1: What ref returns

As Table 1 indicates, you can create references to many kinds of Perl constructs, apart from variables.

If a reference is used in a context where a string is expected, then the ref function is called automatically to produce the expected string, and a unique hexadecimal value (the internal memory address of the thing being referred to) is appended. That means that printing out a reference:

print $hsh_ref, "\n";
produces something like:

HASH(0x10027588)

since each element of print 's argument list is stringified before printing.

The ref function has a vital additional role in object-oriented Perl, where it can be used to identify the class to which a particular object belongs. More on that in a moment.

References, referents, and objects

References and referents matter because they're both required when you come to build objects in Perl. In fact, Perl objects are just referents (i.e. variables or values) that have a special relationship with a particular package. References come into the picture because Perl objects are always accessed via a reference, using an extension of the "arrow" notation.

But that doesn't mean that Perl's object-oriented features are difficult to use (even if you're still unsure of references and referents). To do real, useful, production-strength, object-oriented programming in Perl you only need to learn about one extra function, one straightforward piece of additional syntax, and three very simple rules. Let's start with the rules...

Rule 1: To create a class, build a package

Perl packages already have a number of class-like features:

In Perl, those features are sufficient to allow a package to act like a class.

Suppose you wanted to build an application to track faults in a system. Here's how to declare a class named "Bug" in Perl:

package Bug;
That's it! In Perl, classes are packages. No magic, no extra syntax, just plain, ordinary packages. Of course, a class like the one declared above isn't very interesting or useful, since its objects will have no attributes or behaviour.

That brings us to the second rule...

Rule 2: To create a method, write a subroutine

In object-oriented theory, methods are just subroutines that are associated with a particular class and exist specifically to operate on objects that are instances of that class. In Perl, a subroutine that is declared in a particular package is already associated with that package. So to write a Perl method, you just write a subroutine within the package that is acting as your class.

For example, here's how to provide an object method to print Bug objects:

package Bug;
sub print_me
{
       # The code needed to print the Bug goes here
}

Again, that's it. The subroutine print_me is now associated with the package Bug, so whenever Bug is used as a class, Perl automatically treats Bug::print_me as a method.

Invoking the Bug::print_me method involves that one extra piece of syntax mentioned above-an extension to the existing Perl "arrow" notation. If you have a reference to an object of class Bug, you can access any method of that object by using a -> symbol, followed by the name of the method.

For example, if the variable $nextbug holds a reference to a Bug object, you could call Bug::print_me on that object by writing:

$nextbug->print_me();
Calling a method through an arrow should be very familiar to any C++ programmers; for the rest of us, it's at least consistent with other Perl usages:
$hsh_ref->{"key"};           
# Access the hash referred to by $hashref
$arr_ref->[$index];          
# Access the array referred to by $arrayref
$sub_ref->(@args);           
# Access the sub referred to by $subref

$obj_ref->method(@args);     
# Access the object referred to by $objref


The only difference with the last case is that the referent (i.e. the object) pointed to by $objref has many ways of being accessed (namely, its various methods). So, when you want to access that object, you have to specify which particular way-which method-should be used. Hence, the method name after the arrow.

When a method like Bug::print_me is called, the argument list that it receives begins with the reference through which it was called, followed by any arguments that were explicitly given to the method. That means that calling Bug::print_me("logfile") is not the same as calling $nextbug->print_me("logfile") . In the first case, print_me is treated as a regular subroutine so the argument list passed to Bug::print_me is equivalent to:

( "logfile" )
In the second case, print_me is treated as a method so the argument list is equivalent to:
( $objref, "logfile" )
Having a reference to the object passed as the first parameter is vital, because it means that the method then has access to the object on which it's supposed to operate. Hence you'll find that most methods in Perl start with something equivalent to this:
package Bug;
sub print_me
{
    my ($self) = shift;

    # The @_ array now stores the arguments passed to &Bug::print_me
    # The rest of &print_me uses the data referred to by $self 
    # and the explicit arguments (still in @_)
}
or, better still:
package Bug;
sub print_me
{
    my ($self, @args) = @_;

    # The @args array now stores the arguments passed to &Bug::print_me
    # The rest of &print_me uses the data referred to by $self
    # and the explicit arguments (now in @args)
}
This second version is better because it provides a lexically scoped copy of the argument list ( @args ). Remember that the @_ array is "magical"-changing any element of it actually changes the caller's version of the corresponding argument. Copying argument values to a lexical array like @args prevents nasty surprises of this kind, as well as improving the internal documentation of the subroutine (especially if a more meaningful name than @args is chosen).

The only remaining question is: how do you create the invoking object in the first place?

Rule 3: To create an object, bless a referent

Unlike other object-oriented languages, Perl doesn't require that an object be a special kind of record-like data structure. In fact, you can use any existing type of Perl variable-a scalar, an array, a hash, etc.-as an object in Perl.

Hence, the issue isn't how to create the object, because you create them exactly like any other Perl variable: declare them with a my , or generate them anonymously with a [ ... ] or { ... } . The real problem is how to tell Perl that such an object belongs to a particular class. That brings us to the one extra built-in Perl function you need to know about. It's called bless , and its only job is to mark a variable as belonging to a particular class.

The bless function takes two arguments: a reference to the variable to be marked, and a string containing the name of the class. It then sets an internal flag on the variable, indicating that it now belongs to the class.

For example, suppose that $nextbug actually stores a reference to an anonymous hash:

$nextbug = {
                id    => "00001",
                type  => "fatal",
                descr => "application does not compile",
           };
To turn that anonymous hash into an object of class Bug you write:
bless $nextbug, "Bug";
And, once again, that's it! The anonymous array referred to by $nextbug is now marked as being an object of class Bug. Note that the variable $nextbug itself hasn't been altered in any way; only the nameless hash it refers to has been marked. In other words, bless sanctified the referent, not the reference. Figure 3 illustrates where the new class membership flag is set.

You can check that the blessing succeeded by applying the built-in ref function to $nextbug . As explained above, when ref is applied to a reference, it normally returns the type of that reference. Hence, before $nextbug was blessed, ref($nextbug) would have returned the string 'HASH' .

Once an object is blessed, ref returns the name of its class instead. So after the blessing, ref($nextbug) will return 'Bug' . Of course the object itself still is a hash, but now it's a hash that belongs to the Bug class. The various entries of the hash become the attributes of the newly created Bug object.

[A picture of an anonymous hash having a flag set within it]
Figure 3: What changes when an object is blessed

Creating a constructor

Given that you're likely to want to create many such Bug objects, it would be convenient to have a subroutine that took care of all the messy, blessy details. You could pass it the necessary information, and it would then wrap it in an anonymous hash, bless the hash, and give you back a reference to the resulting object.

And, of course, you might as well put such a subroutine in the Bug package itself, and call it something that indicates its role. Such a subroutine is known as a constructor, and it generally looks like this:

package Bug;
sub new
{
    my $class = $_[0];
    my $objref = {
                     id    => $_[1],
                     type  => $_[2],
                     descr => $_[3],
                 };
    bless $objref, $class;
    return $objref;
}
Note that the middle bits of the subroutine (in bold) look just like the raw blessing that was handed out to $nextbug in the previous example.

The bless function is set up to make writing constructors like this a little easier. Specifically, it returns the reference that's passed as its first argument (i.e. the reference to whatever referent it just blessed into object-hood). And since Perl subroutines automatically return the value of their last evaluated statement, that means that you could condense the definition of Bug::new to this:

sub Bug::new
{
        bless { id => $_[1], type => $_[2], descr => $_[3] }, $_[0];
}
This version has exactly the same effects: slot the data into an anonymous hash, bless the hash into the class specified first argument, and return a reference to the hash.

Regardless of which version you use, now whenever you want to create a new Bug object, you can just call:

$nextbug = Bug::new("Bug", $id, $type, $description);
That's a little redundant, since you have to type "Bug" twice. Fortunately, there's another feature of the "arrow" method-call syntax that solves this problem. If the operand to the left of the arrow is the name of a class -rather than an object reference-then the appropriate method of that class is called. More importantly, if the arrow notation is used, the first argument passed to the method is a string containing the class name. That means that you could rewrite the previous call to Bug::new like this:
$nextbug = Bug->new($id, $type, $description);
There are other benefits to this notation when your class uses inheritance, so you should always call constructors and other class methods this way.

Method enacting

Apart from encapsulating the gory details of object creation within the class itself, using a class method like this to create objects has another big advantage. If you abide by the convention of only ever creating new Bug objects by calling Bug::new , you're guaranteed that all such objects will always be hashes. Of course, there's nothing to prevent us from "manually" blessing arrays, or scalars as Bug objects, but it turns out to make life much easier if you stick to blessing one type of object into each class.

For example, if you can be confident that any Bug object is going to be a blessed hash, you can (finally!) fill in the missing code in the Bug:: print_me method:

package Bug;
sub print_me
{
    my ($self) = @_;
    print "ID: $self->{id}\n";
    print "$self->{descr}\n";
    print "(Note: problem is fatal)\n" if $self->{type} eq "fatal";
}
Now, whenever the print_me method is called via a reference to any hash that's been blessed into the Bug class, the $self variable extracts the reference that was passed as the first argument and then the print statements access the various entries of the blessed hash.

Till death us do part...

Objects sometimes require special attention at the other end of their lifespan too. Most object-oriented languages provide the ability to specify a subroutine that is called automatically when an object ceases to exist. Such subroutines are usually called destructors , and are used to undo any side-effects caused by the previous existence of an object. That may include:

In Perl, you can set up a destructor for a class by defining a subroutine named DESTROY in the class's package. Any such subroutine is automatically called on an object of that class, just before that object's memory is reclaimed. Typically, this happens when the last variable holding a reference to the object goes out of scope, or has another value assigned to it.

For example, you could provide a destructor for the Bug class like this:

package Bug;
# other stuff as before

sub DESTROY
{
        my ($self) = @_;
        print "<< Squashed the bug: $self->{id} >>\n\n";
}

Now, every time an object of class Bug is about to cease to exist, that object will automatically have its DESTROY method called, which will print an epitaph for the object. For example, the following code:
package main;
use Bug;

open BUGDATA, "Bug.dat" or die "Couldn't find Bug data";

while (<BUGDATA>)
{
    my @data = split ',', $_;       
# extract comma-separated Bug data
    my $bug = Bug->new(@data);      
# create a new Bug object
    $bug->print_me();               
# print it out
} 

print "(end of list)\n";
prints out something like this:
ID: HW000761
"Cup holder" broken
Note: problem is fatal
<< Squashed the bug HW000761 >>

ID: SW000214
Word processor trashing disk after 20 saves.
<< Squashed the bug SW000214 >> 

ID: OS000633
Can't change background colour (blue) on blue screen of death.
<< Squashed the bug OS000633 >> 

(end of list)
That's because, at the end of each iteration of the while loop, the lexical variable $bug goes out of scope, taking with it the only reference to the Bug object created earlier in the same loop. That object's reference count immediately becomes zero and, because it was blessed, the corresponding DESTROY method (i.e. Bug::DESTROY ) is automatically called on the object.

Where to from here?

Of course, these fundamental techniques only scratch the surface of object-oriented programming in Perl. Simple hash-based classes with methods, constructors, and destructors may be enough to let you solve real problems in Perl, but there's a vast array of powerful and labor-saving techniques you can add to those basic components: autoloaded methods, class methods and class attributes, inheritance and multiple inheritance, polymorphism, multiple dispatch, enforced encapsulation, operator overloading, tied objects, genericity, and persistence.

Perl's standard documentation includes plenty of good material- perlref , perlreftut , perlobj , perltoot , perltootc , and perlbot to get you started. But if you're looking for a comprehensive tutorial on everything you need to know, you may also like to consider my new book, Object Oriented Perl , from which this article has been adapted.

[Jun 28, 2017] Whats Wrong with sort and How to Fix It by Tom Christiansen

Unicode pose some tricky problems... Perl 5.14 introduced unicode_strings feature Perl.com June 2011 Archives
Aug 31, 2011 | www.perl.com
By now, you may have read Considerations on Using Unicode Properly in Modern Perl Applications . Still think doing things correctly is easy? Tom Christiansen demonstrates that even sorting can be trickier than you think.

NOTE : The following is an excerpt from the draft manuscript of Programming Perl , 4ᵗʰ edition

Calling sort without a comparison function is quite often the wrong thing to do, even on plain text. That's because if you use a bare sort, you can get really strange results. It's not just Perl either: almost all programming languages work this way, even the shell command. You might be surprised to find that with this sort of nonsense sort, B comes before a not after it, comes before d, and ff comes after zz. There's no end to such silliness, either; see the default sort tables at the end of this article to see what I mean.

There are situations when a bare sort is appropriate, but fewer than you think. One scenario is when every string you're sorting contains nothing but the 26 lowercase (or uppercase, but not both) Latin letters from a-z, without any whitespace or punctuation.

Another occasion when a simple, unadorned sort is appropriate is when you have no other goal but to iterate in an order that is merely repeatable, even if that order should happen to be completely arbitrary. In other words, yes, it's garbage, but it's the same garbage this time as it was last time. That's because the default sort resorts to an unmediated cmp operator, which has the "predictable garbage" characteristics I just mentioned.

The last situation is much less frequent than the first two. It requires that the things you're sorting be special‐purpose, dedicated binary keys whose bit sequences have with excruciating care been arranged to sort in some prescribed fashion. This is also the strategy for any reasonable use of the cmp operator.

So what's wrong with sort anyway?

I know, I know. I can hear everyone saying, "But it's called sort , so how could that ever be wrong?" Sure it's called sort , but you still have to know how to use it to get useful results out. Probably the most surprising thing about sort is that it does not by default do an alphabetic, an alphanumeric, or a numeric sort. What it actually does is something else altogether, and that something else is of surprisingly limited usefulness.

Imagine you have an array of records. It does you virtually no good to write:

@sorted_recs = sort @recs;

Because Perl's cmp operator does only a bit comparison not an alphabetic one, it does nearly as little good to write your record sort this way:

@srecs = sort {
    $b->{AGE}      <=>  $b->{AGE}
                   ||
    $a->{SURNAME}  cmp  $b->{SURNAME}
} @recs;

The problem is that that cmp for the record's SURNAME field is not an alphabetic comparison. It's merely a code point comparison. That means it works like C's strcmp function or Java's String.compareTo method. Although commonly referred to as a "lexicographic" comparison, this is a gross misnomer: it's about as far away from the way real lexicographers sort dictionary entries as you can get without flipping a coin.

Fortunately, you don't have to come up with your own algorithm for dictionary sorting, because Perl provides a standard class to do this for you: Unicode::Collate . Don't let the name throw you, because while it was first invented for Unicode, it works great on regular ASCII text, too, and does a better job at making lexicographers happy than a plain old sort ever manages.

If you have code that purports to sort text that looks like this:

@sorted_lines = sort @lines;

Then all you have to get a dictionary sort is write instead:

use Unicode::Collate;
@sorted_lines = Unicode::Collate::->new->sort(@lines);

For structured records, like those with ages and surnames in them, you have to be a bit fancier. One way to fix it would be to use the class's own cmp operator instead of the built‐in one.

use Unicode::Collate;
my $collator = Unicode::Collate::->new();
@srecs = sort {
    $b->{AGE}  <=>  $b->{AGE}
          ||
    $collator->cmp( $a->{SURNAME}, $b->{SURNAME} )
} @recs;

However, that makes a fairly expensive method call for every possible comparison. Because Perl's adaptive merge sort algorithm usually runs in O(n log n) time given n items, and because each comparison requires two different computed keys, that can be a lot of duplicate effort. Our sorting class therefore provide a convenient getSortKey method that calculates a special binary key which you can cache and later pass to the normal cmp operator on your own. This trick lets you use cmp yet get a truly alphabetic sort out of it for a change.

Here is a simple but sufficient example of how to do that:

use Unicode::Collate;
my $collator = Unicode::Collate::->new();

# first calculate the magic sort key for each text field, and cache it
for my $rec (@recs) {
    $rec->{SURNAME_key} = $collator->getSortKey( $rec->{SURNAME} );
} 

# now sort the records as before, but for the surname field,
# use the cached sort key instead
@srecs = sort {
    $b->{AGE}          <=>  $b->{AGE}
                      ||
    $a->{SURNAME_key}  cmp  $b->{SURNAME_key}
} @recs;

That's what I meant about very carefully preparing a mediated sort key that contains the precomputed binary key.

English Card Catalogue Sorts

The simple code just demonstrated assumes you want to sort names the same way you do regular text. That isn't a good assumption, however. Many countries, languages, institutions, and sometimes even librarians have their own notions about how a card catalogue or a phonebook ought to be sorted.

For example, in the English language, surnames with Scottish patronymics starting with Mc or Mac, like MacKinley and McKinley , not only count as completely identical synonyms for sorting purposes, they go before any other surname that begins with M, and so precede surnames like Mables or Machado .

Yes, really.

That means that the following names are sorted correctly -- for English:

Lewis, C.S.
McKinley, Bill
MacKinley, Ron
Mables, Martha
Machado, Jos
Macon, Bacon

Yes, it's true. Check out your local large English‐language bookseller or library -- presuming you can find one. If you do, best make sure to blow the dust off first.

Sorting Spanish Names

It's a good thing those names follow English rules for sorting names. If this were Spanish, we would have to deal with double‐barrelled surnames, where the patronym sorts before the matronym, which in turn sorts before any given names. That means that if Seor Machado's full name were, like the poet's, Antonio Cipriano Jos Mara y Francisco de Santa Ana Machado y Ruiz , then you would have to sort him with the other Machados but then consider Ruiz before Antonio if there were any other Machados . Similarly, the poet Federico del Sagrado Corazn de Jess Garca Lorca sorts before the writer Gabriel Jos de la Concordia Garca Mrquez .

On the other hand, if your records are not full multifield hashes but only simple text that don't happen to be surnames, your task is a lot simpler, since now all you have to is get the cmp operator to behave sensibly. That you can do easily enough this way:

use Unicode::Collate;
@sorted_text = Unicode::Collate::->new->sort(@text);

Sorting Text, Not Binary

Imagine you had this list of German‐language authors:

@germans = qw{
    Bll
    Born
    Bhme
    Bodmer
    Brandis
    Bttcher
    Borchert
    Bobrowski
};

If you just sorted them with an unmediated sort operator, you would get this utter nonsense:

Bobrowski
Bodmer
Borchert
Born
Brandis
Brant
Bhme
Bll
Bttcher

Or maybe this equally nonsensical answer:

Bobrowski
Bodmer
Borchert
Born
Bll
Brandis
Brant
Bhme
Bttcher

Or even this still completely nonsensical answer:

Bobrowski
Bodmer
Borchert
Born
Bhme
Bll
Brandis
Brant
Bttcher

The crucial point to all that is that it's text not binary , so not only can you never judge what its bit patterns hold just by eyeballing it, more importantly, it has special rules to make it sort alphabetically (some might say sanely), an ordering no nave code‐point sort will never come even close to getting right, especially on Unicode.

The correct ordering is:

Bobrowski
Bodmer
Bhme
Bll
Borchert
Born
Bttcher
Brandis
Brant

And that is precisely what

use Unicode::Collate;
@sorted_germans = Unicode::Collate::->new->sort(@german_names);

gives you: a correctly sorted list of those Germans' names.

Sorting German Names

Hold on, though.

Correct in what language? In English, yes, the order given is now correct. But considering that these authors wrote in the German language, it is quite conceivable that you should be following the rules for ordering German names in German , not in English. That produces this ordering:

Bobrowski
Bodmer
Bhme
Bll
Bttcher
Borchert
Born
Brandis
Brant

How come Bttcher now came before Borchert ? Because Bttcher is supposed to be the same as Boettcher . In a German phonebook or other German list of German names, things like and oe are considered synonyms, which is not at all how it works in English. To get the German phonebook sort, you merely have to modify your constructor this way:

use Unicode::Collate::Locale;
@sorted_germans = Unicode::Collate::Locale::
                      ->new(locale => "de_phonebook")
                      ->sort(@german_names);

Isn't this fun?

Be glad you're not sorting names. Sorting names is hard.

Default Sort Tables

Here are most of the Latin letters, ordered using the default sort :

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j 
k l m n o p q r s t u v w x y z                     
                                    
        Ā ā Ă ă Ą ą Ć ć Ĉ ĉ Ċ ċ Č č Ď ď Đ đ Ē ē Ĕ ĕ Ė ė Ę ę Ě ě 
Ĝ ĝ Ğ ğ Ġ ġ Ģ ģ Ĥ ĥ Ħ ħ Ĩ ĩ Ī ī Ĭ ĭ Į į İ ı IJ ij Ĵ ĵ Ķ ķ ĸ Ĺ ĺ Ļ ļ Ľ ľ Ŀ 
ŀ Ł ł Ń ń Ņ ņ Ň ň Ŋ ŋ Ō ō Ŏ ŏ Ő ő   Ŕ ŕ Ŗ ŗ Ř ř Ś ś Ŝ ŝ Ş ş   Ţ ţ Ť 
ť Ŧ ŧ Ũ ũ Ū ū Ŭ ŭ Ů ů Ű ű Ų ų Ŵ ŵ Ŷ ŷ  Ź ź Ż ż   ſ ƀ Ɓ Ƃ ƃ Ƈ ƈ Ɖ Ɗ Ƌ 
ƌ ƍ Ǝ Ə Ɛ Ƒ  Ɠ Ɣ ƕ Ɩ Ɨ Ƙ ƙ ƚ ƛ Ɯ Ɲ ƞ Ƥ ƥ Ʀ ƫ Ƭ ƭ Ʈ Ư ư Ʊ Ʋ Ƴ ƴ Ƶ ƶ Ʒ Ƹ 
ƹ ƺ ƾ ƿ DŽ Dž dž LJ Lj lj NJ Nj nj Ǎ ǎ Ǐ ǐ Ǒ ǒ Ǔ ǔ Ǖ ǖ Ǘ ǘ Ǚ ǚ Ǜ ǜ ǝ Ǟ ǟ Ǡ ǡ Ǣ ǣ 
Ǥ ǥ Ǧ ǧ Ǩ ǩ Ǫ ǫ Ǭ ǭ Ǯ ǯ ǰ DZ Dz dz Ǵ ǵ Ƿ Ǹ ǹ Ǻ ǻ Ǽ ǽ Ǿ ǿ Ȁ ȁ Ȃ ȃ Ȅ ȅ Ȇ ȇ Ȉ 
ȉ Ȋ ȋ Ȍ ȍ Ȏ ȏ Ȑ ȑ Ȓ ȓ Ȕ ȕ Ȗ ȗ Ș ș Ț ț Ȝ ȝ Ȟ ȟ Ƞ ȡ Ȥ ȥ Ȧ ȧ Ȩ ȩ Ȫ ȫ Ȭ ȭ Ȯ 
ȯ Ȱ ȱ Ȳ ȳ ȴ ȵ ȶ ȷ Ⱥ Ȼ ȼ Ƚ Ⱦ ɐ ɑ ɒ ɓ ɕ ɖ ɗ ɘ ə ɚ ɛ ɜ ɝ ɞ ɟ ɠ ɡ ɢ ɣ ɤ ɥ ɦ 
ɧ ɨ ɩ ɪ ɫ ɬ ɭ ɮ ɯ ɰ ɱ ɲ ɳ ɴ ɶ ɹ ɺ ɻ ɼ ɽ ɾ ɿ ʀ ʁ ʂ ʃ ʄ ʅ ʆ ʇ ʈ ʉ ʊ ʋ ʌ ʍ 
ʎ ʏ ʐ ʑ ʒ ʓ ʙ ʚ ʛ ʜ ʝ ʞ ʟ ʠ ʣ ʤ ʥ ʦ ʧ ʨ ʩ ʪ ʫ ˡ ˢ ˣ ᴀ ᴁ ᴂ ᴃ ᴄ ᴅ ᴆ ᴇ ᴈ ᴉ 
ᴊ ᴋ ᴌ ᴍ ᴎ ᴏ ᴑ ᴓ ᴔ ᴘ ᴙ ᴚ ᴛ ᴜ ᴝ ᴞ ᴟ ᴠ ᴡ ᴢ ᴣ ᴬ ᴭ ᴮ ᴯ ᴰ ᴱ ᴲ ᴳ ᴴ ᴵ ᴶ ᴷ ᴸ ᴹ ᴺ 
ᴻ ᴼ ᴾ ᴿ ᵀ ᵁ ᵂ ᵃ ᵄ ᵅ ᵆ ᵇ ᵈ ᵉ ᵊ ᵋ ᵌ ᵍ ᵎ ᵏ ᵐ ᵑ ᵒ ᵖ ᵗ ᵘ ᵙ ᵚ ᵛ ᵢ ᵣ ᵤ ᵥ ᵫ ᵬ ᵭ 
ᵮ ᵯ ᵰ ᵱ ᵲ ᵳ ᵴ ᵵ ᵶ Ḁ ḁ Ḃ ḃ Ḅ ḅ Ḇ ḇ Ḉ ḉ Ḋ ḋ Ḍ ḍ Ḏ ḏ Ḑ ḑ Ḓ ḓ Ḕ ḕ Ḗ ḗ Ḙ ḙ Ḛ 
ḛ Ḝ ḝ Ḟ ḟ Ḡ ḡ Ḣ ḣ Ḥ ḥ Ḧ ḧ Ḩ ḩ Ḫ ḫ Ḭ ḭ Ḯ ḯ Ḱ ḱ Ḳ ḳ Ḵ ḵ Ḷ ḷ Ḹ ḹ Ḻ ḻ Ḽ ḽ Ḿ 
ḿ Ṁ ṁ Ṃ ṃ Ṅ ṅ Ṇ ṇ Ṉ ṉ Ṋ ṋ Ṍ ṍ Ṏ ṏ Ṑ ṑ Ṓ ṓ Ṕ ṕ Ṗ ṗ Ṙ ṙ Ṛ ṛ Ṝ ṝ Ṟ ṟ Ṡ ṡ Ṣ 
ṣ Ṥ ṥ Ṧ ṧ Ṩ ṩ Ṫ ṫ Ṭ ṭ Ṯ ṯ Ṱ ṱ Ṳ ṳ Ṵ ṵ Ṷ ṷ Ṹ ṹ Ṻ ṻ Ṽ ṽ Ṿ ṿ Ẁ ẁ Ẃ ẃ Ẅ ẅ Ẇ 
ẇ Ẉ ẉ Ẋ ẋ Ẍ ẍ Ẏ ẏ Ẑ ẑ Ẓ ẓ Ẕ ẕ ẖ ẗ ẘ ẙ ẚ ẛ ẞ ẟ Ạ ạ Ả ả Ấ ấ Ầ ầ Ẩ ẩ Ẫ ẫ Ậ 
ậ Ắ ắ Ằ ằ Ẳ ẳ Ẵ ẵ Ặ ặ Ẹ ẹ Ẻ ẻ Ẽ ẽ Ế ế Ề ề Ể ể Ễ ễ Ệ ệ Ỉ ỉ Ị ị Ọ ọ Ỏ ỏ Ố 
ố Ồ ồ Ổ ổ Ỗ ỗ Ộ ộ Ớ ớ Ờ ờ Ở ở Ỡ ỡ Ợ ợ Ụ ụ Ủ ủ Ứ ứ Ừ ừ Ử ử Ữ ữ Ự ự Ỳ ỳ Ỵ 
ỵ Ỷ ỷ Ỹ ỹ K Å Ⅎ ⅎ Ⅰ Ⅱ Ⅲ Ⅳ Ⅴ Ⅵ Ⅶ Ⅷ Ⅸ Ⅹ Ⅺ Ⅻ Ⅼ Ⅽ Ⅾ Ⅿ ⅰ ⅱ ⅲ ⅳ ⅴ 
ⅵ ⅶ ⅷ ⅸ ⅹ ⅺ ⅻ ⅼ ⅽ ⅾ ⅿ ff fi fl ffi ffl ſt st A B C D E F G H I
J K L M N O P Q R S T U V W X Y Z a b c d e f g h i
j k l m n o p q r s t u v w x y z

As you can see, those letters are scattered all over the place. Sure, it's not completely random, but it's not useful either, because it is full of arbitrary placement that makes no alphabetical sense. That's because it is not an alphabetic sort at all. However, with the special kind of sort I've just shown you above, the ones that call the sort method from the Unicode::Collate class, you do get an alphabetic sort. Using that method, the Latin letters I just showed you now come out in alphabetical order, which is like this:

a a A A  ᵃ ᴬ     ă Ă ắ Ắ ằ Ằ ẵ Ẵ ẳ Ẳ   ấ Ấ ầ Ầ ẫ Ẫ ẩ Ẩ ǎ Ǎ   
Å ǻ Ǻ   ǟ Ǟ   ȧ Ȧ ǡ Ǡ ą Ą ā Ā ả Ả ȁ Ȁ ȃ Ȃ ạ Ạ ặ Ặ ậ Ậ ḁ Ḁ   ᴭ ǽ Ǽ 
ǣ Ǣ ẚ ᴀ Ⱥ ᴁ ᴂ ᵆ ɐ ᵄ ɑ ᵅ ɒ b b B B ᵇ ᴮ ḃ Ḃ ḅ Ḅ ḇ Ḇ ʙ ƀ ᴯ ᴃ ᵬ ɓ Ɓ ƃ Ƃ c 
c ⅽ C C Ⅽ ć Ć ĉ Ĉ č Č ċ Ċ   ḉ Ḉ ᴄ ȼ Ȼ ƈ Ƈ ɕ d d ⅾ D D Ⅾ ᵈ ᴰ ď Ď ḋ 
Ḋ ḑ Ḑ ḍ Ḍ ḓ Ḓ ḏ Ḏ đ Đ   dz ʣ Dz DZ dž Dž DŽ ʥ ʤ ᴅ ᴆ ᵭ ɖ Ɖ ɗ Ɗ ƌ Ƌ ȡ ẟ e e E 
E ᵉ ᴱ     ĕ Ĕ   ế Ế ề Ề ễ Ễ ể Ể ě Ě   ẽ Ẽ ė Ė ȩ Ȩ ḝ Ḝ ę Ę ē Ē ḗ 
Ḗ ḕ Ḕ ẻ Ẻ ȅ Ȅ ȇ Ȇ ẹ Ẹ ệ Ệ ḙ Ḙ ḛ Ḛ ᴇ ǝ Ǝ ᴲ ə Ə ᵊ ɛ Ɛ ᵋ ɘ ɚ ɜ ᴈ ᵌ ɝ ɞ ʚ ɤ 
f f F F ḟ Ḟ ff ffi ffl fi fl ʩ ᵮ  Ƒ ⅎ Ⅎ g g G G ᵍ ᴳ ǵ Ǵ ğ Ğ ĝ Ĝ ǧ Ǧ ġ Ġ ģ 
Ģ ḡ Ḡ ɡ ɢ ǥ Ǥ ɠ Ɠ ʛ ɣ Ɣ h h H H ᴴ ĥ Ĥ ȟ Ȟ ḧ Ḧ ḣ Ḣ ḩ Ḩ ḥ Ḥ ḫ Ḫ ẖ ħ Ħ ʜ 
ƕ ɦ ɧ i i ⅰ I I Ⅰ ᵢ ᴵ     ĭ Ĭ   ǐ Ǐ   ḯ Ḯ ĩ Ĩ İ į Į ī Ī ỉ Ỉ ȉ 
Ȉ ȋ Ȋ ị Ị ḭ Ḭ ⅱ Ⅱ ⅲ Ⅲ ij IJ ⅳ Ⅳ ⅸ Ⅸ ı ɪ ᴉ ᵎ ɨ Ɨ ɩ Ɩ j j J J ᴶ ĵ Ĵ ǰ ȷ ᴊ 
ʝ ɟ ʄ k k K K K ᵏ ᴷ ḱ Ḱ ǩ Ǩ ķ Ķ ḳ Ḳ ḵ Ḵ ᴋ ƙ Ƙ ʞ l l ⅼ L L Ⅼ ˡ ᴸ ĺ Ĺ 
ľ Ľ ļ Ļ ḷ Ḷ ḹ Ḹ ḽ Ḽ ḻ Ḻ ł Ł ŀ Ŀ lj Lj LJ ʪ ʫ ʟ ᴌ ƚ Ƚ ɫ ɬ ɭ ȴ ɮ ƛ ʎ m m ⅿ M 
M Ⅿ ᵐ ᴹ ḿ Ḿ ṁ Ṁ ṃ Ṃ ᴍ ᵯ ɱ n n N N ᴺ ń Ń ǹ Ǹ ň Ň   ṅ Ṅ ņ Ņ ṇ Ṇ ṋ Ṋ ṉ 
Ṉ nj Nj NJ ɴ ᴻ ᴎ ᵰ ɲ Ɲ ƞ Ƞ ɳ ȵ ŋ Ŋ ᵑ o o O O  ᵒ ᴼ     ŏ Ŏ   ố Ố ồ 
Ồ ỗ Ỗ ổ Ổ ǒ Ǒ   ȫ Ȫ ő Ő   ṍ Ṍ ṏ Ṏ ȭ Ȭ ȯ Ȯ ȱ Ȱ   ǿ Ǿ ǫ Ǫ ǭ Ǭ ō Ō ṓ 
Ṓ ṑ Ṑ ỏ Ỏ ȍ Ȍ ȏ Ȏ ớ Ớ ờ Ờ ỡ Ỡ ở Ở ợ Ợ ọ Ọ ộ Ộ   ᴏ ᴑ ɶ ᴔ ᴓ p p P P ᵖ 
ᴾ ṕ Ṕ ṗ Ṗ ᴘ ᵱ ƥ Ƥ q q Q Q ʠ ĸ r r R R ᵣ ᴿ ŕ Ŕ ř Ř ṙ Ṙ ŗ Ŗ ȑ Ȑ ȓ Ȓ ṛ 
Ṛ ṝ Ṝ ṟ Ṟ ʀ Ʀ ᴙ ᵲ ɹ ᴚ ɺ ɻ ɼ ɽ ɾ ᵳ ɿ ʁ s s S S ˢ ś Ś ṥ Ṥ ŝ Ŝ   ṧ Ṧ ṡ 
Ṡ ş Ş ṣ Ṣ ṩ Ṩ ș Ș ſ ẛ  ẞ st ſt ᵴ ʂ ʃ ʅ ʆ t t T T ᵗ ᵀ ť Ť ẗ ṫ Ṫ ţ Ţ ṭ Ṭ 
ț Ț ṱ Ṱ ṯ Ṯ ʨ ƾ ʦ ʧ ᴛ ŧ Ŧ Ⱦ ᵵ ƫ ƭ Ƭ ʈ Ʈ ȶ ʇ u u U U ᵘ ᵤ ᵁ     ŭ Ŭ 
  ǔ Ǔ ů Ů   ǘ Ǘ ǜ Ǜ ǚ Ǚ ǖ Ǖ ű Ű ũ Ũ ṹ Ṹ ų Ų ū Ū ṻ Ṻ ủ Ủ ȕ Ȕ ȗ Ȗ ư Ư 
ứ Ứ ừ Ừ ữ Ữ ử Ử ự Ự ụ Ụ ṳ Ṳ ṷ Ṷ ṵ Ṵ ᴜ ᴝ ᵙ ᴞ ᵫ ʉ ɥ ɯ Ɯ ᵚ ᴟ ɰ ʊ Ʊ v v ⅴ V 
V Ⅴ ᵛ ᵥ ṽ Ṽ ṿ Ṿ ⅵ Ⅵ ⅶ Ⅶ ⅷ Ⅷ ᴠ ʋ Ʋ ʌ w w W W ᵂ ẃ Ẃ ẁ Ẁ ŵ Ŵ ẘ ẅ Ẅ ẇ Ẇ ẉ 
Ẉ ᴡ ʍ x x ⅹ X X Ⅹ ˣ ẍ Ẍ ẋ Ẋ ⅺ Ⅺ ⅻ Ⅻ y y Y Y   ỳ Ỳ ŷ Ŷ ẙ   ỹ Ỹ ẏ 
Ẏ ȳ Ȳ ỷ Ỷ ỵ Ỵ ʏ ƴ Ƴ z z Z Z ź Ź ẑ Ẑ   ż Ż ẓ Ẓ ẕ Ẕ ƍ ᴢ ƶ Ƶ ᵶ ȥ Ȥ ʐ ʑ 
ʒ Ʒ ǯ Ǯ ᴣ ƹ Ƹ ƺ ʓ ȝ Ȝ   ƿ Ƿ

Isn't that much nicer?

Romani Ite Domum

In case you're wondering what that last row of distinctly un‐Roman Latin letters might possibly be, they're called respectively ezh ʒ, yogh ȝ, thorn , and wynn ƿ. They had to go somewhere, so they ended up getting stuck after z

Some are still used in certain non‐English (but still Latin) alphabets today, such as Icelandic, and even though you probably won't bump into them in contemporary English texts, you might see some if you're reading the original texts of famous medieval English poems like Beowulf , Sir Gawain and the Green Knight , or Brut .

The last of those, Brut , was written by a fellow named Laȝamon , a name whose third letter is a yogh. Famous though he was, I wouldn't suggest changing your name to Laȝamon in his honor, as I doubt the phone company would be amused.

[Jun 18, 2017] Making Perl Reusable with Modules

Notable quotes:
"... Figure 1. Creating the resultant of 5 and 3 ..."
"... Music-Resultant ..."
"... Music-Resultant ..."
"... Music-Resultant/lib/Music/Resultant.pm ..."
"... Resultant.pm ..."
"... Music-Resultant ..."
"... Music-Resultant/t/00-load.t ..."
"... Music-Resultant/ ..."
Jun 18, 2017 | www.perl.com
By Andy Sylvester on August 7, 2007 12:00 AM
Perl software development can occur at several levels. When first developing the idea for an application, a Perl developer may start with a short program to flesh out the necessary algorithms. After that, the next step might be to create a package to support object-oriented development. The final work is often to create a Perl module for the package to make the logic available to all parts of the application. Andy Sylvester explores this topic with a simple mathematical function. Creating a Perl Subroutine

I am working on ideas for implementing some mathematical concepts for a method of composing music. The ideas come from the work of Joseph Schillinger . At the heart of the method is being able to generate patterns using mathematical operations and using those patterns in music composition. One of the basic operations described by Schillinger is creating a "resultant," or series of numbers, based on two integers (or "generators"). Figure 1 shows a diagram of how to create the resultant of the integers 5 and 3.

creating the resultant of 5 and 3
Figure 1. Creating the resultant of 5 and 3

Figure 1 shows two line patterns with units of 5 and units of 3. The lines continue until both lines come down (or "close") at the same time. The length of each line corresponds to the product of the two generators (5 x 3 = 15). If you draw dotted lines down from where each of the two generator lines change state, you can create a third line that changes state at each of the dotted line points. The lengths of the segments of the third line make up the resultant of the integers 5 and 3 (3, 2, 1, 3, 1, 2, 3).

Schillinger used graph paper to create resultants in his System of Musical Composition . However, another convenient way of creating a resultant is to calculate the modulus of a counter and then calculate a term in the resultant series based on the state of the counter. An algorithm to create the terms in a resultant might resemble:

Read generators from command line
Determine total number of counts for resultant
   (major_generator * minor_generator)
Initialize resultant counter = 0
For MyCounts from 1 to the total number of counts
   Get the modulus of MyCounts to the major and minor generators
   Increment the resultant counter
   If either modulus = 0
     Save the resultant counter to the resultant array
     Re-initialize resultant counter = 0
   End if
End for

From this design, I wrote a short program using the Perl modulus operator ( % ):

#!/usr/bin/perl
#*******************************************************
#
# FILENAME: result01.pl
#
# USAGE: perl result01.pl major_generator minor_generator
#
# DESCRIPTION:
#    This Perl script will generate a Schillinger resultant
#    based on two integers for the major generator and minor
#    generator.
#
#    In normal usage, the user will input the two integers
#    via the command line. The sequence of numbers representing
#    the resultant will be sent to standard output (the console
#    window).
#
# INPUTS:
#    major_generator - First generator for the resultant, input
#                      as the first calling argument on the
#                      command line.
#
#    minor_generator - Second generator for the resultant, input
#                      as the second calling argument on the
#                      command line.
#
# OUTPUTS:
#    resultant - Sequence of numbers written to the console window
#
#**************************************************************

   use strict;
   use warnings;

   my $major_generator = $ARGV[0];
   my $minor_generator = $ARGV[1];

   my $total_counts   = $major_generator * $minor_generator;
   my $result_counter = 0;
   my $major_mod      = 0;
   my $minor_mod      = 0;
   my $i              = 0;
   my $j              = 0;
   my @resultant;

   print "Generator Total = $total_counts\n";

   while ($i < $total_counts) {
       $i++;
       $result_counter++;
       $major_mod = $i % $major_generator;
       $minor_mod = $i % $minor_generator;
       if (($major_mod == 0) || ($minor_mod == 0)) {
          push(@resultant, $result_counter);
          $result_counter = 0;
       }
       print "$i \n";
       print "Modulus of $major_generator is $major_mod \n";
       print "Modulus of $minor_generator is $minor_mod \n";
   }

   print "\n";
   print "The resultant is @resultant \n";

Run the program with 5 and 3 as the inputs ( perl result01.pl 5 3 ):

Generator Total = 15
1
Modulus of 5 is 1
Modulus of 3 is 1
2
Modulus of 5 is 2
Modulus of 3 is 2
3
Modulus of 5 is 3
Modulus of 3 is 0
4
Modulus of 5 is 4
Modulus of 3 is 1
5
Modulus of 5 is 0
Modulus of 3 is 2
6
Modulus of 5 is 1
Modulus of 3 is 0
7
Modulus of 5 is 2
Modulus of 3 is 1
8
Modulus of 5 is 3
Modulus of 3 is 2
9
Modulus of 5 is 4
Modulus of 3 is 0
10
Modulus of 5 is 0
Modulus of 3 is 1
11
Modulus of 5 is 1
Modulus of 3 is 2
12
Modulus of 5 is 2
Modulus of 3 is 0
13
Modulus of 5 is 3
Modulus of 3 is 1
14
Modulus of 5 is 4
Modulus of 3 is 2
15
Modulus of 5 is 0
Modulus of 3 is 0

The resultant is 3 2 1 3 1 2 3

This result matches the resultant terms as shown in the graph in Figure 1, so it looks like the program generates the correct output.

Creating a Perl Package from a Program

With a working program, you can create a Perl package as a step toward being able to reuse code in a larger application. The initial program has two pieces of input data (the major generator and the minor generator). The single output is the list of numbers that make up the resultant. These three pieces of data could be combined in an object. The program could easily become a subroutine to generate the terms in the resultant. This could be a method in the class contained in the package. Creating a class implies adding a constructor method to create a new object. Finally, there should be some methods to get the major generator and minor generator from the object to use in generating the resultant (see the perlboot and perltoot tutorials for background on object-oriented programming in Perl).

From these requirements, the resulting package might be:

#!/usr/bin/perl
#*******************************************************
#
# Filename: result01a.pl
#
# Description:
#    This Perl script creates a class for a Schillinger resultant
#    based on two integers for the major generator and the
#    minor generator.
#
# Class Name: Resultant
#
# Synopsis:
#
# use Resultant;
#
# Class Methods:
#
#   $seq1 = Resultant ->new(5, 3)
#
#      Creates a new object with a major generator of 5 and
#      a minor generator of 3. These parameters need to be
#      initialized when a new object is created, as there
#      are no methods to set these elements within the object.
#
#   $seq1->generate()
#
#      Generates a resultant and saves it in the ResultList array
#
# Object Data Methods:
#
#   $major_generator = $seq1->get_major()
#
#      Returns the major generator
#
#   $minor_generator = $seq1->get_minor()
#
#      Returns the minor generator
#
#
#**************************************************************

{ package Resultant;
  use strict;
  sub new {
    my $class           = shift;
    my $major_generator = shift;
    my $minor_generator = shift;

    my $self = {Major => $major_generator,
                Minor => $minor_generator,
                ResultantList => []};

    bless $self, $class;
    return $self;
  }

  sub get_major {
    my $self = shift;
    return $self->{Major};
  }

  sub get_minor {
    my $self = shift;
    return $self->{Minor};
  }

  sub generate {
    my $self         = shift;
    my $total_counts = $self->get_major * $self->get_minor;
    my $i            = 0;
    my $major_mod;
    my $minor_mod;
    my @result;
    my $result_counter = 0;

   while ($i < $total_counts) {
       $i++;
       $result_counter++;
       $major_mod = $i % $self->get_major;
       $minor_mod = $i % $self->get_minor;

       if (($major_mod == 0) || ($minor_mod == 0)) {
          push(@result, $result_counter);
          $result_counter = 0;
       }
   }

   @{$self->{ResultList}} = @result;
  }
}

#
# Test code to check out class methods
#

# Counter declaration
my $j;

# Create new object and initialize major and minor generators
my $seq1 = Resultant->new(5, 3);

# Print major and minor generators
print "The major generator is ", $seq1->get_major(), "\n";
print "The minor generator is ", $seq1->get_minor(), "\n";

# Generate a resultant
$seq1->generate();

# Print the resultant
print "The resultant is ";
foreach $j (@{$seq1->{ResultList}}) {
  print "$j ";
}
print "\n";

Execute the file ( perl result01a.pl ):

The major generator is 5
The minor generator is 3
The resultant is 3 2 1 3 1 2 3

This output text shows the same resultant terms as produced by the first program.

Creating a Perl Module

From a package, you can create a Perl module to make the package fully reusable in an application. Also, you can modify our original test code into a series of module tests to show that the module works the same as the standalone package and the original program.

I like to use the Perl module Module::Starter to create a skeleton module for the package code. To start, install the Module::Starter module and its associated modules from CPAN, using the Perl Package Manager, or some other package manager. To see if you already have the Module::Starter module installed, type perldoc Module::Starter in a terminal window. If the man page does not appear, you probably do not have the module installed.

Select a working directory to create the module directory. This can be the same directory that you have been using to develop your Perl program. Type the following command (though with your own name and email address):

$ 
module-starter --module=Music::Resultant --author="John Doe" \
    --email=john@johndoe.com

Perl should respond with:

Created starter directories and files

In the working directory, you should see a folder or directory called Music-Resultant . Change your current directory to Music-Resultant , then type the commands:

$ 
perl Makefile.PL

$ 
make

These commands will create the full directory structure for the module. Now paste the text from the package into the module template at Music-Resultant/lib/Music/Resultant.pm . Open Resultant.pm in a text editor and paste the subroutines from the package after the lines:

=head1 FUNCTIONS

=head2 function1

=cut

When you paste the package source code, remove the opening brace from the package, so that the first lines appear as:

 package Resultant;
  sub new {
    use strict;
    my $class = shift;

and the last lines of the source appears without the the final closing brace as:

   @{$self->{ResultList}} = @result;
  }

After making the above changes, save Resultant.pm . This is all that you need to do to create a module for your own use. If you eventually release your module to the Perl community or upload it to CPAN , you should do some more work to prepare the module and its documentation (see the perlmod and perlmodlib documentation for more information).

After modifying Resultant.pm , you need to install the module to make it available for other Perl applications. To avoid configuration issues, install the module in your home directory, separate from your main Perl installation.

  1. In your home directory, create a lib/ directory, then create a perl/ directory within the lib/ directory. The result should resemble:
    /home/myname/lib/perl
    
    
  2. Go to your module directory ( Music-Resultant ) and re-run the build process with a directory path to tell Perl where to install the module:
    $ 
    perl Makefile.PL LIB=/home/myname/lib/perl
     $
    make install
    
    

    Once this is complete, the module will be installed in the directory.

The final step in module development is to add tests to the .t file templates created in the module directory. The Perl distribution includes several built-in test modules, such as Test::Simple and Test::More to help test Perl subroutines and modules.

To test the module, open the file Music-Resultant/t/00-load.t . The initial text in this file is:

#!perl -T

use Test::More tests => 1;

BEGIN {
    use_ok( 'Music::Resultant' );
}

diag( "Testing Music::Resultant $Music::Resultant::VERSION, Perl $], $^X" );

You can run this test file from the t/ directory using the command:

perl -I/home/myname/lib/perl -T 00-load.t

The -I switch tells the Perl interpreter to look for the module Resultant.pm in your alternate installation directory. The directory path must immediately follow the -I switch, or Perl may not search your alternate directory for your module. The -T switch is necessary because there is a -T switch in the first line of the test script, which turns on taint checking. (Taint checking only works when enabled at Perl startup; perl will exit with an error if you try to enable it later.) Your results should resemble the following(your Perl version may be different).

1..1
ok 1 - use Music::Resultant;
# Testing Music::Resultant 0.01, Perl 5.008006, perl

The test code from the second listing is easy to convert to the format used by Test::More . Change the number at the end of the tests line from 1 to 4, as you will be adding three more tests to this file. The template file has an initial test to show that the module exists. Next, add tests after the BEGIN block in the file:

# Test 2:
my $seq1 = Resultant->new(5, 3);  # create an object
isa_ok ($seq1, Resultant);        # check object definition

# Test 3: check major generator
my $local_major_generator = $seq1->get_major();
is ($local_major_generator, 5, 'major generator is correct' );

# Test 4: check minor generator
my $local_minor_generator = $seq1->get_minor();
is ($local_minor_generator, 3, 'minor generator is correct' );

To run the tests, retype the earlier command line in the Music-Resultant/ directory:

$ 
perl -I/home/myname/lib/perl -T t/00-load.t

You should see the results:

1..4
ok 1 - use Music::Resultant;
ok 2 - The object isa Resultant
ok 3 - major generator is correct
ok 4 - minor generator is correct
# Testing Music::Resultant 0.01, Perl 5.008006, perl

These tests create a Resultant object with a major generator of 5 and a minor generator of 3 (Test 2), and check to see that the major generator in the object is correct (Test 3), and that the minor generator is correct (Test 4). They do not cover the resultant terms. One way to check the resultant is to add the test code used in the second listing to the .t file:

# Generate a resultant
$seq1->generate();

# Print the resultant
my $j;
print "The resultant is ";
foreach $j (@{$seq1->{ResultList}}) {
  print "$j ";
}
print "\n";

You should get the following results:

1..4
ok 1 - use Music::Resultant;
ok 2 - The object isa Resultant
ok 3 - major generator is correct
ok 4 - minor generator is correct
The resultant is 3 2 1 3 1 2 3
# Testing Music::Resultant 0.01, Perl 5.008006, perl

That's not valid test output, so it needs a little bit of manipulation. To check the elements of a list using a testing function, install the Test::Differences module and its associated modules from CPAN, using the Perl Package Manager, or some other package manager. To see if you already have the Test::Differences module installed, type perldoc Test::Differences in a terminal window. If the man page does not appear, you probably do not have the module installed.

Once that module is part of your Perl installation, change the number of tests from 4 to 5 on the Test::More statement line and add a following statement after the use Test::More statement:

use Test::Differences;

Finally, replace the code that prints the resultant with:

# Test 5: (uses Test::Differences and associated modules)
$seq1->generate();
my @result   = @{$seq1->{ResultList}};
my @expected = (3, 2, 1, 3, 1, 2, 3);
eq_or_diff \@result, \@expected, "resultant terms are correct";

Now when the test file runs, you can confirm that the resultant is correct:

1..5
ok 1 - use Music::Resultant;
ok 2 - The object isa Resultant
ok 3 - major generator is correct
ok 4 - minor generator is correct
ok 5 - resultant terms are correct
# Testing Music::Resultant 0.01, Perl 5.008006, perl

Summary

There are multiple levels of Perl software development. Once you start to create modules to enable reuse of your Perl code, you will be able to leverage your effort into larger applications. By using Perl testing modules, you can ensure that your code works the way you expect and provide a way to ensure that the modules continue to work as you add more features.

Resources

Here are some other good resources on creating Perl modules:

Here are some good resources for using Perl testing modules like Test::Simple and Test::More :

[May 28, 2017] ELIZA - Wikipedia

Perl CPAN Module Chatbot::Eliza

[May 16, 2017] Perl - regex - Position of first nonmatching character

May 16, 2017 | stackoverflow.com

I think thats exactly what the pos function is for.

NOTE: pos only works if you use the /g flag


my
 $x 
=
'abcdefghijklmnopqrstuvwxyz'
;
my
 $end 
=
0
;
if
(
 $x 
=~
/
$ARGV
[
0
]/
g 
)
{

    $end 
=
 pos
(
$x
);
}
print
"End of match is: $end\n"
;

Gives the following output


[
@centos5
~]
$ perl x
.
pl
End
 of match is
:
0
[
@centos5
~]
$ perl x
.
pl def
End
 of match is
:
6
[
@centos5
~]
$ perl x
.
pl xyz
End
 of match is
:
26
[
@centos5
~]
$ perl x
.
pl aaa
End
 of match is
:
0
[
@centos5
~]
$ perl x
.
pl ghi
End
 of match is
:
9

No, it only works when a match was successful. tripleee Oct 10 '11 at 15:24

Sorry, I misread the question. The actaul question is very tricky, especially if the regex is more complicated than just /gho/ , especially if it contains [ or ( . Should I delete my irrelevant answer? Sodved Oct 10 '11 at 15:27

I liked the possibility to see an example of how pos works, as I didn't know about it before - so now I can understand why it also doesn't apply to the question; so thanks for this answer! :) sdaau Jun 8 '12 at 18:26

[May 16, 2017] Perl - positions of regex match in string - Stack Overflow

May 16, 2017 | stackoverflow.com
Perl - positions of regex match in string Ask Question

if
(
my
@matches
=
 $input_string 
=~
/
$metadata
[
$_
]{
"pattern"
}/
g
)
{
print
 $
-[
1
]
.
"\n"
;
# this gives me error uninitialized ...
}

print scalar @matches; gaves me 4, that is ok, but if i use $-[1] to get start of first match, it gaves me error. Where is problem?

EDIT1: How i can get positions of each match in string? If i have string "ahoj ahoj ahoj" and regexp /ahoj/g, how i can get positions of start and end of each "ahoj" in string? perl regex

share | improve this question edited Feb 22 '13 at 20:40 asked Feb 22 '13 at 20:27 Krab 2,643 21 48
What error does it give you? user554546 Feb 22 '13 at 20:29
$-[1] is the position of the 1st subpattern (something in parentheses within the regular expression). You're probably looking for $-[0] , the position of the whole pattern? Scott Lamb Feb 22 '13 at 20:32
scott lamb: no i was thinking if i have string "ahoj ahoj ahoj", then i can get position 0, 5, 10 etc inside $-[n], if regex is /ahoj/g Krab Feb 22 '13 at 20:34
add a comment | 1 Answer 1 active oldest votes
up vote 8 down vote accepted The array @- contains the offset of the start of the last successful match (in $-[0] ) and the offset of any captures there may have been in that match (in $-[1] , $-[2] etc.).

There are no captures in your string, so only $-[0] is valid, and (in your case) the last successful match is the fourth one, so it will contain the offset of the fourth instance of the pattern.

The way to get the offsets of individual matches is to write


my
@matches
;
while
(
"ahoj ahoj ahoj"
=~
/(
ahoj
)/
g
)
{

  push 
@matches
,
 $1
;
print
 $
-[
0
],
"\n"
;
}

output


0
5
10

Or if you don't want the individual matched strings, then


my
@matches
;

push 
@matches
,
 $
-[
0
]
while
"ahoj ahoj ahoj"
=~
/
ahoj
/
g
;

[May 07, 2017] Example Code from Beginning Perl for Bioinformatics

While example are genome sequencing specific most code is good illustiontion of string processing in Perl and as such has a wider appeal. See also molecularevolution.org
May 07, 2017 | uwf.edu

This page contains an uncompressed copy of example code from your course text, downloaded on January 15, 2003. Please see the official Beginning Perl for Bioinformatics Website under the heading " Examples and Exercises " for any updates to this code.


General files
Chapter 4 NOTE: Examples 4-5 to 4-7 also require the protein sequence data file: NM_021964fragment.pep.txt - To match the example in your book, save the file out with the name: NM_021964fragment.pep

Chapter 5

NOTE: Example 5-3 also requires the protein sequence data file: NM_021964fragment.pep.txt - To match the example in your book, save the file out with the name: NM_021964fragment.pep
NOTE: Example 5-4, 5-6 and 5-7 also require the DNA file: small.dna.txt - To match the example in your book, save the file out with the name: small.dna

Chapter 6

NOTE: BeginPerlBioinfo.pm may be needed to execute some code examples from this chapter. Place this file in the same directory as your .pl files.

Chapter 7

NOTE: BeginPerlBioinfo.pm may be needed to execute some code examples from this chapter. Place this file in the same directory as your .pl files.

Chapter 8

NOTE: BeginPerlBioinfo.pm may be needed to execute some code examples from this chapter. Place this file in the same directory as your .pl files.

NOTE: Example 8-2,8-3 and 8-4 also require the DNA file: sample.dna.txt - To match the example in your book, save the file out with the name: sample.dna

Chapter 9

NOTE: BeginPerlBioinfo.pm may be needed to execute some code examples from this chapter. Place this file in the same directory as your .pl files.

[May 07, 2017] Why is Perl used so extensively in biology research

Jan 15, 2016 | stackoverflow.com

Lincoln Stein highlighted some of the saving graces of Perl for bioinformatics in his article: How Perl Saved the Human Genome Project .

From his analysis:

I think several factors are responsible:
  1. Perl is remarkably good for slicing, dicing, twisting, wringing, smoothing, summarizing and otherwise mangling text. Although the biological sciences do involve a good deal of numeric analysis now, most of the primary data is still text: clone names, annotations, comments, bibliographic references. Even DNA sequences are textlike. Interconverting incompatible data formats is a matter of text mangling combined with some creative guesswork. Perl's powerful regular expression matching and string manipulation operators simplify this job in a way that isn't equalled by any other modern language.
  2. Perl is forgiving. Biological data is often incomplete, fields can be missing, or a field that is expected to be present once occurs several times (because, for example, an experiment was run in duplicate), or the data was entered by hand and doesn't quite fit the expected format. Perl doesn't particularly mind if a value is empty or contains odd characters. Regular expressions can be written to pick up and correct a variety of common errors in data entry. Of course this flexibility can be also be a curse. I talk more about the problems with Perl below.
  3. Perl is component-oriented. Perl encourages people to write their software in small modules, either using Perl library modules or with the classic Unix tool-oriented approach. External programs can easily be incorporated into a Perl script using a pipe, system call or socket. The dynamic loader introduced with Perl5 allows people to extend the Perl language with C routines or to make entire compiled libraries available for the Perl interpreter. An effort is currently under way to gather all the world's collected wisdom about biological data into a set of modules called "bioPerl" (discussed at length in an article to be published later in the Perl Journal).
  4. Perl is easy to write and fast to develop in. The interpreter doesn't require you to declare all your function prototypes and data types in advance, new variables spring into existence as needed, calls to undefined functions only cause an error when the function is needed. The debugger works well with Emacs and allows a comfortable interactive style of development.
  5. Perl is a good prototyping language. Because Perl is quick and dirty, it often makes sense to prototype new algorithms in Perl before moving them to a fast compiled language. Sometimes it turns out that Perl is fast enough so that of the algorithm doesn't have to be ported; more frequently one can write a small core of the algorithm in C, compile it as a dynamically loaded module or external executable, and leave the rest of the application in Perl (for an example of a complex genome mapping application implemented in this way, see http://waldo.wi.mit.edu/ftp/distribution/software/rhmapper/ ).
  6. Perl is a good language for Web CGI scripting, and is growing in importance as more labs turn to the Web for publishing their data.

I use lots of Perl for dealing with qualitative and quantitative data in social science research. In terms of getting things done (largely with text) quickly, finding libraries on CPAN (nice central location), and generally just getting things done quickly, it can't be surpassed.

Perl is also excellent glue, so if you have some instrumental records, and you need to glue them to data analysis routines, then Perl is your language. Perl is very powerful when it comes to deal with text and it's present in almost every Linux/Unix distribution. In bioinformatics, not only are sequence data very easy to manipulate with Perl, but also most of the bionformatics algorithms will output some kind of text results.

Then, the biggest bioinformatics centers like the EBI had that great guy, Ewan Birney, who was leading the BioPerl project. That library has lots of parsers for every kind of popular bioinformatics algorithms' results, and for manipulating the different sequence formats used in major sequence databases.

Nowadays, however, Perl is not the only language used by bioinformaticians: along with sequence data, labs produce more and more different kinds of data types and other languages are more often used in those areas.

The R statistics programming language for example, is widely used for statistical analysis of microarray and qPCR data (among others). Again, why are we using it so much? Because it has great libraries for that kind of data (see bioconductor project).

Now when it comes to web development, CGI is not really state of the art today, but people who know Perl may stick to it. In my company though it is no longer used...

I hope this helps.

Bioinformatics deals primarily in text parsing and Perl is the best programming language for the job as it is made for string parsing. As the O'Reilly book (Beginning Perl for Bioinformatics) says that "With [Perl]s highly developed capacity to detect patterns in data, Perl has become one of the most popular languages for biological data analysis." This seems to be a pretty comprehensive response. Perhaps one thing missing, however, is that most biologists (until recently, perhaps) don't have much programming experience at all. The learning curve for Perl is much lower than for compiled languages (like C or Java), and yet Perl still provides a ton of features when it comes to text processing. So what if it takes longer to run? Biologists can definitely handle that. Lab experiments routinely take one hour or more finish, so waiting a few extra minutes for that data processing to finish isn't going to kill them!

Just note that I am talking here about biologists that program out of necessity. I understand that there are some very skilled programmers and computer scientists out there that use Perl as well, and these comments may not apply to them.

===

People missed out DBI , the Perl abstract database interface that makes it really easy to work with bioinformatic databases.

There is also the one-liner angle. You can write something to reformat data in a single line in Perl and just use the -pe flag to embed that at the command line. Many people using AWK and sed moved to Perl. Even in full programs, file I/O is incredibly easy and quick to write, and text transformation is expressive at a high level compared to any engineering language around. People who use Java or even Python for one-off text transformation are just too lazy to learn another language. Java especially has a high dependence on the JVM implementation and its I/O performance.

At least you know how fast or slow Perl will be everywhere, slightly slower than C I/O. Don't learn grep , cut , sed , or AWK ; just learn Perl as your command line tool, even if you don't produce large programs with it. Regarding CGI, Perl has plenty of better web frameworks such as Catalyst and Mojolicious , but the mindshare definitely came from CGI and bioinformatics being one of the earliest heavy users of the Internet.

===

Perl is very easy to learn as compared to other languages. It can fully exploit the biological data which is becoming the big data. It can manipulate big data and perform good for manipulation data curation and all type of DNA programming, automation of biology has become easy due languages like Perl, Python and Ruby . It is very easy for those who are knowing biology, but not knowing how to program that in other programming languages.

Personally, and I know this will date me, but it's because I learned Perl first. I was being asked to take FASTA files and mix with other FASTA files. Perl was the recommended tool when I asked around.

At the time I'd been through a few computer science classes, but I didn't really know programming all that well.

Perl proved fairly easy to learn. Once I'd gotten regular expressions into my head I was parsing and making new FASTA files within a day.

As has been suggested, I was not a programmer. I was a biochemistry graduate working in a lab, and I'd made the mistake of setting up a Linux server where everyone could see me. This was back in the day when that was an all-day project.

Anyway, Perl became my goto for anything I needed to do around the lab. It was awesome, easy to use, super flexible, other Perl guys in other labs we're a lot like me.

So, to cut it short, Perl is easy to learn, flexible and forgiving, and it did what I needed.

Once I really got into bioinformatics I picked up R, Python, and even Java. Perl is not that great at helping to create maintainable code, mostly because it is so flexible. Now I just use the language for the job, but Perl is still one of my favorite languages, like a first kiss or something.

To reiterate, most bioinformatics folks learned coding by just kluging stuff together, and most of the time you're just trying to get an answer for the principal investigator (PI), so you can't spend days on code design. Perl is superb at just getting an answer, it probably won't work a second time, and you will not understand anything in your own code if you see it six months later; BUT if you need something now, then it is a good choice even though I mostly use Python now.

I hope that gives you an answer from someone who lived it.

[May 07, 2017] A useful capability of Perl substr function

Perl subst function can used as pseudo function on the left side of assignment, That allow to insert a substring into arbitrary point of the string

For example, the code fragment:

$test_string='<cite>xxx<blockquote>test to show to insert substring into string using substr as pseudo-function</blockquote>';
print "Before: $test_string\n"; 
substr($test_string,length('<cite>xxx'),0)='</cite>';
print "After: $test_string\n"; 
will print
Before: <cite>xxx<blockquote>test to show to insert substring into string using substr as pseudo-function</blockquote>
After:  <cite>xxx</cite><blockquote>test to show to insert substring into string using substr as pseudo-function</blockquote>

Please note that is you found the symbol of string bafore which you need to insert the string you need to substrac one from the found position

$pos=index($test_string,'<blockquote>;);
if( $pos > -1 ){
    substr($test_string,$pos-1,0)='</cite>';
}

Mar 20, 2017] Cultured Perl Debugging Perl with ease

Mar 20, 2017 | www.ibm.com

Teodor Zlatanov
Published on November 01, 2000

The Perl debugger comes with its own help ('h' or 'h h', for the long and short help screens, respectively). The perldoc perldebug page (type "perldoc perldebug" at your prompt) has a more complete description of the Perl debugger.

So let's start with a buggy program and take a look at how the Perl debugger works. First, it'll attempt to print the first 20 lines in a file.

1 2 3 4 5 6 7 8 9 10 #!/usr/bin/perl -w use strict; foreach (0..20) { my $line = ; print "$_ : $line"; }

When run by itself, buggy.pl fails with the message: "Use of uninitialized value in concatenation (.) at ./buggy.pl line 8, line 9." More mysteriously, it prints "9:" on a line by itself and waits for user input.

Now what does that mean? You may already have spotted the bugbear that came along when we fired up the Perl debugger.

First let's simply make sure the bug is repeatable. We'll set an action on line 8 to print $line where the error occurred, and run the program.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 > perl -d ./buggy.pl buggy.pl Default die handler restored. Loading DB routines from perl5db.pl version 1.07 Editor support available. Enter h or `h h' for help, or `man perldebug' for more help. main::(./buggy.pl:5): foreach (0..20) main::(./buggy.pl:6): { DB<1> use Data::Dumper DB<1> a 8 print 'The line variable is now ', Dumper $line

The Data::Dumper module loads so that the autoaction can use a nice output format. The autoaction is set to do a print statement every time line 8 is reached. Now let's watch the show.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 DB c The line variable is now $VAR1 = '#!/usr/bin/perl -w '; 0 : #!/usr/bin/perl -w The line variable is now $VAR1 = ' '; 1 : The line variable is now $VAR1 = 'use strict; '; 2 : use strict; The line variable is now $VAR1 = ' '; 3 : The line variable is now $VAR1 = 'foreach (0..20) '; 4 : foreach (0..20) The line variable is now $VAR1 = '{ '; 5 : { The line variable is now $VAR1 = ' my $line = ; '; 6 : my $line = ; The line variable is now $VAR1 = ' print "$_ : $line"; '; 7 : print "$_ : $line"; The line variable is now $VAR1 = '} '; 8 : } The line variable is now $VAR1 = undef; Use of uninitialized value in concatenation (.) at ./buggy.pl line 8, <> line 9. 9 :

It's clear now that the problem occurred when the line variable was undefined. Furthermore, the program waited for more input. And pressing the Return key eleven more times created the following output:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 The line variable is now $VAR1 = ' '; 10 : The line variable is now $VAR1 = ' '; 11 : The line variable is now $VAR1 = ' '; 12 : The line variable is now $VAR1 = ' '; 13 : The line variable is now $VAR1 = ' '; 14 : The line variable is now $VAR1 = ' '; 15 : The line variable is now $VAR1 = ' '; 16 : The line variable is now $VAR1 = ' '; 17 : The line variable is now $VAR1 = ' '; 18 : The line variable is now $VAR1 = ' '; 19 : The line variable is now $VAR1 = ' '; 20 : Debugged program terminated. Use q to quit or R to restart, use O inhibit_exit to avoid stopping after program termination, h q, h R or h O to get additional info. DB<3>

By now it's obvious that the program is buggy because it unconditionally waits for 20 lines of input, even though there are cases in which the lines will not be there. The fix is to test the $line after reading it from the filehandle:

1 2 3 4 5 6 7 8 9 #!/usr/bin/perl -w use strict; foreach (0..20) { my $line = ; last unless defined $line; # exit loop if $line is not defined print "$_ : $line"; }

As you see, the fixed program works properly in all cases!

Concluding notes on the Perl debugger

The Emacs editor supports the Perl debugger and makes using it a somewhat better experience. You can read more about the GUD Emacs mode inside Emacs with Info (type M-x info). GUD is a universal debugging mode that works with the Perl debugger (type M-x perldb while editing a Perl program in Emacs).

With a little work, the vi family of editors will also support the Perl debuggers. See the perldoc perldebug page for more information. For other editors, consult each editor's documentation.

The Perl built-in debugger is a powerful tool and can do much more than the simple usage we just looked at. It does, however, require a fair amount of Perl expertise. Which is why we are now going to look at some simpler tools that will better suit beginning and intermediate Perl programmers.

Devel::ptkdb

To use the Devel::ptkdb debugger you first have to download it from CPAN ( see Related topics below) and install it on your system. (Some of you may also need to install the Tk module, also in CPAN.) On a personal note, Devel::ptkdb works best on UNIX systems like Linux. (Although it's not theoretically limited to UNIX-compatible systems, I have never heard of anyone successfully using Devel::ptkdb on Windows. As the old saying goes, anything is possible except skiing through a revolving door.)

If you can't get your system administrator to perform the installation for you (because, for instance, you are the system administrator), you can try doing the following at your prompt (you may need to run this as root):

1 2 perl -MCPAN -e'install Tk' perl -MCPAN -e'install Devel::ptkdb'

After some initial questions, if this is your first time running the CPAN installation routines, you will download and install the appropriate modules automatically.

You can run a program with the ptkdb debugger as follows (using our old buggy.pl example):

1 perl -d:ptkdb buggy.pl buggy.pl

To read the documentation for the Devel::ptkdb modules, use the command "perldoc Devel::ptkdb". We are using version 1.1071 here. (Although updated versions may come out at any time, they should not look very different from the one we're using.)

A window will come up with the program's source code on the left and a list of watched expressions (initially empty) on the right. Enter the word "$line" in the "Enter Expr:" box. Then click on the "Step Over" button to watch the program execute.

The "Run" button will run the program until it finishes or hits a breakpoint. Clicking on the line number in the source-listing window sets or deletes breakpoints. If you select the "BrkPts" tab on the right, you can edit the list of breakpoints and make them conditional upon a variable or a function. (This is a very easy way to set up conditional breakpoints.)

Ptkdb also has File, Control, Data, Stack, and Bookmarks menus. These menus are all explained in the perldoc documentation. Because it's so easy to use, Ptkdb is an absolute must for beginner and intermediate Perl programmers. It can even be useful for Perl gurus (as long as they don't tell anyone that they're using those new-fangled graphical interfaces).

Writing your own Perl shell

Sometimes using a debugger is overkill. If, for example, you want to test something simple, in isolation from the rest of a large program, a debugger would be too complex for the task. This is where a Perl shell can come in handy.

While other valid approaches to a Perl shell certainly exist, we're going to look at a general solution that works well for most daily work (and I use it all the time). Once you understand the tool, you should feel free to tailor it to your own needs and preferences.

The following code requires the Term::ReadLine module. You can download it from CPAN and install it almost the same way as you did Devel::ptkdb.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 #!/usr/bin/perl -w use Term::ReadLine; use Data::Dumper; my $historyfile = $ENV{HOME} . '/.phistory'; my $term = new Term::ReadLine 'Perl Shell'; sub save_list { my $f = shift; my $l = shift; open F, $f; print F "$_\n" foreach @$l } if (open H, $historyfile) { @h = ; chomp @h; close H; $h{$_} = 1 foreach @h; $term->addhistory($_) foreach keys %h; } while ( defined ($_ = $term->readline("My Perl Shell> ")) ) { my $res = eval($_); warn $@ if $@; unless ($@) { open H, ">>$historyfile"; print H "$_\n"; close H; print "\n", Data::Dumper->Dump([$res], ['Result']); } $term->addhistory($_) if /\S/; }

This Perl shell does several things well, and some things decently.

First of all, it keeps a unique history of commands already entered in your home directory in a file called ".phistory". If you enter a command twice, only one copy will remain (see the function that opens $historyfile and reads history lines from it).

With each entry of a new command, the command list is saved to the .phistory file. So if you enter a command that crashes the shell, the history of your last session is not lost.

The Term::ReadLine module makes it easy to enter commands for execution. Because commands are limited to only one line at a time, it's possible to write good old buggy.pl as:

1 2 3 4 5 6 Da Perl Shell> use strict $Result = undef; Perl Shell> print "$_: " . foreach (0..20) 0: ... 1: ...

The problem of course is that the input operator ends up eating the shell's own input. So don't use or STDIN in the Perl shell, because they'll make things more difficult. Try this instead:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Perl Shell> open F, "buggy.pl" $Result = 1; Perl Shell> foreach (0..20) { last if eof(F); print "$_: " . ; } 0: #!/usr/bin/perl -w 1: 2: use strict; 3: 4: foreach (0..20) 5: { 6: my $line = ; 7: last unless defined $line; # exit loop if $line is not defined 8: print "$_ : $line"; 9: } $Result = undef;

As you can see, the shell works for cases where you can easily condense statements into one line. It's also a surprisingly common solution to isolating bugs and provides a great learning environment. Do a little exercise and see if you can write a Perl shell for debugging on your own, and see how much you learn!

Building an arsenal of tools

We've only covered the very basics here of the built-in Perl debugger, Devel::ptkdb, and related tools. There are many more ways to debug Perl. What's important is that you gain an understanding of the debugging process: how a bug is observed, solved, and then fixed. Of course the single most important thing is to make sure you have a comprehensive understanding of your program's requirements.

The Perl built-in debugger is very powerful, but it's not good for beginner or intermediate Perl programmers. (With the exception of Emacs, where it can be a useful tool even for beginners as long as they understand debugging under Emacs.)

The Devel::ptkdb module and debugger are (because of power and usability) by far the best choice for beginning and intermediate programmers. Perl shells, on the other hand, are personalized debugging solutions for isolated problems with small pieces of code.

Every software tester builds his own arsenal of debugging tools, whether it's the Emacs editor with GUD, or a Perl shell, or print statements throughout the code. Hopefully the tools we've looked at here will make your debugging experience a little easier.

[Mar 20, 2017] Cultured Perl One-liners 102

Mar 20, 2017 | www.ibm.com
One-liners 102

More one-line Perl scripts

Teodor Zlatanov
Published on March 12, 2003 Share this page

Facebook Twitter Linked In Google+ E-mail this page Comments 0

This article, as regular readers may have guessed, is the sequel to " One-liners 101 ," which appeared in a previous installment of "Cultured Perl". The earlier article is an absolute requirement for understanding the material here, so please take a look at it before you continue.

The goal of this article, as with its predecessor, is to show legible and reusable code, not necessarily the shortest or most efficient version of a program. With that in mind, let's get to the code!

Tom Christiansen's list

Tom Christiansen posted a list of one-liners on Usenet years ago, and that list is still interesting and useful for any Perl programmer. We will look at the more complex one-liners from the list; the full list is available in the file tomc.txt (see Related topics to download this file). The list overlaps slightly with the " One-liners 101 " article, and I will try to point out those intersections.

Awk is commonly used for basic tasks such as breaking up text into fields; Perl excels at text manipulation by design. Thus, we come to our first one-liner, intended to add two columns in the text input to the script.

Listing 1. Like awk?
1 2 3 4 # add first and penultimate columns # NOTE the equivalent awk script: # awk '{i = NF - 1; print $1 + $i}' perl -lane 'print $F[0] + $F[-2]'

So what does it do? The magic is in the switches. The -n and -a switches make the script a wrapper around input that splits the input on whitespace into the @F array; the -e switch adds an extra statement into the wrapper. The code of interest actually produced is:

Listing 2: The full Monty
1 2 3 4 5 while (<>) { @F = split(' '); print $F[0] + $F[-2]; # offset -2 means "2nd to last element of the array" }

Another common task is to print the contents of a file between two markers or between two line numbers.

Listing 3: Printing a range of lines
1 2 3 4 5 6 7 8 9 10 11 # 1. just lines 15 to 17 perl -ne 'print if 15 .. 17' # 2. just lines NOT between line 10 and 20 perl -ne 'print unless 10 .. 20' # 3. lines between START and END perl -ne 'print if /^START$/ .. /^END$/' # 4. lines NOT between START and END perl -ne 'print unless /^START$/ .. /^END$/'

A problem with the first one-liner in Listing 3 is that it will go through the whole file, even if the necessary range has already been covered. The third one-liner does not have that problem, because it will print all the lines between the START and END markers. If there are eight sets of START/END markers, the third one-liner will print the lines inside all eight sets.

Preventing the inefficiency of the first one-liner is easy: just use the $. variable, which tells you the current line. Start printing if $. is over 15 and exit if $. is greater than 17.

Listing 4: Printing a numeric range of lines more efficiently
1 2 # just lines 15 to 17, efficiently perl -ne 'print if $. >= 15; exit if $. >= 17;'

Enough printing, let's do some editing. Needless to say, if you are experimenting with one-liners, especially ones intended to modify data, you should keep backups. You wouldn't be the first programmer to think a minor modification couldn't possibly make a difference to a one-liner program; just don't make that assumption while editing the Sendmail configuration or your mailbox.

Listing 5: In-place editing
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 # 1. in-place edit of *.c files changing all foo to bar perl -p -i.bak -e 's/\bfoo\b/bar/g' *.c # 2. delete first 10 lines perl -i.old -ne 'print unless 1 .. 10' foo.txt # 3. change all the isolated oldvar occurrences to newvar perl -i.old -pe 's{\boldvar\b}{newvar}g' *.[chy] # 4. increment all numbers found in these files perl -i.tiny -pe 's/(\d+)/ 1 + $1 /ge' file1 file2 .... # 5. delete all but lines between START and END perl -i.old -ne 'print unless /^START$/ .. /^END$/' foo.txt # 6. binary edit (careful!) perl -i.bak -pe 's/Mozilla/Slopoke/g' /usr/local/bin/netscape

Why does 1 .. 10 specify line numbers 1 through 10? Read the "perldoc perlop" manual page. Basically, the .. operator iterates through a range. Thus, the script does not count 10 lines , it counts 10 iterations of the loop generated by the -n switch (see "perldoc perlrun" and Listing 2 for an example of that loop).

The magic of the -i switch is that it replaces each file in @ARGV with the version produced by the script's output on that file. Thus, the -i switch makes Perl into an editing text filter. Do not forget to use the backup option to the -i switch. Following the i with an extension will make a backup of the edited file using that extension.

Note how the -p and -n switch are used. The -n switch is used when you want explicitly to print out data. The -p switch implicitly inserts a print $_ statement in the loop produced by the -n switch. Thus, the -p switch is better for full processing of a file, while the -n switch is better for selective file processing, where only specific data needs to be printed.

Examples of in-place editing can also be found in the " One-liners 101 " article.

Reversing the contents of a file is not a common task, but the following one-liners show than the -n and -p switches are not always the best choice when processing an entire file.

Listing 6: Reversal of files' fortunes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 # 1. command-line that reverses the whole input by lines # (printing each line in reverse order) perl -e 'print reverse <>' file1 file2 file3 .... # 2. command-line that shows each line with its characters backwards perl -nle 'print scalar reverse $_' file1 file2 file3 .... # 3. find palindromes in the /usr/dict/words dictionary file perl -lne '$_ = lc $_; print if $_ eq reverse' /usr/dict/words # 4. command-line that reverses all the bytes in a file perl -0777e 'print scalar reverse <>' f1 f2 f3 ... # 5. command-line that reverses each paragraph in the file but prints # them in order perl -00 -e 'print reverse <>' file1 file2 file3 ....

The -0 (zero) flag is very useful if you want to read a full paragraph or a full file into a single string. (It also works with any character number, so you can use a special character as a marker.) Be careful when reading a full file in one command ( -0777 ), because a large file will use up all your memory. If you need to read the contents of a file backwards (for instance, to analyze a log in reverse order), use the CPAN module File::ReadBackwards. Also see " One-liners 101 ," which shows an example of log analysis with File::ReadBackwards.

Note the similarity between the first and second scripts in Listing 6. The first one, however, is completely different from the second one. The difference lies in using <> in scalar context (as -n does in the second script) or list context (as the first script does).

The third script, the palindrome detector, did not originally have the $_ = lc $_; segment. I added that to catch those palindromes like "Bob" that are not the same backwards.

My addition can be written as $_ = lc; as well, but explicitly stating the subject of the lc() function makes the one-liner more legible, in my opinion.

Paul Joslin's list

Paul Joslin was kind enough to send me some of his one-liners for this article.

Listing 7: Rewrite with a random number
1 2 # replace string XYZ with a random number less than 611 in these files perl -i.bak -pe "s/XYZ/int rand(611)/e" f1 f2 f3

This is a filter that replaces XYZ with a random number less than 611 (that number is arbitrarily chosen). Remember the rand() function returns a random number between 0 and its argument.

Note that XYZ will be replaced by a different random number every time, because the substitution evaluates "int rand(611)" every time.

Listing 8: Revealing the files' base nature
1 2 3 4 5 6 7 8 9 10 11 # 1. Run basename on contents of file perl -pe "s@.*/@@gio" INDEX # 2. Run dirname on contents of file perl -pe 's@^(.*/)[^/]+@$1\n@' INDEX # 3. Run basename on contents of file perl -MFile::Basename -ne 'print basename $_' INDEX # 4. Run dirname on contents of file perl -MFile::Basename -ne 'print dirname $_' INDEX

One-liners 1 and 2 came from Paul, while 3 and 4 were my rewrites of them with the File::Basename module. Their purpose is simple, but any system administrator will find these one-liners useful.

Listing 9: Moving or renaming, it's all the same in UNIX
1 2 3 4 5 6 # 1. write command to mv dirs XYZ_asd to Asd # (you may have to preface each '!' with a '\' depending on your shell) ls | perl -pe 's!([^_]+)_(.)(.*)!mv $1_$2$3 \u$2\E$3!gio' # 2. Write a shell script to move input from xyz to Xyz ls | perl -ne 'chop; printf "mv $_ %s\n", ucfirst $_;'

For regular users or system administrators, renaming files based on a pattern is a very common task. The scripts above will do two kinds of job: either remove the file name portion up to the _ character, or change each filename so that its first letter is uppercased according to the Perl ucfirst() function.

There is a UNIX utility called "mmv" by Vladimir Lanin that may also be of interest. It allows you to rename files based on simple patterns, and it's surprisingly powerful. See the Related topics section for a link to this utility.

Some of mine

The following is not a one-liner, but it's a pretty useful script that started as a one-liner. It is similar to Listing 7 in that it replaces a fixed string, but the trick is that the replacement itself for the fixed string becomes the fixed string the next time.

The idea came from a newsgroup posting a long time ago, but I haven't been able to find original version. The script is useful in case you need to replace one IP address with another in all your system files -- for instance, if your default router has changed. The script includes $0 (in UNIX, usually the name of the script) in the list of files to rewrite.

As a one-liner it ultimately proved too complex, and the messages regarding what is about to be executed are necessary when system files are going to be modified.

Listing 10: Replace one IP address with another one
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 #!/usr/bin/perl -w use Regexp::Common qw/net/; # provides the regular expressions for IP matching my $replacement = shift @ARGV; # get the new IP address die "You must provide $0 with a replacement string for the IP 111.111.111.111" unless $replacement; # we require that $replacement be JUST a valid IP address die "Invalid IP address provided: [$replacement]" unless $replacement =~ m/^$RE{net}{IPv4}$/; # replace the string in each file foreach my $file ($0, qw[/etc/hosts /etc/defaultrouter /etc/ethers], @ARGV) { # note that we know $replacement is a valid IP address, so this is # not a dangerous invocation my $command = "perl -p -i.bak -e 's/111.111.111.111/$replacement/g' $file"; print "Executing [$command]\n"; system($command); }

Note the use of the Regexp::Common module, an indispensable resource for any Perl programmer today. Without Regexp::Common, you will be wasting a lot of time trying to match a number or other common patterns manually, and you're likely to get it wrong.

Conclusion

Thanks to Paul Joslin for sending me his list of one-liners. And in the spirit of conciseness that one-liners inspire, I'll refer you to " One-liners 101 " for some closing thoughts on one-line Perl scripts.
Articles by Teodor Zlatanov

Git gets demystified and Subversion control (Aug 27, 2009)

Build simple photo-sharing with Amazon cloud and Perl (Apr 06, 2009)

developerWorks: Use IMAP with Perl, Part 2 (May 26, 2005)

developerWorks: Complex Layered Configurations with AppConfig (Apr 11, 2005)

developerWorks: Perl 6 Grammars and Regular Expressions (Nov 09, 2004)

developerWorks: Genetic Algorithms Simulate a Multi-Celled Organism (Oct 28, 2004)

developerWorks: Cultured Perl: Managing Linux Configuration Files (Jun 15, 2004)

developerWorks: Cultured Perl: Fun with MP3 and Perl, Part 2 (Feb 09, 2004)

developerWorks: Cultured Perl: Fun with MP3 and Perl, Part 1 (Dec 16, 2003)

developerWorks: Inversion Lists with Perl (Oct 27, 2003)

developerWorks: Cultured Perl: One-Liners 102 (Mar 21, 2003)

developerWorks: Developing cfperl, From the Beginning (Jan 22, 2003)

IBM developerWorks: Using the xinetd program for system administration (Nov 28, 2001)

IBM developerWorks: Reading and writing Excel files with Perl (Sep 30, 2001)

IBM developerWorks: Automating UNIX system administration with Perl (Jul 22, 2001)

IBM developerWorks: A programmer's Linux-oriented setup - Optimizing your machine for your needs (Mar 25, 2001)

IBM developerWorks: Cultured Perl: Debugging Perl with ease (Nov 23, 2000)

IBM developerWorks: Cultured Perl: Review of Programming Perl, Third Edition (Sep 17, 2000)

IBM developerWorks: Cultured Perl: Writing Perl programs that speak English Using Parse::RecDescent (Aug 05, 2000)

IBM developerWorks: Perl: Small observations about the big picture (Jul 02, 2000)

IBM developerWorks: Parsing with Perl modules (Apr 30, 2000)

[Dec 27, 2016] Perl is a great choice for a variety of industries

Dec 27, 2016 | opensource.com

Opensource.com

Earlier this year, ActiveState conducted a survey of users who had downloaded our distribution of Perl over the prior year and a half. We received 356 responses99 commercial users and 257 individual users. I've been using Perl for a long time, and I expected that lengthy experience would be typical of the Perl community. Our survey results, however, tell a different story.

Almost one-third of the respondents have three or fewer years of experience. Nearly half of all respondents reported using Perl for fewer than five years, a statistic that could be attributed to Perl's outstanding, inclusive community. The powerful and pragmatic nature of Perl and its supportive community make it a great choice for a wide array of uses across a variety of industries.

For a deeper dive, check out this video of my talk at YAPC North America this year.

Perl careers

Right now you can search online and find Perl jobs related to Amazon and BBC, not to mention several positions at Boeing. A quick search on Dice.com, an IT and engineering career website, yielded 3,575 listings containing the word Perl at companies like Amazon, Athena Health, and Northrop Grumman. Perl is also found in the finance industry, where it's primarily used to pull data from databases and process it.

Perl benefits

Perl's consistent utilization is the result of myriad factors, but its open source background is a powerful attribute.

Projects using Perl reduce upfront costs and downstream risks, and when you factor in how clean and powerful Perl is, it becomes quite a compelling option. Add to this that Perl sees yearly releases (more than that, even, as Perl has seen seven releases since 2012), and you can begin to understand why Perl still runs large parts of the web.

Mojolicious, Dancer, and Catalyst are just a few of the powerful web frameworks built for Perl. Designed for simplicity and scalability, these frameworks provide aspiring Perl developers an easy entry point to the language, which might explain some of the numbers from the survey I mentioned above. The inclusive nature of the Perl community draws developers, as well. It's hard to find a more welcoming or active community, and you can see evidence of that in the online groups, open source projects, and regular worldwide conferences and workshops.

Perl modules

Perl also has a mature installation tool chain and a strong testing culture. Anyone who wants to create automated test suites for Perl projects has the assistance of the over 400 testing and quality modules available on CPAN (Comprehensive Perl Archive Network). They won't have to sort through all 400 to choose the best, though: Test::Most is a one-stop shop for the most commonly used test modules. CPAN is one of Perl's biggest advantages over other programming languages. The archive hosts tens of thousands of ready-to-use modules for Perl, and the breadth and variety of those modules is astounding.

Even with a quick search you can find hardcore numerical modules, ODE (ordinary differential equations) solvers, and countless other types of modules written over the last 20 years by thousands of contributors. This contribution-based archive network helps keep Perl fresh and relevant, proliferating modules like pollen that will blow around to the incredible number of Perl projects out in the world.

You might think that community modules aren't the most reliable, but every distribution of modules on CPAN has been tested on myriad platforms and Perl configurations. As a testament to the determination of Perl users, the community has constructed a testing network and they spend time to make sure each Perl module works well on every available platform. They also maintain extensively-checked libraries that help Perl developers with big data projects.

What we're seeing today is a significant, dedicated community of Perl developers. This is not only because the language is pragmatic, effective, and powerful, but also because of the incredible community that these developers compose. The Perl community doesn't appear to be going anywhere, which means neither is Perl.

[Dec 26, 2016] Perl Advent Calendar Enters Its 17th Year

Dec 26, 2016 | developers.slashdot.org
(perladvent.org)

Posted by EditorDavid on Saturday December 03, 2016 @10:34AM

An anonymous reader writes: Thursday brought this year's first new posts on the Perl Advent Calendar , a geeky tradition first started back in 2000.

Friday's post described Santa's need for fast, efficient code, and the day that a Christmas miracle occurred during Santa's annual code review (involving the is_hashref subroutine from Perl's reference utility library). And for the last five years, the calendar has also had its own Twitter feed .

But in another corner of the North Pole, you can also unwrap the Perl 6 Advent Calendar , which this year celebrates the one-year anniversary of the official launch of Perl 6. Friday's post was by brian d foy, a writer on the classic Perl textbooks Learning Perl and Intermediate Perl (who's now also crowdfunding his next O'Reilly book , Learning Perl 6 ).

foy's post talked about Perl 6's object hashes, while the calendar kicked off its new season Thursday with a discussion about creating Docker images using webhooks triggered by GitHub commits as an example of Perl 6's "whipupitude".

[Nov 18, 2015] Beginning Perl

**** the author tried to cover way too much for the introductory book. If you skip some chapters this might be book introductory book. Otherwise it is tilted toward intermediate. Most material is well written and it is clear that the author is knowledgabe in the subject he is trying to cover.
Sept 19, 2012 | Amazon.com
Athelbert Z. Athelstanon July 31, 2014

Nice attempt; flawed implementation

Utterly inadequate editing. e.g. In the references chapter, where a backslash is essential to the description at hand, the backslashes don't show. There are numerous other less critical editing failures.

The result makes the book useless as a training aid.

Craig Treptow, June 13, 2013

out of 5 starsA Great Book to Learn About Perl

Preface
I have been dabbling in Perl on and off since about 1993. For a decade or so, it was mostly "off", and then I took a position programming Perl full time about a year ago. We currently use perl 5.8.9, and I spend part of my time teaching Perl to old school mainframe COBOL programmers. Dare I say, I am the target market for this book?

Chapter 1
The author takes the time, to explain that you should ever use `PERL', since it's not an acronym. I find it funny that the section headings utilize an "all caps" font, so the author does end up using `PERL'. That's not even a quibble, I just chuckle at such things.

The author covers the perlbrew utility. Fantastic! What about all of us schmucks that are stuck with Windows at work, or elsewhere? Throw us a bone!! Ok, I don't think there is a bone to throw us, but the author does a great job of covering the options for Windows.

He covers the community! Amazing! Wonderful! Of all things a beginner should know, this is one of them, and it's great that the author has taken some time to describe what's out there.

One other note are the...notes. I love the fact that the author has left little breadcrumbs in the book (each starts with "NOTE" in a grey box), warning you about things that could ultimately hurt you. Case in point, the warning on page 13 regarding the old OO docs that came with 5.8 and 5.10. Wonderful.

Chapter 2
An entire chapter on CPAN? Yes!!! CPAN is a great resource, and part of what makes Perl so great. The author even has some advice regarding how to evaluate a module. Odd, though, there is no mention of the wonderful http://metacpan.org site. That is quickly becoming the favorite of a lot of people.

It is great that the author covers the various cpan clients. However, if you end up in a shop like mine, that ends up being useless as you have to beg some sysadmin for every module you want installed.

Chapter 3
The basics of Perl are covered here in a very thorough way. The author takes you from "What is programming?" to package variables and some of the Perl built-in variables in short order.

Chapter 4
Much more useful stuff is contained in this chapter. I mean I wish pack() and unpack() were made known to me when I first saw Perl, but hey, Perl is huge and I can understand leaving such things out, but I'm happy the author left a lot of them in.

Herein lies another one of those wonderful grey boxes. On page 106 you'll find the box labeled `What is "TRUTH"?' So many seem to stumble over this, so it is great that it's in the book and your attention is drawn to it.

Chapter 5
Here you'll find the usual assortment of control-flow discussion including the experimental given/when, which most will know as a "switch" or "case" statement. The author even has a section to warn you against your temptation to use the "Switch" module. That's good stuff.

Chapter 6
Wow references so early in the book!?!? Upon reflecting a bit, I think this is a good move. They allow so much flexibility with Perl, that I'm happy the author has explored them so early.

Chapter 7
I do find it odd that a chapter on subroutines comes after a chapter on references, though. It seems like subroutines are the obvious choice to get a beginning programmer to start organizing their code. Hence, it should have come earlier.

Having said that, I love the authors technique of "Named Arguments" and calling the hash passed in "%arg_for". It reads so well! I'm a fan and now tend to use this. Of course, it is obvious now that references needed to be discussed first, or this technique would just be "black magic" to a new Perl person.

There are so many other good things in this chapter: Carp, Try::Tiny, wantarray, Closures, recursion, etc. This is definitely a good chapter to read a couple of times and experiment with the code.

Chapter 8
As the author points out, an entire book has been written on the topic of regular expressions (perhaps even more than one book). The author does a good job of pulling out the stuff you're most likely to use and run across in code.

Chapter 9
Here's one that sort of depends on what you do. It's good to know, but if you spend your days writing web apps that never interact with the file system, you'll never use this stuff. Of course thinking that will mean that you'll use it tomorrow, so read the chapter today anyway. :)

Chapter 10
A chapter on just sort, map, and grep? Yes, yes there is, and it is well worth reading. This kind of stuff is usually left for some sort of "intermediate" level book, but it's good to read about it now and try to use them to see how they can help.

Chapter 11
Ah, yes, a good chapter for when you've gotten past a single file with 100 subroutines and want to organize that in a more manageable way. I find it a bit odd that POD comes up in this chapter, rather than somewhere else. I guess it makes sense here, but would you really not document until you got to this point? Perhaps, but hey, at least you're documenting now. :)

Chapter 12 and 13
I like the author's presentation of OO. I think you get a good feel for the "old school" version that you are likely to see in old code bases with a good comparison of how that can be easier by using Moose. These two chapters are worth reading a few times and playing with some code.

Chapter 14
Unit testing for the win! I loved seeing this chapter. I walked into a shop with zero unit tests and have started the effort. Testing has been part of the Perl culture since the beginning. Embrace it. We can't live in a world without unit tests. I've been doing that and it hurts, don't do that to yourself.

Chapter 15
"The Interwebs", really? I don't know what I would have called this chapter, but I'm happy it exists. Plack is covered, yay!!! Actually, this is a good overview of "web programming", and just "how the web works". Good stuff.

Chapter 16
A chapter on DBI? Yes! This is useful. If you work in almost any shop, data will be in a database and you'll need to get to it.

Chapter 17
"Plays well with others"...hmmm....another odd title, yet I can't think of a more appropriate one. How about "The chapter about STDIN, STDOUT, and STDERR". That's pretty catchy, right?

Chapter 18
A chapter on common tasks, yet I've only had to do one of those things ( parsing and manipulating dates). I think my shop is weird, or I just haven't gotten involved with projects that required any of the other activities, such as reading/writing XML.

Including the debugger and a profiler is good. However, how do you use the debugger with a web app? I don't know. Perhaps one day I'll figure it out. That's a section I wish was in the book. The author doesn't mention modulinos, but I think that's the way to use the debugger for stepping through module. I could be wrong. In any case, a little more on debugger scenarios would have been helpful. A lot of those comments also apply to profiling. I hope I just missed that stuff in this chapter. :)

Chapter 19
Wow, the sort of "leftover" chapter, yet still useful. It is good to know about ORMs for instance, even if you are like me and can't use them at work (yet).

Quick coverage of templates and web frameworks? Yes, and Yes! I love a book that doesn't mention CGI.pm, since it is defunct now. Having said that, there are probably tons of shops that use it (like mine) until their employees demand that it be deleted from systems without remorse. So, it probably should have been given at least some lip service.

I am an admitted "fanboy" of Ovid. Given that, I can see how you might think I got paid for this or something. I didn't. I just think that he did a great job covering Perl with this book. He gives you stuff here that other authors have separated into multiple books. So much, in fact, that you won't even miss the discussion of what was improved with Perl's past v5.10.

All in all, if you buy this book, I think you'll be quite happy with it.

[Nov 16, 2015] undef can be used as a dummy variable in split function

Instead of

($id, $not_used, credentials, $home_dir, $shell ) = split /:/;

You can write

($id, undef, credentials, $home_dir, $shell ) = split /:/;

In Perl 22 they even did pretty fancy (and generally useless staff). Instead of

my(undef, $card_num, undef, undef, undef, $count) = split /:/;

You can write

use v5.22; 
my(undef, $card_num, (undef)x3, $count) = split /:/;

[Nov 15, 2015] Web Basics with LWP

Aug 20, 2002 | Perl.com

LWP (short for "Library for WWW in Perl") is a popular group of Perl modules for accessing data on the Web. Like most Perl module-distributions, each of LWP's component modules comes with documentation that is a complete reference to its interface. However, there are so many modules in LWP that it's hard to know where to look for information on doing even the simplest things.

Introducing you to using LWP would require a whole book--a book that just happens to exist, called Perl & LWP. This article offers a sampling of recipes that let you perform common tasks with LWP.

Getting Documents with LWP::Simple

If you just want to access what's at a particular URL, the simplest way to do it is to use LWP::Simple's functions.

In a Perl program, you can call its get($url) function. It will try getting that URL's content. If it works, then it'll return the content; but if there's some error, it'll return undef.

  my $url = 'http://freshair.npr.org/dayFA.cfm?todayDate=current';
    # Just an example: the URL for the most recent /Fresh Air/ show
  use LWP::Simple;
  my $content = get $url;
  die "Couldn't get $url" unless defined $content;

  # Then go do things with $content, like this:

  if($content =~ m/jazz/i) {
    print "They're talking about jazz today on Fresh Air!\n";
  } else {
    print "Fresh Air is apparently jazzless today.\n";
  }

The handiest variant on get is getprint, which is useful in Perl one-liners. If it can get the page whose URL you provide, it sends it to STDOUT; otherwise it complains to STDERR.


  % perl -MLWP::Simple -e "getprint 'http://cpan.org/RECENT'"

This is the URL of a plain-text file. It lists new files in CPAN in the past two weeks. You can easily make it part of a tidy little shell command, like this one that mails you the list of new Acme:: modules:


  % perl -MLWP::Simple -e "getprint 'http://cpan.org/RECENT'"  \
     | grep "/by-module/Acme" | mail -s "New Acme modules! Joy!" $USER

There are other useful functions in LWP::Simple, including one function for running a HEAD request on a URL (useful for checking links, or getting the last-revised time of a URL), and two functions for saving and mirroring a URL to a local file. See the LWP::Simple documentation for the full details, or Chapter 2, "Web Basics" of Perl & LWP for more examples.

The Basics of the LWP Class Model

LWP::Simple's functions are handy for simple cases, but its functions don't support cookies or authorization; they don't support setting header lines in the HTTP request; and generally, they don't support reading header lines in the HTTP response (most notably the full HTTP error message, in case of an error). To get at all those features, you'll have to use the full LWP class model.

While LWP consists of dozens of classes, the two that you have to understand are LWP::UserAgent and HTTP::Response. LWP::UserAgent is a class for "virtual browsers," which you use for performing requests, and HTTP::Response is a class for the responses (or error messages) that you get back from those requests.

The basic idiom is $response = $browser->get($url), or fully illustrated:


  # Early in your program:
  
  use LWP 5.64; # Loads all important LWP classes, and makes
                #  sure your version is reasonably recent.

  my $browser = LWP::UserAgent->new;
  
  ...
  
  # Then later, whenever you need to make a get request:
  my $url = 'http://freshair.npr.org/dayFA.cfm?todayDate=current';
  
  my $response = $browser->get( $url );
  die "Can't get $url -- ", $response->status_line
   unless $response->is_success;

  die "Hey, I was expecting HTML, not ", $response->content_type
   unless $response->content_type eq 'text/html';
     # or whatever content-type you're equipped to deal with

  # Otherwise, process the content somehow:
  
  if($response->content =~ m/jazz/i) {
    print "They're talking about jazz today on Fresh Air!\n";
  } else {
    print "Fresh Air is apparently jazzless today.\n";
  }
There are two objects involved: $browser, which holds an object of the class LWP::UserAgent, and then the $response object, which is of the class HTTP::Response. You really need only one browser object per program; but every time you make a request, you get back a new HTTP::Response object, which will have some interesting attributes: Adding Other HTTP Request Headers

The most commonly used syntax for requests is $response = $browser->get($url), but in truth, you can add extra HTTP header lines to the request by adding a list of key-value pairs after the URL, like so:


  $response = $browser->get( $url, $key1, $value1, $key2, $value2, ... );

For example, here's how to send more Netscape-like headers, in case you're dealing with a site that would otherwise reject your request:


  my @ns_headers = (
   'User-Agent' => 'Mozilla/4.76 [en] (Win98; U)',
   'Accept' => 'image/gif, image/x-xbitmap, image/jpeg, 
        image/pjpeg, image/png, */*',
   'Accept-Charset' => 'iso-8859-1,*,utf-8',
   'Accept-Language' => 'en-US',
  );

  ...
  
  $response = $browser->get($url, @ns_headers);

If you weren't reusing that array, you could just go ahead and do this:



  $response = $browser->get($url,
   'User-Agent' => 'Mozilla/4.76 [en] (Win98; U)',
   'Accept' => 'image/gif, image/x-xbitmap, image/jpeg, 
        image/pjpeg, image/png, */*',
   'Accept-Charset' => 'iso-8859-1,*,utf-8',
   'Accept-Language' => 'en-US',
  );

If you were only going to change the 'User-Agent' line, you could just change the $browser object's default line from "libwww-perl/5.65" (or the like) to whatever you like, using LWP::UserAgent's agent method:


   $browser->agent('Mozilla/4.76 [en] (Win98; U)');
Enabling Cookies

A default LWP::UserAgent object acts like a browser with its cookies support turned off. There are various ways of turning it on, by setting its cookie_jar attribute. A "cookie jar" is an object representing a little database of all the HTTP cookies that a browser can know about. It can correspond to a file on disk (the way Netscape uses its cookies.txt file), or it can be just an in-memory object that starts out empty, and whose collection of cookies will disappear once the program is finished running.

To give a browser an in-memory empty cookie jar, you set its cookie_jar attribute like so:


  $browser->cookie_jar({});

To give it a copy that will be read from a file on disk, and will be saved to it when the program is finished running, set the cookie_jar attribute like this:


  use HTTP::Cookies;
  $browser->cookie_jar( HTTP::Cookies->new(
    'file' => '/some/where/cookies.lwp',
        # where to read/write cookies
    'autosave' => 1,
        # save it to disk when done
  ));

That file will be an LWP-specific format. If you want to access the cookies in your Netscape cookies file, you can use the HTTP::Cookies::Netscape class:


  use HTTP::Cookies;
    # yes, loads HTTP::Cookies::Netscape too
  
  $browser->cookie_jar( HTTP::Cookies::Netscape->new(
    'file' => 'c:/Program Files/Netscape/Users/DIR-NAME-HERE/cookies.txt',
        # where to read cookies
  ));

You could add an 'autosave' => 1 line as we did earlier, but at time of writing, it's uncertain whether Netscape might discard some of the cookies you could be writing back to disk.

Posting Form Data

Many HTML forms send data to their server using an HTTP POST request, which you can send with this syntax:


 $response = $browser->post( $url,
   [
     formkey1 => value1, 
     formkey2 => value2, 
     ...
   ],
 );
Or if you need to send HTTP headers:

 $response = $browser->post( $url,
   [
     formkey1 => value1, 
     formkey2 => value2, 
     ...
   ],
   headerkey1 => value1, 
   headerkey2 => value2, 
 );

For example, the following program makes a search request to AltaVista (by sending some form data via an HTTP POST request), and extracts from the HTML the report of the number of matches:


  use strict;
  use warnings;
  use LWP 5.64;
  my $browser = LWP::UserAgent->new;
  
  my $word = 'tarragon';
  
  my $url = 'http://www.altavista.com/sites/search/web';
  my $response = $browser->post( $url,
    [ 'q' => $word,  # the Altavista query string
      'pg' => 'q', 'avkw' => 'tgz', 'kl' => 'XX',
    ]
  );
  die "$url error: ", $response->status_line
   unless $response->is_success;
  die "Weird content type at $url -- ", $response->content_type
   unless $response->content_type eq 'text/html';

  if( $response->content =~ m{AltaVista found ([0-9,]+) results} ) {
    # The substring will be like "AltaVista found 2,345 results"
    print "$word: $1\n";
  } else {
    print "Couldn't find the match-string in the response\n";
  }
Sending GET Form Data

Some HTML forms convey their form data not by sending the data in an HTTP POST request, but by making a normal GET request with the data stuck on the end of the URL. For example, if you went to imdb.com and ran a search on Blade Runner, the URL you'd see in your browser window would be:


  http://us.imdb.com/Tsearch?title=Blade%20Runner&restrict=Movies+and+TV

To run the same search with LWP, you'd use this idiom, which involves the URI class:


  use URI;
  my $url = URI->new( 'http://us.imdb.com/Tsearch' );
    # makes an object representing the URL
  
  $url->query_form(  # And here the form data pairs:
    'title'    => 'Blade Runner',
    'restrict' => 'Movies and TV',
  );
  
  my $response = $browser->get($url);

See Chapter 5, "Forms" of Perl & LWP for a longer discussion of HTML forms and of form data, as well as Chapter 6 through Chapter 9 for a longer discussion of extracting data from HTML.

Absolutizing URLs

The URI class that we just mentioned above provides all sorts of methods for accessing and modifying parts of URLs (such as asking sort of URL it is with $url->scheme, and asking what host it refers to with $url->host, and so on, as described in the docs for the URI class. However, the methods of most immediate interest are the query_form method seen above, and now the new_abs method for taking a probably relative URL string (like "../foo.html") and getting back an absolute URL (like "http://www.perl.com/stuff/foo.html"), as shown here:


  use URI;
  $abs = URI->new_abs($maybe_relative, $base);

For example, consider this program that matches URLs in the HTML list of new modules in CPAN:


  use strict;
  use warnings;
  use LWP 5.64;
  my $browser = LWP::UserAgent->new;
  
  my $url = 'http://www.cpan.org/RECENT.html';
  my $response = $browser->get($url);
  die "Can't get $url -- ", $response->status_line
   unless $response->is_success;
  
  my $html = $response->content;
  while( $html =~ m/<A HREF=\"(.*?)\"/g ) {    
      print "$1\n";  
  }

When run, it emits output that starts out something like this:


  MIRRORING.FROM
  RECENT
  RECENT.html
  authors/00whois.html
  authors/01mailrc.txt.gz
  authors/id/A/AA/AASSAD/CHECKSUMS
  ...

However, if you actually want to have those be absolute URLs, you can use the URI module's new_abs method, by changing the while loop to this:


  while( $html =~ m/<A HREF=\"(.*?)\"/g ) {    
      print URI->new_abs( $1, $response->base ) ,"\n";
  }

(The $response->base method from HTTP::Message is for returning the URL that should be used for resolving relative URLs--it's usually just the same as the URL that you requested.)

That program then emits nicely absolute URLs:


  http://www.cpan.org/MIRRORING.FROM
  http://www.cpan.org/RECENT
  http://www.cpan.org/RECENT.html
  http://www.cpan.org/authors/00whois.html
  http://www.cpan.org/authors/01mailrc.txt.gz
  http://www.cpan.org/authors/id/A/AA/AASSAD/CHECKSUMS
  ...

See Chapter 4, "URLs", of Perl & LWP for a longer discussion of URI objects.

Of course, using a regexp to match hrefs is a bit simplistic, and for more robust programs, you'll probably want to use an HTML-parsing module like HTML::LinkExtor, or HTML::TokeParser, or even maybe HTML::TreeBuilder.

Other Browser Attributes

LWP::UserAgent objects have many attributes for controlling how they work. Here are a few notable ones:

For more options and information, see the full documentation for LWP::UserAgent.

Writing Polite Robots

If you want to make sure that your LWP-based program respects robots.txt files and doesn't make too many requests too fast, you can use the LWP::RobotUA class instead of the LWP::UserAgent class.

LWP::RobotUA class is just like LWP::UserAgent, and you can use it like so:


  use LWP::RobotUA;
  my $browser = LWP::RobotUA->new(
    'YourSuperBot/1.34', 'you@yoursite.com');
    # Your bot's name and your email address

  my $response = $browser->get($url);

But HTTP::RobotUA adds these features:

For more options and information, see the full documentation for LWP::RobotUA.

Using Proxies

In some cases, you will want to (or will have to) use proxies for accessing certain sites or for using certain protocols. This is most commonly the case when your LWP program is running (or could be running) on a machine that is behind a firewall.

To make a browser object use proxies that are defined in the usual environment variables (HTTP_PROXY), just call the env_proxy on a user-agent object before you go making any requests on it. Specifically:


  use LWP::UserAgent;
  my $browser = LWP::UserAgent->new;
  
  # And before you go making any requests:
  $browser->env_proxy;

For more information on proxy parameters, see the LWP::UserAgent documentation, specifically the proxy, env_proxy, and no_proxy methods.

HTTP Authentication

Many Web sites restrict access to documents by using "HTTP Authentication". This isn't just any form of "enter your password" restriction, but is a specific mechanism where the HTTP server sends the browser an HTTP code that says "That document is part of a protected 'realm', and you can access it only if you re-request it and add some special authorization headers to your request".

For example, the Unicode.org administrators stop email-harvesting bots from harvesting the contents of their mailing list archives by protecting them with HTTP Authentication, and then publicly stating the username and password (at http://www.unicode.org/mail-arch/)--namely username "unicode-ml" and password "unicode".

For example, consider this URL, which is part of the protected area of the Web site:


  http://www.unicode.org/mail-arch/unicode-ml/y2002-m08/0067.html

If you access that with a browser, you'll get a prompt like "Enter username and password for 'Unicode-MailList-Archives' at server 'www.unicode.org'", or in a graphical browser, something like this:

In LWP, if you just request that URL, like this:


  use LWP 5.64;
  my $browser = LWP::UserAgent->new;

  my $url =
   'http://www.unicode.org/mail-arch/unicode-ml/y2002-m08/0067.html';
  my $response = $browser->get($url);

  die "Error: ", $response->header('WWW-Authenticate') || 
    'Error accessing',
    #  ('WWW-Authenticate' is the realm-name)
    "\n ", $response->status_line, "\n at $url\n Aborting"
   unless $response->is_success;

Then you'll get this error:


  Error: Basic realm="Unicode-MailList-Archives"
   401 Authorization Required
   at http://www.unicode.org/mail-arch/unicode-ml/y2002-m08/0067.html
   Aborting at auth1.pl line 9.  [or wherever]

because the $browser doesn't know any the username and password for that realm ("Unicode-MailList-Archives") at that host ("www.unicode.org"). The simplest way to let the browser know about this is to use the credentials method to let it know about a username and password that it can try using for that realm at that host. The syntax is:


  $browser->credentials(
    'servername:portnumber',
    'realm-name',
    'username' => 'password'
  );

In most cases, the port number is 80, the default TCP/IP port for HTTP; and you usually call the credentials method before you make any requests. For example:


  $browser->credentials(
    'reports.mybazouki.com:80',
    'web_server_usage_reports',
    'plinky' => 'banjo123'
  );

So if we add the following to the program above, right after the $browser = LWP::UserAgent->new; line:


  $browser->credentials(  # add this to our $browser 's "key ring"
    'www.unicode.org:80',
    'Unicode-MailList-Archives',
    'unicode-ml' => 'unicode'
  );

and then when we run it, the request succeeds, instead of causing the die to be called.

Accessing HTTPS URLs

When you access an HTTPS URL, it'll work for you just like an HTTP URL would--if your LWP installation has HTTPS support (via an appropriate Secure Sockets Layer library). For example:


  use LWP 5.64;
  my $url = 'https://www.paypal.com/';   # Yes, HTTPS!
  my $browser = LWP::UserAgent->new;
  my $response = $browser->get($url);
  die "Error at $url\n ", $response->status_line, "\n Aborting"
   unless $response->is_success;
  print "Whee, it worked!  I got that ",
   $response->content_type, " document!\n";

If your LWP installation doesn't have HTTPS support set up, then the response will be unsuccessful, and you'll get this error message:


  Error at https://www.paypal.com/
   501 Protocol scheme 'https' is not supported
   Aborting at paypal.pl line 7.   [or whatever program and line]

If your LWP installation does have HTTPS support installed, then the response should be successful, and you should be able to consult $response just like with any normal HTTP response.

For information about installing HTTPS support for your LWP installation, see the helpful README.SSL file that comes in the libwww-perl distribution.

Getting Large Documents

When you're requesting a large (or at least potentially large) document, a problem with the normal way of using the request methods (like $response = $browser->get($url)) is that the response object in memory will have to hold the whole document--in memory. If the response is a 30-megabyte file, this is likely to be quite an imposition on this process's memory usage.

A notable alternative is to have LWP save the content to a file on disk, instead of saving it up in memory. This is the syntax to use:


  $response = $ua->get($url,
                         ':content_file' => $filespec,
                      );

For example,


  $response = $ua->get('http://search.cpan.org/',
                         ':content_file' => '/tmp/sco.html'
                      );

When you use this :content_file option, the $response will have all the normal header lines, but $response->content will be empty.

Note that this ":content_file" option isn't supported under older versions of LWP, so you should consider adding use LWP 5.66; to check the LWP version, if you think your program might run on systems with older versions.

If you need to be compatible with older LWP versions, then use this syntax, which does the same thing:


  use HTTP::Request::Common;
  $response = $ua->request( GET($url), $filespec );

Resources

Remember, this article is just the most rudimentary introduction to LWP--to learn more about LWP and LWP-related tasks, you really must read from the following:


Copyright 2002, Sean M. Burke. You can redistribute this document and/or modify it, but only under the same terms as Perl itself.

[Nov 15, 2015] Unescaped left brace in regex is deprecated

Here maintainers went in wrong direction. Those guys are playing dangerous games and keeping users hostage. I wonder why this warning is installed but in any case it is implemented incorrectly. It raised the warning in $zone =~/^(\d{4})\/(\d{1,2})\/(\d{1,2})$/ which breaks compatibility with huge mass of Perl scripts and Perl books. Is not this stupid? I think this is a death sentence for version 5.22. Reading Perl delta it looks like developers do not have any clear ideas how version 5 of the language should develop, do not write any documents about it that could be discussed.
Notable quotes:
"... A literal { should now be escaped in a pattern ..."
www.perlmonks.org

in reply to "Unescaped left brace in regex is deprecated"

From the perldelta for Perl v5.22.0:

A literal { should now be escaped in a pattern

If you want a literal left curly bracket (also called a left brace) in a regular expression pattern, you should now escape it by either preceding it with a backslash (\{) or enclosing it within square brackets [{], or by using \Q; otherwise a deprecation warning will be raised. This was first announced as forthcoming in the v5.16 release; it will allow future extensions to the language to happen.

[Nov 15, 2015] Perl LWP

LWP is a better deal that CGI.pm.
June 30, 2002 | Amazon.com

Read sample chapters online...

The LWP (Library for WWW in Perl) suite of modules lets your programs download and extract information from the Web. Perl & LWP shows how to make web requests, submit forms, and even provide authentication information, and it demonstrates using regular expressions, tokens, and trees to parse HTML. This book is a must have for Perl programmers who want to automate and mine the Web.

Gavin

Excellent coverage of LWP, packed full of useful examples, on July 16, 2002

I was definitely interested when I first heard that O'Reilly were publishing a book on LWP. LWP is a definitive collection of perl modules covering everything you could think of doing with URIs, HTML, and HTTP. While 'web services' are the buzzword friendly technology of the day, sometimes you need to roll your sleeves up and get a bit dirty scraping screens and hacking at HTML. For such a deep subject, this book weighs in at a slim 242 pages. This is a very good thing. I'm far too busy to read these massive shelf-destroying tomes that seem to be churned out recently.

It covers everything you need to know with concise examples, which is what makes this book really shine. You start with the basics using LWP::Simple through to more advanced topics using LWP::UserAgent, HTTP::Cookies, and WWW::RobotRules. Sean shows finger saving tips and shortcuts that take you more than a couple notches above what you can learn from the lwpcook manpage, with enough depth to satisfy somebody who is an experienced LWP hacker.

This book is a great reference, just flick through and you'll find a relevant chapter with an example to save the day. Chapters include filling in forms and extracting data from HTML using regular expressions, then more advanced topics using HTML::TokeParser, and then my preferred tool, the author's own HTML::TreeBuilder. The book ends with a chapter on spidering, with excellent coverage of design and warnings to get your started on your web trawling.

[Nov 01, 2015] Stupid open() tricks

perltricks.com

Create an anonymous temporary file

If I give open a filename of an explicit undef and the read-write mode (+> or +<), Perl opens an anonymous temporary file:

 open   my   $fh  ,     '+>'  ,     undef  ; 

Perl actually creates a named file and opens it, but immediately unlinks the name. No one else will be able to get to that file because no one else has the name for it. If I had used File::Temp, I might leave the temporary file there, or something else might be able to see it while I'm working with it.

Print to a string

If my perl is compiled with PerlIO (it probably is), I can open a filehandle on a scalar variable if the filename argument is a reference to that variable.

 open   my   $fh  ,     '>'  ,   \   my   $string  ; 

This is handy when I want to capture output for an interface that expects a filehandle:

 something_that_prints  (   $fh   ); 

Now $string contains whatever was printed by the function. I can inspect it by printing it:

 say   "I captured:\n$string"  ; 

Read lines from a string

I can also read from a scalar variable by opening a filehandle on it.

 open   my   $fh  ,     '<'  ,   \ $string  ; 

Now I can play with the string line-by-line without messing around with regex anchors or line endings:

 while  (     <  $fh  >     )     {     ...     } 

I write about these sorts of filehandle-on-string tricks in Effective Perl Programming.

Make a pipeline

Most Unix programmers probably already know that they can read the output from a command as the input for another command. I can do that with Perl's open too:

 use   v5  .  10  ;  

open   my   $pipe  ,     '-|'  ,     'date'  ;  
  while  (     <  $pipe  >     )     {  
  say   "$_"  ;  
    } 

This reads the output of the date system command and prints it. But, I can have more than one command in that pipeline. I have to abandon the three-argument form which purposely prevents this nonsense:

 open   my   $pipe  ,   qq  (  cat   '$0'     |   sort   |);  
  while  (     <  $pipe  >     )     {  
    print     "$.: $_"  ;  
    } 

This captures the text of the current program, sorts each line alphabetically and prints the output with numbered lines. I might get a Useless Use of cat Award for that program that sorts the lines of the program, but it's still a feature.

gzip on the fly

In Gzipping data directly from Perl, I showed how I could compress data on the fly by using Perl's gzip IO layer. This is handy when I have limited disk space:

 open   my   $fh  ,     '>:gzip'   $filename 
  or   die     "Could not write to $filename: $!"  ;  

  while  (   $_   =   something_interesting  ()     )     {  
    print     {   $fh   }   $_  ;  
  } 

I can go the other direction as well, reading directly from compressed files when I don't have enough space to uncompress them first:

 open   my   $fh  ,     '<:gzip'   $filename 
  or   die     "Could not read from $filename: $!"  ;  

  while  (     <  $fh  >     )     {  
    print  ;  
    } 

Change STDOUT

I can change the default output filehandle with select if I don't like standard output, but I can do that in another way. I can change STDOUT for the times when the easy way isn't fun enough. David Farrell showed some of this in How to redirect and restore STDOUT.

First I can say the "dupe" the standard output filehandle with the special &mode:

 use   v5  .  10  ;  

open   my   $STDOLD  ,     '>&'  ,   STDOUT  ; 

Any of the file modes will work there as long as I append the & to it.

I can then re-open STDOUT:

 open STDOUT  ,     '>>'  ,     'log.txt'  ;  
say   'This should be logged to log.txt.'  ; 

When I'm ready to change it back, I do the same thing:

 open STDOUT  ,     '>&'  ,   $STDOLD  ;  
say   'This should show in the terminal'  ; 

If I only have the file descriptor, perhaps because I'm working with an old Unix programmer who thinks vi is a crutch, I can use that:

 open   my   $fh  ,     "<&=$fd"   
  or   die     "Could not open filehandle on $fd\n"  ; 

This file descriptor has a three-argument form too:

 open   my   $fh  ,     '<&='  ,   $fd
  or   die     "Could not open filehandle on $fd\n"  ; 

I can have multiple filehandles that go to the same place since they are different names for the same file descriptor:

 use   v5  .  10  ;  

open   my   $fh  ,     '>>&='  ,   fileno  (  STDOUT  );  

say           'Going to default'  ;  
say $fh       'Going to duped version. fileno '     .   fileno  (  $fh  );  
say STDOUT    'Going to STDOUT. fileno '     .   fileno  (  $fh  ); 

All of these print to STDOUT.

[Oct 31, 2015] A preview of Perl 5.222 by brian d foy

The release is quite disappointing, not to say worse ... Warning about non escaped { in regex is a SNAFU as it is implemented completely incorrectly and does no distinguish the important cases like \d{3} or .{3} (in this case no backslash should never be required).
April 10, 2015 | perltricks.com

Perl v5.22 is bringing myriad new features and ways of doing things, making its perldelta file much more interesting than most releases. While I normally wait until after the first stable release to go through these features over at The Effective Perler, here's a preview of some of the big news.

A safer ARGV

The line input operator, <> looks at the @ARGV array for filenames to open and read through the ARGV filehandle. It has the same meta-character problem as the two-argument open. Special characters in the filename might do shell things. To get around this unintended feature (which I think might be useful if that's what you want), there's a new line-input operator, <<>>, that doesn't treat any character as special:

while( <<>> ) {  # new, safe line input operator
	...;
	}
CGI.pm and Module::Build disappear from core

The Perl maintainers have been stripping modules from the Standard Library. Sometimes that's because no one uses (or should use) that module anymore, no one wants to maintain that module, or it's better to get it from CPAN where the maintainer can update it faster than the Perl release cycle. You can still find these modules on CPAN, though.

The CGI.pm module, only one of Lincoln Stein's amazing contributions to the Perl community, is from another era. It was light years ahead of its Perl 4 predecessor, cgi.pl. It did everything, including HTML generation. This was the time before robust templating systems came around, and CGI.pm was good. But, they've laid it to rest.

Somehow, Module::Build fell out of favor. Before then, building and installing Perl modules depended on a non-perl tool, make. That's a portability problem. However, we already know they have Perl, so if there were a pure Perl tool that could do the same thing we could solve the portability problem. We could also do much more fancy things. It was the wave of the future. I didn't really buy into Module::Build although I had used it for a distributions, but I'm still a bit sad to see it go. It had some technical limitations and was unmaintained for a bit, and now it's been cut loose. David Golden explains more about that in Paying respect to Module::Build.

This highlights a long-standing and usually undiscovered problem with modules that depend on modules in the Standard Library. For years, most authors did not bother to declare those dependencies because Perl was there and its modules must be there too. When those modules move to a CPAN-only state, they end up with an undeclared dependencies. This also shows up in some linux distributions that violate the Perl license by removing some modules or putting them in a different package. Either way, always declare a dependency on everything you use despite its provenance.

Hexadecimal floating point values

Have you always felt too constrained by ten digits, but were also stuck with non-integers? Now your problems are solved with hexadecimal floating point numbers.

We already have the exponential notation with uses the e to note the exponent, as in 1.23e4. But that e is a hexadecimal digit, so we can't use that to denote the exponent. Instead, we use p and an exponent that's a power of two:

use v5.22;

my $num = 0.p2;
Variable aliases

We can now assign to the reference version of a non-reference variable. This creates an alias for the referenced value.

use v5.22;
use feature qw(refaliasing);

\%other_hash = \%hash;

I think we'll discover many interesting uses for this, and probably some dangerous ones, but the use case in the docs looks interesting. We can now assign to something other than a scalar for the foreach control variable:

use v5.22;
use feature qw(refaliasing);

foreach \my %hash ( @array_of_hashes ) { # named hash control variable
	foreach my $key ( keys %hash ) { # named hash now!
		...;
		}
	}

I don't think I'll use that particular pattern since I'm comfortable with references, but if you really hate the dereferencing arrow, this might be for you. Note that v5.12 allows us to write keys $hash_ref without the dereferencing %. See my Effective Perl items Use array references with the array operators, but also Don't use auto-dereferencing with each or keys.

Repetition in list assignment

Perl can assign one list of scalars to another. In Learning Perl we show assigning to undef. I could make dummy variables:

my($name, $card_num, $addr, $home, $work, $count) = split /:/;

But if I don't need all of those variable, I can put placeholder undefs in the assignment list:

my(undef, $card_num, undef, undef, undef, $count) = split /:/;

Those consecutive undefs can be a problem, as well as ugly. I don't have to count out separate undefs now:

use v5.22;

my(undef, $card_num, (undef)x3, $count) = split /:/;
List pipe opens on Win32

The three-argument open can take a pipe mode, which didn't previously work on Windows. Now it does, to the extent that the list form of system works on Win32:

open my $fh, '-|', 'some external command' or die;

I always have to check my notes to remember that the - in the pipe mode goes on the side of the pipe that has the pipe. Those in the unix world know - as a special filename for standard input in many commands.

Various small fixes

We also get many smaller fixes I think are worth a shout out. Many of these are clean ups to warts and special cases:

[Oct 31, 2015] starting with perl 5.14 local($_) will always strip all magic from $_, to make it possible to safely reuse $_ in a subroutine.

[Oct 06, 2015] Larry Wall Unveils Perl 6.0.0

October 06, 2015 | developers.slashdot.org

An anonymous reader writes: Last night Larry Wall unveiled the first development release of Perl 6, joking that now a top priority was fixing bugs that could be mistaken for features. The new language features meta-programming - the ability to define new bits of syntax on your own to extend the language, and even new infix operators. Larry also previewed what one reviewer called "exotic and new" features, including the sequence operator and new control structures like "react" and "gather and take" lists. "We don't want their language to run out of steam," Larry told the audience. "It might be a 30- or 40-year language. I think it's good enough."

Can't find independent verification cruff (171569)
Neither perl6.org and its mailing lists seem to mention anything about this. The links in TFA are blocked by OpenDNS too.
Re:Can't find independent verification (Score:5, Informative) Tuesday October 06, 2015 @06:33PM (#50674655)
It's a development released (timed to coincide with Larry's birthday in September, according to Wikipedia).

Here's URLs where the event was announced.

http://www.meetup.com/SVPerl/e... [meetup.com]

http://perl6releasetalk.ticket... [ticketleap.com]

mbkennel (97636) on Tuesday October 06, 2015

Bugs mistaken as features? (Score:4, Funny)

Last night Larry Wall unveiled the first development release of Perl 6, joking that now a top priority was fixing bugs that could be mistaken for features.

Sounds good.

Anonymous Coward on Tuesday October 06, 2015 @07:20PM (#50675019)

No Coroutines (Score:5, Insightful)

No coroutines. So sad. That still leaves Lua and Stackless Python as the only languages with symmetric, transparent coroutines without playing games with the C stack.

Neither Lua nor Stackless Python implement recursion on the C stack. Python and apparently Perl6/More implement recursion on the C stack, which means that they can't easily create multiple stacks for juggling multiple flows of control. That's why in Python and Perl6 you have the async/await take/gather syntax, whereas in Lua coroutine.resume and coroutine.yield can be called from any function, regardless of where it is in the stack call frame, without having to adorn the function definition. Javascript is similarly crippled. All the promise/future/lambda stuff could be made infinitely more elegant with coroutines, but all modern Javascript implementations assume a single call stack, so the big vendors rejected coroutines.

In Lua a new coroutine has the same allocation cost as a new lambda/empty closure. And switching doesn't involving dumping or restoring CPU registers. So in Lua you can use coroutines to implement great algorithms without thinking twice. Not as just a fancy green threading replacement, but for all sorts of algorithms where the coroutine will be quickly discarded (just as coroutines' little brothers, generators, are typically short lived). Kernel threads and "fibers" are comparatively heavy weight, both in terms of performance and memory, compared to VM-level coroutines.

The only other language with something approximating cheap coroutines is Go.

I was looking forward to Perl 6. But I think I'll skip it. The top two language abstractions I would have loved to see were coroutines and lazy evaluation. Perl6 delivered poor approximations of those things. Those approximations are okay for the most-used design patterns, but aren't remotely composable to the same degree. And of course the "most used" patterns are that way because of existing language limitations.

These days I'm mostly a Lua and C guy who implements highly concurrent network services. I was really looking forward to Perl6 (I always liked Perl 5), but it remains the case that the only interesting language alternative in my space are Go and Rust. But Rust can't handle out-of-memory (OOM). (Impossible to, e.g., catch OOM when creating a Box). Apparently Rust developers think that it's okay to crash a service because a request failed, unless you want to create 10,000 kernel threads, which is possible but stupid. Too many Rust developers are desktop GUI and game developers with a very narrow, skewed experience about dealing with allocation and other resource failures. Even Lua can handle OOM trivially and cleanly without trashing the whole VM or unwinding the entire call stack. (Using protected calls, which is what Rust _should_ add.) So that basically just leaves Go, which is looking better and better. Not surprising given how similar Go and Lua are.

But the problem with Go is that you basically have to leave the C world behind for large applications (you can't call out to a C library from a random goroutine because it has to switch to a special C stack; which means you don't want to have 10,000 concurrent goroutines each calling into a third-party C library), whereas Lua is designed to treat C code as a first-class environment. (And you have to meet it half way.

To make Lua coroutines integrate with C code which yields, you have to implement your own continuation logic because the C stack isn't preserved when yielding. It's not unlike chaining generators in Python, which requires a little effort. A tolerable issue but doable in the few cases it's necessary in C, whereas in Python and now Perl6 it's _always_ an issue and hinderance.

Greyfox (87712) on Tuesday October 06, 2015 @08:45PM (#50675635) Homepage Journal

Re:Oh no (Score:4, Insightful)

Well you CAN write maintainable code in perl, you just have to use some discipline. Turn "use strict;" on everywhere, break your project up into packages across functional lines and have unit tests on everything. You know, all that stuff that no companies ever do. Given the choice between having to maintain a perl project and a ruby one, I'd take the perl project every time. At least you'll have some chance that the developers wrote some decent code, if only in self defense since they usually end up maintaining it themselves for a few years.

murr (214674) on Tuesday October 06, 2015 @09:13PM (#50675793)

"First Development Release" ? (Score:4, Interesting)

If that was the first development release, what on earth was the 2010 release of Rakudo Star?

The problem with Perl 6 was never a lack of development releases, it's 15 years of NOTHING BUT development releases.

hummassa (157160) on Wednesday October 07, 2015 @12:06AM (#50676619) Homepage Journal

Re:Perl? LOL. (Score:1)


DBI is stable, it just works. I have lots of headache in Every Single One of the database middleware. Except DBI.

Lisandro (799651) on Wednesday October 07, 2015 @03:01AM (#50677185)

Re: Perl? LOL. (Score:3)

Same experience here. Say what you want about Perl 5 but it is still one of the fastest interpreted languages around.

I do a lot of prototyping in Python. But if i want speed, i usually use Perl.

randalware (720317) on Tuesday October 06, 2015 @10:03PM (#50676095) Journal

Perl (Score:5, Insightful)

I used perl a lot over the years.

comparing it to a compiled language (C, Ada, Fortran, etc) or a web centric (java, java script, php, etc) language is not a good comparison.

when I need something done (and needed more than the shell) and I had to maintain it I wrote it in perl all sorts of sysadmin widgets.many are still being used today (15+ years later)
I wrote clean decent code with comments & modules.

how much code have you written ? and had it stay running for decades ?

the people that took over my positions when I changed jobs never had a problem updating the code or using it.

bytesex (112972) on Wednesday October 07, 2015 @05:48AM (#50677687) Homepage

Re:Perl? LOL. (Score:5, Insightful)

The cool kids jumped on the python bandwagon saying perl was old, but in all this time they have yet failed to:

Which are all the reasons I love and use perl.

Persistent variables via state()

perlsub - perldoc.perl.org

Beginning with Perl 5.10.0, you can declare variables with the state keyword in place of my. For that to work, though, you must have enabled that feature beforehand, either by using the feature pragma, or by using -E on one-liners (see feature). Beginning with Perl 5.16, the CORE::state form does not require the feature pragma.

The state keyword creates a lexical variable (following the same scoping rules as my) that persists from one subroutine call to the next. If a state variable resides inside an anonymous subroutine, then each copy of the subroutine has its own copy of the state variable. However, the value of the state variable will still persist between calls to the same copy of the anonymous subroutine. (Don't forget that sub { ... } creates a new subroutine each time it is executed.)

For example, the following code maintains a private counter, incremented each time the gimme_another() function is called:

 
  1. use feature 'state';
  2. sub gimme_another { state $x; return ++$x }
 

And this example uses anonymous subroutines to create separate counters:

 
  1. use feature 'state';
  2. sub create_counter {
  3. return sub { state $x; return ++$x }
  4. }
 

Also, since $x is lexical, it can't be reached or modified by any Perl code outside.

When combined with variable declaration, simple scalar assignment to state variables (as in state $x = 42 ) is executed only the first time. When such statements are evaluated subsequent times, the assignment is ignored. The behavior of this sort of assignment to non-scalar variables is undefined.

Persistent variables with closures

Just because a lexical variable is lexically (also called statically) scoped to its enclosing block, eval, or do FILE, this doesn't mean that within a function it works like a C static. It normally works more like a C auto, but with implicit garbage collection.

Unlike local variables in C or C++, Perl's lexical variables don't necessarily get recycled just because their scope has exited. If something more permanent is still aware of the lexical, it will stick around. So long as something else references a lexical, that lexical won't be freed--which is as it should be. You wouldn't want memory being free until you were done using it, or kept around once you were done. Automatic garbage collection takes care of this for you.

This means that you can pass back or save away references to lexical variables, whereas to return a pointer to a C auto is a grave error. It also gives us a way to simulate C's function statics. Here's a mechanism for giving a function private variables with both lexical scoping and a static lifetime. If you do want to create something like C's static variables, just enclose the whole function in an extra block, and put the static variable outside the function but in the block.

 
  1. {
  2. my $secret_val = 0;
  3. sub gimme_another {
  4. return ++$secret_val;
  5. }
  6. }
  7. # $secret_val now becomes unreachable by the outside
  8. # world, but retains its value between calls to gimme_another
 

If this function is being sourced in from a separate file via require or use, then this is probably just fine. If it's all in the main program, you'll need to arrange for the my to be executed early, either by putting the whole block above your main program, or more likely, placing merely a BEGIN code block around it to make sure it gets executed before your program starts to run:

 
  1. BEGIN {
  2. my $secret_val = 0;
  3. sub gimme_another {
  4. return ++$secret_val;
  5. }
  6. }
 

See BEGIN, UNITCHECK, CHECK, INIT and END in perlmod about the special triggered code blocks, BEGIN , UNITCHECK , CHECK , INIT and END .

If declared at the outermost scope (the file scope), then lexicals work somewhat like C's file statics. They are available to all functions in that same file declared below them, but are inaccessible from outside that file. This strategy is sometimes used in modules to create private variables that the whole module can see.

[Jun 11, 2015] The Fall Of Perl, The Web's Most Promising Language By Conor Myhrvold

Rumors about Perl death are greatly exaggerated ;-). some people like me are not attracted to Python -- for many reasons. Perl is more flexible and in Unix scripting area more powerful, higher level language. Some aspects of Python I like, but all-in-all I stay with Perl.
Fast Company Business + Innovation

And the rise of Python. Does Perl have a future?

I first heard of Perl when I was in middle school in the early 2000s. It was one of the world's most versatile programming languages, dubbed the Swiss army knife of the Internet. But compared to its rival Python, Perl has faded from popularity. What happened to the web's most promising language?

Perl's low entry barrier compared to compiled, lower level language alternatives (namely, C) meant that Perl attracted users without a formal CS background (read: script kiddies and beginners who wrote poor code). It also boasted a small group of power users ("hardcore hackers") who could quickly and flexibly write powerful, dense programs that fueled Perl's popularity to a new generation of programmers.

A central repository (the Comprehensive Perl Archive Network, or CPAN) meant that for every person who wrote code, many more in the Perl community (the Programming Republic of Perl) could employ it. This, along with the witty evangelism by eclectic creator Larry Wall, whose interest in language ensured that Perl led in text parsing, was a formula for success during a time in which lots of text information was spreading over the Internet.

As the 21st century approached, many pearls of wisdom were wrought to move and analyze information on the web. Perl did have a learning curve-often meaning that it was the third or fourth language learned by adopters-but it sat at the top of the stack.

"In the race to the millennium, it looks like C++ will win, Java will place, and Perl will show," Wall said in the third State of Perl address in 1999. "Some of you no doubt will wish we could erase those top two lines, but I don't think you should be unduly concerned. Note that both C++ and Java are systems programming languages. They're the two sports cars out in front of the race. Meanwhile, Perl is the fastest SUV, coming up in front of all the other SUVs. It's the best in its class. Of course, we all know Perl is in a class of its own."

Then came the upset.

The Perl vs. Python Grudge Match

Then Python came along. Compared to Perl's straight-jacketed scripting, Python was a lopsided affair. It even took after its namesake, Monty Python's Flying Circus. Fittingly, most of Wall's early references to Python were lighthearted jokes at its expense.

Well, the millennium passed, computers survived Y2K, and my teenage years came and went. I studied math, science, and humanities but kept myself an arm's distance away from typing computer code. My knowledge of Perl remained like the start of a new text file: cursory, followed by a lot of blank space to fill up.

In college, CS friends at Princeton raved about Python as their favorite language (in spite of popular professor Brian Kernighan on campus, who helped popularize C). I thought Python was new, but I later learned it was around when I grew up as well, just not visible on the charts.

By the late 2000s Python was not only the dominant alternative to Perl for many text parsing tasks typically associated with Perl (i.e. regular expressions in the field of bioinformatics) but it was also the most proclaimed popular language, talked about with elegance and eloquence among my circle of campus friends, who liked being part of an up-and-coming movement.

Side By Side Comparison: Binary Search

Despite Python and Perl's well documented rivalry and design decision differences-which persist to this day-they occupy a similar niche in the programming ecosystem. Both are frequently referred to as "scripting languages," even though later versions are retro-fitted with object oriented programming (OOP) capabilities.

Stylistically, Perl and Python have different philosophies. Perl's best known mottos is " There's More Than One Way to Do It". Python is designed to have one obvious way to do it. Python's construction gave an advantage to beginners: A syntax with more rules and stylistic conventions (for example, requiring whitespace indentations for functions) ensured newcomers would see a more consistent set of programming practices; code that accomplished the same task would look more or less the same. Perl's construction favors experienced programmers: a more compact, less verbose language with built-in shortcuts which made programming for the expert a breeze.

During the dotcom era and the tech recovery of the mid to late 2000s, high-profile websites and companies such as Dropbox (Python) and Amazon and Craigslist (Perl), in addition to some of the world's largest news organizations (BBC, Perl) used the languages to accomplish tasks integral to the functioning of doing business on the Internet.

But over the course of the last 15 years, not only how companies do business has changed and grown, but so have the tools they use to have grown as well, unequally to the detriment of Perl. (A growing trend that was identified in the last comparison of the languages, "A Perl Hacker in the Land of Python," as well as from the Python side a Pythonista's evangelism aggregator, also done in the year 2000.)

Perl's Slow Decline

Today, Perl's growth has stagnated. At the Orlando Perl Workshop in 2013, one of the talks was titled "Perl is not Dead, It is a Dead End," and claimed that Perl now existed on an island. Once Perl programmers checked out, they always left for good, never to return. Others point out that Perl is left out of the languages to learn first-in an era where Python and Java had grown enormously, and a new entrant from the mid-2000s, Ruby, continues to gain ground by attracting new users in the web application arena (via Rails), followed by the Django framework in Python (PHP has remained stable as the simplest option as well).

In bioinformatics, where Perl's position as the most popular scripting language powered many 1990s breakthroughs like genetic sequencing, Perl has been supplanted by Python and the statistical language R (a variant of S-plus and descendent of S, also developed in the 1980s).

In scientific computing, my present field, Python, not Perl, is the open source overlord, even expanding at Matlab's expense (also a child of the 1980s, and similarly retrofitted with OOP abilities). And upstart PHP grew in size to the point where it is now arguably the most common language for web development (although its position is dynamic, as Ruby and Python have quelled PHP's dominance and are now entrenched as legitimate alternatives.)

While Perl is not in danger of disappearing altogether, it is in danger of losing cultural relevance, an ironic fate given Wall's love of language. How has Perl become the underdog, and can this trend be reversed? (And, perhaps more importantly, will Perl 6 be released!?)

How I Grew To Love Python

Why Python, and not Perl? Perhaps an illustrative example of what happened to Perl is my own experience with the language.

In college, I still stuck to the contained environments of Matlab and Mathematica, but my programming perspective changed dramatically in 2012. I realized lacking knowledge of structured computer code outside the "walled garden" of a desktop application prevented me from fully simulating hypotheses about the natural world, let alone analyzing data sets using the web, which was also becoming an increasingly intellectual and financially lucrative skill set.

One year after college, I resolved to learn a "real" programming language in a serious manner: An all-in immersion taking me over the hump of knowledge so that, even if I took a break, I would still retain enough to pick up where I left off. An older alum from my college who shared similar interests-and an experienced programmer since the late 1990s-convinced me of his favorite language to sift and sort through text in just a few lines of code, and "get things done": Perl. Python, he dismissed, was what "what academics used to think." I was about to be acquainted formally.

Before making a definitive decision on which language to learn, I took stock of online resources, lurked on PerlMonks, and acquired several used O'Reilly books, the Camel Book and the Llama Book, in addition to other beginner books. Yet once again, Python reared its head, and even Perl forums and sites dedicated to the language were lamenting the digital siege their language was succumbing to. What happened to Perl? I wondered. Ultimately undeterred, I found enough to get started (quality over quantity, I figured!), and began studying the syntax and working through examples.

But it was not to be. In trying to overcome the engineered flexibility of Perl's syntax choices, I hit a wall. I had adopted Perl for text analysis, but upon accepting an engineering graduate program offer, switched to Python to prepare.

By this point, CPAN's enormous advantage had been whittled away by ad hoc, hodgepodge efforts from uncoordinated but overwhelming groups of Pythonistas that now assemble in Meetups, at startups, and on college and corporate campuses to evangelize the Zen of Python. This has created a lot of issues with importing (pointed out by Wall), and package download synchronizations to get scientific computing libraries (as I found), but has also resulted in distributions of Python such as Anaconda that incorporate the most important libraries besides the standard library to ease the time tariff on imports.

As if to capitalize on the zeitgiest, technical book publisher O'Reilly ran this ad, inflaming Perl devotees.

By 2013, Python was the language of choice in academia, where I was to return for a year, and whatever it lacked in OOP classes, it made up for in college classes. Python was like Google, who helped spread Python and employed van Rossum for many years. Meanwhile, its adversary Yahoo (largely developed in Perl) did well, but comparatively fell further behind in defining the future of programming. Python was the favorite and the incumbent; roles had been reversed.

So after six months of Perl-making effort, this straw of reality broke the Perl camel's back and caused a coup that overthrew the programming Republic which had established itself on my laptop. I sheepishly abandoned the llama. Several weeks later, the tantalizing promise of a new MIT edX course teaching general CS principles in Python, in addition to numerous n00b examples, made Perl's syntax all too easy to forget instead of regret.

Measurements of the popularity of programming languages, in addition to friends and fellow programming enthusiasts I have met in the development community in the past year and a half, have confirmed this trend, along with the rise of Ruby in the mid-2000s, which has also eaten away at Perl's ubiquity in stitching together programs written in different languages.

While historically many arguments could explain away any one of these studies-perhaps Perl programmers do not cheerlead their language as much, since they are too busy productively programming. Job listings or search engine hits could mean that a programming language has many errors and issues with it, or that there is simply a large temporary gap between supply and demand.

The concomitant picture, and one that many in the Perl community now acknowledge, is that Perl is now essentially a second-tier language, one that has its place but will not be the first several languages known outside of the Computer Science domain such as Java, C, or now Python.

The Future Of Perl (Yes, It Has One)

I believe Perl has a future, but it could be one for a limited audience. Present-day Perl is more suitable to users who have worked with the language from its early days, already dressed to impress. Perl's quirky stylistic conventions, such as using $ in front to declare variables, are in contrast for the other declarative symbol $ for practical programmers today-the money that goes into the continued development and feature set of Perl's frenemies such as Python and Ruby. And the high activation cost of learning Perl, instead of implementing a Python solution.

Ironically, much in the same way that Perl jested at other languages, Perl now finds itself at the receiving end. What's wrong with Perl, from my experience? Perl's eventual problem is that if the Perl community cannot attract beginner users like Python successfully has, it runs the risk of become like Children of Men, dwindling away to a standstill; vast repositories of hieroglyphic code looming in sections of the Internet and in data center partitions like the halls of the Mines of Moria. (Awe-inspiring and historical? Yes. Lively? No.)

Perl 6 has been an ongoing development since 2000. Yet after 14 years it is not officially done, making it the equivalent of Chinese Democracy for Guns N' Roses. In Larry Wall's words: "We're not trying to make Perl a better language than C++, or Python, or Java, or JavaScript. We're trying to make Perl a better language than Perl. That's all." Perl may be on the same self-inflicted path to perfection as Axl Rose, underestimating not others but itself. "All" might still be too much.

Absent a game-changing Perl release (which still could be "too little, too late") people who learn to program in Python have no need to switch if Python can fulfill their needs, even if it is widely regarded as second or third best in some areas. The fact that you have to import a library, or put up with some extra syntax, is significantly easier than the transactional cost of learning a new language and switching to it. So over time, Python's audience stays young through its gateway strategy that van Rossum himself pioneered, Computer Programming for Everybody. (This effort has been a complete success. For example, at MIT Python replaced Scheme as the first language of instruction for all incoming freshman, in the mid-2000s.)

Python Plows Forward

Python continues to gain footholds one by one in areas of interest, such as visualization (where Python still lags behind other language graphics, like Matlab, Mathematica, or the recent d3.js), website creation (the Django framework is now a mainstream choice), scientific computing (including NumPy/SciPy), parallel programming (mpi4py with CUDA), machine learning, and natural language processing (scikit-learn and NLTK) and the list continues.

While none of these efforts are centrally coordinated by van Rossum himself, a continually expanding user base, and getting to CS students first before other languages (such as even Java or C), increases the odds that collaborations in disciplines will emerge to build a Python library for themselves, in the same open source spirit that made Perl a success in the 1990s.

As for me? I'm open to returning to Perl if it can offer me a significantly different experience from Python (but "being frustrating" doesn't count!). Perhaps Perl 6 will be that release. However, in the interim, I have heeded the advice of many others with a similar dilemma on the web. I'll just wait and C

[Feb 28, 2012] Perl Books for modern Perl programming by Chromatic

February 28, 2012

We've just put letter and A4 sized PDFs of Modern Perl: the Book online. This is the new edition, updated for 5.14 and 2011-2012.

As usual, these electronic versions are free to download. Please do. Please share them with friends, family, colleagues, coworkers, and interested people.

Of course we're always thrilled if you buy a printed copy of Modern Perl: the book. Yet even if you don't, please share a copy with your friends, tell other people about it, and (especially) post kind reviews far and wide.

We'd love to see reviews on places like Slashdot, LWN, any bookseller, and any other popular tech site.

We're working on other forms, like ePub and Kindle. That's been the delay (along with personal business); the previous edition's Kindle formatting didn't have the quality we wanted, so we're doing it ourselves to get things right. I hope to have those available in the next couple of weeks, but that depends on how much more debugging we have to do.

Thanks, as always, for reading.

[Jan 18, 2012] I am looking forward to learn perl

LinkedIn

Q: Hi, I'm looking forward to learn perl, I m a systems administrator ( unix ) .I'm intrested in an online course, any recommendations would be highly appreciated. Syed

A: I used to teach sysadmins Perl in corporate environment and I can tell you that the main danger of learning Perl for system administrator is overcomplexity that many Perl books blatantly sell. In this sense anything written by Randal L. Schwartz is suspect and Learning Perl is a horrible book to start. I wonder how many sysadmins dropped Perl after trying to learn from this book

See http://www.softpanorama.org/Bookshelf/perl.shtml

It might be that the best way is to try first to replace awk in your scripts with Perl. And only then gradually start writing full-blown Perl scripts. For inspiration you can look collection on Perl one-liners but please beware that some (many) of them are way too clever to be useful. Useless overcomplexity rules here too.

I would also recommend to avoid OO features on Perl that many books oversell. A lot can be done using regular Algol-style programming with subroutines and by translating awk into Perl. OO has it uses but like many other programming paradigms it is oversold.

Perl is very well integrated in Unix (better then any of the competitors) and due to this it opens for sysadmin levels of productivity simply incomparable with those levels that are achievable using shell. You can automate a lot of routine work and enhance existing monitoring systems and such with ease if you know Perl well.

[Jul 27, 2011] PAC

freshmeat.net
PAC provides a GUI to configure SSH and Telnet connections, including usernames, passwords, EXPECT regular expressions, and macros.

It is similar in function to SecureCRT or Putty. It is intended for people who connect to many servers through SSH.

It can automate logins and command executions.

Tags Perl GTK+ SSH Telnet GNOME Ubuntu Expect
Licenses GPLv3
Operating Systems Linux Ubuntu Debian
Implementation Perl GTK+ Expect
Translations English

[Mar 18, 2011] New Perl news site launches by Ranguard

March 17, 2011
http://perlnews.org/ has just launched and will be providing a source for major announcements related to The Perl Programming Language (http://www.perl.org/). Find out more at http://perlnews.org/about/ - or if you have a story submit it http://perlnews.org/submit/.

All stories are approved to ensure relevance.

Thanks, The Perl News Team.

[Mar 17, 2011] Stupid "make" Tricks: Workflow Control with "make" by Mark Leighton Fisher

March 16, 2011 | blogs.perl.org

Following up Stupid Unix Tricks: Workflow Control with GNU Make -- this trick works on any platform with a make(1) program, including Windows, QNX, VMS, and z/OS.

It also serves to de-couple dependency checking and the workflow execution engine from the rest of your program (with the caveat that your program may need to interpret the output from make(1).)

[Feb 16, 2011] Perl-Critic freshmeat.net

The problem is that Damian Conway's book Perl Best Practices contains a lot of questionable advice ;-)

Perl::Critic is an extensible framework for creating and applying coding standards to Perl source code.

Essentially, it is a static source code analysis engine. It is distributed with a number of Perl::Critic::Policy modules that attempt to enforce various coding guidelines.

Most Policy modules are based on Damian Conway's book Perl Best Practices. However, Perl::Critic is not limited to PBP, and will even support Policies that contradict Conway. You can enable, disable, and customize those Polices through the Perl::Critic interface. You can also create new Policy modules that suit your own tastes

Tags: Perl Admin Tools, Perl, Programming style, Program Understanding,

[Feb 15, 2011] PAC 2.5.5.4

Used for implementation Perl GTK+ SSH Telnet GNOME & Expect

PAC provides a GUI to configure SSH and Telnet connections, including usernames, passwords, EXPECT regular expressions, and macros. It is similar in function to SecureCRT or Putty.

It is intended for people who connect to many servers through SSH.

It can automate logins and command executions.

Selected Comments

archenroot

Well the PAC is really just nice piece of software, but I experience 2 issues: - no support of using the ssh keys, but I can live without that - when I open terminal and switch to another application just opened on the desktop (Gnome type), I cann't directly use PAC for some time (about 10-20 seconds) then it itself refreshes and I can continue using opened terminal. Do you have any suggestion what this schould be about?

Thank you .

[Jan 21, 2011] Which language is best for system admin except for shell LinkedIn

While I personally prefer Perl, paradoxically the answer is "it does not matter", if and only if the aspiring sysadmin pays sufficient attention to Unix philosophy. See also Unix philosophy - Wikipedia, the free encyclopedia
Lisa Penland

@Garry - I began my career on sys5 rel 3. I'm rather partial to ksh. Have to say though - I tend to avoid AIX like the plague..and THAT's a personal preference.

@robin - there's only one topic more likely to incite a flame war - and that's "what's the best editor"

If you are really interested in becoming a *nix sysadmin be certain to check out the unix philosophy as well. It's not enough to understand the mechanical how...a good admin understands the whys. Anyone on this list can help you learn the steps to perform x function. It's much harder to teach someone how to think like an admin. http://www.faqs.org/docs/artu/ch01s06.html

Nikolai Bezroukov

Many years ago I wrote a small tutorial "Introduction to Perl for Unix System Administrators"
which covers basic features of the language. Might be useful for some people here.

http://www.softpanorama.org/Scripting/Perlbook/index.shtml

As for Perl vs Python vs Ruby this kind of religious discussion is funny and entertaining but Lisa Penland made a very important point: "If you are really interested in becoming a *nix sysadmin be certain to check out the Unix philosophy as well."

-- Nikolai

P.S. There is one indisputable fact on the ground that we need to remember discussing this topic in context of enterprise Unix environment:

Typically in a large corporation sysadmin need to support two or more flavors of Unix. In many organizations installation of additional scripting language on all Unix boxes is a formidable task that requires political support on pretty high level of hierarchy. My God, even to make bash available on all boxes is an almost impossible task :-).

[Jan 15, 2011] When

When is an extremely simple personal calendar program, aimed at the Unix geek who wants something minimalistic.

It can keep track of things you need to do on particular dates.

It's a very short and simple program, so you can easily tinker with it yourself.

It doesn't depend on any libraries, so it's easy to install. You should be able to install it on any system where Perl is available, even if you don't have privileges for installing libraries. Its file format is a simple text file, which you can edit in your favorite editor.

[Jan 14, 2011] Brian Kernighan - Random thoughts on scripting languages

PDF. Pretty superficial. I would expect better from the famous author...

Other scripting languages

[Dec 25, 2010] 23 Years of Culture Hacking With Perl -

December 25 | Slashdot

Greg Lindahl:

Blekko's search engine and NoSQL database are written in Perl. We haven't had problems hiring experienced, smart Perl people, and we've also had no trouble getting experienced Python people to learn Perl.

grcumb: Re:Rambling, barely coherent, self-indulgent.

I suppose I learned a lot about the Perl community though.

Larry may sound glib most of the time, but if you took the time to look, you'd see method in his madness. He chooses to make his points lightly, because that's an important part of the message. Perl as a language is designed to reflect the idiosyncrasies of the human brain. It treats dogmatism as damage and routes around it. As Larry wrote, it is trollish in its nature. But its friendly, playful brand of trollishness is what allows it to continue to evolve as a culture.

Strip away the thin veneer of sillyness and you'll see that everything I've written has been lifted directly from Larry's missive. Just because he likes to act a little silly doesn't mean he's wrong.

One of the worst things a programmer can do is invest too much ego, pride or seriousness in his work. That is the path to painfully over-engineered, theoretically correct but practically useless software that often can't survive a single revision.

Perl as a language isn't immune to any of these sins, but as a culture, it goes to some lengths to mitigate against them.

[Dec 25, 2010] Day 24 Yule the Ancient Troll-tide Carol

Perl 6 Advent Calendar

Perl is not just a technology; it's a culture. Just as Perl is a technology of technological hacking, so too Perl is a culture of cultural hacking. Giving away a free language implementation with community support was the first cultural hack in Perl's history, but there have been many others since, both great and small. You can see some of those hacks in that mirror you are holding. Erthat is holding you.

The second big cultural hack was demonstrating to Unix culture that its reductionistic ideas could be subverted and put to use in a non-reductionistic setting.

Dual licensing was a third cultural hack to make Perl acceptable both to businesses and the FSF.

Yet another well-known hack was writing a computer book that was not just informative but also, gasp, entertaining! But these are all shallow hacks. The deep hack was to bootstrap an entire community that is continually hacking on itself recursively in a constructive way (well, usually).

[Dec 20, 2010] What hurts you the most in Perl

Overcomplexity junkies in Perl. That's a very important threat that can undermine the language future.
LinkedIn

Steve Carrobis

Perl is a far better applications type language than JAVA/C/C#. Each has their niche. Threads were always an issue in Perl, and like OO, if you don't need it or know it don't use it. My issues with Perl is when people get Overly Obfuscated with their code because the person thinks that less characters and a few pointers makes the code faster. Unless you do some real smart OOesque building all you are doing is making it harder to figure out what you were thinking about. and please perl programmers, don't by into the "self documenting code" i am an old mainframer and self documenting code was as you wrote you added comments to the core parts of the code ... i can call my subroutine "apple" to describe it.. but is it really an apple? or is it a tomato or pomegranate. If written properly Perl is very efficient code. and like all the other languages if written incorrectly its' HORRIBLE. I have been writing perl since almost before 3.0 ;-)

That's my 3 cents.. Have a HAPPY and a MERRY!

Nikolai Bezroukov

@steve Thanks for a valuable comment about the threat of overcomplexity junkies in Perl. That's a very important threat that can undermine the language future.

@Gabor: A well know fact is that PHP, which is a horrible language both as for general design and implementation of most features you mentioned is very successful and is widely used on for large Web applications with database backend (Mediawiki is one example). Also if we think about all dull, stupid and unrelaible Java coding of large business applications that we see on the marketplace the question arise whether we want this type of success ;-)

@Douglas: Mastering Perl requires slightly higher level of qualification from developers then "Basic-style" development in PHP or commercial Java development (where Java typically plays the role of Cobol) which is mainstream those days. Also many important factors are outside technical domain: ecosystem for Java is tremendous and is supported by players with deep pockets. Same is true for Python. Still Perl has unique advantages, is universally deployed on Unix and as such is and always will be attractive for thinking developers

I think that for many large business applications which in those days often means Web application with database backend one can use virtual appliance model and use OS facilities for multitasking. Nothing wrong with this approach on modern hardware. Here Perl provides important advantages due to good integration with Unix.

Also structuring of a large application into modules using pipes and sockets as communication mechanism often provides very good maintainability. Pointers are also very helpful and unique for Perl. Typically scripting languages do not provide pointers. Perl does and as such gives the developer unique power and flexibility (with additional risks as an inevitable side effect).

Another important advantage of Perl is that it is a higher level language then Python (to say nothing about Java ) and stimulates usage of prototyping which is tremendously important for large projects as the initial specification is usually incomplete and incorrect. Also despite proliferation of overcomplexity junkies in Perl community, some aspects of Perl prevent excessive number of layers/classes, a common trap that undermines large projects in Java. Look at IBM fiasco with Lotus Notes 8.5.

I think that Perl is great in a way it integrates with Unix and promote thinking of complex applications as virtual appliances. BTW this approach also permits usage of a second language for those parts of the system for which Perl does not present clear advantages.

Also Perl provide an important bridge to system administrators who often know the language and can use subset of it productively. That makes it preferable for large systems which depend on customization such as monitoring systems.

Absence of bytecode compiler hurts development of commercial applications in Perl in more ways than one but that's just question of money. I wonder why ActiveState missed this opportunity to increase its revenue stream. I also agree that the quality of many CPAN modules can be improved but abuse of CPAN along with fixation on OO is a typical trait of overcomplexity junkies so this has some positive aspect too :-).

I don't think that OO is a problem for Perl, if you use it where it belongs: in GUI interfaces. In many cases OO is used when hierarchical namespaces are sufficient. Perl provides a clean implementation of the concept of namespaces. The problem is that many people are trained in Java/C++ style of OO and as we know for hummer everything looks like a nail. ;-)

Allan Bowhill:

I think the original question Gabor posed implies there is a problem 'selling' Perl to companies for large projects. Maybe it's a question of narrowing its role.

It seems to me that if you want an angle to sell Perl on, it would make sense to cast it (in a marketing sense) into a narrower role that doesn't pretend to be everything to everyone. Because, despite what some hard-core Perl programmers might say, the language is somewhat dated. It hasn't really changed all that much since the 1990s.

Perl isn't particularly specialized so it has been used historically for almost every kind of application imaginable. Since it was (for a long time in the dot-com era) a mainstay of IT development (remember the 'duct tape' of the internet?) it gained high status among people who were developing new systems in short time-frames. This may in fact be one of the problems in selling it to people nowadays.

The FreeBSD OS even included Perl as part of their main (full) distribution for some time and if I remember correctly, Perl scripts were included to manage the ports/packaging system for all the 3rd party software. It was taken out of the OS shortly after the bust and committee reorganization at FreeBSD, where it was moved into third-party software. The package-management scripts were re-written in C. Other package management utilities were effectively displaced by a Ruby package.

A lot of technologies have come along since the 90s which are more appealing platforms than Perl for web development, which is mainly what it's about now.

If you are going to build modern web sites these days, you'll more than likely use some framework that utilizes object-oriented languages. I suppose the Moose augmentation of Perl would have some appeal with that, but CPAN modules and addons like Moose are not REALLY the Perl language itself. So if we are talking about selling the Perl language alone to potential adopters, you have to be honest in discussing the merits of the language itself without all the extras.

Along those lines I could see Perl having special appeal being cast in a narrower role, as a kind of advanced systems batching language - more capable and effective than say, NT scripting/batch files or UNIX shell scripts, but less suitable than object-oriented languages, which pretty much own the market for web and console utilities development now.

But there is a substantial role for high-level batching languages, particularly in systems that build data for consumption by other systems. These are traditionally implemented in the highest-level batching language possible. Such systems build things like help files, structured (non-relational) databases (often used on high-volume commercial web services), and software. Not to mention automation many systems administration tasks.

There are not too many features or advantages to Perl that are unique in itself in the realm of scripting languages, as they were in the 90s. The simplicity of built-in Perl data structures and regular expression capabilities are reflected almost identically in Ruby, and are at least accessible in other strongly-typed languages like Java and C#.

The fact that Perl is easy to learn, and holds consistent with the idea that "everything is a string" and there is no need to formalize things into an object-oriented model are a few of its selling points. If it is cast as an advanced batching language, there are almost no other languages that could compete with it in that role.

Dean Hamstead:

@Pascal: bytecode is nasty for the poor Sysadmin/Devop who has to run your code. She/he can never fix it when bugs arise. There is no advantage to bytecode over interpreted.

Which infact leads me to a good point.

All the 'selling points' of Java have all failed to be of any real substance.

In truth, Java is popular because it is popular.

Lots of people don't like perl because its not popular any more. Similar to how lots of people hate Mac's but have no logical reason for doing so.

Douglas is almost certainly right, that Python is rapidly becoming the new fad language.

I'm not sure how perl OO is a 'hack'. When you bless a reference in to an object it becomes and object... I can see that some people are confused by perl's honesty about what an object is. Other languages attempt to hide away how they have implemented objects in their compiler - who cares? Ultimately the objects are all converted in to machine code and executed.

In general perl objects are more object oriented than java objects. They are certainly more polymorphic.

Perl objects can fully hide their internals if thats something you want to do. Its not even hard, and you don't need to use moose. But does it afford any real benefit? Not really.

At the end of the day, if you want good software you need to hire good programmers it has nothing to do with the language. Even though some languages try to force the code to be neat (Python) and try to force certain behaviours (Java?) you can write complete garbage in any of them, then curse that language for allowing the author to do so.

A syntactic argument is pointless. As is something oriented around OO. What benefits perl brings to a business are...

- massive centralised website of libraries (CPAN)
- MVC's
- DBI
- POE
- Other frameworks etc
- automated code review (perlcritic)
- automated code formatting and tidying (perltidy)
- document as you code (POD)
- natural test driven development (Test::More etc)
- platform independence
- perl environments on more platforms than java
- perl comes out of the box on every unix
- excellent canon of printed literature, from beginner to expert
- common language with Sysadmin/Devops and traditional developers roles (with source code always available to *fix* them problem quickly, not have to try to set up an ant environment with and role a new War file)
- rolled up perl applications (PAR files)
- Perl can use more than 3.6gig of ram (try that in java)

Brian Martin

Well said Dean.

Personally, I don't really care if a system is written in Perl or Python or some other high level language, I don't get religious about which high level language is used.

There are many high level languages, any one of them is vastly more productive & consequently less buggy than developing in a low level language like C or Java. Believe me, I have written more vanilla C code in my career than Perl or Python, by a factor of thousands, yet I still prefer Python or Perl as quite simply a more succinct expression of the intended algorithm.

If anyone wants to argue the meaning of "high level", well basically APL wins ok. In APL, to invert a matrix is a single operator. If you've never had to implement a matrix inversion from scratch, then you've never done serious programming. Meanwhile, Python or Perl are pretty convenient.

What I mean by a "high level language" is basically how many pages of code does it take to play a decent game of draughts (chequers), or chess ?

[Jun 10, 2010] Deep-protocol analysis of UNIX networks

Jun 08, 2010 | developerWorks
Parsing the raw data to understand the content

Another way to process the content from tcpdump is to save the raw network packet data to a file and then process the file to find and decode the information that you want.

There are a number of modules in different languages that provide functionality for reading and decoding the data captured by tcpdump and snoop. For example, within Perl, there are two modules: Net::SnoopLog (for snoop) and Net::TcpDumpLog (for tcpdump). These will read the raw data content. The basic interfaces for both of these modules is the same.

To start, first you need to create a binary record of the packets going past on the network by writing out the data to a file using either snoop or tcpdump. For this example, we'll use tcpdump and the Net::TcpDumpLog module: $ tcpdump -w packets.raw.

Once you have amassed the network data, you can start to process the network data contents to find the information you want. The Net::TcpDumpLog parses the raw network data saved by tcpdump. Because the data is in it's raw binary format, parsing the information requires processing this binary data. For convenience, another suite of modules, NetPacket::*, provides decoding of the raw data.

For example, Listing 8 shows a simple script that prints out the IP address information for all of the packets.


Listing 8. Simple script that prints out the IP address info for all packets
use Net::TcpDumpLog;
    
use NetPacket::Ethernet;
    
use NetPacket::IP;

    
my $log = Net::TcpDumpLog->new();
 
$log->read("packets.raw");
 
 
foreach my $index ($log->indexes)
       
{
    
    my $packet = $log->data($index);
           

    my $ethernet = NetPacket::Ethernet->decode($packet);

  
    if ($ethernet->{type} == 0x0800)
       
    {
    
        my $ip = NetPacket::IP->decode($ethernet->{data});
          

    
        printf("  %s to %s protocol %s \n",
               $ip->{src_ip},$ip->{dest_ip},$ip->{proto});
   }

} 
The first part is to extract each packet. The Net::TcpDumpLog module serializes each packet, so that we can read each packet by using the packet ID. The data() method then returns the raw data for the entire packet.

As with the output from snoop, we have to extract each of the blocks of data from the raw network packet information. So in this example, we first need to extract the ethernet packet, including the data payload, from the raw network packet. The NetPacket::Ethernet module does this for us.

Since we are looking for IP packets, we can check for IP packets by looking at the Ethernet packet type. IP packets have an ID of 0x0800.

The NetPacket::IP module can then be used to extract the IP information from the data payload of the Ethernet packet. The module provides the source IP, destination IP and protocol information, among others, which we can then print.

Using this basic framework you can perform more complex lookups and decoding that do not rely on the automated solutions provided by tcpdump or snoop. For example, if you suspect that there is HTTP traffic going past on a non-standard port (i.e., not port 80), you could look for the string HTTP on ports other than 80 from the suspected host IP using the script in Listing 9.


Listing 9. Looking for strong HHTP on ports other than 80
use Net::TcpDumpLog;
    
use NetPacket::Ethernet;
    
use NetPacket::IP;
    
use NetPacket::TCP;
    

    
my $log = Net::TcpDumpLog->new();
       
$log->read("packets.raw");
       

    
foreach my $index ($log->indexes)
       
{
    
    my $packet = $log->data($index);
       

    
    my $ethernet = NetPacket::Ethernet->decode($packet);
       

    
    if ($ethernet->{type} == 0x0800)
       
    {
    
        my $ip = NetPacket::IP->decode($ethernet->{data});
          

    
        if ($ip->{src_ip} eq '192.168.0.2')
       
        {
    
            if ($ip->{proto} == 6)
       
            {
    
                my $tcp = NetPacket::TCP->decode($ip->{data});
       
                if (($tcp->{src_port} != 80) &&
               
                    ($tcp->{data} =~ m/HTTP/))
       
                {
    
                    print("Found HTTP traffic on non-port 80\n");
    
                    printf("%s (port: %d) to %s (port: %d)\n%s\n",
    
                           $ip->{src_ip},
       
                           $tcp->{src_port},
       
                           $ip->{dest_ip},
       
                           $tcp->{dest_port},
       
                           $tcp->{data});
 
                }
    
            }
    
        }
    
   }
    
}

Running the above script on a sample packet set returned the following shown in Listing 10.

Listing 10. Running the script on a sample packet set
$ perl http-non80.pl
Found HTTP traffic on non-port 80
192.168.0.2 (port: 39280) to 168.143.162.100 (port: 80)
GET /statuses/user_timeline.json HTTP/1.1
Found HTTP traffic on non-port 80
192.168.0.2 (port: 39282) to 168.143.162.100 (port: 80)
GET /statuses/friends_timeline.json HTTP/1

In this particular case we're seeing traffic from the host to an external website (Twitter).

Obviously, in this example, we are dumping out the raw data, but you could use the same basic structure to decode and the data in any format using any public or proprietary protocol structure. If you are using or developing a protocol using this method, and know the protocol format, you could extract and monitor the data being transferred.

[Jun 10, 2010] ack -- better than grep, a power search tool for programmers

Latest version of ack: 1.92, December 11, 2009

ack is a tool like grep, designed for programmers with large trees of heterogeneous source code.

ack is written purely in Perl, and takes advantage of the power of Perl's regular expressions.

How to install ack

It can be installed any number of ways:

Ack in Project for Textmate users

Users of TextMate, the programmer's editor for the Mac, can use the Ack in Project plugin by Trevor Squires:

TextMate users know just how slow its "Find in Project" can be with large source trees. That's why you need "ack-in-project" a TextMate bundle that uses the super-speedy 'ack' tool to search your code FAST. It gives you beautiful, clickable results just as fast as "ack" can find them. Check it out at: http://github.com/protocool/ack-tmbundle/tree/master

Testimonials

"Whoa, this is *so* much better than grep it's not even funny." -- Jacob Kaplan-Moss, creator of Django.

"Thanks for creating ack and sharing it with the world. It makes my days just a little more pleasant. I'm glad to have it in my toolbox. That installation is as simple as downloading the standalone version and chmodding is a nice touch." -- Alan De Smet

"I came across ack today, and now grep is sleeping outside. It's very much like grep, except it assumes all the little things that you always wanted grep to remember, but that it never did. It actually left the light on for you, and put the toilet seat down." -- Samuel Huckins

"ack is the best tool I have added to my toolbox in the past year, hands down." -- Bill Mill on reddit

"I use it all the time and I can't imagine how I managed with only grep." -- Thomas Thurman

"This has been replacing a Rube Goldberg mess of find/grep/xargs that I've been using to search source files in a fairly large codebase." -- G. Wade Johnson

"You had me at --thpppt." -- John Gruber, Daring Fireball

"Grepping of SVN repositories was driving me crazy until I found ack. It fixes all of my grep annoyances and adds features I didn't even know I wanted." -- Paul Prescod

"I added ack standalone to our internal devtools project at work. People are all over it." -- Jason Gessner

"I just wanted to send you my praise for this wonderful little application. It's in my toolbox now and after one day of use has proven itself invaluable." -- Benjamin W. Smith

"ack has replaced grep for me for 90% of what I used it for. Obsoleted most of my 'grep is crippled' wrapper scripts, too." -- Randall Hansen

"ack's powerful search facilities are an invaluable tool for searching large repositories like Parrot. The ability to control the search domain by filetype--and to do so independent of platform--has made one-liners out of many complex queries previously done with custom scripts. Parrot developers are hooked on ack." -- Jerry Gay

"That thing is awesome. People see me using it and ask what the heck it is." -- Andrew Moore

Top 10 reasons to use ack instead of grep.

  1. It's blazingly fast because it only searches the stuff you want searched.
  2. ack is pure Perl, so it runs on Windows just fine.
  3. The standalone version uses no non-standard modules, so you can put it in your ~/bin without fear.
  4. Searches recursively through directories by default, while ignoring .svn, CVS and other VCS directories.
    • Which would you rather type?
      $ grep pattern $(find . -type f | grep -v '\.svn')
      $ ack pattern
  5. ack ignores most of the crap you don't want to search
    • VCS directories
    • blib, the Perl build directory
    • backup files like foo~ and #foo#
    • binary files, core dumps, etc
  6. Ignoring .svn directories means that ack is faster than grep for searching through trees.
  7. Lets you specify file types to search, as in --perl or --nohtml.
    • Which would you rather type?
      $ grep pattern $(find . -name '*.pl' -or -name '*.pm' -or -name '*.pod' | grep -v .svn)
      $ ack --perl pattern
    Note that ack's --perl also checks the shebang lines of files without suffixes, which the find command will not.
  8. File-filtering capabilities usable without searching with ack -f. This lets you create lists of files of a given type.
    $ ack -f --perl > all-perl-files
  9. Color highlighting of search results.
  10. Uses real Perl regular expressions, not a GNU subset.
  11. Allows you to specify output using Perl's special variables
    • Example: ack '(Mr|Mr?s)\. (Smith|Jones)' --output='$&'
  12. Many command-line switches are the same as in GNU grep:
    -w does word-only searching
    -c shows counts per file of matches
    -l gives the filename instead of matching lines
    etc.
  13. Command name is 25% fewer characters to type! Save days of free-time! Heck, it's 50% shorter compared to grep -r.

ack's command flags

$ ack --help
Usage: ack [OPTION]... PATTERN [FILE]

Search for PATTERN in each source file in the tree from cwd on down.
If [FILES] is specified, then only those files/directories are checked.
ack may also search STDIN, but only if no FILE are specified, or if
one of FILES is "-".

Default switches may be specified in ACK_OPTIONS environment variable or
an .ackrc file. If you want no dependency on the environment, turn it
off with --noenv.

Example: ack -i select

Searching:
  -i, --ignore-case     Ignore case distinctions in PATTERN
  --[no]smart-case      Ignore case distinctions in PATTERN,
                        only if PATTERN contains no upper case
                        Ignored if -i is specified
  -v, --invert-match    Invert match: select non-matching lines
  -w, --word-regexp     Force PATTERN to match only whole words
  -Q, --literal         Quote all metacharacters; PATTERN is literal

Search output:
  --line=NUM            Only print line(s) NUM of each file
  -l, --files-with-matches
                        Only print filenames containing matches
  -L, --files-without-matches
                        Only print filenames with no matches
  -o                    Show only the part of a line matching PATTERN
                        (turns off text highlighting)
  --passthru            Print all lines, whether matching or not
  --output=expr         Output the evaluation of expr for each line
                        (turns off text highlighting)
  --match PATTERN       Specify PATTERN explicitly.
  -m, --max-count=NUM   Stop searching in each file after NUM matches
  -1                    Stop searching after one match of any kind
  -H, --with-filename   Print the filename for each match
  -h, --no-filename     Suppress the prefixing filename on output
  -c, --count           Show number of lines matching per file
  --column              Show the column number of the first match

  -A NUM, --after-context=NUM
                        Print NUM lines of trailing context after matching
                        lines.
  -B NUM, --before-context=NUM
                        Print NUM lines of leading context before matching
                        lines.
  -C [NUM], --context[=NUM]
                        Print NUM lines (default 2) of output context.

  --print0              Print null byte as separator between filenames,
                        only works with -f, -g, -l, -L or -c.

File presentation:
  --pager=COMMAND       Pipes all ack output through COMMAND.  For example,
                        --pager="less -R".  Ignored if output is redirected.
  --nopager             Do not send output through a pager.  Cancels any
                        setting in ~/.ackrc, ACK_PAGER or ACK_PAGER_COLOR.
  --[no]heading         Print a filename heading above each file's results.
                        (default: on when used interactively)
  --[no]break           Print a break between results from different files.
                        (default: on when used interactively)
  --group               Same as --heading --break
  --nogroup             Same as --noheading --nobreak
  --[no]color           Highlight the matching text (default: on unless
                        output is redirected, or on Windows)
  --[no]colour          Same as --[no]color
  --color-filename=COLOR
  --color-match=COLOR   Set the color for matches and filenames.
  --flush               Flush output immediately, even when ack is used
                        non-interactively (when output goes to a pipe or
                        file).

File finding:
  -f                    Only print the files found, without searching.
                        The PATTERN must not be specified.
  -g REGEX              Same as -f, but only print files matching REGEX.
  --sort-files          Sort the found files lexically.

File inclusion/exclusion:
  -a, --all-types       All file types searched;
                        Ignores CVS, .svn and other ignored directories
  -u, --unrestricted    All files and directories searched
  --[no]ignore-dir=name Add/Remove directory from the list of ignored dirs
  -r, -R, --recurse     Recurse into subdirectories (ack's default behavior)
  -n, --no-recurse      No descending into subdirectories
  -G REGEX              Only search files that match REGEX

  --perl                Include only Perl files.
  --type=perl           Include only Perl files.
  --noperl              Exclude Perl files.
  --type=noperl         Exclude Perl files.
                        See "ack --help type" for supported filetypes.

  --type-set TYPE=.EXTENSION[,.EXT2[,...]]
                        Files with the given EXTENSION(s) are recognized as
                        being of type TYPE. This replaces an existing
                        definition for type TYPE.
  --type-add TYPE=.EXTENSION[,.EXT2[,...]]
                        Files with the given EXTENSION(s) are recognized as
                        being of (the existing) type TYPE

  --[no]follow          Follow symlinks.  Default is off.

  Directories ignored by default:
    autom4te.cache, blib, _build, .bzr, .cdv, cover_db, CVS, _darcs, ~.dep,
    ~.dot, .git, .hg, ~.nib, nytprof, .pc, ~.plst, RCS, SCCS, _sgbak and
    .svn

  Files not checked for type:
    /~$/           - Unix backup files
    /#.+#$/        - Emacs swap files
    /[._].*\.swp$/ - Vi(m) swap files
    /core\.\d+$/   - core dumps

Miscellaneous:
  --noenv               Ignore environment variables and ~/.ackrc
  --help                This help
  --man                 Man page
  --version             Display version & copyright
  --thpppt              Bill the Cat

Exit status is 0 if match, 1 if no match.

This is version 1.92 of ack.

[Apr 25, 2010] The Perl Review Archives

Download Volume 0 Issue 0 (February 2002) (394k PDF) Download Volume 0 Issue 1 (March 2002) (429k PDF) Download Volume 0 Issue 2 (April 2002) (485k PDF) Download Volume 0 Issue 3 (May 2002) (433k PDF) Download Volume 0 Issue 4 (July 2002) (332k PDF) Download Volume 0 Issue 5 (September 2002) (363k PDF) Download Volume 0 Issue 6 (November 2002) (352k PDF)