Softpanorama

May the source be with you, but remember the KISS principle ;-)
Contents Bulletin Scripting in shell and Perl Network troubleshooting History Humor


Perl for Unix System Administrators

News

Scripting Languages

eBook: Perl for system admins

Recommended Perl Books Recommended Links Recommended Papers Perl Language Perl Reference
Perl as a command line tool Perl Regular Expressions Overview of Perl regular expressions More Complex Perl Regular Expressions Perl namespaces Perl modules Subroutines and Functions Pipes
Perl Debugging Perl applications Perl One Liners HTML Matching Examples Perl IDE Perl Certification Notes on Perl Hype Perl for Win32
Perl Style Beautifiers and Pretty Printers Perl Xref Perl lint Perl power tools Perl IDE and Programming Environment Perl Error Checklist Perl Warts
Perl POD documentation Larry Wall Quotes Larry Wall Articles and Interviews Tips Perl history and evolution Humor Etc

Introduction

This is a very limited effort to help Unix sysadmins to learn of Perl. It is based on my FDU lectures to CS students. We discuss mainly "simple Perl" and the site tries to counter "excessive complexity" drive that dominates many Perl-related sites and publications.  See also my ebook Introduction to Perl for Unix system administrators (It probably can be also used for preparing to the Certified Internet Web Professional Exam). Perl continue to evolve. Thankfully it is evolves slowly but during the last decade we got state variables (5.10) and a couple of other useful features along with several other which many would consider redundant or even harmful.

Syntax of Perl is pretty regular and is favorably compared with the disaster which is syntax of Borne shell and derivatives as well as with syntax of C and C-derivatives. Larry Wall managed to avoid most classic pitfalls in creating of the syntax of the language, pitfalls in which creators on PHP readily fell ("dangling else" is one example).

Systems administrators need to deal with many repetitive tasks in a very complex, and changing environment which often includes several different flavors of Linux (RHEL and Suse) and Unix (Solaris, HP-UX and AIX). Linux has Perl installed by  default.  It is also included in all major Unix flavors but version installed might be outdated and need upgrading. Oracle installs Perl too, so it is automatically present on all servers with Oracle.  Some other application install Perl too.  That means that it provides the simplest way to automate recurring tasks on multiple platforms. Among typical tasks which sysadmin need to deal with:

You definitely can greatly simplify your life as well as improve "manageability" of the servers (or group of servers) with additional Perl scripts. As long  as you still apply KISS principle, because at some point the task of maintaining the scripts became evident and unless scripts are simple the game might not worth the candles. 

As most sysadmins already know shell, the affinity with shell is one of the major advantages of using Perl as the second scripting language for Unix sysadmin. no other language come close in this respect: Perl allow to reuse most of the key concepts of the shell.

IMHO the main advantage of using powerful complex language like Perl is the ability to write simple programs. Perhaps the world has gone overboard on this object-oriented thing. You do not need many tricks used in lower level languages as Perl itself provides you high level primitives for the task. This page is linked to several sub-pages. The most important among them are:

All language have quirks, and all inflict a lot of pain before one can adapt to them. Once learned the quirks become incorporated into your understanding of the language. But there is no royal way to mastering the language. The more different is one's background is, more one needs to suffer. Generally any user of a new programming language needs to suffer a lot ;-)

When mastering a new language first you face a level of "cognitive overload" until the quirks of the language become easily handled by your unconscious mind. At that point, all of the sudden the quirky interaction becomes a "standard" way of performing the task. For example, regular expression syntax seems to be a weird base for serious programs, fraught with pitfalls, a big semantic mess as a result of outgrowing its primary purpose. On the other hand, in skilled hands this is a very powerful tool that can serve a reliable parser for complex data and in certain cases as a substitute for string functions such as index and substr.

There are three notable steps in adaptation to Perl idiosyncrasies for programmers who got used to other languages:

  1. One early sign is when you start to put $ on all scalar variables automatically.  That's easy for those people who use write their own shell sctips and generally is not a problem for sysadmins. Most mistakes when you omit $ in front of the variable are diagnosed by interpreter but some like $location{city} are not. the problems arise if  along with Unix shell you use the third language, for example C.  In this case you make mistakes, despite you experince, and you need conscious effort all the time. This is the case with me.
  2. The next step is to overcome notational difficulties of using two different comparison operations, one for strings and the other for numerical values (like "==" for numbers vs. eq for strings) for comparison numbers and strings. In case two variables are involved, the interpreter does not provide any warnings, so you need to be very careful as if you use other language in parallel with Perl such errors crop into your scripts automatically. If one of the operators of "==" is a string constant automatic diagnostic can be provided.
  3. Conscious use of "==" as equality predicate for numbers, if you previous language allowed to used "=" for assignments you are in trouble. The pitfall of using "=" for assignment,  results in the side effect of introducing errors in comparisons in Perl, as you put "=" instead of "--".  For example,    if ($a=1)...  instead of if  ($a==1)... This problem was understood by designers on Algol 60, which was designed is late 50th. But Perl designers like lemmings followed C designers and naturally stepped on this rake again. With predictable result. To avoid it they used := for assignment instead of plain =.  Actually the designers of Fortran, PL/1, C (as derivative of PL/1), C++ and Java ignored this lesson (Fortran designers are actually not guily as it predates Algol 60). But  because C  (with its derivatives such as C++ and Java) became dominant programming language we have, what we have: propagation of this blunder to many top programming languages. Now think a little, about the notion of progress in programming language design ;-)  It's sad that the design blunder about which designers knew 65 years ago still is present in the most popular languages used today ;-). In all languages that have lexical structure of C, this blunder remains one of the most rich source of subtle errors for novices. Naturally this list includes Perl. C programmers typically are already trained to be aware about  this language pitfall. But in Perl you too can use several protective measures
    1. Modify syntax highlighting in your editor  so that such cases were marked in bold red.
    2. Manually or automatically (simple regex done in the editor can detected ~ 99% of cases) reorganize such comparisons putting the constant on the left part of comparison, like in  if (1==$a)....
    3. Recent versions of Perl interpreter provide warning in this case, so checking your script with option -cw  or better using use warnings  pregma in all your scripts. It also helps if IDE provides capability to display of results of checking of syntax in one of the windows and jump to the line in code listed in the error or warning (this is a standard feature of all IDEs and actually this can be done in most editors too). 
  4. Learning to find missing closing "}". This problem is typical for all C-style languages and generally requires pretty printer to spot.  But Perl interpreter has a blunder -- it does not recognize the fact that in Perl subroutines can't be nested within blocks and does not point to the first subroutine as the diagnostic point -- it points to the end of the script.  You the best way is to extract suspicious section into a new file and check it separately, cutting not relevant parts, until you detect the problem. The longer the program is the more acute this problem becomes. BTW this problem was solved in PL/1 which was create in early 60th: PL/1 has labels for closing statements as in "mainloop: do... end mainloop" which close all intermediate constructs automatically Both C and Perl failed to adopt this innovation.  Neither Perl not C also use the concept of Pascal "local numeric labels"  -- labels that exist only until they are redefined, see discussion at Knuth.
  5. Leaning to find missing " (double quote) or ' (single quote).  With good editor this is not a problem as syntax highlighting points you where the problems begin.

Please note that as syntax of Perl is complex. So the diagnostic in Perl interpreter is really bad and often point to the spot far below where the error occurred.  It is nowhere near the quality of diagnostics that mainframe programmers got in IBM PL/1 diagnostic complier, which is also probably 50 years old and run on tiny by today standard machines with 256K (kilobytes, not megabytes)  of RAM and 7M (megabytes, not gigabytes, or terabytes) harddrives.  The only comfort  is that other scripting languages are even worse then Perl ;-).

Benefits that Perl brings to system administration 

All-in-all Perl is the language that fits most sysadmin needs, It' not fancy and its general usage is in decline since 2000 but fashion should never be primary factor in choosing the scripting language. Perl has stable and well tested  interpreter and is close to shell (to the extent  that most concepts of shell can be directly reused). And that's what important. As on modern servers Perl interpreter loads in a fraction of a second, Perl also allows to get rid of most usage of AWK and SED, making you environment more uniform and less complex. This is important advantage.  Among benefits that Perl bring to system administration are

In short if make sense to learn Perl as it makes sysadmin like a lot easier. Probably more so then any other tool in sysadmin arsenal...

Perl is really great for text processing and in this particular domain is probably unmatched. For example in Python regular expressions are implemented via standard library module; they are not even a part of the language.

A warning about relative popularity

 As of 2017 Perl no longer belongs to the top 10 programming languages (Scripting languages slip in popularity, except JavaScript and Python, Infoworld, Nov 13, 2017).  It's still more popular then Visual Basic, so there nothing to worry about.  But far less then popular then Python.  Of cause popularity is not everything. Python and Perl share some characteristics, but don't exactly occupy the same niches. But it is a lot: fashion rules the programming, so this is a factor that you need consciously evaluate and be aware of.

In large enterprise environment, outside system administration area Perl now is almost invisible. Python is gaining ground in research. Mostly because universities both in the USA and Europe now teach Python in introductory classes and engineers come "knowing some Python". This looks like "Java success story" of late 1990th on new level. Like Perl, Python is also now installed on all Linux distributions by default and there are several important linux system programs written in Python (yum, Anaconda, etc) which implicitly suggest that Python has Red Hat mark of adoption/approval too (yum was originally written at Duke University Department of Physics)

So there is now a pressure to adopt Python. That's sad, because IMHO Perl is a great scripting language which can be used on many different levels, starting from AWK/SED replacement tool. Going from Perl to Python for text processing to me feels like leaving a Corvette and driving a station wagon. Python will gets you there. But it's not fun and will take more time althouht you probably might feel more comfortable inside.

To say nothing about the list of IDE available. Perl does not even ship with the "Standard IDE" although Padre, which is somewhat competitive with Komodo is available for free, but the latest binary distribution suitable for beginners is from 2012. It's still highly usable. Komodo edit can serve as a surrogate IDE too.  Pycharm -- the most popular Python IDE can work with Perl and works well.  My feeling is that for Perl to remain competitive IDE should be maintained and shipped along with Perl interpreter (like in Python and R distributions).  May be at the expense of some esoteric modules included in standard  library.  Also number of books per year devoted to Python and available via Amazon for 2017 is at least one order of magnitude larger then the number of books devoted to Perl (quality issues aside). All this creates a real pressure to use Python everywhere, even in system administration.

Here is an insightful post on this topic (Which is better, Perl or Python Which one is more robust How do they compare with each other):

Joe Pepersack, Just Another Perl Hacker Answered May 30 2015

Perl is better. Perl has almost no constraints.  It's philosophy is that there is more than one way to do it (TIMTOWTDI, pronounced Tim Toady). Python artificially restricts what you can do as a programmer.  It's philosophy is that there should be one way to do it.   If you don't agree with Guido's way of doing it, you're sh*t out of luck.

Basically, Python is Perl with training wheels.   Training wheels are a great thing for a beginner, but eventually you should outgrow them.  Yes, riding without training wheels is less safe.   You can wreck and make a bloody mess of yourself.   But you can also do things that you can't do if you have training wheels.   You can go faster and do interesting and useful tricks that aren't possible otherwise. Perl gives you great power, but with great power comes great responsibility.

A big thing that Pythonistas tout as their superiority is that Python forces you to write clean code.   That's true, it does... at the point of a gun, sometimes at the detriment of simplicity or brevity.   Perl merely gives you the tools to write clean code (perltidy, perlcritic, use strict, /x option for commenting regexes) and gently encourages you to use them.

Perl gives you more than enough rope to hang yourself (and not just rope, Perl gives you bungee cords, wire, chain, string, and just about any other thing you can possibly hang yourself with).   This can be a problem.   Python was a reaction to this, and their idea of "solving" the problem was to only give you one piece of rope and make it so short you can't possibly hurt yourself with it.    If you want to tie a bungee cord around your waist and jump off a bridge, Python says "no way, bungee cords aren't allowed".  Perl says "Here you go, hope you know what you are doing... and by the way here are some things that you can optionally use if you want to be safer"

Some clear advantage of Perl:

Advantages of Python

The most common versions of Perl  5 in production

RHEL 6.x now ships with Perl 5.10. Many classic Unixes still ship with Perl 5.8.8. Older versions of  Solaris and HP-US servers might have version below Perl 5.8.8 but in 2017 that's rare as most of such servers are decommissioned (typical lifespan of a server in corporate environment is 5-7 years). 

It you need compatibility with all major flavor of Unix in you scripts it is a safe bet to write for Perl 5.8.8. Such a decision virtually guarantee compatibility with all enterprise servers, except those that should be discarded 5 or 10 years ago. In other words no "state" variables, if you want "perfect" compatibility. Non perfect, but acceptable.

If you need only Linux deployment compatibility that can be achieved by using version 5.10 which allow you to use "state" variables.

If you need compatibility with linux servers only version 5.10 look like a more or less safe bet too (very few enterprise servers in 2017 are now below version RHEL 6; those typically have Perl 5.8). 

Also too high version of Perl also is always desirable --  see note about Perl 5.22. (the current version of Perl 5 is version 26).  Hopefully warts added to version 22  will be corrected based on the feedback. Here is a slide from Oct 3, 2016 by Tom Radcliffe The Perl Paradox

The problem of Perl complexity junkies

There is a type of Perl books authors that enjoy the fact that Perl is complex non-orthogonal language and like to drive this notion to the extreme. I would call them complexity junkies. Be skeptical and do not take recommendations of Perl advocates like Randal L. Schwartz  or Tom Christiansen for granted :-) Fancy idioms are very bad for novices. Please remember about KISS principle and try to write simple Perl scripts without complex regular expressions and/or fancy idioms. Some Perl gurus pathological preoccupation with idioms is definitely not healthy and is part of the problem, not a part of the solution...

We can defines three main types of Perl complexity junkies:

My issues with Perl is when people get Overly Obfuscated with their code, because the person thinks that less characters and a few pointers makes the code faster.

Please remember about KISS principle and try to write simple Perl scripts without overly complex regular expressions or fancy idioms.  If you do this Perl is great language, unmatched for sysadmin domain. Simplicity has great merits even if goes again current fancy.

Generally the problems with OO mentioned above are more fundamental than the trivial "abstraction is the enemy of convenience". It is more like that badly chosen notational abstraction at one level can lead to an inhibition of innovative notational abstraction on others. In general OO is similar to idea of "compiler-copiler" when you create a new language in such a way that it allow to compile new constructs with the existing complier.  While in some cases useful or even indispensible, there is always a price to pay for such  fancy staff.

Tips

Missing semicolon problem in Perl

Some deficiencies of Perl syntax were directly inhered from C. One of the most notable "obligatory semi-colon after ach statement. Which lead to tremendous amount of errors. "soft semicolon" (implied semicolon on line end if round brackets or similar symmetrical symbols are balanced) is a better approach, but for some reason it was never implemented (semicolon is an option only before closing curvy bracket in Perl interpreter. ). See also discussion below.

Avoiding mistyping "=" instead of "==" blunders

One of most famous C design blunder was the introduction of a small lexical difference between assignment and comparison (remember that Algol used := for assignment; PL/1 uses = for both) caused by the design decision to make the language more compact (terminals at this type were not very reliable and number of symbols typed matter greatly. In C assignment is allowed in if statement but no attempts were made to make language more failsafe by avoiding possibility of mixing up "=" and "==". In C syntax if ($a = $b) assigns the contents of $b to a and executes the code following if b not equal to 0. It is easy to mix thing and write if ($a = $b ) instead of (if ($a == $b) which is a pretty nasty bug. You can often reverse the sequence and put constant first like in

if ( 1==$i ) ...
as
if ( 1=$i ) ...
does not make any sense, such a blunder will be detected on syntax level.

Locating unbalanced "}" errors

Typically this is connected with your recent changes so you should know where to look. If not use diff with tha last version that complied OK (I hope you use come king of CMS like subversion or git; if you that's time to switch)

The optimal way to spot this problem is to use pretty printer. In the absence of pretty printer you can insert '}' in binary search fashion until you find the spot where it is missing.

You can also extract part of your script and analyze it separately, deleting "balanced" parts one by one.

This error actually discourages writing long Perl scripts so the is a silver lining in each dark cloud.

You can also use pseudo comments that signify nesting level zero and check those points with special program or by writing an editor macro. One also can mark closing brackets with the name of construct it is closing

if (... ) { 

} # if 

Problem of unclosed quote at the end of the line string literal ("...")

Use a good editor. split long literals into one line literals and concatenate them with dot opaerator.

As a historical note specifying max length of literals is an effecting way of catching missing quote that was implemented in PL/1 compilers. You can also have an option to limit literal to a single line. In general multi-line literals should have different lexical markers (like "here" construct in shell). Perl provides the opportunity to use concatenation operator for splitting literals into multiple line, which are "merged" at compile time, so there is no performance penalty for this constructs. But there is no limit on the number of lines string literal can occupy so this does not help much. If such limit can be communicated via pragma statement at compile type in a particular fragment of text this is an effective way to avoid the problem. Usually only few places in program use multiline literals, if any. Editors that use coloring help to detect unclosed literal problem but there are cases when they are useless.

How to avoid using wrong comparison operator comparing two variable

If you are comparing a variable and a constant Perl interpret can help you to detect this error. but if you are comparing two variable you are on your own. And I often use wrong comparison operator just out of inertia or after usage of C. the most typical for ma error is to use == for stings comparison.

One way is to comment sting comparisons and then match comments with the comparison operator used using simple macro in editor (you should use programmable editor, and vim is programmable)

Usage of different set of comparison operator for number and string comparison is probably the blunder in Perl design (which Python actually avoided) and was inherited from shell. Programmer that use other languages along with Perl are in huge disadvantage her as other language experience force them to make the same errors again and again. Even shell solution (using different enclosing brackets); it might well be that in Perl usage of ( ) for arithmetic comparison and ((...)) for string would be a better deal. They still can be used as a part of defensive programming so that you can spot inconsistencies easier

Perl as a new programming paradigm

Perl + C and, especially Perl+Unix+shell represent a new programming paradigm in which the OS became a part of your programming toolkit and which is much more productive for large class of programs that OO-style development (OO-cult ;-). It became especially convenient in virtual machine environment when application typically "owns" the machine. In this case the level of integration of the language and operating system became of paramount importance and Perl excels in this respect. You can use shell for file manipulation and pipelines, Perl for high-level data structure manipulation and C when Perl is insufficient or too slow. The latter question for complex programs is non-trivial and correct detection of bottlenecks needs careful measurements; generally Perl is fast enough for most system programs.

The key idea here is that any sufficiently flexible and programmable environment - and Perl is such an environment -- gradually begins to take on characteristics of both language and operating system as it grows. See Stevey's Blog Rants Get Famous By Not Programming for more about this effect.

Any sufficiently flexible and programmable environment - and Perl is such an environment -- gradually begins to take on characteristics of both language and operating system as it grows.

Unix shell can actually provide a good "in the large" framework of complex programming system serving as a glue for the components.

From the point of view of typical application-level programming Perl is very under appreciated and very little understood language. Almost nobody is interested in details of interpreter, where debugger is integrated with the language really brilliantly. Also namespaces in Perl and OO constructs are very unorthodox and very interesting design.

References are major Perl innovation

References are Perl innovation: classic CS view is that scripting language should not contain references (OO languages operate with references but only implicitly). Role of list construct as implicit subroutine argument list is also implemented non trivially (elements are "by reference" not "by name") and against CS orthodoxy (which favors default "by name" passing of arguments). There are many other unique things about design of Perl. All-in-all for a professional like me, who used to write compilers, Perl is one of the few relatively "new" languages that is not boring :-).

Perl has a great debugger

Debugger for the language is as important as the language itself. Perl debugger is simply great. See Debugging Perl Scripts

Brilliance of Perl Artistic license

Perl license is a real brilliance. Incredible from my point of view feat taking into account when it was done. It provided peaceful co-existence with GPL which is no small feat ;-). Dual licensing was a neat, extremely elegant cultural hack to make Perl acceptable both to businesses and the FSF.

It's very sad that there no really good into for Perl written from the point of view of CS professional despite 100 or more books published.

Perl warts


A small, crocky feature that sticks out of an otherwise clean design. Something conspicuous for localized ugliness, especially a special-case exception to a general rule. ...

Jargon File's definition of the term "wart"

Language design warts

Perl extended C-style syntax in innovative way. For example if statement always uses {} block, never an individual statement, also ; before } is optional. But it shares several C-style syntax shortcomings and introduced a couple of its own:

For a language aficionado Larry Wall make way too many blunders in the design of Perl. Which is understandable (he has no computer science background and was hacker in heart), but sad.

There are also several semantically problems with the language:

Absence of good development environment

R-language has RStudio which probably can be viewed as gold standard of minimal features needed for scripting language GUI. While RStudio has a weak editor it has syntax highlighting and integration with debugger and as such is adequate for medium scripts.

There is no similar "established" as standard de-facto GUI shipped with Perl interpreter and looks like nobody cares. That's a bad design decision although you can use Orthodox file manager (such as Midnight commander, or in Windows Far or Total Commander) as poor man IDE. Komodo Edit is more or less OK editor for Perl and is free although in no way it is full IDE.

This is not a show stopper for system administrators as they can use screen and multiple/different terminal sessions for running scripting and editing them. Also mcedit is handy and generally adequate for small scripts. To say nothing that each sysadmin know badic set of command for vi/vim, and many know it well.

But this is a problem when you try to write Perl scripts with over 1K lines which consist of multiple files. Many things in modern IDE helps to avoid typical errors (for example identifiers can be picked up from the many by right clicking, braces are easier to match if editor provide small almost invisible vertical rulers, color of the string help to detect running string constants, etc.

Currently Komodo and free Komodo editor are almost the only viable game in town.

See

for additional discussion.

Lost development priorities

For mature language the key area of development is not questionable enhancements, but improvement of interpreter diagnostics and efforts in preventing typical errors (which at this point are known).

Perl version 5.10 was the version when two very useful enhancement to the language were added:

Still very little was done to improve interpreter in order to help programmers to avoid most typical Perl errors. that means that the quality of the editor for Perl programmers is of paramount importance. I would recommend free Komodo editor. It allows you to see the list of already declared variables in the program and thus avoid classic "typo in the variable" type of errors.

Not all enhancements that Perl developers adopters after version 5.10 have practical value. Some, as requirement to use backslash in regular expressions number of iterations ( so that /\d{2}/ in "normal" Perl became /\d\{2}/ in version 5.22), are counterproductive. For that reason I do not recommend using version 5.22. You can also use pragma

use v5.12.0

to avoid stupid warnings version 5.20 generates.

There is no attempts to standardize Perl and do enhancements via orderly, negotiated by major stakeholders process. Like is done with C or Fortran (each 11 years; which is a very reasonable period which allow current fads to die ;-). At the same time quality of diagnostics of typical errors by Perl interpreter remains weak (it imporved with the introduction of strict though).

Support for a couple of useful pragma, for example, the ability to limit the length of string constants to a given length (for example 120) for certain parts of the script is absent. Ot something similar like "do not cross the line" limitation.

Local labels might help to close multiple level of nesting (the problem of missing curvy bracket is typical in al C-style languages)

 1:if( $i==1 ){
     if( $k==0 ){
         if ($m==0 ){
   # the curvy bracket below closes all opened clock since the local label 1
 }:1 

Multiple entry points into subroutines might help to organize namespaces.

Working with namespaces can and should be improved and rules for Perl namespaces should be much better better documented. Like pointers namespaces provide powerful facity to structuring language programs. which can be used with or without modules framework. this is a very nice and very powerful Perl feature that makes Perl a class or its own for experienced programmers. Please note that modules are not the only game in town. Actually the way they were constructed has some issues and (sometime stupid) overemphasis on OO only exacerbate those issues. Multiple entry points in procedures would be probably more useful and more efficient addition to the language. Additional that is very easy to implement. The desire to be like the rest of the pack often backfire... From SE point of view scripting language as VHL stands above OO in pecking order ;-). OO is mainly force feed for low level guys who suffer from Java...

Actually there are certain features that should probably be eliminated from Perl 5. For example use of unquoted words as indexes to hashes is definitely a language designers blunder and should be gone. String functions and array functions should be better unified. Exception mechanism should be introduced. Assignment in if statements should be somehow restricted. Assignment of constants to variables in if statement (and all conditions) should be flagged as a clear error (as in if ($a=5) ... ). I think latest version of Perl interpreter do this already.

Problems with Perl 5. 22

Attention: The release contains an obvious newly introduced wart in regex tokenizer, which now requires backslash for number of repetitions part of basic regex symbols. For example in case of /\d{2}/ which you now need to write /\d\{2}/ -- pretty illogical as a curvy brace here a part of \d construct, not a separate symbol (which of course should be escaped);

Looks to me like a typical SNAFU. But the problem is wider and not limited to Perl. There is generally tendency for a gradual loss of architectural integrity after the initial author is gone and there is no strong "language standard committee" which drive the language development (like in Fortran, which issues an undated version of the standard of the language each 11 years).

So some languages like Python this is still in the future, but for many older languages is is already reality and a real danger. Mechanism for preventing this are not well understood. The same situation happens with OS like Linux (systemd).

This newly introduced bug (aka feature) also affects regexes that use opening curvy bracket as a delimiter. Which is a minor but pretty annoying "change we can believe in" ;-). I think that idiosyncrasy will prevent spread this version into production version of Linux Unix for a long, long time (say 10 years) or forever. Image the task of modification of somebody else 30-50K lines Perl scripts for those warnings that heavily uses curvy braces in regex or use \d{1,3} constructs for parsing IP addresses.

This looks more and more like an artificially created year 2000 problem for Perl.

Dr. Nikolai Bezroukov


Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

[Dec 08, 2017] Perl Debugger Tutorial 10 Easy Steps to Debug Perl Program

Dec 08, 2017 | www.thegeekstuff.com

Perl Debugger Tutorial: 10 Easy Steps to Debug Perl Program by Balakrishnan Mariyappan on May 19, 2010

https://apis.google.com/se/0/_/+1/fastbutton?usegapi=1&size=medium&origin=http%3A%2F%2Fwww.thegeekstuff.com&url=http%3A%2F%2Fwww.thegeekstuff.com%2F2010%2F05%2Fperl-debugger%2F&gsrc=3p&jsh=m%3B%2F_%2Fscs%2Fapps-static%2F_%2Fjs%2Fk%3Doz.gapi.en_US.7iE0RPXkeyg.O%2Fm%3D__features__%2Fam%3DAQ%2Frt%3Dj%2Fd%3D1%2Frs%3DAGLTcCPtrDcrcZ6TwfUke349lDWwAOzBUw#_methods=onPlusOne%2C_ready%2C_close%2C_open%2C_resizeMe%2C_renderstart%2Concircled%2Cdrefresh%2Cerefresh&id=I0_1512705132381&_gfid=I0_1512705132381&parent=http%3A%2F%2Fwww.thegeekstuff.com&pfname=&rpctoken=25025448

http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fwww.thegeekstuff.com%2F2010%2F05%2Fperl-debugger%2F&send=false&layout=button_count&width=450&show_faces=false&action=like&colorscheme=light&font&height=21

http://platform.twitter.com/widgets/tweet_button.6b8337773e8a8ecc4f0b054fec8f1482.en.html#dnt=false&id=twitter-widget-0&lang=en&original_referer=http%3A%2F%2Fwww.thegeekstuff.com%2F2010%2F05%2Fperl-debugger%2F&size=m&text=Perl%20Debugger%20Tutorial%3A%2010%20Easy%20Steps%20to%20Debug%20Perl%20Program&time=1512705132548&type=share&url=http%3A%2F%2Fwww.thegeekstuff.com%2F2010%2F05%2Fperl-debugger%2F

Earlier we discussed the basics of how to write and execute a perl program using Perl Hello World Example .

In this article, Let us review how to debug a perl program / script using Perl debugger , which is similar to the gdb tool for debugging C code .

To debug a perl program, invoke the perl debugger using "perl -d" as shown below.

# perl -d  ./perl_debugger.pl

To understand the perl debugger commands in detail, let us create the following sample perl program (perl_debugger.pl).

$ cat perl_debugger.pl
#!/usr/bin/perl -w

# Script to list out the filenames (in the pwd) that contains specific pattern.

#Enabling slurp mode
$/=undef;

# Function : get_pattern
# Description : to get the pattern to be matched in files.
sub get_pattern
{
my $pattern;
print "Enter search string: ";
chomp ($pattern = <> );
return $pattern;
}

# Function : find_files
# Description : to get list of filenames that contains the input pattern.
sub find_files
{
my $pattern = shift;
my (@files,@list,$file);

# using glob, obtaining the filenames,
@files = <./*>;

# taking out the filenames that contains pattern.
@list = grep {
$file = $_;
open $FH,"$file";
@lines = <$FH>;
$count = grep { /$pattern/ } @lines;
$file if($count);
} @files;
return @list;
}
# to obtain the pattern from STDIN
$pattern = get_pattern();

# to find-out the list of filenames which has the input pattern.
@list = find_files($pattern);

print join "\n",@list;
1. Enter Perl Debugger

# perl -d ./perl_debugger.pl

it prompts,
DB<1>

2. View specific lines or subroutine statements using (l)

DB<1> l 10
10: my $pattern;

DB<2> l get_pattern
11 {
12: my $pattern;
13: print "Enter search string: ";
14: chomp ($pattern = );
15: return $pattern;
16 }

3. Set the breakpoint on get_pattern function using (b)

DB<3> b find_files

4. Set the breakpoint on specific line using (b)

DB<4> b 44

5. View the breakpoints using (L)

DB<5> L
./perl_debugger.pl:
22: my $pattern = shift;
break if (1)
44: print join "\n",@list;
break if (1)

6. step by step execution using (s and n)

DB<5> s
main::(./perl_debugger.pl:39): $pattern = get_pattern();

DB<5> s
main::get_pattern(./perl_debugger.pl:12):
12: my $pattern;

Option s and n does step by step execution of each statements. Option s steps into the subroutine. Option n executes the subroutine in a single step (stepping over it).

The s option does stepping into the subroutine but while n option which would execute the subroutine(stepping over it).

7. Continue till next breakpoint (or line number, or subroutine) using (c)

DB<5> c
Enter search string: perl
main::find_files(./perl_debugger.pl:22):
22: my $pattern = shift;

8. Continue down to the specific line number using (c)

DB<5> c 36
main::find_files(./perl_debugger.pl:36):
36: return @list;

9. Print the value in the specific variable using (p)

DB<6> p $pattern
perl

DB<7> c
main::(./perl_debugger.pl:44): print join "\n",@list;
DB<7> c
./perl_debugger.pl
Debugged program terminated. Use q to quit or R to restart,
use o inhibit_exit to avoid stopping after program termination,
h q, h R or h o to get additional info.

After the last continue operation, the output gets printed on the stdout as "./perl_debugger.pl" since it matches the pattern "perl".

10. Get debug commands from the file (source)

Perl debugger can get the debug command from the file and execute it. For example, create the file called "debug_cmds" with the perl debug commands as,

c
p $pattern
q

Note that R is used to restart the operation(no need quit and start debugger again).
DB<7> R
DB<7> source debug_cmds
>> c
Enter search string: perl
./perl_debugger.pl
Debugged program terminated. Use q to quit or R to restart,
use o inhibit_exit to avoid stopping after program termination,
h q, h R or h o to get additional info.
>> p $pattern
perl
>> q

Note : If you are relatively new to perl, refer to our previous article: 20 perl programming tips for beginners .

Summary of perl debugger commands

Following options can be used once you enter the perl debugger.

[Dec 03, 2017] Debugging Regular Expressions

Dec 03, 2017 | my.safaribooksonline.com

Not for the fainthearted, if you want to see how a regular expression runs when used in a match or substitution, use the core re pragma with its debug option:

% perl -Mstrict -Mwarnings
use re qw(debug);
$_ = "cats=purr, dog=bark";
my %sound = /(\w+)=(\w+)/g;
^D
Compiling REx `(\w+)=(\w+)'
size 15 first at 4
1: OPEN1(3)
3: PLUS(5)
4: ALNUM(0)
5: CLOSE1(7)
7: EXACT <=>(9)
9: OPEN2(11)
11: PLUS(13)
12: ALNUM(0)
13: CLOSE2(15)
15: END(0)
floating `=' at 1..2147483647 (checking floating) stclass `ALNUM' plus
minlen 3
Guessing start of match, REx `(\w+)=(\w+)' against `cats=purr,
dog=bark'...
Found floating substr `=' at offset 4...
Does not contradict STCLASS...
Guessed: match at offset 0
Matching REx `(\w+)=(\w+)' against `cats=purr, dog=bark'
Setting an EVAL scope, savestack=3
0 <> <cats=purr, d> | 1: OPEN1
0 <> <cats=purr, d> | 3: PLUS
ALNUM can match 4 times out of 32767...
Setting an EVAL scope, savestack=3
4 <cats> <=purr, d> | 5: CLOSE1
4 <cats> <=purr, d> | 7: EXACT <=>
5 <cats=> <purr, d> | 9: OPEN2
5 <cats=> <purr, d> | 11: PLUS

Setting an EVAL scope, savestack=3
9 <=purr> <, dog=b> | 13: CLOSE2
9 <=purr> <, dog=b> | 15: END
Match successful!
Guessing start of match, REx `(\w+)=(\w+)' against `, dog=bark'...
Found floating substr `=' at offset 5...
By STCLASS: moving 0 --> 2
Guessed: match at offset 2
Matching REx `(\w+)=(\w+)' against `dog=bark'
Setting an EVAL scope, savestack=3
11 <urr, > <dog=bar> | 1: OPEN1
11 <urr, > <dog=bar> | 3: PLUS
ALNUM can match 3 times out of 32767...
Setting an EVAL scope, savestack=3
14 <rr, dog> <=bark> | 5: CLOSE1
14 <rr, dog> <=bark> | 7: EXACT <=>
15 <rr, dog=> <bark> | 9: OPEN2
15 <rr, dog=> <bark> | 11: PLUS
ALNUM can match 4 times out of 32767...
Setting an EVAL scope, savestack=3
19 <rr, dog=bark> <> | 13: CLOSE2
19 <rr, dog=bark> <> | 15: END
Match successful!
Freeing REx: `(\w+)=(\w+)'

debugcolor option instead of debug , you'll get some form of highlighting or coloring in the output that'll make it prettier, if not more understandable

[Dec 03, 2017] Just writing the tests is often a damn fine way of finding bugs

Nov 15, 2017 | perlmonks.com

GrandFather (Sage) on Jul 12, 2006 at 08:38 UTC

Re^2: Strategies for maintenance of horrible code?

Actually, just writing the tests is often a damn fine way of finding bugs. No exactly what OP is after at the moment, but something that is at the forefront of my mind because I'm in the middle of writing a set of tests (in Perl :) for some XML processing C++ code and turning up a pile of bugs as I go.

However it does suggest another test avenue: write test harnesses for modules so that you can exercise them in isolation and better understand how they work. If the test harness ends up part of a regression test system so much the better.


DWIM is Perl's answer to Gödel

[Dec 03, 2017] Strategies for maintenance of horrible code?

Notable quotes:
"... Debugging this code is a whole different game, and I'd really appreciate some input from other monks who've dealt with this type of problem. ..."
Jul 12, 2006 | perlmonks.com

converter has asked for the wisdom of the Perl Monks concerning the following question:

For the past several months I've been busy rewriting the horrible Perl code left behind by my predecessor. His approach to development was "Write some code. If the code runs without revealing any of the damage it's done, ship it. If not, write some more code."

This code is so bad that when co-workers ask me what I'm working on, I tell them "The Madman's Diary." Yes, it would have been cheaper and faster to throw this code away and start over, but I wasn't given that option.

My latest assignment is the repair of a tangled mess of a show-stopper that was discovered in a product that was supposed to ship today. After adding an open() override that logs the arguments to open() and some quality time with the watch(1) utility observing changes to the files containing the data that are causing the problem, I've narrowed the list of suspects down to a couple in-house scripts and a few (probably altered) webmin modules.

Now that I know where to look, I'd like to identify as quickly as possible which details can be safely ignored. I plan to use Devel::DProf to produce an execution graph for reference and Tie::Watch to watch variables, but I wonder if there are other tools that I should look at. A utility or module that would allow me to incrementally build a profile with persistent notes would be wonderful.

Debugging this code is a whole different game, and I'd really appreciate some input from other monks who've dealt with this type of problem.

eyepopslikeamosquito (Chancellor) on Jul 12, 2006 at 08:30 UTC

Re: Strategies for maintenance of horrible code?
Yes, it would have been cheaper and faster to throw this code away and start over
Maybe. For another point of view, see Joel Spolsky on not rewriting from scratch .

I agree with adrianh . If a component is not broken, don't rewrite it. Rewrite a component when you find a number of bugs in it. But first write a regression test suite for the component. I've seen many folks over the years throw out old code, rewrite it ... and introduce a heap of new bugs in the process. If you come into a new company and introduce a swag of new bugs in previously working code, you will start to smell very badly.

See also:

GrandFather (Sage) on Jul 12, 2006 at 08:38 UTC

Re^2: Strategies for maintenance of horrible code?

Actually, just writing the tests is often a damn fine way of finding bugs. No exactly what OP is after at the moment, but something that is at the forefront of my mind because I'm in the middle of writing a set of tests (in Perl :) for some XML processing C++ code and turning up a pile of bugs as I go.

However it does suggest another test avenue: write test harnesses for modules so that you can exercise them in isolation and better understand how they work. If the test harness ends up part of a regression test system so much the better.


DWIM is Perl's answer to Gödel

tinita (Parson) on Jul 12, 2006 at 12:28 UTC

Maybe. For another point of view, see Joel Spolsky on not rewriting from scratch.
uh oh. why does this remind me of perl 6? =)

adrianh (Chancellor) on Jul 12, 2006 at 07:49 UTC

Debugging this code is a whole different game, and I'd really appreciate some input from other monks who've dealt with this type of problem.

I'd recommend reading Perl Medic and Working Effectively with Legacy Code (the latter isn't Perl specific - but is chock full of useful advice).

I would not spend any time fixing the code if it's not breaking (assuming you're not being paid to review/fix the code). However evil it may be - if it's doing it's job leave it alone.

Instead - every time you need to fix a bug or add some new functionality just test/refactor the bits of the evil code that are touched by the changes. I've found incrementally adding tests and refactoring to be much more effective than any sort of "big bang" fixing things for the sake of them approach :-)

If you are being paid to do a review/fix then Perl::Critic might give you some useful places to look.

webfiend (Vicar) on Jul 14, 2006 at 21:24 UTC

Definitely agree about the approach of sorting things out with gradual refactoring and tests as the need arises. The problem with the "Big Bang" approach is that you have the potential for a very long stretch of time where there are two forks of the code: ugly shipping code that will need to be fixed and refactored as bugs are reported, and pretty nonfunctioning code that will need to incorporate those fixes as they are uncovered, resulting in a perpetual loop of "it's not quite ready yet."

Ovid (Cardinal) on Jul 12, 2006 at 10:37 UTC

Check out Suggestions for working with poor code and some of the replies.

Cheers,
Ovid

New address of my CGI Course .

GrandFather (Sage) on Jul 12, 2006 at 07:39 UTC

Re: Strategies for maintenance of horrible code?

What tools are you using already and on what platform? For a large range of "detail" debugging there is nothing like as good as an IDE with a good integrated debugger. For a higher level view of where things are going Devel::TraceCalls may be handy, although it's output can be rather voluminous.


DWIM is Perl's answer to Gödel

Moron (Curate) on Jul 12, 2006 at 12:13 UTC

Some basic CYA I can see:

1) Ensure there is sufficient functional and technical design documentation against which the routines can be tested.

2) (updated) Make sure there is a sufficiently detailed project plan to include tasks for: systems analysis, functional and technical design, test planning, test script writing (e.g. using Expect ), developing, unit-, integrated and functional testing, rework and implementation, to include a GANTT chart of the work done so far and by who to what % of completion, to avoid getting the blame for not meeting poorly conceived targets over which you had no control.

In response to formal testing against the plan, I find it a useful aid to bug-fixing to monitor execution with perl -d, setting breakpoints and examining variables to hunt down which line of code causes each failure.

-M

Free your mind

aufflick (Deacon) on Jul 13, 2006 at 00:17 UTC

You might find the comments to my recent question Generating documentation from Perl code (not just POD) useful.

The Doxygen perl extension creates docs that are great for seeing what classes re-implement what methods etc. Also the UML::Sequence sounds intriguing - it pupports to generate a sequence diagram by monitoring code execution.

[Dec 03, 2017] Core module Tie::File - Access the lines of a disk file via a Perl array

Dec 03, 2017 | perldoc.perl.org

Tie::File

NAME

Tie::File - Access the lines of a disk file via a Perl array

SYNOPSIS
  1. # This file documents Tie::File version 0.98
  2. use Tie::File
  3. tie @array 'Tie::File' filename or die ...
  4. $array 13 ] = 'blah' # line 13 of the file is now 'blah'
  5. print $array 42 # display line 42 of the file
  6. $n_recs = @array # how many records are in the file?
  7. $#array -= # chop two records off the end
  8. for @array
  9. s/PERL/Perl/g # Replace PERL with Perl everywhere in the file
  10. # These are just like regular push, pop, unshift, shift, and splice
  11. # Except that they modify the file in the way you would expect
  12. push @array new recs ...
  13. my $r1 = pop @array
  14. unshift @array new recs ...
  15. my $r2 = shift @array
  16. @old_recs = splice @array new recs ...
  17. untie @array # all finished
DESCRIPTION

Tie::File represents a regular text file as a Perl array. Each element in the array corresponds to a record in the file. The first line of the file is element 0 of the array; the second line is element 1, and so on.

The file is not loaded into memory, so this will work even for gigantic files.

Changes to the array are reflected in the file immediately.

Lazy people and beginners may now stop reading the manual.

recsep

What is a 'record'? By default, the meaning is the same as for the <...> operator: It's a string terminated by $/ , which is probably "\n" . (Minor exception: on DOS and Win32 systems, a 'record' is a string terminated by "\r\n" .) You may change the definition of "record" by supplying the recsep option in the tie call:

  1. tie @array 'Tie::File' $file recsep => 'es'

This says that records are delimited by the string es . If the file contained the following data:

  1. Curse these pesky flies !\

then the @array would appear to have four elements:

  1. "Curse th"
  2. "e p"
  3. "ky fli"
  4. "!\n"

An undefined value is not permitted as a record separator. Perl's special "paragraph mode" semantics (à la $/ = "" ) are not emulated.

Records read from the tied array do not have the record separator string on the end; this is to allow

  1. $array 17 ] .= "extra"

to work as expected.

(See autochomp , below.) Records stored into the array will have the record separator string appended before they are written to the file, if they don't have one already. For example, if the record separator string is "\n" , then the following two lines do exactly the same thing:

  1. $array 17 ] = "Cherry pie"
  2. $array 17 ] = "Cherry pie\n"

The result is that the contents of line 17 of the file will be replaced with "Cherry pie"; a newline character will separate line 17 from line 18. This means that this code will do nothing:

  1. chomp $array 17

Because the chomp ed value will have the separator reattached when it is written back to the file. There is no way to create a file whose trailing record separator string is missing.

Inserting records that contain the record separator string is not supported by this module. It will probably produce a reasonable result, but what this result will be may change in a future version. Use 'splice' to insert records or to replace one record with several.

autochomp

Normally, array elements have the record separator removed, so that if the file contains the text

  1. Gold
  2. Frankincense
  3. Myrrh

the tied array will appear to contain "Gold" "Frankincense" "Myrrh" . If you set autochomp to a false value, the record separator will not be removed. If the file above was tied with

  1. tie @gifts "Tie::File" $gifts autochomp =>

then the array @gifts would appear to contain "Gold\n" "Frankincense\n" "Myrrh\n" , or (on Win32 systems) "Gold\r\n" "Frankincense\r\n" "Myrrh\r\n" .

mode

Normally, the specified file will be opened for read and write access, and will be created if it does not exist. (That is, the flags O_RDWR | O_CREAT are supplied in the open call.) If you want to change this, you may supply alternative flags in the mode option. See Fcntl for a listing of available flags. For example:

  1. # open the file if it exists, but fail if it does not exist
  2. use Fcntl 'O_RDWR'
  3. tie @array 'Tie::File' $file mode => O_RDWR
  4. # create the file if it does not exist
  5. use Fcntl 'O_RDWR' 'O_CREAT'
  6. tie @array 'Tie::File' $file mode => O_RDWR | O_CREAT
  7. # open an existing file in read-only mode
  8. use Fcntl 'O_RDONLY'
  9. tie @array 'Tie::File' $file mode => O_RDONLY

Opening the data file in write-only or append mode is not supported.

memory

This is an upper limit on the amount of memory that Tie::File will consume at any time while managing the file. This is used for two things: managing the read cache and managing the deferred write buffer .

Records read in from the file are cached, to avoid having to re-read them repeatedly. If you read the same record twice, the first time it will be stored in memory, and the second time it will be fetched from the read cache . The amount of data in the read cache will not exceed the value you specified for memory . If Tie::File wants to cache a new record, but the read cache is full, it will make room by expiring the least-recently visited records from the read cache.

The default memory limit is 2Mib. You can adjust the maximum read cache size by supplying the memory option. The argument is the desired cache size, in bytes.

  1. # I have a lot of memory, so use a large cache to speed up access
  2. tie @array 'Tie::File' $file memory => 20_000_000

Setting the memory limit to 0 will inhibit caching; records will be fetched from disk every time you examine them.

The memory value is not an absolute or exact limit on the memory used. Tie::File objects contains some structures besides the read cache and the deferred write buffer, whose sizes are not charged against memory .

The cache itself consumes about 310 bytes per cached record, so if your file has many short records, you may want to decrease the cache memory limit, or else the cache overhead may exceed the size of the cached data.

dw_size

(This is an advanced feature. Skip this section on first reading.)

If you use deferred writing (See Deferred Writing , below) then data you write into the array will not be written directly to the file; instead, it will be saved in the deferred write buffer to be written out later. Data in the deferred write buffer is also charged against the memory limit you set with the memory option.

You may set the dw_size option to limit the amount of data that can be saved in the deferred write buffer. This limit may not exceed the total memory limit. For example, if you set dw_size to 1000 and memory to 2500, that means that no more than 1000 bytes of deferred writes will be saved up. The space available for the read cache will vary, but it will always be at least 1500 bytes (if the deferred write buffer is full) and it could grow as large as 2500 bytes (if the deferred write buffer is empty.)

If you don't specify a dw_size , it defaults to the entire memory limit.

Option Format

- mode is a synonym for mode . - recsep is a synonym for recsep . - memory is a synonym for memory . You get the idea.

Public Methods

The tie call returns an object, say $o . You may call

  1. $rec = $o->FETCH $n
  2. $o->STORE $n $rec

to fetch or store the record at line $n , respectively; similarly the other tied array methods. (See perltie for details.) You may also call the following methods on this object:

flock
  1. $o->flock MODE

will lock the tied file. MODE has the same meaning as the second argument to the Perl built-in flock function; for example LOCK_SH or LOCK_EX | LOCK_NB . (These constants are provided by the use Fcntl ':flock' declaration.)

MODE is optional; the default is LOCK_EX .

Tie::File maintains an internal table of the byte offset of each record it has seen in the file.

When you use flock to lock the file, Tie::File assumes that the read cache is no longer trustworthy, because another process might have modified the file since the last time it was read. Therefore, a successful call to flock discards the contents of the read cache and the internal record offset table.

Tie::File promises that the following sequence of operations will be safe:

  1. my $o = tie @array "Tie::File" $filename
  2. $o->flock

In particular, Tie::File will not read or write the file during the tie call. (Exception: Using mode => O_TRUNC will, of course, erase the file during the tie call. If you want to do this safely, then open the file without O_TRUNC , lock the file, and use @array = () .)

The best way to unlock a file is to discard the object and untie the array. It is probably unsafe to unlock the file without also untying it, because if you do, changes may remain unwritten inside the object. That is why there is no shortcut for unlocking. If you really want to unlock the file prematurely, you know what to do; if you don't know what to do, then don't do it.

All the usual warnings about file locking apply here. In particular, note that file locking in Perl is advisory , which means that holding a lock will not prevent anyone else from reading, writing, or erasing the file; it only prevents them from getting another lock at the same time. Locks are analogous to green traffic lights: If you have a green light, that does not prevent the idiot coming the other way from plowing into you sideways; it merely guarantees to you that the idiot does not also have a green light at the same time.

autochomp
  1. my $old_value = $o->autochomp # disable autochomp option
  2. my $old_value = $o->autochomp # enable autochomp option
  3. my $ac = $o->autochomp () # recover current value

See autochomp , above.

defer , flush , discard , and autodefer

See Deferred Writing , below.

offset
  1. $off = $o->offset $n

This method returns the byte offset of the start of the $n th record in the file. If there is no such record, it returns an undefined value.

Tying to an already-opened filehandle

If $fh is a filehandle, such as is returned by IO::File or one of the other IO modules, you may use:

  1. tie @array 'Tie::File' $fh ...

Similarly if you opened that handle FH with regular open or sysopen , you may use:

  1. tie @array 'Tie::File' \ *FH ...

Handles that were opened write-only won't work. Handles that were opened read-only will work as long as you don't try to modify the array. Handles must be attached to seekable sources of data---that means no pipes or sockets. If Tie::File can detect that you supplied a non-seekable handle, the tie call will throw an exception. (On Unix systems, it can detect this.)

Note that Tie::File will only close any filehandles that it opened internally. If you passed it a filehandle as above, you "own" the filehandle, and are responsible for closing it after you have untied the @array.

Deferred Writing

(This is an advanced feature. Skip this section on first reading.)

Normally, modifying a Tie::File array writes to the underlying file immediately. Every assignment like $a ] = ... rewrites as much of the file as is necessary; typically, everything from line 3 through the end will need to be rewritten. This is the simplest and most transparent behavior. Performance even for large files is reasonably good.

However, under some circumstances, this behavior may be excessively slow. For example, suppose you have a million-record file, and you want to do:

  1. for @FILE
  2. $_ = "> $_"

The first time through the loop, you will rewrite the entire file, from line 0 through the end. The second time through the loop, you will rewrite the entire file from line 1 through the end. The third time through the loop, you will rewrite the entire file from line 2 to the end. And so on.

If the performance in such cases is unacceptable, you may defer the actual writing, and then have it done all at once. The following loop will perform much better for large files:

  1. tied @a ->defer
  2. for @a
  3. $_ = "> $_"
  4. tied @a ->flush

If Tie::File 's memory limit is large enough, all the writing will done in memory. Then, when you call ->flush , the entire file will be rewritten in a single pass.

(Actually, the preceding discussion is something of a fib. You don't need to enable deferred writing to get good performance for this common case, because Tie::File will do it for you automatically unless you specifically tell it not to. See autodeferring , below.)

Calling ->flush returns the array to immediate-write mode. If you wish to discard the deferred writes, you may call ->discard instead of ->flush . Note that in some cases, some of the data will have been written already, and it will be too late for ->discard to discard all the changes. Support for ->discard may be withdrawn in a future version of Tie::File .

Deferred writes are cached in memory up to the limit specified by the dw_size option (see above). If the deferred-write buffer is full and you try to write still more deferred data, the buffer will be flushed. All buffered data will be written immediately, the buffer will be emptied, and the now-empty space will be used for future deferred writes.

If the deferred-write buffer isn't yet full, but the total size of the buffer and the read cache would exceed the memory limit, the oldest records will be expired from the read cache until the total size is under the limit.

push , pop , shift , unshift , and splice cannot be deferred. When you perform one of these operations, any deferred data is written to the file and the operation is performed immediately. This may change in a future version.

If you resize the array with deferred writing enabled, the file will be resized immediately, but deferred records will not be written. This has a surprising consequence: @a = ... erases the file immediately, but the writing of the actual data is deferred. This might be a bug. If it is a bug, it will be fixed in a future version.

Autodeferring

Tie::File tries to guess when deferred writing might be helpful, and to turn it on and off automatically.

  1. for @a
  2. $_ = "> $_"

In this example, only the first two assignments will be done immediately; after this, all the changes to the file will be deferred up to the user-specified memory limit.

You should usually be able to ignore this and just use the module without thinking about deferring. However, special applications may require fine control over which writes are deferred, or may require that all writes be immediate. To disable the autodeferment feature, use

  1. tied @o ->autodefer

or

  1. tie @array 'Tie::File' $file autodefer =>

Similarly, ->autodefer re-enables autodeferment, and ->autodefer () recovers the current value of the autodefer setting.

CONCURRENT ACCESS TO FILES

Caching and deferred writing are inappropriate if you want the same file to be accessed simultaneously from more than one process. Other optimizations performed internally by this module are also incompatible with concurrent access. A future version of this module will support a concurrent => option that enables safe concurrent access.

Previous versions of this documentation suggested using memory => for safe concurrent access. This was mistaken. Tie::File will not support safe concurrent access before version 0.96.

CAVEATS

(That's Latin for 'warnings'.)

SUBCLASSING

This version promises absolutely nothing about the internals, which may change without notice. A future version of the module will have a well-defined and stable subclassing API.

WHAT ABOUT DB_File ?

People sometimes point out that DB_File will do something similar, and ask why Tie::File module is necessary.

There are a number of reasons that you might prefer Tie::File . A list is available at http://perl.plover.com/TieFile/why-not-DB_File .

AUTHOR

Mark Jason Dominus

To contact the author, send email to: mjd perl tiefile @plover com

To receive an announcement whenever a new version of this module is released, send a blank email message to mjd perl tiefile subscribe @plover com .

The most recent version of this module, including documentation and any news of importance, will be available at

  1. http://perl.plover.com/TieFile/
LICENSE

Tie::File version 0.96 is copyright (C) 2003 Mark Jason Dominus.

This library is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

These terms are your choice of any of (1) the Perl Artistic Licence, or (2) version 2 of the GNU General Public License as published by the Free Software Foundation, or (3) any later version of the GNU General Public License.

This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this library program; it should be in the file COPYING . If not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA

For licensing inquiries, contact the author at:

  1. Mark Jason Dominus
  2. 255 S. Warnock St.
  3. Philadelphia, PA 19107
WARRANTY

Tie::File version 0.98 comes with ABSOLUTELY NO WARRANTY. For details, see the license.

THANKS

[Dec 01, 2017] regex - Debugging Perl Regular expression - Stack Overflow

Dec 01, 2017 | stackoverflow.com

down vote favorite 1

AnonGeek ,Jun 20, 2012 at 20:37

I am trying to debug few regular expressions using:
perl -Mre=debug file.pl

The file.pl script has many regular expression. Some of them are repeated. Using above syntax, all the regex in file.pl are being debugged.

Is there a way to tell Perl to debug only a particular regex in a script?

I am familiar with YAPE::Regex module, but that is not what I require. So please dont suggest to use that.

Ehtesh Choudhury ,Jun 20, 2012 at 20:45

Why not just comment out the other regexes, or run just the particular regex on the command line, via perl -e ? – Ehtesh Choudhury Jun 20 '12 at 20:45

AnonGeek ,Jun 20, 2012 at 20:55

the script is very compilcated(12000 LOC). If I will comment out any of regex then the execution will fail..also if I provide dummy values, then it will give unexpected results :( – AnonGeek Jun 20 '12 at 20:55

Oleg V. Volkov ,Jun 20, 2012 at 20:41

As with many other pragmas, you can use no to cancel previous use .
use re 'debug';

$str=~/\d{3}/;

no re 'debug';

$str=~/\d{3}/;

Denis Ibaev ,Jun 20, 2012 at 20:48

As of 5.9.5 the directive use re 'debug' and its equivalents are lexically scoped, as the other directives are.

Use:

{
    use re 'debug';
    # Debugged regexp here.
}

AnonGeek ,Jun 20, 2012 at 21:10

Is this also supported in 5.8.8? I am putting it under a condition but it is enabling it globally for all regex. – AnonGeek Jun 20 '12 at 21:10

Denis Ibaev ,Jun 21, 2012 at 5:24

No, since version 5.9.5. In 5.8.8 you need use no statement. – Denis Ibaev Jun 21 '12 at 5:24

[Nov 30, 2017] Working with character arrays in perl

Nov 30, 2017 | stackoverflow.com

up vote down vote

cbg ,Jul 7, 2014 at 11:36

my @char_array = split "", $s1;

@char_array now contains all the characters of the $s1 string and it's possible to manipulate it, iterate over it or do whatever to it just like with any other array.

You can you splice to insert elements at a given position of the array:
echo -e 'hello\ndisk\ncaller' | perl -F'' -ane ' splice (@F,2,0," "); splice(@F,4,0," "); foreach(@F){print}'
he l lo
di s k
ca l ler

You can use Data::Dumper for better visualization when working with arrays:

echo -n 'hello' | perl -MData::Dumper -F'' -ane ' splice (@F,2,0," "); splice(@F,4,0," ");print Dumper(\@F)'
$VAR1 = [
          'h',
          'e',
          ' ',
          'l',
          ' ',
          'l',
          'o'
        ];

[Nov 30, 2017] debugging - Perl Debugger Filehandle as Input

Highly recommended!
Nov 30, 2017 | stackoverflow.com
I have this problem: I need to control the perl-debugger from an external script. By research I found out about various solutions, but I don't understand them. I failed to properly set up the RemotePort option (editing ".perldb"), which was the first I tried, and found no useful information on providing a filehandle from which the debugger would get its input (by somehow setting @cmdfhs) I found both options over here: http://search.cpan.org/~nwclark/perl-5.8.6/lib/perl5db.pl

It would be nice if you could tell me how to provide the filehandle from which the debugger gets its input, or if you know a link where this is explained?

Casper ,Jun 28, 2015 at 21:53

Here's a simple example setting it up using RemotePort , which seemed easier to me:

The trick to using RemotePort is that you have to have someone listening on the remote end BEFORE you launch the script to be debugged.

As soon as you launch your script with -d Perl will attempt to connect to RemotePort . So you have to make sure the initial connection succeeds by having someone listening there beforehand.

Here I assume some Linux/Unix variant, which has the netcat utility installed. We use netcat to wait for incoming connections in this example, but you can use anything else you wish too which is able to create a service port and shuffle data between that and the current TTY:

In terminal 1

 # Use netcat to listen for incoming connections on port 9999
 > nc -l -p 9999

In terminal 2

 # Start perl with -d and request a RemotePort connection 
 > PERLDB_OPTS=RemotePort=127.0.0.1:9999 perl -d my_script.pl

As soon as you do that in terminal 1 you will see something like this:

Loading DB routines from perl5db.pl version 1.39_10
Editor support available.

Enter h or 'h h' for help, or 'man perldebug' for more help.

main::(my_script.pl:4):
  DB<1>

There you go..debug away.

Devel::Trepan is a gdb-like debugger. Although it has remote control, you can also run it at the outset with the option --command which will "source" (in the gdb-sense) or run a series of debugger commands.

To go into remote control, either start the debugger using the --server option or inside the debugger use the " server " command once inside the debugger.

See Options for a list of options you can give at the outset.

[Nov 30, 2017] regex - How to read perl regular expression debugger - Stack Overflow

Nov 30, 2017 | stackoverflow.com

Wakan Tanka ,Mar 20, 2015 at 13:17

I've come across following materials:
  1. Mastering Perl by brian d foy , chapter: Debugging Regular Expressions.
  2. Debugging regular expressions which mentions re::debug module for perl

I've also try to use various another techniques:

  1. Module re=debugcolor which highlights it's output.
  2. Used following construction ?{print "$1 $2\n"} .

but still did not get the point how to read their output. I've also found another modules used for debugging regular expressions here but I did not tried them yet, can you please explain how to read output of use re 'debug' or another command used for debugging regular expressions in perl?

EDIT in reply to Borodin:

1st example:

perl -Mre=debug -e' "foobar"=~/(.)\1/'
Compiling REx "(.)\1"
Final program:
   1: OPEN1 (3)
   3:   REG_ANY (4)
   4: CLOSE1 (6)
   6: REF1 (8)
   8: END (0)
minlen 1
Matching REx "(.)\1" against "foobar"
   0 <> <foobar>             |  1:OPEN1(3)
   0 <> <foobar>             |  3:REG_ANY(4)
   1 <f> <oobar>             |  4:CLOSE1(6)
   1 <f> <oobar>             |  6:REF1(8)
                                  failed...
   1 <f> <oobar>             |  1:OPEN1(3)
   1 <f> <oobar>             |  3:REG_ANY(4)
   2 <fo> <obar>             |  4:CLOSE1(6)
   2 <fo> <obar>             |  6:REF1(8)
   3 <foo> <bar>             |  8:END(0)
Match successful!
Freeing REx: "(.)\1"
  1. What does OPEN1, REG_ANY, CLOSE1 ... mean ?
  2. What numbers like 1 3 4 6 8 mean?
  3. What does number in braces OPEN1(3) mean?
  4. Which output should I look at, Compiling REx or Matching REx?

2nd example:

 perl -Mre=debugcolor -e' "foobar"=~/(.*)\1/'
Compiling REx "(.*)\1"
Final program:
   1: OPEN1 (3)
   3:   STAR (5)
   4:     REG_ANY (0)
   5: CLOSE1 (7)
   7: REF1 (9)
   9: END (0)
minlen 0
Matching REx "(.*)\1" against "foobar"
   0 <foobar>|  1:OPEN1(3)
   0 <foobar>|  3:STAR(5)
                                  REG_ANY can match 6 times out of 2147483647...
   6 <foobar>|  5:  CLOSE1(7)
   6 <foobar>|  7:  REF1(9)
                                    failed...
   5 <foobar>|  5:  CLOSE1(7)
   5 <foobar>|  7:  REF1(9)
                                    failed...
   4 <foobar>|  5:  CLOSE1(7)
   4 <foobar>|  7:  REF1(9)
                                    failed...
   3 <foobar>|  5:  CLOSE1(7)
   3 <foobar>|  7:  REF1(9)
                                    failed...
   2 <foobar>|  5:  CLOSE1(7)
   2 <foobar>|  7:  REF1(9)
                                    failed...
   1 <foobar>|  5:  CLOSE1(7)
   1 <foobar>|  7:  REF1(9)
                                    failed...
   0 <foobar>|  5:  CLOSE1(7)
   0 <foobar>|  7:  REF1(9)
   0 <foobar>|  9:  END(0)
Match successful!
Freeing REx: "(.*)\1"
  1. Why are numbers descending 6 5 4 3 ... in this example?
  2. What does failed keyword mean?

Borodin ,Mar 20, 2015 at 14:45

Asking how to use a debugger is very broad. Can you show us the pattern that you are trying to debug, and explain what you don't understand? – Borodin Mar 20 '15 at 14:45

ThisSuitIsBlackNot ,Mar 20, 2015 at 15:38

When you run perl -Mre=debug , you're using the re module; you can see the documentation by running perldoc re . The section on "debug mode" is a bit sparse, but ends with "See 'Debugging regular expressions' in perldebug for additional info." perldoc perldebug is similarly short on details, but ends with "These matters are explored in some detail in 'Debugging regular expressions' in perldebguts." And now we have your answer . – ThisSuitIsBlackNot Mar 20 '15 at 15:38

axblount ,Mar 20, 2015 at 20:25

Regular expressions define finite state machines 1 . The debugger is more or less showing you how the state machine is progressing as the string is consumed character by character.

"Compiling REx" is the listing of instructions for that regular expression. The number in parenthesis after each instruction is where to go once the step succeeds. In /(.*)\1/ :

1: OPEN1 (3)
3:   STAR (5)
4:     REG_ANY (0)
5: CLOSE1 (7)

STAR (5) means compute STAR and once you succeed, go to instruction 5 CLOSE1 .

"Matching REx" is the step-by-step execution of those instructions. The number on the left is the total number of characters that have been consumed so far. This number can go down if the matcher has to go backwards because something it tried didn't work.

To understand these instructions, it's important to understand how regular expressions "work." Finite state machines are usually visualized as a kind of flow chart. I have produced a crude one below for /(.)\1/ . Because of the back reference to a capture group, I don't believe this regex is a strict finite state machine. The chart is useful none the less.

               Match                           
+-------+     Anything     +----------+        
| Start +------------------+  State 1 |        
+---^---+                  +--+---+---+        
    |                         |   |            
    |                         |   |Matched same
    +-------------------------+   | character  
            matched different     |            
                character    +----+------+     
                             |  Success  |     
                             +-----------+

We start on Start . It's easy to advance to the first state, we just consume any one character ( REG_ANY ). The only other thing that could happen is end of input. I haven't drawn that here. The REG_ANY instruction is wrapped in the capture group instructions. OPEN1 starts recording all matched characters into the first capture group. CLOSE1 stops recording characters to the first capture group.

Once we consume a character, we sit on State 1 and consume the next char. If it matches the previous char we move to success! REF1 is the instruction that attempts to match capture group #1. Otherwise, we failed and need to move back to the Start to try again. Whenever the matcher says "failed..." it's telling you that something didn't work, so it's returning to an earlier state (that may or may not include 'unconsuming' characters).

The example with * is more complicated. * (which corresponds to STAR ) tries to match the given pattern zero or more times, and it is greedy . That means it tries to match as many characters as it possibly can. Starting at the beginning of the string, it says "I can match up to 6 characters!" So, it matches all 6 characters ( "foobar" ), closes the capture group, and tries to match "foobar" again. That doesn't work! It tries again with 5, that doesn't work. And so on, until it tries to matching zero characters. That means the capture group is empty, matching the empty string always succeeds. So the match succeeds with \1 = "" .

I realize I've spent more time explaining regular expressions than I have Perl's regex debugger. But I think its output will become much more clear once you understand how regexes operate.

Here is a finite state machine simulator . You can enter a regex and see it executed. Unfortunately, it doesn't support back references.

1: I believe some of Perl's regular expression features push it beyond this definition but it's still useful to think about them this way.

> ,

The debug Iinformation contains description of the bytecode. Numbers denote the node indices in the op tree. Numbers in round brackets tell the engine to jump to a specific node upon match. The EXACT operator tells the regex engine to look for a literal string. REG_ANY means the . symbol. PLUS means the +. Code 0 is for the 'end' node. OPEN1 is a '(' symbol. CLOSE1 means ')'. STAR is a '*'. When the matcher reaches the end node, it returns a success code back to Perl, indicating that the entire regex has matched.

See more details at http://perldoc.perl.org/perldebguts.html#Debugging-Regular-Expressions and a more conceptual http://perl.plover.com/Rx/paper/

[Nov 30, 2017] Working with character arrays in perl

Nov 30, 2017 | stackoverflow.com

up vote down vote

cbg ,Jul 7, 2014 at 11:36

my @char_array = split "", $s1;

@char_array now contains all the characters of the $s1 string and it's possible to manipulate it, iterate over it or do whatever to it just like with any other array.

You can you splice to insert elements at a given position of the array:
echo -e 'hello\ndisk\ncaller' | perl -F'' -ane ' splice (@F,2,0," "); splice(@F,4,0," "); foreach(@F){print}'
he l lo
di s k
ca l ler

You can use Data::Dumper for better visualization when working with arrays:

echo -n 'hello' | perl -MData::Dumper -F'' -ane ' splice (@F,2,0," "); splice(@F,4,0," ");print Dumper(\@F)'
$VAR1 = [
          'h',
          'e',
          ' ',
          'l',
          ' ',
          'l',
          'o'
        ];

[Nov 30, 2017] debugging - Perl Debugger Filehandle as Input

Nov 30, 2017 | stackoverflow.com
I have this problem: I need to control the perl-debugger from an external script. By research I found out about various solutions, but I don't understand them. I failed to properly set up the RemotePort option (editing ".perldb"), which was the first I tried, and found no useful information on providing a filehandle from which the debugger would get its input (by somehow setting @cmdfhs) I found both options over here: http://search.cpan.org/~nwclark/perl-5.8.6/lib/perl5db.pl

It would be nice if you could tell me how to provide the filehandle from which the debugger gets its input, or if you know a link where this is explained?

Casper ,Jun 28, 2015 at 21:53

Here's a simple example setting it up using RemotePort , which seemed easier to me:

The trick to using RemotePort is that you have to have someone listening on the remote end BEFORE you launch the script to be debugged.

As soon as you launch your script with -d Perl will attempt to connect to RemotePort . So you have to make sure the initial connection succeeds by having someone listening there beforehand.

Here I assume some Linux/Unix variant, which has the netcat utility installed. We use netcat to wait for incoming connections in this example, but you can use anything else you wish too which is able to create a service port and shuffle data between that and the current TTY:

In terminal 1

 # Use netcat to listen for incoming connections on port 9999
 > nc -l -p 9999

In terminal 2

 # Start perl with -d and request a RemotePort connection 
 > PERLDB_OPTS=RemotePort=127.0.0.1:9999 perl -d my_script.pl

As soon as you do that in terminal 1 you will see something like this:

Loading DB routines from perl5db.pl version 1.39_10
Editor support available.

Enter h or 'h h' for help, or 'man perldebug' for more help.

main::(my_script.pl:4):
  DB<1>

There you go..debug away.

Devel::Trepan is a gdb-like debugger. Although it has remote control, you can also run it at the outset with the option --command which will "source" (in the gdb-sense) or run a series of debugger commands.

To go into remote control, either start the debugger using the --server option or inside the debugger use the " server " command once inside the debugger.

See Options for a list of options you can give at the outset.

[Nov 30, 2017] regex - How to read perl regular expression debugger - Stack Overflow

Nov 30, 2017 | stackoverflow.com

Wakan Tanka ,Mar 20, 2015 at 13:17

I've come across following materials:
  1. Mastering Perl by brian d foy , chapter: Debugging Regular Expressions.
  2. Debugging regular expressions which mentions re::debug module for perl

I've also try to use various another techniques:

  1. Module re=debugcolor which highlights it's output.
  2. Used following construction ?{print "$1 $2\n"} .

but still did not get the point how to read their output. I've also found another modules used for debugging regular expressions here but I did not tried them yet, can you please explain how to read output of use re 'debug' or another command used for debugging regular expressions in perl?

EDIT in reply to Borodin:

1st example:

perl -Mre=debug -e' "foobar"=~/(.)\1/'
Compiling REx "(.)\1"
Final program:
   1: OPEN1 (3)
   3:   REG_ANY (4)
   4: CLOSE1 (6)
   6: REF1 (8)
   8: END (0)
minlen 1
Matching REx "(.)\1" against "foobar"
   0 <> <foobar>             |  1:OPEN1(3)
   0 <> <foobar>             |  3:REG_ANY(4)
   1 <f> <oobar>             |  4:CLOSE1(6)
   1 <f> <oobar>             |  6:REF1(8)
                                  failed...
   1 <f> <oobar>             |  1:OPEN1(3)
   1 <f> <oobar>             |  3:REG_ANY(4)
   2 <fo> <obar>             |  4:CLOSE1(6)
   2 <fo> <obar>             |  6:REF1(8)
   3 <foo> <bar>             |  8:END(0)
Match successful!
Freeing REx: "(.)\1"
  1. What does OPEN1, REG_ANY, CLOSE1 ... mean ?
  2. What numbers like 1 3 4 6 8 mean?
  3. What does number in braces OPEN1(3) mean?
  4. Which output should I look at, Compiling REx or Matching REx?

2nd example:

 perl -Mre=debugcolor -e' "foobar"=~/(.*)\1/'
Compiling REx "(.*)\1"
Final program:
   1: OPEN1 (3)
   3:   STAR (5)
   4:     REG_ANY (0)
   5: CLOSE1 (7)
   7: REF1 (9)
   9: END (0)
minlen 0
Matching REx "(.*)\1" against "foobar"
   0 <foobar>|  1:OPEN1(3)
   0 <foobar>|  3:STAR(5)
                                  REG_ANY can match 6 times out of 2147483647...
   6 <foobar>|  5:  CLOSE1(7)
   6 <foobar>|  7:  REF1(9)
                                    failed...
   5 <foobar>|  5:  CLOSE1(7)
   5 <foobar>|  7:  REF1(9)
                                    failed...
   4 <foobar>|  5:  CLOSE1(7)
   4 <foobar>|  7:  REF1(9)
                                    failed...
   3 <foobar>|  5:  CLOSE1(7)
   3 <foobar>|  7:  REF1(9)
                                    failed...
   2 <foobar>|  5:  CLOSE1(7)
   2 <foobar>|  7:  REF1(9)
                                    failed...
   1 <foobar>|  5:  CLOSE1(7)
   1 <foobar>|  7:  REF1(9)
                                    failed...
   0 <foobar>|  5:  CLOSE1(7)
   0 <foobar>|  7:  REF1(9)
   0 <foobar>|  9:  END(0)
Match successful!
Freeing REx: "(.*)\1"
  1. Why are numbers descending 6 5 4 3 ... in this example?
  2. What does failed keyword mean?

Borodin ,Mar 20, 2015 at 14:45

Asking how to use a debugger is very broad. Can you show us the pattern that you are trying to debug, and explain what you don't understand? – Borodin Mar 20 '15 at 14:45

ThisSuitIsBlackNot ,Mar 20, 2015 at 15:38

When you run perl -Mre=debug , you're using the re module; you can see the documentation by running perldoc re . The section on "debug mode" is a bit sparse, but ends with "See 'Debugging regular expressions' in perldebug for additional info." perldoc perldebug is similarly short on details, but ends with "These matters are explored in some detail in 'Debugging regular expressions' in perldebguts." And now we have your answer . – ThisSuitIsBlackNot Mar 20 '15 at 15:38

axblount ,Mar 20, 2015 at 20:25

Regular expressions define finite state machines 1 . The debugger is more or less showing you how the state machine is progressing as the string is consumed character by character.

"Compiling REx" is the listing of instructions for that regular expression. The number in parenthesis after each instruction is where to go once the step succeeds. In /(.*)\1/ :

1: OPEN1 (3)
3:   STAR (5)
4:     REG_ANY (0)
5: CLOSE1 (7)

STAR (5) means compute STAR and once you succeed, go to instruction 5 CLOSE1 .

"Matching REx" is the step-by-step execution of those instructions. The number on the left is the total number of characters that have been consumed so far. This number can go down if the matcher has to go backwards because something it tried didn't work.

To understand these instructions, it's important to understand how regular expressions "work." Finite state machines are usually visualized as a kind of flow chart. I have produced a crude one below for /(.)\1/ . Because of the back reference to a capture group, I don't believe this regex is a strict finite state machine. The chart is useful none the less.

               Match                           
+-------+     Anything     +----------+        
| Start +------------------+  State 1 |        
+---^---+                  +--+---+---+        
    |                         |   |            
    |                         |   |Matched same
    +-------------------------+   | character  
            matched different     |            
                character    +----+------+     
                             |  Success  |     
                             +-----------+

We start on Start . It's easy to advance to the first state, we just consume any one character ( REG_ANY ). The only other thing that could happen is end of input. I haven't drawn that here. The REG_ANY instruction is wrapped in the capture group instructions. OPEN1 starts recording all matched characters into the first capture group. CLOSE1 stops recording characters to the first capture group.

Once we consume a character, we sit on State 1 and consume the next char. If it matches the previous char we move to success! REF1 is the instruction that attempts to match capture group #1. Otherwise, we failed and need to move back to the Start to try again. Whenever the matcher says "failed..." it's telling you that something didn't work, so it's returning to an earlier state (that may or may not include 'unconsuming' characters).

The example with * is more complicated. * (which corresponds to STAR ) tries to match the given pattern zero or more times, and it is greedy . That means it tries to match as many characters as it possibly can. Starting at the beginning of the string, it says "I can match up to 6 characters!" So, it matches all 6 characters ( "foobar" ), closes the capture group, and tries to match "foobar" again. That doesn't work! It tries again with 5, that doesn't work. And so on, until it tries to matching zero characters. That means the capture group is empty, matching the empty string always succeeds. So the match succeeds with \1 = "" .

I realize I've spent more time explaining regular expressions than I have Perl's regex debugger. But I think its output will become much more clear once you understand how regexes operate.

Here is a finite state machine simulator . You can enter a regex and see it executed. Unfortunately, it doesn't support back references.

1: I believe some of Perl's regular expression features push it beyond this definition but it's still useful to think about them this way.

> ,

The debug Iinformation contains description of the bytecode. Numbers denote the node indices in the op tree. Numbers in round brackets tell the engine to jump to a specific node upon match. The EXACT operator tells the regex engine to look for a literal string. REG_ANY means the . symbol. PLUS means the +. Code 0 is for the 'end' node. OPEN1 is a '(' symbol. CLOSE1 means ')'. STAR is a '*'. When the matcher reaches the end node, it returns a success code back to Perl, indicating that the entire regex has matched.

See more details at http://perldoc.perl.org/perldebguts.html#Debugging-Regular-Expressions and a more conceptual http://perl.plover.com/Rx/paper/

[Nov 29, 2017] How can I have variable assertions in Perl

Notable quotes:
"... Smart::Comments++ When used with the -ENV switch, it's a fantastic tool for this sort of thing. Much better than having to strip all the tests out before going to production, as someone else suggested. ..."
Nov 29, 2017 | stackoverflow.com

Alex ,Jun 21, 2009 at 12:45

How can I check that a variable has a specific value in Perl? Is there a command to stop a script's execution to look up some of it's variables?

I wonder if I can use the Pythonic practice of inserting:

    assert 0, (foo, bar)

to debug scripts in a debuger-less way?

Telemachus ,Jun 21, 2009 at 12:58

A quick CPAN search suggests Carp::Assert .

Sinan Ünür ,Jun 21, 2009 at 13:30

+1 for typing in more characters faster than I did. I am going to edit the URL to be version agnostic though. – Sinan Ünür Jun 21 '09 at 13:30

Telemachus ,Jun 21, 2009 at 13:50

I was going to comment on your answer about the photo finish. As for the URL, I constantly forget that, so thanks. – Telemachus Jun 21 '09 at 13:50

Sinan Ünür ,Jun 21, 2009 at 12:58

See Carp::Assert .

zoul ,Jun 21, 2009 at 13:44

Smart::Comments are nice.

RET ,Jun 22, 2009 at 3:28

Smart::Comments++ When used with the -ENV switch, it's a fantastic tool for this sort of thing. Much better than having to strip all the tests out before going to production, as someone else suggested.

RET Jun 22 '09 at 3:28

nik ,Jun 21, 2009 at 12:54

There is a script at PerlMonks that introduces a fast assert method.

Speed is important since Perl is interpreted and any inline checks will impact performance (unlike simple C macros for example)


I am not sure if these things are going to be directly usable.


Ok! This is what i was looking for -- PDF Warning: Test-Tutorial.pdf . The Test::Harness is used for writing Perl module tests.

Ape-inago ,Jun 21, 2009 at 13:51

$var_to_check =~ /sometest/ or die "bad variable!";

I tend to throw things like this in my code, and later use a find and replace to get rid of them (in production code).

Also, ' eval ' can be used to run a section of code and capture errors and can be used to create exception handling functionality. If you are asserting that a value is not 0, perhaps you want to throw an exception and handle that case in a special way?

> ,

if ( $next_sunrise_time > 24*60*60 ) { warn( "assertion failed" ); } # Assert that the sun must rise in the next 24 hours.

You can do this if you do not have access to Perl 5.9 which is required for Carp::Assert .

[Nov 29, 2017] How can I have variable assertions in Perl

Nov 29, 2017 | stackoverflow.com

Alex ,Jun 21, 2009 at 12:45

How can I check that a variable has a specific value in Perl? Is there a command to stop a script's execution to look up some of it's variables?

I wonder if I can use the Pythonic practice of inserting:

    assert 0, (foo, bar)

to debug scripts in a debuger-less way?

Telemachus ,Jun 21, 2009 at 12:58

A quick CPAN search suggests Carp::Assert .

Sinan Ünür ,Jun 21, 2009 at 13:30

+1 for typing in more characters faster than I did. I am going to edit the URL to be version agnostic though. – Sinan Ünür Jun 21 '09 at 13:30

Telemachus ,Jun 21, 2009 at 13:50

I was going to comment on your answer about the photo finish. As for the URL, I constantly forget that, so thanks. – Telemachus Jun 21 '09 at 13:50

Sinan Ünür ,Jun 21, 2009 at 12:58

See Carp::Assert .

zoul ,Jun 21, 2009 at 13:44

Smart::Comments are nice.

RET ,Jun 22, 2009 at 3:28

Smart::Comments++ When used with the -ENV switch, it's a fantastic tool for this sort of thing. Much better than having to strip all the tests out before going to production, as someone else suggested.

RET Jun 22 '09 at 3:28

nik ,Jun 21, 2009 at 12:54

There is a script at PerlMonks that introduces a fast assert method.

Speed is important since Perl is interpreted and any inline checks will impact performance (unlike simple C macros for example)


I am not sure if these things are going to be directly usable.


Ok! This is what i was looking for -- PDF Warning: Test-Tutorial.pdf . The Test::Harness is used for writing Perl module tests.

Ape-inago ,Jun 21, 2009 at 13:51

$var_to_check =~ /sometest/ or die "bad variable!";

I tend to throw things like this in my code, and later use a find and replace to get rid of them (in production code).

Also, ' eval ' can be used to run a section of code and capture errors and can be used to create exception handling functionality. If you are asserting that a value is not 0, perhaps you want to throw an exception and handle that case in a special way?

> ,

if ( $next_sunrise_time > 24*60*60 ) { warn( "assertion failed" ); } # Assert that the sun must rise in the next 24 hours.

You can do this if you do not have access to Perl 5.9 which is required for Carp::Assert .

[Nov 23, 2017] Simple Module Tutorial

Notable quotes:
"... Quite some years have gone by since the inital post, just want to point out that "use vars" is now considered deprecated, instead, "our" is prefered. ..."
Aug 06, 2001 | perlmonks.com
So you find the Perl docs on modules a bit confusing? OK here is the world's simplest Perl module demonstrating all the salient features of Exporter and a script that uses this module. We also give a short rundown on @INC and finish with a note on using warnings and modules. Here is the module code. MyModule.pm package MyModule; use strict; use Exporter; use vars qw($VERSION @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS); $VERSION = 1.00; @ISA = qw(Exporter); @EXPORT = (); @EXPORT_OK = qw(func1 func2); %EXPORT_TAGS = ( DEFAULT => [qw(&func1)], Both => [qw(&func1 &func2)]); sub func1 { return reverse @_ } sub func2 { return map{ uc }@_ } 1; [download]

First we get a namespace by declaring a package name. This helps ensure our module's functions and variables remain separate from any script that uses it.

Use strict is a very good idea for modules to restrict the use of global variables. See use strict warnings and diagnostics or die for more details.

We need to use the Exporter module to export our functions from the MyModule:: namespace into the main:: namespace to make them available to scripts that 'use' MyModule.

We pacify strict with the use vars declaration of some variables. We can use an 'our' declaration in 5.6+

We now set a $VERSION number and make Exporter part of MyModule using the @ISA. See perlboot for all the gory details on what @ISA is or just use it as shown.

@EXPORT contains a list of functions that we export by default, in this case nothing. Generally the less you export by default using @EXPORT the better. This avoids accidentally clashing with functions defined in the script using the module. If a script wants a function let it ask.

@EXPORT_OK contains a list of functions that we export on demand so we export &func1 &func2 only if specifically requested to. Use this in preference to just blindly exporting functions via @EXPORT. You can also export variables like $CONFIG provided they are globals not lexicals scoped with my (read declare them with our or use vars).

%EXPORT_TAGS. For convenience we define two sets of export tags. The ':DEFAULT' tag exports only &func1; the ':Both' tag exports both &func1 &func2. This hash stores labels pointing to array references. In this case the arrays are anonymous.

We need the 1; at the end because when a module loads Perl checks to see that the module returns a true value to ensure it loaded OK. You could put any true value at the end (see Code::Police ) but 1 is the convention.

MyScript.pl (A simple script to use MyModule) #!/usr/bin/perl -w use strict; # you may need to set @INC here (see below) my @list = qw (J u s t ~ A n o t h e r ~ P e r l ~ H a c k e r !); # case 1 # use MyModule; # print func1(@list),"\n"; # print func2(@list),"\n"; # case 2 # use MyModule qw(&func1); # print func1(@list),"\n"; # print MyModule::func2(@list),"\n"; # case 3 # use MyModule qw(:DEFAULT); # print func1(@list),"\n"; # print func2(@list),"\n"; # case 4 # use MyModule qw(:Both); # print func1(@list),"\n"; # print func2(@list),"\n"; [download]

We use MyModule in MyScript.pl as shown. Uncomment the examples to see what happens. Just uncomment one at a time.

Case 1: Because our module exports nothing by default we get errors as &funct1 and &funct2 have not been exported thus do not exist in the main:: namespace of the script.

Case 2: This works OK. We ask our module to export the &func1 so we can use it. Although &func2 was not exported we reference it with its full package name so this works OK.

Case 3: The ':DEFAULT' tag *should* export &func1 so you might expect the error here to concern a missing &func2. In fact Perl complains about &func1. Hmm, what is going on here. The DEFAULT tag name is special and is automatically set in our modules %EXPORT_TAGS hash like this DEFAULT => \@EXPORT.

Case 4: We specified the export of both our functions with the ':Both' thus this works.

A note on @INC

When you issue a use MyModule; directive perl searchs the @INC array for a module with the correct name. @INC usually contains:

/perl/lib 
/perl/site/lib
.

The . directory (dot dir) is the current working directory. CORE modules are installed under perl/lib whereas non-CORE modules install under perl/site/lib. You can add directories to the module search path in @INC like this:

BEGIN { push @INC, '/my/dir' } # or BEGIN { unshift @INC, '/my/dir' } # or use lib '/my/dir'; [download]

We need to use a BEGIN block to shift values into @INC at compile time as this is when perl checks for modules. If you wait until the script is comiled it is too late and perl will throw an exception saying it can't find MyModule in @INC... The difference between pushing a value and unshifting a value into @INC is that perl searches the @INC array for the module starting with the first dir in that array. Thus is you have a MyModule in /perl/lib/ and another in /perl/site/lib/ and another in ./ the one in /perl/lib will be found first and thus the one used. The use lib pragma effectively does the same as the BEGIN { unshift @INC, $dir } block - see perlman:lib:lib for full specifics.

What use Foo::Bar means

use Foo::Bar does not mean look for a module called "Foo::Bar.pm" in the @INC directories. It means search @INC for a *subdir* called "Foo" and a *module* called "Bar.pm".

Now once we have "use'd" a module its functions are available via the fully specified &PACKAGE::FUNCTION syntax. When we say &Foo::Bar::some_func we are refering to the *package name* not the (dir::)file name that we used in the use. This allows you to have many package names in one use'd file. In practice the names are usually the same.

use Warnings;

You should test your module with warnings enabled as this will pick up many subtle (and not so subtle :-) errors. You can activate warnings using the -w flag in the script you use to test the module. If you add use warnings to the module then your module will require Perl 5.6+ as this was not available before then. If you put $^W++ at the top of the module then you will globally enable warnings - this may break *other modules* a script may be using in addition to your module so is rather antisocial. An expert coder here called tye tests with warnings but does not include them directly in his/her modules.

Hope this explains how it works.

cheers

tachyon

Update

Fixed a typo and added a few comments. Thanks to John M. Dlugosz . Rewrote and restyled tute for compatibility with versions of Perl < 5.6 thanks to crazyinsomniac . Also thanks to tye for reminding me that $^W++ is globally scoped and a bit antisocial for a module.

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

John M. Dlugosz (Monsignor) on Aug 06, 2001 at 04:30 UTC

Re: Simple Module Tutorial

Very nice, getting everything into a short page like that. But, I have a few comments:

Are you sure you want to make $VERSION a float, rather than a v-string? And if so, illustrate the three-digit convention (e.g. 5.005_001 for version 5.5.1).

I'm also shocked that your pm file doesn't use strict !

I would also suggest adding a comment to the 1; line, saying that this means "loaded OK".

-- John

tachyon (Chancellor) on Aug 06, 2001 at 05:42 UTC

Re: Re: Simple Module Tutorial


by tachyon (Chancellor) on Aug 06, 2001 at 05:42 UTC

Thanks John I've updated the text a bit in line with your suggestions. Forgot the strict in the module! Oops it is back in its rightful place right at the top. I just used the simple $VERSION numbering because this is a simple tute :-) Here is an excerpt from the Exporter manpage for those interested.

Module Version Checking The Exporter module will convert an attempt to import a number from a module into a call to $module_name->require_version($value). This can be used to validate that the version of the module being used is greater than or equal to the required version. The Exporter module supplies a default require_version method which checks the value of $VERSION in the exporting module. Since the default require_version method treats the $VERSION number as a simple numeric value it will regard version 1.10 as lower than 1.9. For this reason it is strongly recommended that you use numbers with at least two decimal places, e.g., 1.09. [download]

cheers

tachyon

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

John M. Dlugosz (Monsignor) on Aug 06, 2001 at 08:18 UTC

Re: Re: Re: Simple Module Tutorial
by John M. Dlugosz (Monsignor) on Aug 06, 2001 at 08:18 UTC Yea, I just posted a tutorial on VERSION.

For compatibility with mixing decimals and v-strings, the built-in UNIVERSAL::require_version uses three decimal digits per part.

If you have $MyModule::VERSION= 1.12; (a decimal number) and do a use MyModule 1.20.1 qw/bar/ , it will tell you that the module 1.120 and you asked for 1.020, so that's OK. You expected 1.20 to be greater than 1.12, not-OK.

-- John

tye (Sage) on Aug 06, 2001 at 22:09 UTC

(tye)Re: Simple Module Tutorial

$W++ will only give you run-time warnings and will affect other packages. Personally, I don't turn on warnings in modules that I write but I do make a point of testing them with warnings turned on (by putting "#!/usr/bin/perl -w" at the top of my test scripts).

- tye (but my friends call me "Tye")

tachyon (Chancellor) on Aug 06, 2001 at 22:36 UTC

Re: (tye)Re: Simple Module Tutorial


by tachyon (Chancellor) on Aug 06, 2001 at 22:36 UTC

Thanks I changed it from the lexically scoped 'use warnings;' so that this is applicable to versions < 5.6. but as it adds little value to the tutorial and has the unwanted side effects you point out I have just deleted it - saves a few lines of dubious value. I'll add a note on testing with warnings when I have a moment.

cheers

tachyon

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

johnnywang (Priest) on Aug 09, 2004 at 22:58 UTC

Re: Simple Module Tutorial

Quite some years have gone by since the inital post, just want to point out that "use vars" is now considered deprecated, instead, "our" is prefered. Another already mentioned point is not to use @EXPORT too much. So my basic module is like the following (I have a emacs function to output this):

package MyModule; use strict; use Exporter qw(import); our $VERSION = 1.00; our @ISA = qw(Exporter); our @EXPORT_OK = qw(func1 func2); our %EXPORT_TAGS = ( DEFAULT => [qw(func1)], Both => [qw(func1 func2)]); sub func1 { return reverse @_ } sub func2 { return map{ uc }@_ } 1; [download] For those emacs users, here's the simple script to generate the skeleton: (defun perl-new-module () "Generate a skeleton source for a traditional perl module." (interactive) (setq var (split-string (read-from-minibuffer "Enter module name (eg. Web::Test): "nil nil nil nil nil nil) " ")) (setq name (car var)) (insert (format "package %s;\n\n" name)) (insert "use strict;\n\n") (insert "use Exporter qw(import);\n") (insert "our @ISA = qw(Exporter);\n") (insert "our @EXPORT_OK = qw();\n") (insert "our %EXPORT_TAGS = ();\n") (insert "our $VERSION = 1.00; \n\n") (insert "\n\n\n\n\n\n") (insert "1;") (insert "\n") (previous-line 6) (end-of-line) ) [download]

adrianh (Chancellor) on Aug 10, 2004 at 00:30 UTC

Re^2: Simple Module Tutorial


by adrianh (Chancellor) on Aug 10, 2004 at 00:30 UTC

Quite some years have gone by since the inital post, just want to point out that "use vars" is now considered deprecated, instead, "our" is prefered.

Some people prefer to use our rather than use vars (I'm one of them) - but the latter is not deprecated. Both do slightly different things and many people still prefer to use vars .

beretboy (Chaplain) on Aug 18, 2001 at 15:07 UTC

Re: Simple Module Tutorial

Exellent tutorial ++! I have never understood the writing of modules till now

"Sanity is the playground of the unimaginative" -Unknown

Codmate (Novice) on Sep 06, 2001 at 20:18 UTC

Re: Re: Simple Module Tutorial


by Codmate (Novice) on Sep 06, 2001 at 20:18 UTC

Fantastic stuff - I just converted a full script into a module (and added some bits) and it worked 1st time - WITH NO DEBUGGING REQUIRED (and yes - I am using strict)!!! I was very scared of modules before but now feel like I could write a hundred. Thanks very much for this - invaluable tutorial for a newbie like me :))))))

Jaap (Curate) on Jul 22, 2002 at 09:35 UTC

Re: Simple Module Tutorial

This is a nice tutorial tachyon. Are you considering writing a more advanced tutorial on modules (combined with OO)?

Especially, what a GOOD module looks like. Should we use carp, dynaloader and what not?

gawatkins (Monsignor) on Apr 10, 2003 at 11:44 UTC

Re: Simple Module Tutorial

Great Tutorial, It helped to clear up the muddy water created by my Perl Black Book .

Thanks again,

Greg W.

twotone (Beadle) on Oct 14, 2007 at 05:06 UTC

Re: Simple Module Tutorial

Great summary of module basics!

Here's a little code I came up with to add my module location to @INC (in a cgi environment) by dynamically determining the document root for the script. It works on the remote apache server and when testing locally in windows. It might be of some interest:

BEGIN { # get doc root from %ENV # implicitly declare file root path if %ENV not fount my $doc_root = $ENV{DOCUMENT_ROOT} || 'C:/Users/User/Documents/website/sites/mysite'; # change \ to / $doc_root =~ s/\\/\//g; # add module folder location $doc_root .= "/cgi-bin/cms/"; # add module location to @INC push(@INC,$doc_root); } [download]

bychan (Initiate) on Jan 28, 2008 at 08:45 UTC

Re^2: Simple Module Tutorial


by bychan (Initiate) on Jan 28, 2008 at 08:45 UTC

This tutorial is great. The only problem is, that I get the following result, if I comment out all the cases:

!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!
!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!
!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!
!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!

Shouldn't there be some error messages or warnings?

chexmix (Hermit) on Aug 12, 2008 at 13:24 UTC

Re: Simple Module Tutorial

I like this post very much, but the following is opaque to me for some reason:

" Case 3: The ':DEFAULT' tag *should* export &func1 so you might expect the error here to concern a missing &func2. In fact Perl complains about &func1. Hmm, what is going on here. The DEFAULT tag name is special and is automatically set in our modules %EXPORT_TAGS hash like this DEFAULT => \@EXPORT."

I confess I still don't know what is going on here, and am wondering if someone can help me out.

The words "is automatically set in our modules %EXPORT_TAGS hash like this DEFAULT => \@EXPORT" seem inaccurate to me, since when I look up at the code for MyModule.pm, I see the line

%EXPORT_TAGS = ( DEFAULT => qw(&func1) ,

and not the line

%EXPORT_TAGS = ( DEFAULT => \@EXPORT,

Is the OP saying that the line as is in the MyModule.pm code is incorrect, because the "special" nature of DEFAULT overrides it with something else?

Thanks. I feel I am blanking on something obvious here, but just can't see it this rainy morning.

tye (Sage) on Aug 14, 2008 at 15:41 UTC

Re^2: Simple Module Tutorial (DEFAULT)


by tye (Sage) on Aug 14, 2008 at 15:41 UTC

Is the OP saying that the line as is in the MyModule.pm code is incorrect, because the "special" nature of DEFAULT overrides it with something else?

Yes. Exporter.pm wants :DEFAULT to match @EXPORT so the module is incorrect in trying to define its own meaning for :DEFAULT. Based on what you've quoted, it appears that Exporter.pm forces this issue, but the more important point is that you shouldn't set $EXPORT_TAGS{DEFAULT} yourself.

- tye

sg (Pilgrim) on Feb 05, 2011 at 23:13 UTC

Re: Simple Module Tutorial

Thanks for the exposition; my inclination regarding a simple module is as follows:

MyModule.pm package MyModule; use strict; use warnings; use diagnostics; use Carp; our $VERSION = 1.08; sub see_me { my $foo = shift; print "\t\tDo you see this: $foo?\n"; } 1; __END__ last line of the module needs to be true; last line of the _file_ need not be true: 0; [download]

The above module is exercised by the following script:

exercise_my_module.pl

#!/c/opt/perl/bin/perl use strict; use warnings; use diagnostics; use Carp; use MyModule 1.05; #use MyModule 1.10; # will fail MyModule::see_me( 8 ); __END__ [download]

chanslor (Acolyte) on Feb 27, 2015 at 19:47 UTC

Re^2: Simple Module Tutorial


by chanslor (Acolyte) on Feb 27, 2015 at 19:47 UTC

Thank you for this post! It has gotten me past the first barrier of writing my own module. Thanks again! - chanslor

Anonymous Monk on Mar 09, 2010 at 12:21 UTC

Re: Simple Module Tutorial

Excellent representation of what seemed a tough nut to swallow. Thank you very much. Tanuj Bhargava

Anonymous Monk on Mar 07, 2011 at 15:12 UTC

Re: Simple Module Tutorial

Thanks to the writer for your trouble. But everyone seems to get it except me. I have tried to duplicate your results.

I have:

- MyScript.pl under /storage/username/PERL_SCRIPTS/dev

- Test.pm under /storage/username/local/perl/site/lib/Test/Test.pm (I just replaced MyModule.pm with Test.pm).

The module code is exactly the same. In MyScript.pl I have added

use lib '/storage/username/local/perl/site/lib'; <p>and typed in the first two cases.</p> <code>perl MyScript.pl [download] gives: Undefined subroutine &main::func1 called at MyScript.pl line 10

Line 10 is:

print func1(@list),"\n";

after typing "use Test;"

What am I missing here? Also, is the BEGIN command supposed to be used in the Perl script? It gives syntax errors when I try to use it.

Thanks in advance,

Gideon

toolic (Bishop) on Mar 07, 2011 at 15:25 UTC

Re^2: Simple Module Tutorial


by toolic (Bishop) on Mar 07, 2011 at 15:25 UTC

"Test" is a poor choice of a module name because there is a Core module of the same name ( Test ) which is part of the standard Perl distribution. Furthermore, since you placed your .pm file under a directory named "Test", you would need to type use Test::Test; . I strongly recommend you change the name of your module to something more unique in order to avoid this naming collision.

Anonymous Monk on Mar 07, 2011 at 16:11 UTC

Re^3: Simple Module Tutorial
by Anonymous Monk on Mar 07, 2011 at 16:11 UTC

Dear toolic,

Thank you for your reply. I now have changed all instances of "Test" with "MyModule" and also changed the name of the module. MyModule.pm is now situated on

/storage/username/local/perl/site/lib

and I use

use lib '/storage/username/local/perl/site/lib'; with the second case (as per the example): # case 2 use MyModule; print func1(@list),"\n"; print MyModule::func2(@list),"\n"; [download]

but I still get the same error: Undefined subroutine &main::func1 called at MyScript line 15.

Just to make sure I copied the module exactly from the example but to no avail. Interestingly, when I comment out the print func1 part, the line after that produces correct output. I hope that someone could point out to me where I am at fault.

Best regards,

Gideon

toolic (Bishop) on Mar 07, 2011 at 16:34 UTC

Re^4: Simple Module Tutorial
by toolic (Bishop) on Mar 07, 2011 at 16:34 UTC

Anonymous Monk on Mar 07, 2011 at 16:58 UTC

Re^3: Simple Module Tutorial
by Anonymous Monk on Mar 07, 2011 at 16:58 UTC

Hi toolic

For some reason I can't reply to your latest post but thanks a mil! I did copy the module exactly but not the script. Somehow I mixed up case 1 and case 2. I expected case 1 not to work but case 2 but instead of coding

use MyModule qw(&func1);

I simply used

use MyModule;

Thanks a lot for pointing it out, it seems to work now. I have learned quite a bit.

Best regards,

Gideon

Anonymous Monk on Nov 18, 2014 at 00:13 UTC

Re: Simple Module Tutorial

Thank you for this topic, it is very useful for a beginner. However i had a trouble with use of Module.

Anonymous Monk on Nov 18, 2014 at 01:06 UTC

Re^2: Simple Module Tutorial


by Anonymous Monk on Nov 18, 2014 at 01:06 UTC

You did not post any code

Anonymous Monk on Dec 16, 2014 at 23:03 UTC

Re^3: Simple Module Tutorial
by Anonymous Monk on Dec 16, 2014 at 23:03 UTC Hello, I used exactly the same code than quoted at Re: Simple Module Tutorial by johnnywang on Aug 09, 2004 at 22:58 UTC I don't get trouble with the code, but it is just that once i used the module i can't modify function in it and see the effects. It looks like library are not updated. How could we do that ?
Replies are listed 'Best First'.

[Nov 23, 2017] A Perl array 'contains' example by Alvin Alexander

Jun 03, 2016 | alvinalexander.com

Perl array FAQ: How can I test to see if a Perl array already contains a given value? (Also written as, How do I search an array with the Perl grep function?)

I use the Perl grep function to see if a Perl array contains a given entry. For instance, in this Perl code:

if ( grep { $_ eq $clientAddress} @ip_addresses ) {
  # the array already contains this ip address; skip it this time
  next;
} else {
  # the array does not yet contain this ip address; add it
  push @ip_addresses, $clientAddress;
}

I'm testing to see if the Perl array "@ip_addresses" contains an entry given by the variable "$clientAddress".

Just use this Perl array search technique in an "if" clause, as shown, and then add whatever logic you want within your if and else statements. In this case, if the current IP address is not already in the array, I add it to the array in the "else" clause, but of course your logic will be unique.

An easier "Perl array contains" example

If it's easier to read without a variable in there, here's another example of this "Perl array contains" code:

if ( grep { $_ eq '192.168.1.100'} @ip_addresses )

if you'd like more details, I didn't realize it, but I have another good example out here in my " Perl grep array tutorial ." (It's pretty bad when you can't find things on your own website.)

[Nov 23, 2017] Simple Module Tutorial

Aug 06, 2001 | perlmonks.com
So you find the Perl docs on modules a bit confusing? OK here is the world's simplest Perl module demonstrating all the salient features of Exporter and a script that uses this module. We also give a short rundown on @INC and finish with a note on using warnings and modules. Here is the module code. MyModule.pm package MyModule; use strict; use Exporter; use vars qw($VERSION @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS); $VERSION = 1.00; @ISA = qw(Exporter); @EXPORT = (); @EXPORT_OK = qw(func1 func2); %EXPORT_TAGS = ( DEFAULT => [qw(&func1)], Both => [qw(&func1 &func2)]); sub func1 { return reverse @_ } sub func2 { return map{ uc }@_ } 1; [download]

First we get a namespace by declaring a package name. This helps ensure our module's functions and variables remain separate from any script that uses it.

Use strict is a very good idea for modules to restrict the use of global variables. See use strict warnings and diagnostics or die for more details.

We need to use the Exporter module to export our functions from the MyModule:: namespace into the main:: namespace to make them available to scripts that 'use' MyModule.

We pacify strict with the use vars declaration of some variables. We can use an 'our' declaration in 5.6+

We now set a $VERSION number and make Exporter part of MyModule using the @ISA. See perlboot for all the gory details on what @ISA is or just use it as shown.

@EXPORT contains a list of functions that we export by default, in this case nothing. Generally the less you export by default using @EXPORT the better. This avoids accidentally clashing with functions defined in the script using the module. If a script wants a function let it ask.

@EXPORT_OK contains a list of functions that we export on demand so we export &func1 &func2 only if specifically requested to. Use this in preference to just blindly exporting functions via @EXPORT. You can also export variables like $CONFIG provided they are globals not lexicals scoped with my (read declare them with our or use vars).

%EXPORT_TAGS. For convenience we define two sets of export tags. The ':DEFAULT' tag exports only &func1; the ':Both' tag exports both &func1 &func2. This hash stores labels pointing to array references. In this case the arrays are anonymous.

We need the 1; at the end because when a module loads Perl checks to see that the module returns a true value to ensure it loaded OK. You could put any true value at the end (see Code::Police ) but 1 is the convention.

MyScript.pl (A simple script to use MyModule) #!/usr/bin/perl -w use strict; # you may need to set @INC here (see below) my @list = qw (J u s t ~ A n o t h e r ~ P e r l ~ H a c k e r !); # case 1 # use MyModule; # print func1(@list),"\n"; # print func2(@list),"\n"; # case 2 # use MyModule qw(&func1); # print func1(@list),"\n"; # print MyModule::func2(@list),"\n"; # case 3 # use MyModule qw(:DEFAULT); # print func1(@list),"\n"; # print func2(@list),"\n"; # case 4 # use MyModule qw(:Both); # print func1(@list),"\n"; # print func2(@list),"\n"; [download]

We use MyModule in MyScript.pl as shown. Uncomment the examples to see what happens. Just uncomment one at a time.

Case 1: Because our module exports nothing by default we get errors as &funct1 and &funct2 have not been exported thus do not exist in the main:: namespace of the script.

Case 2: This works OK. We ask our module to export the &func1 so we can use it. Although &func2 was not exported we reference it with its full package name so this works OK.

Case 3: The ':DEFAULT' tag *should* export &func1 so you might expect the error here to concern a missing &func2. In fact Perl complains about &func1. Hmm, what is going on here. The DEFAULT tag name is special and is automatically set in our modules %EXPORT_TAGS hash like this DEFAULT => \@EXPORT.

Case 4: We specified the export of both our functions with the ':Both' thus this works.

A note on @INC

When you issue a use MyModule; directive perl searchs the @INC array for a module with the correct name. @INC usually contains:

/perl/lib 
/perl/site/lib
.

The . directory (dot dir) is the current working directory. CORE modules are installed under perl/lib whereas non-CORE modules install under perl/site/lib. You can add directories to the module search path in @INC like this:

BEGIN { push @INC, '/my/dir' } # or BEGIN { unshift @INC, '/my/dir' } # or use lib '/my/dir'; [download]

We need to use a BEGIN block to shift values into @INC at compile time as this is when perl checks for modules. If you wait until the script is comiled it is too late and perl will throw an exception saying it can't find MyModule in @INC... The difference between pushing a value and unshifting a value into @INC is that perl searches the @INC array for the module starting with the first dir in that array. Thus is you have a MyModule in /perl/lib/ and another in /perl/site/lib/ and another in ./ the one in /perl/lib will be found first and thus the one used. The use lib pragma effectively does the same as the BEGIN { unshift @INC, $dir } block - see perlman:lib:lib for full specifics.

What use Foo::Bar means

use Foo::Bar does not mean look for a module called "Foo::Bar.pm" in the @INC directories. It means search @INC for a *subdir* called "Foo" and a *module* called "Bar.pm".

Now once we have "use'd" a module its functions are available via the fully specified &PACKAGE::FUNCTION syntax. When we say &Foo::Bar::some_func we are refering to the *package name* not the (dir::)file name that we used in the use. This allows you to have many package names in one use'd file. In practice the names are usually the same.

use Warnings;

You should test your module with warnings enabled as this will pick up many subtle (and not so subtle :-) errors. You can activate warnings using the -w flag in the script you use to test the module. If you add use warnings to the module then your module will require Perl 5.6+ as this was not available before then. If you put $^W++ at the top of the module then you will globally enable warnings - this may break *other modules* a script may be using in addition to your module so is rather antisocial. An expert coder here called tye tests with warnings but does not include them directly in his/her modules.

Hope this explains how it works.

cheers

tachyon

Update

Fixed a typo and added a few comments. Thanks to John M. Dlugosz . Rewrote and restyled tute for compatibility with versions of Perl < 5.6 thanks to crazyinsomniac . Also thanks to tye for reminding me that $^W++ is globally scoped and a bit antisocial for a module.

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

John M. Dlugosz (Monsignor) on Aug 06, 2001 at 04:30 UTC

Re: Simple Module Tutorial

Very nice, getting everything into a short page like that. But, I have a few comments:

Are you sure you want to make $VERSION a float, rather than a v-string? And if so, illustrate the three-digit convention (e.g. 5.005_001 for version 5.5.1).

I'm also shocked that your pm file doesn't use strict !

I would also suggest adding a comment to the 1; line, saying that this means "loaded OK".

-- John

tachyon (Chancellor) on Aug 06, 2001 at 05:42 UTC

Re: Re: Simple Module Tutorial


by tachyon (Chancellor) on Aug 06, 2001 at 05:42 UTC

Thanks John I've updated the text a bit in line with your suggestions. Forgot the strict in the module! Oops it is back in its rightful place right at the top. I just used the simple $VERSION numbering because this is a simple tute :-) Here is an excerpt from the Exporter manpage for those interested.

Module Version Checking The Exporter module will convert an attempt to import a number from a module into a call to $module_name->require_version($value). This can be used to validate that the version of the module being used is greater than or equal to the required version. The Exporter module supplies a default require_version method which checks the value of $VERSION in the exporting module. Since the default require_version method treats the $VERSION number as a simple numeric value it will regard version 1.10 as lower than 1.9. For this reason it is strongly recommended that you use numbers with at least two decimal places, e.g., 1.09. [download]

cheers

tachyon

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

John M. Dlugosz (Monsignor) on Aug 06, 2001 at 08:18 UTC

Re: Re: Re: Simple Module Tutorial
by John M. Dlugosz (Monsignor) on Aug 06, 2001 at 08:18 UTC Yea, I just posted a tutorial on VERSION.

For compatibility with mixing decimals and v-strings, the built-in UNIVERSAL::require_version uses three decimal digits per part.

If you have $MyModule::VERSION= 1.12; (a decimal number) and do a use MyModule 1.20.1 qw/bar/ , it will tell you that the module 1.120 and you asked for 1.020, so that's OK. You expected 1.20 to be greater than 1.12, not-OK.

-- John

tye (Sage) on Aug 06, 2001 at 22:09 UTC

(tye)Re: Simple Module Tutorial

$W++ will only give you run-time warnings and will affect other packages. Personally, I don't turn on warnings in modules that I write but I do make a point of testing them with warnings turned on (by putting "#!/usr/bin/perl -w" at the top of my test scripts).

- tye (but my friends call me "Tye")

tachyon (Chancellor) on Aug 06, 2001 at 22:36 UTC

Re: (tye)Re: Simple Module Tutorial


by tachyon (Chancellor) on Aug 06, 2001 at 22:36 UTC

Thanks I changed it from the lexically scoped 'use warnings;' so that this is applicable to versions < 5.6. but as it adds little value to the tutorial and has the unwanted side effects you point out I have just deleted it - saves a few lines of dubious value. I'll add a note on testing with warnings when I have a moment.

cheers

tachyon

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

johnnywang (Priest) on Aug 09, 2004 at 22:58 UTC

Re: Simple Module Tutorial

Quite some years have gone by since the inital post, just want to point out that "use vars" is now considered deprecated, instead, "our" is prefered. Another already mentioned point is not to use @EXPORT too much. So my basic module is like the following (I have a emacs function to output this):

package MyModule; use strict; use Exporter qw(import); our $VERSION = 1.00; our @ISA = qw(Exporter); our @EXPORT_OK = qw(func1 func2); our %EXPORT_TAGS = ( DEFAULT => [qw(func1)], Both => [qw(func1 func2)]); sub func1 { return reverse @_ } sub func2 { return map{ uc }@_ } 1; [download] For those emacs users, here's the simple script to generate the skeleton: (defun perl-new-module () "Generate a skeleton source for a traditional perl module." (interactive) (setq var (split-string (read-from-minibuffer "Enter module name (eg. Web::Test): "nil nil nil nil nil nil) " ")) (setq name (car var)) (insert (format "package %s;\n\n" name)) (insert "use strict;\n\n") (insert "use Exporter qw(import);\n") (insert "our @ISA = qw(Exporter);\n") (insert "our @EXPORT_OK = qw();\n") (insert "our %EXPORT_TAGS = ();\n") (insert "our $VERSION = 1.00; \n\n") (insert "\n\n\n\n\n\n") (insert "1;") (insert "\n") (previous-line 6) (end-of-line) ) [download]

adrianh (Chancellor) on Aug 10, 2004 at 00:30 UTC

Re^2: Simple Module Tutorial


by adrianh (Chancellor) on Aug 10, 2004 at 00:30 UTC

Quite some years have gone by since the inital post, just want to point out that "use vars" is now considered deprecated, instead, "our" is prefered.

Some people prefer to use our rather than use vars (I'm one of them) - but the latter is not deprecated. Both do slightly different things and many people still prefer to use vars .

beretboy (Chaplain) on Aug 18, 2001 at 15:07 UTC

Re: Simple Module Tutorial

Exellent tutorial ++! I have never understood the writing of modules till now

"Sanity is the playground of the unimaginative" -Unknown

Codmate (Novice) on Sep 06, 2001 at 20:18 UTC

Re: Re: Simple Module Tutorial


by Codmate (Novice) on Sep 06, 2001 at 20:18 UTC

Fantastic stuff - I just converted a full script into a module (and added some bits) and it worked 1st time - WITH NO DEBUGGING REQUIRED (and yes - I am using strict)!!! I was very scared of modules before but now feel like I could write a hundred. Thanks very much for this - invaluable tutorial for a newbie like me :))))))

Jaap (Curate) on Jul 22, 2002 at 09:35 UTC

Re: Simple Module Tutorial

This is a nice tutorial tachyon. Are you considering writing a more advanced tutorial on modules (combined with OO)?

Especially, what a GOOD module looks like. Should we use carp, dynaloader and what not?

gawatkins (Monsignor) on Apr 10, 2003 at 11:44 UTC

Re: Simple Module Tutorial

Great Tutorial, It helped to clear up the muddy water created by my Perl Black Book .

Thanks again,

Greg W.

twotone (Beadle) on Oct 14, 2007 at 05:06 UTC

Re: Simple Module Tutorial

Great summary of module basics!

Here's a little code I came up with to add my module location to @INC (in a cgi environment) by dynamically determining the document root for the script. It works on the remote apache server and when testing locally in windows. It might be of some interest:

BEGIN { # get doc root from %ENV # implicitly declare file root path if %ENV not fount my $doc_root = $ENV{DOCUMENT_ROOT} || 'C:/Users/User/Documents/website/sites/mysite'; # change \ to / $doc_root =~ s/\\/\//g; # add module folder location $doc_root .= "/cgi-bin/cms/"; # add module location to @INC push(@INC,$doc_root); } [download]

bychan (Initiate) on Jan 28, 2008 at 08:45 UTC

Re^2: Simple Module Tutorial


by bychan (Initiate) on Jan 28, 2008 at 08:45 UTC

This tutorial is great. The only problem is, that I get the following result, if I comment out all the cases:

!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!
!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!
!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!
!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!

Shouldn't there be some error messages or warnings?

chexmix (Hermit) on Aug 12, 2008 at 13:24 UTC

Re: Simple Module Tutorial

I like this post very much, but the following is opaque to me for some reason:

" Case 3: The ':DEFAULT' tag *should* export &func1 so you might expect the error here to concern a missing &func2. In fact Perl complains about &func1. Hmm, what is going on here. The DEFAULT tag name is special and is automatically set in our modules %EXPORT_TAGS hash like this DEFAULT => \@EXPORT."

I confess I still don't know what is going on here, and am wondering if someone can help me out.

The words "is automatically set in our modules %EXPORT_TAGS hash like this DEFAULT => \@EXPORT" seem inaccurate to me, since when I look up at the code for MyModule.pm, I see the line

%EXPORT_TAGS = ( DEFAULT => qw(&func1) ,

and not the line

%EXPORT_TAGS = ( DEFAULT => \@EXPORT,

Is the OP saying that the line as is in the MyModule.pm code is incorrect, because the "special" nature of DEFAULT overrides it with something else?

Thanks. I feel I am blanking on something obvious here, but just can't see it this rainy morning.

tye (Sage) on Aug 14, 2008 at 15:41 UTC

Re^2: Simple Module Tutorial (DEFAULT)


by tye (Sage) on Aug 14, 2008 at 15:41 UTC

Is the OP saying that the line as is in the MyModule.pm code is incorrect, because the "special" nature of DEFAULT overrides it with something else?

Yes. Exporter.pm wants :DEFAULT to match @EXPORT so the module is incorrect in trying to define its own meaning for :DEFAULT. Based on what you've quoted, it appears that Exporter.pm forces this issue, but the more important point is that you shouldn't set $EXPORT_TAGS{DEFAULT} yourself.

- tye

sg (Pilgrim) on Feb 05, 2011 at 23:13 UTC

Re: Simple Module Tutorial

Thanks for the exposition; my inclination regarding a simple module is as follows:

MyModule.pm package MyModule; use strict; use warnings; use diagnostics; use Carp; our $VERSION = 1.08; sub see_me { my $foo = shift; print "\t\tDo you see this: $foo?\n"; } 1; __END__ last line of the module needs to be true; last line of the _file_ need not be true: 0; [download]

The above module is exercised by the following script:

exercise_my_module.pl

#!/c/opt/perl/bin/perl use strict; use warnings; use diagnostics; use Carp; use MyModule 1.05; #use MyModule 1.10; # will fail MyModule::see_me( 8 ); __END__ [download]

chanslor (Acolyte) on Feb 27, 2015 at 19:47 UTC

Re^2: Simple Module Tutorial


by chanslor (Acolyte) on Feb 27, 2015 at 19:47 UTC

Thank you for this post! It has gotten me past the first barrier of writing my own module. Thanks again! - chanslor

Anonymous Monk on Mar 09, 2010 at 12:21 UTC

Re: Simple Module Tutorial

Excellent representation of what seemed a tough nut to swallow. Thank you very much. Tanuj Bhargava

Anonymous Monk on Mar 07, 2011 at 15:12 UTC

Re: Simple Module Tutorial

Thanks to the writer for your trouble. But everyone seems to get it except me. I have tried to duplicate your results.

I have:

- MyScript.pl under /storage/username/PERL_SCRIPTS/dev

- Test.pm under /storage/username/local/perl/site/lib/Test/Test.pm (I just replaced MyModule.pm with Test.pm).

The module code is exactly the same. In MyScript.pl I have added

use lib '/storage/username/local/perl/site/lib'; <p>and typed in the first two cases.</p> <code>perl MyScript.pl [download] gives: Undefined subroutine &main::func1 called at MyScript.pl line 10

Line 10 is:

print func1(@list),"\n";

after typing "use Test;"

What am I missing here? Also, is the BEGIN command supposed to be used in the Perl script? It gives syntax errors when I try to use it.

Thanks in advance,

Gideon

toolic (Bishop) on Mar 07, 2011 at 15:25 UTC

Re^2: Simple Module Tutorial


by toolic (Bishop) on Mar 07, 2011 at 15:25 UTC

"Test" is a poor choice of a module name because there is a Core module of the same name ( Test ) which is part of the standard Perl distribution. Furthermore, since you placed your .pm file under a directory named "Test", you would need to type use Test::Test; . I strongly recommend you change the name of your module to something more unique in order to avoid this naming collision.

Anonymous Monk on Mar 07, 2011 at 16:11 UTC

Re^3: Simple Module Tutorial
by Anonymous Monk on Mar 07, 2011 at 16:11 UTC

Dear toolic,

Thank you for your reply. I now have changed all instances of "Test" with "MyModule" and also changed the name of the module. MyModule.pm is now situated on

/storage/username/local/perl/site/lib

and I use

use lib '/storage/username/local/perl/site/lib'; with the second case (as per the example): # case 2 use MyModule; print func1(@list),"\n"; print MyModule::func2(@list),"\n"; [download]

but I still get the same error: Undefined subroutine &main::func1 called at MyScript line 15.

Just to make sure I copied the module exactly from the example but to no avail. Interestingly, when I comment out the print func1 part, the line after that produces correct output. I hope that someone could point out to me where I am at fault.

Best regards,

Gideon

toolic (Bishop) on Mar 07, 2011 at 16:34 UTC

Re^4: Simple Module Tutorial
by toolic (Bishop) on Mar 07, 2011 at 16:34 UTC

Anonymous Monk on Mar 07, 2011 at 16:58 UTC

Re^3: Simple Module Tutorial
by Anonymous Monk on Mar 07, 2011 at 16:58 UTC

Hi toolic

For some reason I can't reply to your latest post but thanks a mil! I did copy the module exactly but not the script. Somehow I mixed up case 1 and case 2. I expected case 1 not to work but case 2 but instead of coding

use MyModule qw(&func1);

I simply used

use MyModule;

Thanks a lot for pointing it out, it seems to work now. I have learned quite a bit.

Best regards,

Gideon

Anonymous Monk on Nov 18, 2014 at 00:13 UTC

Re: Simple Module Tutorial

Thank you for this topic, it is very useful for a beginner. However i had a trouble with use of Module.

Anonymous Monk on Nov 18, 2014 at 01:06 UTC

Re^2: Simple Module Tutorial


by Anonymous Monk on Nov 18, 2014 at 01:06 UTC

You did not post any code

Anonymous Monk on Dec 16, 2014 at 23:03 UTC

Re^3: Simple Module Tutorial
by Anonymous Monk on Dec 16, 2014 at 23:03 UTC Hello, I used exactly the same code than quoted at Re: Simple Module Tutorial by johnnywang on Aug 09, 2004 at 22:58 UTC I don't get trouble with the code, but it is just that once i used the module i can't modify function in it and see the effects. It looks like library are not updated. How could we do that ?
Replies are listed 'Best First'.

[Nov 23, 2017] A Perl array 'contains' example by Alvin Alexander

June 3, 2016 | alvinalexander.com

Perl array FAQ: How can I test to see if a Perl array already contains a given value? (Also written as, How do I search an array with the Perl grep function?)

I use the Perl grep function to see if a Perl array contains a given entry. For instance, in this Perl code:

if ( grep { $_ eq $clientAddress} @ip_addresses ) {
  # the array already contains this ip address; skip it this time
  next;
} else {
  # the array does not yet contain this ip address; add it
  push @ip_addresses, $clientAddress;
}

I'm testing to see if the Perl array "@ip_addresses" contains an entry given by the variable "$clientAddress".

Just use this Perl array search technique in an "if" clause, as shown, and then add whatever logic you want within your if and else statements. In this case, if the current IP address is not already in the array, I add it to the array in the "else" clause, but of course your logic will be unique.

An easier "Perl array contains" example

If it's easier to read without a variable in there, here's another example of this "Perl array contains" code:

if ( grep { $_ eq '192.168.1.100'} @ip_addresses )

if you'd like more details, I didn't realize it, but I have another good example out here in my " Perl grep array tutorial ." (It's pretty bad when you can't find things on your own website.)

[Nov 22, 2017] edited yesterday

Notable quotes:
"... Comment on possible multiple conflicting options ..."
"... Higher-Order Perl ..."
Nov 22, 2017 | stackoverflow.com

down vote favorite

Speeddymon ,2 days ago

I've been reading up on dispatch tables and I get the general idea of how they work, but I'm having some trouble taking what I see online and applying the concept to some code I originally wrote as an ugly mess of if-elsif-else statements.

I have options parsing configured by using GetOpt::Long , and in turn, those options set a value in the %OPTIONS hash, depending on the option used.

Taking the below code as an example... ( UPDATED WITH MORE DETAIL

use     5.008008;
use     strict;
use     warnings;
use     File::Basename qw(basename);
use     Getopt::Long qw(HelpMessage VersionMessage :config posix_default require_order no_ignore_case auto_version auto_help);

my $EMPTY      => q{};

sub usage
{
    my $PROG = basename($0);
    print {*STDERR} $_ for @_;
    print {*STDERR} "Try $PROG --help for more information.\n";
    exit(1);
}

sub process_args
{
    my %OPTIONS;

    $OPTIONS{host}              = $EMPTY;
    $OPTIONS{bash}              = 0;
    $OPTIONS{nic}               = 0;
    $OPTIONS{nicName}           = $EMPTY;
    $OPTIONS{console}           = 0;
    $OPTIONS{virtual}           = 0;
    $OPTIONS{cmdb}              = 0;
    $OPTIONS{policyid}          = 0;
    $OPTIONS{showcompliant}     = 0;
    $OPTIONS{backup}            = 0;
    $OPTIONS{backuphistory}     = 0;
    $OPTIONS{page}              = $EMPTY;

    GetOptions
      (
        'host|h=s'              => \$OPTIONS{host}               ,
        'use-bash-script'       => \$OPTIONS{bash}               ,
        'remote-console|r!'     => \$OPTIONS{console}            ,
        'virtual-console|v!'    => \$OPTIONS{virtual}            ,
        'nic|n!'                => \$OPTIONS{nic}                ,
        'nic-name|m=s'          => \$OPTIONS{nicName}            ,
        'cmdb|d!'               => \$OPTIONS{cmdb}               ,
        'policy|p=i'            => \$OPTIONS{policyid}           ,
        'show-compliant|c!'     => \$OPTIONS{showcompliant}      ,
        'backup|b!'             => \$OPTIONS{backup}             ,
        'backup-history|s!'     => \$OPTIONS{backuphistory}      ,
        'page|g=s'              => \$OPTIONS{page}               ,
        'help'                  => sub      { HelpMessage(-exitval => 0, -verbose ->1)     },
        'version'               => sub      { VersionMessage()  },
      ) or usage;

    if ($OPTIONS{host} eq $EMPTY)
    {
        print {*STDERR} "ERROR: Must specify a host with -h flag\n";
        HelpMessage;
    }

    sanity_check_options(\%OPTIONS);

    # Parse anything else on the command line and throw usage
    for (@ARGV)
    {
        warn "Unknown argument: $_\n";
        HelpMessage;
    }

    return {%OPTIONS};
}

sub sanity_check_options
{
    my $OPTIONS     = shift;

    if (($OPTIONS->{console}) and ($OPTIONS->{virtual}))
    {
        print "ERROR: Cannot use flags -r and -v together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{console}) and ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flags -r and -d together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{console}) and ($OPTIONS->{backup}))
    {
        print "ERROR: Cannot use flags -r and -b together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{console}) and ($OPTIONS->{nic}))
    {
        print "ERROR: Cannot use flags -r and -n together\n";
        HelpMessage;
    }

    if (($OPTIONS->{virtual}) and ($OPTIONS->{backup}))
    {
        print "ERROR: Cannot use flags -v and -b together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{virtual}) and ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flags -v and -d together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{virtual}) and ($OPTIONS->{nic}))
    {
        print "ERROR: Cannot use flags -v and -n together\n";
        HelpMessage;
    }

    if (($OPTIONS->{backup}) and ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flags -b and -d together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{backup}) and ($OPTIONS->{nic}))
    {
        print "ERROR: Cannot use flags -b and -n together\n";
        HelpMessage;
    }

    if (($OPTIONS->{nic}) and ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flags -n and -d together\n";
        HelpMessage;
    }

    if (($OPTIONS->{policyid} != 0) and not ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flag -p without also specifying -d\n";
        HelpMessage;
    }

    if (($OPTIONS->{showcompliant}) and not ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flag -c without also specifying -d\n";
        HelpMessage;
    }

    if (($OPTIONS->{backuphistory}) and not ($OPTIONS->{backup}))
    {
        print "ERROR: Cannot use flag -s without also specifying -b\n";
        HelpMessage;
    }

    if (($OPTIONS->{nicName}) and not ($OPTIONS->{nic}))
    {
        print "ERROR: Cannot use flag -m without also specifying -n\n";
        HelpMessage;
    }

    return %{$OPTIONS};
}

I'd like to turn the above code into a dispatch table, but can't figure out how to do it.

Any help is appreciated.

Jim Garrison ,2 days ago

Are the sets of conflicting options always pairs? Can you have situations where options a , b , and c cannot occur together but any two are OK? Before you can pick a representation you need to be sure your model can handle the logic you need in a general way. This is not an easy problem. – Jim Garrison 2 days ago

simbabque ,yesterday

Don't use English, it's horribly slow and makes your code harder to read. – simbabque yesterday

Speeddymon ,yesterday

Removed English module and changed $ARG / @ARG to $_ / @_ Added $EMPTY as I forgot I had it defined globally. – Speeddymon yesterday

Speeddymon ,yesterday

@JimGarrison -- you are correct. The if-elsif-else does not explicitly account for 3 options that conflict (though it does account for that implicitly) As an example, using -h is required with all of the other options. But, using -h , -r , v , all together is not allowed, while -h , -r , and -d is allowed. – Speeddymon yesterday

ikegami ,yesterday

Since the host must be provided, it should be an argument, not an option. – ikegami yesterday

zdim ,2 days ago

I am not sure how a dispatch table would help since you need to go through pair-wise combinations of specific possibilities, and thus cannot trigger a suitable action by one lookup.

Here is another way to organize it

use List::MoreUtils 'firstval';

sub sanity_check_options
{
    my ($OPTIONS, $opt_excl) = @_;

    # Check each of 'opt_excl' against all other for ConFLict
    my @excl = sort keys %$opt_excl;
    while (my $eo = shift @excl) 
    {
        if (my $cfl = firstval { $OPTIONS->{$eo} and $OPTIONS->{$_} } @excl) 
        {
            say "Can't use -$opt_excl->{$eo} and -$opt_excl->{$cfl} together";
            HelpMessage();
            last;
        }
    }

    # Go through specific checks on
    # policyid, showcompliant, backuphistory, and nicName
    ...
    return 1;  # or some measure of whether there were errors
}

# Mutually exclusive options
my %opt_excl = (
    console => 'r', virtual => 'v', cmdb => 'c', backup => 'b', nic => 'n'
); 

sanity_check_options(\%OPTIONS, \%opt_excl);

This checks all options listed in %opt_excl against each other for conflict, removing the segments of elsif involving the (five) options that are mutually exclusive. It uses List::MoreUtils::firstval . The few other specific invocations are best checked one by one.

There is no use of returning $OPTIONS since it is passed as reference so any changes apply to the original structure (while it's not meant to be changed either). Perhaps you can keep track of whether there were errors and return that if it can be used in the caller, or just return 1 .

This addresses the long elsif chain as asked, and doesn't go into the rest of code. Here is one comment though: There is no need for {%OPTIONS} , which copies the hash in order to create an anonymous one; just use return \%OPTIONS;


Comment on possible multiple conflicting options

This answer as it stands does not print all conflicting options that have been used if there are more than two, as raised by ikegami in comments; it does catch any conflicts so that the run is aborted.

The code is readily adjusted for this. Instead of the code in the if block either

However, one is expected to know of allowed invocations and any listing of conflicts is a courtesy to the forgetful user (or a debugging aid); a usage message is printed as well anyway.

Given the high number of conflicting options the usage message should contain a prominent note on this. Also consider that so many conflicting options may indicate a design flaw.

Finally, this code fully relies on the fact that this processing goes once per run and operates with a handful of options; thus it is not concerned with efficiency and freely uses ancillary data structures.

Speeddymon ,yesterday

Updated the question to clarify. – Speeddymon yesterday

zdim ,yesterday

@Speeddymon Thank you, updated. This brings together checks of those five options which can't go one with another. The remaining few I leave to be checked one by one; "encoding" one or two possibilities in some all-encompassing system would just increase complexity (and may end up less readable). – zdim yesterday

zdim ,yesterday

@Speeddymon Added the missing include, use List::MoreUtils 'firstval' . Edited a little in the meanwhile, as well. – zdim yesterday

Speeddymon ,yesterday

Thank you for the easy to follow example. I went with yours as it was the clearest and contained the least duplicate code. – Speeddymon yesterday

ikegami ,23 hours ago

@Speeddymon, Apparently, it's not clear as you think since you didn't realize if doesn't work. It doesn't mention the error of using -r and -c together if -b is also provided. And why is a hash being used at all? Wasteful and needlessly complex. – ikegami 23 hours ago

simbabque ,yesterday

You can use a dispatch table if there are a lot of options. I would build that table programmatically. It might not be the best option here, but it works and the configuration is more readable than your elsif construct.
use strict;
use warnings;
use Ref::Util::XS 'is_arrayref';    # or Ref::Util

sub create_key {
    my $input = shift;

    # this would come from somewhere else, probably the Getopt config
    my @opts = qw( host bash nic nicName console virtual cmdb
        policyid showcompliant backup backuphistory page );

    # this is to cover the configuration with easier syntax
    $input = { map { $_ => 1 } @{$input} }
        if is_arrayref($input);

    # options are always prefilled with false values
    return join q{}, map { $input->{$_} ? 1 : 0 }
        sort @opts;
}

my %forbidden_combinations = (
    map { create_key( $_->[0] ) => $_->[1] } (
        [ [qw( console virtual )] => q{Cannot use flags -r and -v together} ],
        [ [qw( console cmdb )]    => q{Cannot use flags -r and -d together} ],
        [ [qw( console backup )]  => q{Cannot use flags -r and -b together} ],
        [ [qw( console nic )]     => q{Cannot use flags -r and -n together} ],
    )
);

p %forbidden_combinations; # from Data::Printer

The output of the p function is the dispatch table.

{
    00101   "Cannot use flags -r and -v together",
    00110   "Cannot use flags -r and -n together",
    01100   "Cannot use flags -r and -d together",
    10100   "Cannot use flags -r and -b together"
}

As you can see, we've sorted all the options ascii-betically to use them as keys. That way, you could in theory build all kinds of combinations like exclusive options.

Let's take a look at the configuration itself.

my %forbidden_combinations = (
    map { create_key( $_->[0] ) => $_->[1] } (
        [ [qw( console virtual )] => q{Cannot use flags -r and -v together} ],
        # ...
    )
);

We use a list of array references. Each entry is on one line and contains two pieces of information. Using the fat comma => makes it easy to read. The first part, which is much like a key in a hash, is the combination. It's a list of fields that should not occur together. The second element in the array ref is the error message. I've removed all the recurring elements, like the newline, to make it easier to change how and where the error can be displayed.

The map around this list of combination configuration runs the options through our create_key function, which translates it to a simple bitmap-style string. We assign all of it to a hash of that map and the error message.

Inside create_key , we check if it was called with an array reference as its argument. If that's the case, the call was for building the table, and we convert it to a hash reference so we have a proper map to look stuff up in. We know that the %OPTIONS always contains all the keys that exist, and that those are pre-filled with values that all evaluate to false . We can harness that convert the truthiness of those values to 1 or 0 , which then builds our key.

We will see in a moment why that is useful.

Now how do we use this?

sub HelpMessage { exit; }; # as a placeholder

# set up OPTIONS
my %OPTIONS = (
    host          => q{},
    bash          => 0,
    nic           => 0,
    nicName       => q{},
    console       => 0,
    virtual       => 0,
    cmdb          => 0,
    policyid      => 0,
    showcompliant => 0,
    backup        => 0,
    backuphistory => 0,
    page          => q{},
);

# read options with Getopt::Long ...
$OPTIONS{console} = $OPTIONS{virtual} = 1;

# ... and check for wrong invocations
if ( exists $forbidden_combinations{ my $key = create_key($OPTIONS) } ) {
    warn "ERROR: $forbidden_combinations{$key}\n";
    HelpMessage;
}

All we need to do now is get the $OPTIONS hash reference from Getopt::Long, and pass it through our create_key function to turn it into the map string. Then we can simply see if that key exists in our %forbidden_combinations dispatch table and show the corresponding error message.


Advantages of this approach

If you want to add more parameters, all you need to do is include them in @opts . In a full implementation that would probably be auto-generated from the config for the Getopt call. The keys will change under the hood, but since that is abstracted away you don't have to care.

Furthermore, this is easy to read. The create_key aside, the actual dispatch table syntax is quite concise and even has documentary character.

Disadvantages of this approach

There is a lot of programmatic generation going on for just a single call. It's certainly not the most efficient way to do it.


To take this further, you can write functions that auto-generate entries for certain scenarios.

I suggest you take a look at the second chapter in Mark Jason Dominus' excellent book Higher-Order Perl , which is available for free as a PDF.

Speeddymon ,yesterday

Thank you for the detailed answer. I've updated the question to help clarify how the $OPTIONS hash is setup. Can your example work within the bounds of what I have already, or should I rewrite the whole thing from scratch? – Speeddymon yesterday

simbabque ,yesterday

@Speeddymon yeah, that should work. I see you've got %OPTIONS , and it is always pre-set with values. That's going to be interesting. Let me try. – simbabque yesterday

Speeddymon ,yesterday

Speaking of the HOP book... That was actually what I was using to try to learn and where I was having trouble in applying the concept to my code. :-) I couldn't find a PDF version before, so thank you for the link! – Speeddymon yesterday

simbabque ,yesterday

@Speeddymon I've updated the answer and changed it to match your updated code. I suggest you read the diff first. What I don't like about it yet is that the possible keys are there twice, but that can be solved with some more trickery. I think that would blow up the answer even more, so I didn't do that. – simbabque yesterday

ikegami ,23 hours ago

Doesn't detect the case when -r , -v and -b are provided as an error. – ikegami 23 hours ago

ikegami ,2 days ago

You shouldn't be using elsif here because multiple condition could be true. And since multiple conditions could be true, a dispatch table can't be used. Your code can still be simplified greatly.
my @errors;

push @errors, "ERROR: Host must be provided\n"
   if !defined($OPTIONS{host});

my @conflicting =
   map { my ($opt, $flag) = @$_; $OPTIONS->{$opt} ? $flag : () }
      [ 'console', '-r' ],
      [ 'virtual', '-v' ],
      [ 'cmdb',    '-d' ],
      [ 'backup',  '-b' ],
      [ 'nic',     '-n' ];

push @errors, "ERROR: Can only use one the following flags at a time: @conflicting\n"
   if @conflicting > 1;

push @errors, "ERROR: Can't use flag -p without also specifying -d\n"
   if defined($OPTIONS->{policyid}) && !$OPTIONS->{cmdb};

push @errors, "ERROR: Can't use flag -c without also specifying -d\n"
   if $OPTIONS->{showcompliant} && !$OPTIONS->{cmdb};

push @errors, "ERROR: Can't use flag -s without also specifying -b\n"
   if $OPTIONS->{backuphistory} && !$OPTIONS->{backup};

push @errors, "ERROR: Can't use flag -m without also specifying -n\n"
   if defined($OPTIONS->{nicName}) && !$OPTIONS->{nic};

push @errors, "ERROR: Incorrect number of arguments\n"
   if @ARGV;

usage(@errors) if @errors;

Note that the above fixes numerous errors in your code.


Help vs Usage Error

Calling HelpMessage indifferently in both situations is therefore incorrect.

Create the following sub named usage to use (without arguments) when GetOptions returns false, and with an error message when some other usage error occurs:

use File::Basename qw( basename );

sub usage {
   my $prog = basename($0);
   print STDERR $_ for @_;
   print STDERR "Try '$prog --help' for more information.\n";
   exit(1);
}

Keep using HelpMessage in response to --help , but the defaults for the arguments are not appropriate for --help . You should use the following:

'help' => sub { HelpMessage( -exitval => 0, -verbose => 1 ) },

Speeddymon ,yesterday

I wondered if it would be impossible because of multiple conditions being true, but based on other answers, it seems that it is possible to still build a table and compare... – Speeddymon yesterday

ikegami ,yesterday

What are you talking about? No answer used a dispatch table. All the answers (including mine) used a ( for or map ) loop that performs as many checks as there are conditions. The points of a dispatch table is to do a single check no matter how many conditions there are. Since all conditions can be true, you need to check all conditions, so a dispatch table is impossible by definition. (And that's without even mentioning that the value of a dispatch table should be a code reference or similar (something to dispatch to).) – ikegami yesterday

ikegami ,yesterday

The difference between mine and the others is that mine avoids using an inefficient unordered hash and uses an efficient ordered list instead. (You could place the list in an array if you prefer.) – ikegami yesterday

ikegami ,yesterday

Updated to match updated question. That fact that none of the other answers can be extended for your updated question proves my pointthat trying to put everything into one loop or table just makes things less flexible, longer and more complex. – ikegami yesterday

Speeddymon ,yesterday

In response to the "help" tip -- HelpMessage is defined by GetOpt::Long and reads from the PODs at the end of the file. – Speeddymon yesterday

[Nov 22, 2017] Bitwise operators supported by Perl language

Nov 18, 2017 | www.tutorialspoint.com
Bitwise operator works on bits and perform bit by bit operation. Assume if $a = 60; and $b = 13; Now in binary format they will be as follows − $a = 0011 1100 $b = 0000 1101 ----------------- $a&$b = 0000 1100 $a|$b = 0011 1101 $a^$b = 0011 0001 ~$a = 1100 0011 There are following Bitwise operators supported by Perl language, assume if $a = 60; and $b = 13
S.No. Operator & Description
1 &

Binary AND Operator copies a bit to the result if it exists in both operands.

Example − ($a & $b) will give 12 which is 0000 1100

2 |

Binary OR Operator copies a bit if it exists in eather operand.

Example − ($a | $b) will give 61 which is 0011 1101

3 ^

Binary XOR Operator copies the bit if it is set in one operand but not both.

Example − ($a ^ $b) will give 49 which is 0011 0001

4 ~

Binary Ones Complement Operator is unary and has the efect of 'flipping' bits.

Example − (~$a ) will give -61 which is 1100 0011 in 2's complement form due to a signed binary number.

5 <<

Binary Left Shift Operator. The left operands value is moved left by the number of bits specified by the right operand.

Example − $a << 2 will give 240 which is 1111 0000

6 >>

Binary Right Shift Operator. The left operands value is moved right by the number of bits specified by the right operand.

Example − $a >> 2 will give 15 which is 0000 1111

Example

Try the following example to understand all the bitwise operators available in Perl. Copy and paste the following Perl program in test.pl file and execute this program.

#!/usr/local/bin/perl

use integer;
 
$a = 60;
$b = 13;

print "Value of \$a = $a and value of \$b = $b\n";

$c = $a & $b;
print "Value of \$a & \$b = $c\n";

$c = $a | $b;
print "Value of \$a | \$b = $c\n";

$c = $a ^ $b;
print "Value of \$a ^ \$b = $c\n";

$c = ~$a;
print "Value of ~\$a = $c\n";

$c = $a << 2;
print "Value of \$a << 2 = $c\n";

$c = $a >> 2;
print "Value of \$a >> 2 = $c\n";

When the above code is executed, it produces the following result −

Value of $a = 60 and value of $b = 13
Value of $a & $b = 12
Value of $a | $b = 61
Value of $a ^ $b = 49
Value of ~$a = -61
Value of $a << 2 = 240
Value of $a >> 2 = 15

[Nov 22, 2017] perl - How can I also get an element's index when I grep through an array - Stack Overflow

Nov 22, 2017 | stackoverflow.com

Learn more up vote down vote favorite

Geo ,Jun 10, 2010 at 16:39

Let's say I have this list:
my @list = qw(one two three four five);

and I want to grab all the elements containing o . I'd have this:

my @containing_o = grep { /o/ } @list;

But what would I have to do to also receive an index, or to be able to access the index in grep 's body?

,

my @index_containing_o = grep { $list[$_] =~ /o/ } 0..$#list;  # ==> (0,1,3)

my %hash_of_containing_o = map { $list[$_]=~/o/?($list[$_]=>$_):() } 0..$#list
            # ==> ( 'one' => 0, 'two' => 1, 'four' => 3 )

[Nov 22, 2017] Perl grep array FAQ - How to search an array-list of strings alvinalexander.com

Nov 22, 2017 | alvinalexander.com

Perl grep array FAQ - How to search an array/list of strings By Alvin Alexander. Last updated: June 3 2016 Perl "grep array" FAQ: Can you demonstrate a Perl grep array example? (Related: Can you demonstrate how to search a Perl array?)

A very cool thing about Perl is that you can search lists (arrays) with the Perl grep function. This makes it very easy to find things in large lists -- without having to write your own Perl for/foreach loops.

A simple Perl grep array example (Perl array search)

Here's a simple Perl array grep example. First I create a small string array (pizza toppings), and then search the Perl array for the string "pepper":

# create a perl list/array of strings
@pizzas = qw(cheese pepperoni veggie sausage spinach garlic);

# use the perl grep function to search the @pizzas list for the string "pepper"
@results = grep /pepper/, @pizzas;

# print the results
print "@results\n";

As you might guess from looking at the code, my @results Perl array prints the following output:

pepperoni
Perl grep array - case-insensitive searching

If you're familiar with Perl regular expressions, you might also guess that it's very easy to make this Perl array search example case-insensitive using the standard i operator at the end of my search string.

Here's what our Perl grep array example looks like with this change:

@results = grep /pepper/i, @pizzas;
Perl grep array and regular expressions (regex)

You can also use more complex Perl regular expressions (regex) in your array search. For instance, if for some reason you wanted to find all strings in your array that contain at least eight consecutive word characters, you could use this search pattern:

@results = grep /\w{8}/, @pizzas;

That example results in the following output:

pepperoni
Perl grep array - Summary

I hope this Perl grep array example (Perl array search example) has been helpful. For related Perl examples, see the Related block on this web page, or use the search form on this website. If you have any questions, or better yet, more Perl array search examples, feel free to use the Comments section below.

[Nov 22, 2017] Perl Searching for item in an Array - Stack Overflow

Nov 22, 2017 | stackoverflow.com

Perl: Searching for item in an Array Ask Question up vote down vote favorite 1

Majic Johnson ,Apr 20, 2012 at 4:53

Given an array @A we want to check if the element $B is in it. One way is to say this:
Foreach $element (@A){
    if($element eq $B){
        print "$B is in array A";
    }
}

However when it gets to Perl, I am thinking always about the most elegant way. And this is what I am thinking: Is there a way to find out if array A contains B if we convert A to a variable string and use

index(@A,$B)=>0

Is that possible?

cHao ,Apr 20, 2012 at 4:55

grep { $_ eq $B } @A ? – cHao Apr 20 '12 at 4:55

daxim ,Apr 20, 2012 at 7:06

Related: stackoverflow.com/questions/7898499/ stackoverflow.com/questions/3086874/daxim Apr 20 '12 at 7:06

Nikhil Jain ,Apr 20, 2012 at 5:49

There are many ways to find out whether the element is present in the array or not:
  1. Using foreach
    foreach my $element (@a) {
        if($element eq $b) {
           # do something             
           last;
        }
    }
    
  2. Using Grep:
    my $found = grep { $_ eq $b } @a;
    
  3. Using List::Util module
    use List::Util qw(first); 
    
    my $found = first { $_ eq $b } @a;
    
  4. Using Hash initialised by a Slice
    my %check;
    @check{@a} = ();
    
    my $found = exists $check{$b};
    
  5. Using Hash initialised by map
    my %check = map { $_ => 1 } @a;
    
    my $found = $check{$b};
    

pilcrow ,May 2, 2012 at 19:56

The List::Util::first() example is (potentially) subtly incorrect when searching for false values, since $found will also evaluate false. ( die unless $found ... oops!) List::MoreUtils::any does the right thing here. – pilcrow May 2 '12 at 19:56

yazu ,Apr 20, 2012 at 4:56

use 5.10.1;

$B ~~ @A and say '$B in @A';

brian d foy ,Apr 20, 2012 at 13:07

You have to be very careful with this because this distributes the match over the elements. If @A has an array reference element that contains $B, this will still match even though $B isn't a top level element of @A. The smart match is fundamentally broken for this and many other reasons. – brian d foy Apr 20 '12 at 13:07

obmib ,Apr 20, 2012 at 5:51

use List::AllUtils qw/ any /;
print "\@A contains $B" if any { $B eq $_ } @A;

bvr ,Apr 20, 2012 at 7:43

I would recommend first in this case, as it does not have to traverse whole array. It can stop when item is found. – bvr Apr 20 '12 at 7:43

brian d foy ,Apr 20, 2012 at 13:10

any can stop too because it needs only one element to be true. – brian d foy Apr 20 '12 at 13:10

pilcrow ,May 3, 2012 at 1:38

Beware that first can also return a false value if it finds, e.g., "0", which would confound the example given in this answer. any has the desired semantics. – pilcrow May 3 '12 at 1:38

[Nov 22, 2017] edited yesterday

Nov 22, 2017 | stackoverflow.com

down vote favorite

Speeddymon ,2 days ago

I've been reading up on dispatch tables and I get the general idea of how they work, but I'm having some trouble taking what I see online and applying the concept to some code I originally wrote as an ugly mess of if-elsif-else statements.

I have options parsing configured by using GetOpt::Long , and in turn, those options set a value in the %OPTIONS hash, depending on the option used.

Taking the below code as an example... ( UPDATED WITH MORE DETAIL

use     5.008008;
use     strict;
use     warnings;
use     File::Basename qw(basename);
use     Getopt::Long qw(HelpMessage VersionMessage :config posix_default require_order no_ignore_case auto_version auto_help);

my $EMPTY      => q{};

sub usage
{
    my $PROG = basename($0);
    print {*STDERR} $_ for @_;
    print {*STDERR} "Try $PROG --help for more information.\n";
    exit(1);
}

sub process_args
{
    my %OPTIONS;

    $OPTIONS{host}              = $EMPTY;
    $OPTIONS{bash}              = 0;
    $OPTIONS{nic}               = 0;
    $OPTIONS{nicName}           = $EMPTY;
    $OPTIONS{console}           = 0;
    $OPTIONS{virtual}           = 0;
    $OPTIONS{cmdb}              = 0;
    $OPTIONS{policyid}          = 0;
    $OPTIONS{showcompliant}     = 0;
    $OPTIONS{backup}            = 0;
    $OPTIONS{backuphistory}     = 0;
    $OPTIONS{page}              = $EMPTY;

    GetOptions
      (
        'host|h=s'              => \$OPTIONS{host}               ,
        'use-bash-script'       => \$OPTIONS{bash}               ,
        'remote-console|r!'     => \$OPTIONS{console}            ,
        'virtual-console|v!'    => \$OPTIONS{virtual}            ,
        'nic|n!'                => \$OPTIONS{nic}                ,
        'nic-name|m=s'          => \$OPTIONS{nicName}            ,
        'cmdb|d!'               => \$OPTIONS{cmdb}               ,
        'policy|p=i'            => \$OPTIONS{policyid}           ,
        'show-compliant|c!'     => \$OPTIONS{showcompliant}      ,
        'backup|b!'             => \$OPTIONS{backup}             ,
        'backup-history|s!'     => \$OPTIONS{backuphistory}      ,
        'page|g=s'              => \$OPTIONS{page}               ,
        'help'                  => sub      { HelpMessage(-exitval => 0, -verbose ->1)     },
        'version'               => sub      { VersionMessage()  },
      ) or usage;

    if ($OPTIONS{host} eq $EMPTY)
    {
        print {*STDERR} "ERROR: Must specify a host with -h flag\n";
        HelpMessage;
    }

    sanity_check_options(\%OPTIONS);

    # Parse anything else on the command line and throw usage
    for (@ARGV)
    {
        warn "Unknown argument: $_\n";
        HelpMessage;
    }

    return {%OPTIONS};
}

sub sanity_check_options
{
    my $OPTIONS     = shift;

    if (($OPTIONS->{console}) and ($OPTIONS->{virtual}))
    {
        print "ERROR: Cannot use flags -r and -v together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{console}) and ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flags -r and -d together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{console}) and ($OPTIONS->{backup}))
    {
        print "ERROR: Cannot use flags -r and -b together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{console}) and ($OPTIONS->{nic}))
    {
        print "ERROR: Cannot use flags -r and -n together\n";
        HelpMessage;
    }

    if (($OPTIONS->{virtual}) and ($OPTIONS->{backup}))
    {
        print "ERROR: Cannot use flags -v and -b together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{virtual}) and ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flags -v and -d together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{virtual}) and ($OPTIONS->{nic}))
    {
        print "ERROR: Cannot use flags -v and -n together\n";
        HelpMessage;
    }

    if (($OPTIONS->{backup}) and ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flags -b and -d together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{backup}) and ($OPTIONS->{nic}))
    {
        print "ERROR: Cannot use flags -b and -n together\n";
        HelpMessage;
    }

    if (($OPTIONS->{nic}) and ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flags -n and -d together\n";
        HelpMessage;
    }

    if (($OPTIONS->{policyid} != 0) and not ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flag -p without also specifying -d\n";
        HelpMessage;
    }

    if (($OPTIONS->{showcompliant}) and not ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flag -c without also specifying -d\n";
        HelpMessage;
    }

    if (($OPTIONS->{backuphistory}) and not ($OPTIONS->{backup}))
    {
        print "ERROR: Cannot use flag -s without also specifying -b\n";
        HelpMessage;
    }

    if (($OPTIONS->{nicName}) and not ($OPTIONS->{nic}))
    {
        print "ERROR: Cannot use flag -m without also specifying -n\n";
        HelpMessage;
    }

    return %{$OPTIONS};
}

I'd like to turn the above code into a dispatch table, but can't figure out how to do it.

Any help is appreciated.

Jim Garrison ,2 days ago

Are the sets of conflicting options always pairs? Can you have situations where options a , b , and c cannot occur together but any two are OK? Before you can pick a representation you need to be sure your model can handle the logic you need in a general way. This is not an easy problem. – Jim Garrison 2 days ago

simbabque ,yesterday

Don't use English, it's horribly slow and makes your code harder to read. – simbabque yesterday

Speeddymon ,yesterday

Removed English module and changed $ARG / @ARG to $_ / @_ Added $EMPTY as I forgot I had it defined globally. – Speeddymon yesterday

Speeddymon ,yesterday

@JimGarrison -- you are correct. The if-elsif-else does not explicitly account for 3 options that conflict (though it does account for that implicitly) As an example, using -h is required with all of the other options. But, using -h , -r , v , all together is not allowed, while -h , -r , and -d is allowed. – Speeddymon yesterday

ikegami ,yesterday

Since the host must be provided, it should be an argument, not an option. – ikegami yesterday

zdim ,2 days ago

I am not sure how a dispatch table would help since you need to go through pair-wise combinations of specific possibilities, and thus cannot trigger a suitable action by one lookup.

Here is another way to organize it

use List::MoreUtils 'firstval';

sub sanity_check_options
{
    my ($OPTIONS, $opt_excl) = @_;

    # Check each of 'opt_excl' against all other for ConFLict
    my @excl = sort keys %$opt_excl;
    while (my $eo = shift @excl) 
    {
        if (my $cfl = firstval { $OPTIONS->{$eo} and $OPTIONS->{$_} } @excl) 
        {
            say "Can't use -$opt_excl->{$eo} and -$opt_excl->{$cfl} together";
            HelpMessage();
            last;
        }
    }

    # Go through specific checks on
    # policyid, showcompliant, backuphistory, and nicName
    ...
    return 1;  # or some measure of whether there were errors
}

# Mutually exclusive options
my %opt_excl = (
    console => 'r', virtual => 'v', cmdb => 'c', backup => 'b', nic => 'n'
); 

sanity_check_options(\%OPTIONS, \%opt_excl);

This checks all options listed in %opt_excl against each other for conflict, removing the segments of elsif involving the (five) options that are mutually exclusive. It uses List::MoreUtils::firstval . The few other specific invocations are best checked one by one.

There is no use of returning $OPTIONS since it is passed as reference so any changes apply to the original structure (while it's not meant to be changed either). Perhaps you can keep track of whether there were errors and return that if it can be used in the caller, or just return 1 .

This addresses the long elsif chain as asked, and doesn't go into the rest of code. Here is one comment though: There is no need for {%OPTIONS} , which copies the hash in order to create an anonymous one; just use return \%OPTIONS;


Comment on possible multiple conflicting options

This answer as it stands does not print all conflicting options that have been used if there are more than two, as raised by ikegami in comments; it does catch any conflicts so that the run is aborted.

The code is readily adjusted for this. Instead of the code in the if block either

However, one is expected to know of allowed invocations and any listing of conflicts is a courtesy to the forgetful user (or a debugging aid); a usage message is printed as well anyway.

Given the high number of conflicting options the usage message should contain a prominent note on this. Also consider that so many conflicting options may indicate a design flaw.

Finally, this code fully relies on the fact that this processing goes once per run and operates with a handful of options; thus it is not concerned with efficiency and freely uses ancillary data structures.

Speeddymon ,yesterday

Updated the question to clarify. – Speeddymon yesterday

zdim ,yesterday

@Speeddymon Thank you, updated. This brings together checks of those five options which can't go one with another. The remaining few I leave to be checked one by one; "encoding" one or two possibilities in some all-encompassing system would just increase complexity (and may end up less readable). – zdim yesterday

zdim ,yesterday

@Speeddymon Added the missing include, use List::MoreUtils 'firstval' . Edited a little in the meanwhile, as well. – zdim yesterday

Speeddymon ,yesterday

Thank you for the easy to follow example. I went with yours as it was the clearest and contained the least duplicate code. – Speeddymon yesterday

ikegami ,23 hours ago

@Speeddymon, Apparently, it's not clear as you think since you didn't realize if doesn't work. It doesn't mention the error of using -r and -c together if -b is also provided. And why is a hash being used at all? Wasteful and needlessly complex. – ikegami 23 hours ago

simbabque ,yesterday

You can use a dispatch table if there are a lot of options. I would build that table programmatically. It might not be the best option here, but it works and the configuration is more readable than your elsif construct.
use strict;
use warnings;
use Ref::Util::XS 'is_arrayref';    # or Ref::Util

sub create_key {
    my $input = shift;

    # this would come from somewhere else, probably the Getopt config
    my @opts = qw( host bash nic nicName console virtual cmdb
        policyid showcompliant backup backuphistory page );

    # this is to cover the configuration with easier syntax
    $input = { map { $_ => 1 } @{$input} }
        if is_arrayref($input);

    # options are always prefilled with false values
    return join q{}, map { $input->{$_} ? 1 : 0 }
        sort @opts;
}

my %forbidden_combinations = (
    map { create_key( $_->[0] ) => $_->[1] } (
        [ [qw( console virtual )] => q{Cannot use flags -r and -v together} ],
        [ [qw( console cmdb )]    => q{Cannot use flags -r and -d together} ],
        [ [qw( console backup )]  => q{Cannot use flags -r and -b together} ],
        [ [qw( console nic )]     => q{Cannot use flags -r and -n together} ],
    )
);

p %forbidden_combinations; # from Data::Printer

The output of the p function is the dispatch table.

{
    00101   "Cannot use flags -r and -v together",
    00110   "Cannot use flags -r and -n together",
    01100   "Cannot use flags -r and -d together",
    10100   "Cannot use flags -r and -b together"
}

As you can see, we've sorted all the options ascii-betically to use them as keys. That way, you could in theory build all kinds of combinations like exclusive options.

Let's take a look at the configuration itself.

my %forbidden_combinations = (
    map { create_key( $_->[0] ) => $_->[1] } (
        [ [qw( console virtual )] => q{Cannot use flags -r and -v together} ],
        # ...
    )
);

We use a list of array references. Each entry is on one line and contains two pieces of information. Using the fat comma => makes it easy to read. The first part, which is much like a key in a hash, is the combination. It's a list of fields that should not occur together. The second element in the array ref is the error message. I've removed all the recurring elements, like the newline, to make it easier to change how and where the error can be displayed.

The map around this list of combination configuration runs the options through our create_key function, which translates it to a simple bitmap-style string. We assign all of it to a hash of that map and the error message.

Inside create_key , we check if it was called with an array reference as its argument. If that's the case, the call was for building the table, and we convert it to a hash reference so we have a proper map to look stuff up in. We know that the %OPTIONS always contains all the keys that exist, and that those are pre-filled with values that all evaluate to false . We can harness that convert the truthiness of those values to 1 or 0 , which then builds our key.

We will see in a moment why that is useful.

Now how do we use this?

sub HelpMessage { exit; }; # as a placeholder

# set up OPTIONS
my %OPTIONS = (
    host          => q{},
    bash          => 0,
    nic           => 0,
    nicName       => q{},
    console       => 0,
    virtual       => 0,
    cmdb          => 0,
    policyid      => 0,
    showcompliant => 0,
    backup        => 0,
    backuphistory => 0,
    page          => q{},
);

# read options with Getopt::Long ...
$OPTIONS{console} = $OPTIONS{virtual} = 1;

# ... and check for wrong invocations
if ( exists $forbidden_combinations{ my $key = create_key($OPTIONS) } ) {
    warn "ERROR: $forbidden_combinations{$key}\n";
    HelpMessage;
}

All we need to do now is get the $OPTIONS hash reference from Getopt::Long, and pass it through our create_key function to turn it into the map string. Then we can simply see if that key exists in our %forbidden_combinations dispatch table and show the corresponding error message.


Advantages of this approach

If you want to add more parameters, all you need to do is include them in @opts . In a full implementation that would probably be auto-generated from the config for the Getopt call. The keys will change under the hood, but since that is abstracted away you don't have to care.

Furthermore, this is easy to read. The create_key aside, the actual dispatch table syntax is quite concise and even has documentary character.

Disadvantages of this approach

There is a lot of programmatic generation going on for just a single call. It's certainly not the most efficient way to do it.


To take this further, you can write functions that auto-generate entries for certain scenarios.

I suggest you take a look at the second chapter in Mark Jason Dominus' excellent book Higher-Order Perl , which is available for free as a PDF.

Speeddymon ,yesterday

Thank you for the detailed answer. I've updated the question to help clarify how the $OPTIONS hash is setup. Can your example work within the bounds of what I have already, or should I rewrite the whole thing from scratch? – Speeddymon yesterday

simbabque ,yesterday

@Speeddymon yeah, that should work. I see you've got %OPTIONS , and it is always pre-set with values. That's going to be interesting. Let me try. – simbabque yesterday

Speeddymon ,yesterday

Speaking of the HOP book... That was actually what I was using to try to learn and where I was having trouble in applying the concept to my code. :-) I couldn't find a PDF version before, so thank you for the link! – Speeddymon yesterday

simbabque ,yesterday

@Speeddymon I've updated the answer and changed it to match your updated code. I suggest you read the diff first. What I don't like about it yet is that the possible keys are there twice, but that can be solved with some more trickery. I think that would blow up the answer even more, so I didn't do that. – simbabque yesterday

ikegami ,23 hours ago

Doesn't detect the case when -r , -v and -b are provided as an error. – ikegami 23 hours ago

ikegami ,2 days ago

You shouldn't be using elsif here because multiple condition could be true. And since multiple conditions could be true, a dispatch table can't be used. Your code can still be simplified greatly.
my @errors;

push @errors, "ERROR: Host must be provided\n"
   if !defined($OPTIONS{host});

my @conflicting =
   map { my ($opt, $flag) = @$_; $OPTIONS->{$opt} ? $flag : () }
      [ 'console', '-r' ],
      [ 'virtual', '-v' ],
      [ 'cmdb',    '-d' ],
      [ 'backup',  '-b' ],
      [ 'nic',     '-n' ];

push @errors, "ERROR: Can only use one the following flags at a time: @conflicting\n"
   if @conflicting > 1;

push @errors, "ERROR: Can't use flag -p without also specifying -d\n"
   if defined($OPTIONS->{policyid}) && !$OPTIONS->{cmdb};

push @errors, "ERROR: Can't use flag -c without also specifying -d\n"
   if $OPTIONS->{showcompliant} && !$OPTIONS->{cmdb};

push @errors, "ERROR: Can't use flag -s without also specifying -b\n"
   if $OPTIONS->{backuphistory} && !$OPTIONS->{backup};

push @errors, "ERROR: Can't use flag -m without also specifying -n\n"
   if defined($OPTIONS->{nicName}) && !$OPTIONS->{nic};

push @errors, "ERROR: Incorrect number of arguments\n"
   if @ARGV;

usage(@errors) if @errors;

Note that the above fixes numerous errors in your code.


Help vs Usage Error

Calling HelpMessage indifferently in both situations is therefore incorrect.

Create the following sub named usage to use (without arguments) when GetOptions returns false, and with an error message when some other usage error occurs:

use File::Basename qw( basename );

sub usage {
   my $prog = basename($0);
   print STDERR $_ for @_;
   print STDERR "Try '$prog --help' for more information.\n";
   exit(1);
}

Keep using HelpMessage in response to --help , but the defaults for the arguments are not appropriate for --help . You should use the following:

'help' => sub { HelpMessage( -exitval => 0, -verbose => 1 ) },

Speeddymon ,yesterday

I wondered if it would be impossible because of multiple conditions being true, but based on other answers, it seems that it is possible to still build a table and compare... – Speeddymon yesterday

ikegami ,yesterday

What are you talking about? No answer used a dispatch table. All the answers (including mine) used a ( for or map ) loop that performs as many checks as there are conditions. The points of a dispatch table is to do a single check no matter how many conditions there are. Since all conditions can be true, you need to check all conditions, so a dispatch table is impossible by definition. (And that's without even mentioning that the value of a dispatch table should be a code reference or similar (something to dispatch to).) – ikegami yesterday

ikegami ,yesterday

The difference between mine and the others is that mine avoids using an inefficient unordered hash and uses an efficient ordered list instead. (You could place the list in an array if you prefer.) – ikegami yesterday

ikegami ,yesterday

Updated to match updated question. That fact that none of the other answers can be extended for your updated question proves my pointthat trying to put everything into one loop or table just makes things less flexible, longer and more complex. – ikegami yesterday

Speeddymon ,yesterday

In response to the "help" tip -- HelpMessage is defined by GetOpt::Long and reads from the PODs at the end of the file. – Speeddymon yesterday

[Nov 22, 2017] Bitwise operators supported by Perl language

Nov 18, 2017 | www.tutorialspoint.com
Bitwise operator works on bits and perform bit by bit operation. Assume if $a = 60; and $b = 13; Now in binary format they will be as follows − $a = 0011 1100 $b = 0000 1101 ----------------- $a&$b = 0000 1100 $a|$b = 0011 1101 $a^$b = 0011 0001 ~$a = 1100 0011 There are following Bitwise operators supported by Perl language, assume if $a = 60; and $b = 13
S.No. Operator & Description
1 &

Binary AND Operator copies a bit to the result if it exists in both operands.

Example − ($a & $b) will give 12 which is 0000 1100

2 |

Binary OR Operator copies a bit if it exists in eather operand.

Example − ($a | $b) will give 61 which is 0011 1101

3 ^

Binary XOR Operator copies the bit if it is set in one operand but not both.

Example − ($a ^ $b) will give 49 which is 0011 0001

4 ~

Binary Ones Complement Operator is unary and has the efect of 'flipping' bits.

Example − (~$a ) will give -61 which is 1100 0011 in 2's complement form due to a signed binary number.

5 <<

Binary Left Shift Operator. The left operands value is moved left by the number of bits specified by the right operand.

Example − $a << 2 will give 240 which is 1111 0000

6 >>

Binary Right Shift Operator. The left operands value is moved right by the number of bits specified by the right operand.

Example − $a >> 2 will give 15 which is 0000 1111

Example

Try the following example to understand all the bitwise operators available in Perl. Copy and paste the following Perl program in test.pl file and execute this program.

#!/usr/local/bin/perl

use integer;
 
$a = 60;
$b = 13;

print "Value of \$a = $a and value of \$b = $b\n";

$c = $a & $b;
print "Value of \$a & \$b = $c\n";

$c = $a | $b;
print "Value of \$a | \$b = $c\n";

$c = $a ^ $b;
print "Value of \$a ^ \$b = $c\n";

$c = ~$a;
print "Value of ~\$a = $c\n";

$c = $a << 2;
print "Value of \$a << 2 = $c\n";

$c = $a >> 2;
print "Value of \$a >> 2 = $c\n";

When the above code is executed, it produces the following result −

Value of $a = 60 and value of $b = 13
Value of $a & $b = 12
Value of $a | $b = 61
Value of $a ^ $b = 49
Value of ~$a = -61
Value of $a << 2 = 240
Value of $a >> 2 = 15

[Nov 22, 2017] Perl grep array FAQ - How to search an array-list of strings alvinalexander.com

Nov 22, 2017 | alvinalexander.com

Perl grep array FAQ - How to search an array/list of strings By Alvin Alexander. Last updated: June 3 2016 Perl "grep array" FAQ: Can you demonstrate a Perl grep array example? (Related: Can you demonstrate how to search a Perl array?)

A very cool thing about Perl is that you can search lists (arrays) with the Perl grep function. This makes it very easy to find things in large lists -- without having to write your own Perl for/foreach loops.

A simple Perl grep array example (Perl array search)

Here's a simple Perl array grep example. First I create a small string array (pizza toppings), and then search the Perl array for the string "pepper":

# create a perl list/array of strings
@pizzas = qw(cheese pepperoni veggie sausage spinach garlic);

# use the perl grep function to search the @pizzas list for the string "pepper"
@results = grep /pepper/, @pizzas;

# print the results
print "@results\n";

As you might guess from looking at the code, my @results Perl array prints the following output:

pepperoni
Perl grep array - case-insensitive searching

If you're familiar with Perl regular expressions, you might also guess that it's very easy to make this Perl array search example case-insensitive using the standard i operator at the end of my search string.

Here's what our Perl grep array example looks like with this change:

@results = grep /pepper/i, @pizzas;
Perl grep array and regular expressions (regex)

You can also use more complex Perl regular expressions (regex) in your array search. For instance, if for some reason you wanted to find all strings in your array that contain at least eight consecutive word characters, you could use this search pattern:

@results = grep /\w{8}/, @pizzas;

That example results in the following output:

pepperoni
Perl grep array - Summary

I hope this Perl grep array example (Perl array search example) has been helpful. For related Perl examples, see the Related block on this web page, or use the search form on this website. If you have any questions, or better yet, more Perl array search examples, feel free to use the Comments section below.

[Nov 22, 2017] Perl Searching for item in an Array - Stack Overflow

Nov 22, 2017 | stackoverflow.com

Perl: Searching for item in an Array Ask Question up vote down vote favorite 1

Majic Johnson ,Apr 20, 2012 at 4:53

Given an array @A we want to check if the element $B is in it. One way is to say this:
Foreach $element (@A){
    if($element eq $B){
        print "$B is in array A";
    }
}

However when it gets to Perl, I am thinking always about the most elegant way. And this is what I am thinking: Is there a way to find out if array A contains B if we convert A to a variable string and use

index(@A,$B)=>0

Is that possible?

cHao ,Apr 20, 2012 at 4:55

grep { $_ eq $B } @A ? – cHao Apr 20 '12 at 4:55

daxim ,Apr 20, 2012 at 7:06

Related: stackoverflow.com/questions/7898499/ stackoverflow.com/questions/3086874/daxim Apr 20 '12 at 7:06

Nikhil Jain ,Apr 20, 2012 at 5:49

There are many ways to find out whether the element is present in the array or not:
  1. Using foreach
    foreach my $element (@a) {
        if($element eq $b) {
           # do something             
           last;
        }
    }
    
  2. Using Grep:
    my $found = grep { $_ eq $b } @a;
    
  3. Using List::Util module
    use List::Util qw(first); 
    
    my $found = first { $_ eq $b } @a;
    
  4. Using Hash initialised by a Slice
    my %check;
    @check{@a} = ();
    
    my $found = exists $check{$b};
    
  5. Using Hash initialised by map
    my %check = map { $_ => 1 } @a;
    
    my $found = $check{$b};
    

pilcrow ,May 2, 2012 at 19:56

The List::Util::first() example is (potentially) subtly incorrect when searching for false values, since $found will also evaluate false. ( die unless $found ... oops!) List::MoreUtils::any does the right thing here. – pilcrow May 2 '12 at 19:56

yazu ,Apr 20, 2012 at 4:56

use 5.10.1;

$B ~~ @A and say '$B in @A';

brian d foy ,Apr 20, 2012 at 13:07

You have to be very careful with this because this distributes the match over the elements. If @A has an array reference element that contains $B, this will still match even though $B isn't a top level element of @A. The smart match is fundamentally broken for this and many other reasons. – brian d foy Apr 20 '12 at 13:07

obmib ,Apr 20, 2012 at 5:51

use List::AllUtils qw/ any /;
print "\@A contains $B" if any { $B eq $_ } @A;

bvr ,Apr 20, 2012 at 7:43

I would recommend first in this case, as it does not have to traverse whole array. It can stop when item is found. – bvr Apr 20 '12 at 7:43

brian d foy ,Apr 20, 2012 at 13:10

any can stop too because it needs only one element to be true. – brian d foy Apr 20 '12 at 13:10

pilcrow ,May 3, 2012 at 1:38

Beware that first can also return a false value if it finds, e.g., "0", which would confound the example given in this answer. any has the desired semantics. – pilcrow May 3 '12 at 1:38

[Nov 22, 2017] perl - How can I also get an element's index when I grep through an array - Stack Overflow

Nov 22, 2017 | stackoverflow.com

Learn more up vote down vote favorite

Geo ,Jun 10, 2010 at 16:39

Let's say I have this list:
my @list = qw(one two three four five);

and I want to grab all the elements containing o . I'd have this:

my @containing_o = grep { /o/ } @list;

But what would I have to do to also receive an index, or to be able to access the index in grep 's body?

,

my @index_containing_o = grep { $list[$_] =~ /o/ } 0..$#list;  # ==> (0,1,3)

my %hash_of_containing_o = map { $list[$_]=~/o/?($list[$_]=>$_):() } 0..$#list
            # ==> ( 'one' => 0, 'two' => 1, 'four' => 3 )

[Nov 22, 2017] perl modules

Nov 17, 2017 | perlmonks.com

Discipulus (Monsignor) on Nov 16, 2017 at 09:04 UTC

Re: perl modules

Hello codestroman and welcome to the monastery and to the wonderful world of Perl!

First of all, please, add <c> code tags </c> around your code and output.

Then be sure to have read the standard documentation: perlmod and perlnewmod

Infact a basic Perl module define a package and use Exporter to export functions in the using Perl program.

In my homenode i've collected a lot of links on about module creation

L*

Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

thanos1983 (Priest) on Nov 16, 2017 at 09:17 UTC

Re: perl modules

Hello codestroman

Just to add a minor suggestion here, to the full cover reply of fellow monk Discipulus . It will assist you a lot also to read Simple Module Tutorial

Update: Direct answer to your question can be found here How to add a relative directory to @INC with multiple possible solutions. I would strongly recommend to go through all the articles that all monks proposed.

Hope this helps, BR.

Seeking for Perl wisdom...on the process of learning...not there...yet!

hippo (Abbot) on Nov 16, 2017 at 09:21 UTC

Re: perl modules (Can't locate in @INC)
PLEASE HELP!!

This is a monastery - a place of quite contemplation. The louder you shout the less wisdom shall you receive.

The error message Can't locate dog.pm in @INC is pretty explicit. Either your module file is not called dog.pm in which case, change it or else your file dog.pm is not in any of the directories listed in @INC in which case either move it to one of those directories or else change @INC with use lib .

I also see, despite the lack of formatting in your post that your module doesn't use any namespace. You should probably address that. Perhaps a solid read through Simple Module Tutorial would be a good idea?

Anonymous Monk on Nov 16, 2017 at 09:07 UTC

Re: perl modules

use an absolute pathname in use lib

Anonymous Monk on Nov 16, 2017 at 15:16 UTC

Re: perl modules

Welcome to the language ... and, to the Monastery. The "simple module tutorial" listed above is a very good place to start. Like all languages of its kind, Perl looks at runtime for external modules in a prescribed list of places, in a specified order. You can affect this in several ways, as the tutorials describe. Please read them carefully.

In the Perl(-5) language, this list is stored in a pre-defined array variable called @INC and it is populated from a variety of sources: a base-list that is compiled directly into the Perl interpreter, the PERL5LIB environment-variable, use lib statements, and even direct modification of the variable itself. Perl searches this list from beginning to end and processes (only) the first matching file that it finds.

(Note that, in Perl, the use statement is actually a pragma, or declaration to the compiler, and as such it has many "uses" and a somewhat complicated syntax.)

Corion (Pope) on Nov 16, 2017 at 15:23 UTC

Re^2: perl modules


by Corion (Pope) on Nov 16, 2017 at 15:23 UTC

(Note that, in Perl, the use statement is actually a pragma, or declaration to the compiler, and as such it has many "uses" and a somewhat complicated syntax.)

Please no.

The word "pragma" has a special meaning in Perl, and it is highly confusing to claim that a Perl "keyword" would be a "pragma". use certainly is a keyword and nothing else.

If you mean to say something different, please describe in more words what you want to say.

[Nov 18, 2017] Using the built-in debugger of Perl as REPL by Gabor Szabo

Youtube video, Mainly explain how to use x command in Perl debugger.
Nov 18, 2017 | www.youtube.com

The command line debugger that comes with perl is very powerful.
Not only does it allow us to debug script but it can be used as a REPL - a Read Eval Print Loop to explore the capabilities of the language. There are a few basic examples in this screencast.

http://perlmaven.com/using-the-built-...

To see all the Perl tutorials visit http://perlmaven.com/perl-tutorial

About Perl Programming and Perl programmers.

In this screencast:

perl -d e 1

p - print scalar
x - print data structure
b subname - set breakpoint

[Nov 17, 2017] Why key function applied to hash reference adds reference to the hash if it does not exist

Nov 17, 2017 | perlmonks.com

nikmit has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

I came across this behaviour in perl which I find unintuitive, was wondering what the use case scenario for it is or whether I have done something wrong to bring it about...

I had a statement checking for the existence of data like so return 0 unless keys %{$hashref->{$key}} and I failed to realise that $key may not always exist.

I would have expected to see an error if $href->{$key} is undefined and therefore not a reference, but instead $key was just added to the hash.

Example:

#!/usr/bin/perl -w #perl-5.22.3 use strict; my $href = { cat => {milk => 1}, dog => {bone => 1} }; if (keys %{$href->{cow}}) { print "noop\n"; } else { if (exists $href->{cow}) { print "holy cow\n"; } else { print "no cow\n"; } } [download]

This prints 'holy cow'

Discipulus (Monsignor) on Nov 17, 2017 at 09:32 UTC

Re: keys %{$hash->{$href}} adds $href to the hash if it doesnt exist?

Hello nikmit ,

it's not a case of autovivification?

It is explained in perlref and for more informations see Explaining Autovivication and Autovivification in perl and https://perlmaven.com/autovivification

On CPAN there is a pragma to disable it if unwanted.

PS exists fixes your snippet: if (exists $href->{cow} and keys %{$href->{cow}}) { # no cow

L*


Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

1nickt (Prior) on Nov 17, 2017 at 13:54 UTC

Re^2: keys %{$hash->{$href}} adds $href to the hash if it doesnt exist?


by 1nickt (Prior) on Nov 17, 2017 at 13:54 UTC

Care must be used with exists as it will indeed autovivify intermediate hashes:

use strict; use warnings; use feature 'say'; use Data::Dumper; $Data::Dumper::Sortkeys = $Data::Dumper::Indent = 1; my $href = { cat => { milk => 1 }, dog => { bone => 1 }, }; say exists $href->{'cow'}->{'alfalfa'} ? 'cow' : 'no cow'; say Dumper $href __END__ [download] Output: no cow $VAR1 = { 'cat' => { 'milk' => 1 }, 'cow' => {}, # uh-oh 'dog' => { 'bone' => 1 } }; [download] So you would have to either use exists on all levels of the structure as haukex suggested : use strict; use warnings; use feature 'say'; use Data::Dumper; $Data::Dumper::Sortkeys = $Data::Dumper::Indent = 1; my $href = { cat => { milk => 1 }, dog => { bone => 1 }, }; say exists $href->{'cow'} && exists $href->{'cow'}->{'alfalfa'} ? 'cow' : 'no cow'; say Dumper $href __END__ [download] Output: no cow $VAR1 = { 'cat' => { 'milk' => 1 }, 'dog' => { 'bone' => 1 } }; [download] ... or use autovivification : use strict; use warnings; use feature 'say'; use Data::Dumper; $Data::Dumper::Sortkeys = $Data::Dumper::Indent = 1; my $href = { cat => { milk => 1 }, dog => { bone => 1 }, }; no autovivification; say exists $href->{'cow'}->{'alfalfa'} ? 'cow' : 'no cow'; say Dumper $href __END__ [download] Output: no cow $VAR1 = { 'cat' => { 'milk' => 1 }, 'dog' => { 'bone' => 1 } }; [download] Note that autovivification.pm has effect lexically: use strict; use warnings; use feature 'say'; use Data::Dumper; $Data::Dumper::Sortkeys = $Data::Dumper::Indent = 1; my $href = { cat => { milk => 1 }, dog => { bone => 1 }, }; { no autovivification; say exists $href->{'cow'}->{'alfalfa'} ? 'cow' : 'no cow'; say Dumper $href } say exists $href->{'cow'}->{'alfalfa'} ? 'cow' : 'still no cow'; say Dumper $href; __END__ [download] no cow $VAR1 = { 'cat' => { 'milk' => 1 }, 'dog' => { 'bone' => 1 } }; still no cow $VAR1 = { 'cat' => { 'milk' => 1 }, 'cow' => {}, # uh-oh 'dog' => { 'bone' => 1 } }; [download]
The way forward always starts with a minimal test.

nikmit (Sexton) on Nov 17, 2017 at 10:15 UTC

Re^2: keys %{$hash->{$href}} adds $href to the hash if it doesnt exist?


by nikmit (Sexton) on Nov 17, 2017 at 10:15 UTC

Thanks - no autovivification will become a permanent presence for me, next to use strict .

haukex (Monsignor) on Nov 17, 2017 at 09:42 UTC

Re: keys %{$hash->{$href}} adds $href to the hash if it doesnt exist?

This is "autovivification" and was just discussed the other day, see the replies in the thread Array dereference in foreach() , including the ones deeper down in the thread.

Use exists to check if a hash key exists. As described in its documentation, if you have multi-level data structures (hashes of hashes), you need to check every level. Update: Discipulus just updated to show an example.

Eily (Parson) on Nov 17, 2017 at 10:01 UTC

Re: keys %{$hash->{$href}} adds $href to the hash if it doesnt exist?

FYI, while keys %hash returns the number of keys in scalar context, you can also use the hash itself, the value will be false if the hash is empty and true otherwise (actually 0 when empty, and information on the content otherwise). So if (exists $href->{cow} and %{ $href->{cow} }) . Unlike the keys version, scalar %{ $href->{cow} } will not create a new hash (autovivify) if the cow key doesn't exist, but die instead (at least if you forgot to check if the key exists, you'll get an error in the right place).

[Nov 17, 2017] Meteoalarm - Weather warnings

Nov 17, 2017 | perlmonks.com


1 direct reply -- Read more / Contribute by walto
on Sep 23, 2017 at 00:50

http://meteoalarm.eu ) is the official website from European national weather services that gives out warnings in extreme weather situations. It has been a while ago that i wrote a perl module for processing this information. ( Weather warnings from www.meteoalarm.eu ). The website is still on but has changed since. That made some changes necessary. I wrote the module only for informational purposes and it is not meant to use it for anything critical. Here is the code: #!/usr/bin/perl # # package Meteoalarm; use strict; use warnings; use Carp; use LWP; use HTML::Entities; use HTML::TreeBuilder; use utf8; binmode STDOUT, ":encoding(UTF-8)"; our $VERSION = "0.06"; sub new { my $class = shift; my $self = {}; my %passed_params = @_; $self->{'user_agent'} = _make_user_agent( $passed_params{'user_agent'} ); bless( $self, $class ); return $self; } sub countries { my $self = shift; my %passed_params = @_; my %type = ( 'all' => 0, 'wind' => 1, 'snow' => 2, 'ice' => 2, 'snow/ice' => 2, 'thunderstorm' => 3, 'fog' => 4, 'extreme high temperature' => 5, 'extreme low temperature' => 6, 'coastal event' => 7, 'forestfire' => 8, 'avalanches' => 9, 'rain' => 10, 'unnamed' => 11, 'flood' => 12, 'rainflood' => 13 ); if ( !$passed_params{type} ) { $passed_params{type} = 0; } elsif ( !$type{ $passed_params{type} } ) { $passed_params{type} = 0; } else { $passed_params{type} = $type{ $passed_params{type} }; } my %day = ( 'today' => 0, 'tomorrow' => 1 ); if ( !$passed_params{day} ) { $passed_params{day} = 0; } elsif ( !$day{ $passed_params{day} } ) { $passed_params{day} = 0; } else { $passed_params{day} = $day{ $passed_params{day} }; } my $url = _make_country_url( $passed_params{day}, $passed_params{type} ); my $content = _fetch_content( $url, $self->{'user_agent'} ); my $country_warnings = _parse_country_warnings($content); return $country_warnings; } sub regions { my ($self) = shift; my %passed_params = @_; my %day = ( 'today' => 0, 'tomorrow' => 1 ); if ( !$passed_params{day} ) { $passed_params{day} = 0; } elsif ( !$day{ $passed_params{day} } ) { $passed_params{day} = 0; } else { $passed_params{day} = $day{ $passed_params{day} }; } my %type = ( 'all' => 0, 'wind' => 1, 'snow' => 2, 'ice' => 2, 'snow/ice' => 2, 'thunderstorm' => 3, 'fog' => 4, 'extreme high temperature' => 5, 'extreme low temperature' => 6, 'coastal event' => 7, 'forestfire' => 8, 'avalanches' => 9, 'rain' => 10, 'unnamed' => 11, 'flood' => 12, 'rainflood' => 13 ); my %country_codes = ( 'AT' => 'AT-Austria', 'BA' => 'BA_Bosnia-Herzegovina', 'BE' => 'BE-Belgium', 'BG' => 'BG-Bulgaria', 'CH' => 'CH-Switzerland', 'CY' => 'CY-Cyprus', 'CZ' => 'CZ-Czechia', 'DE' => 'DE-Germany', 'DK' => 'DK-Denmark', 'EE' => 'EE-Estonia', 'ES' => 'ES-Spain', 'FI' => 'FI-Finland', 'FR' => 'FR-France', 'GR' => 'GR-Greece', 'HR' => 'HR-Croatia', 'HU' => 'HU-Hungary', 'IE' => 'IE-Ireland', 'IL' => 'IL-Israel', 'IS' => 'IS-Iceland', 'IT' => 'IT-Italy', 'LT' => 'LT-Lithuania', 'LU' => 'LU-Luxemburg', 'LV' => 'LV-Latvia', 'MD' => 'MD-Moldova', 'ME' => 'ME-Montenegro', 'MK' => 'MK-Former Yugoslav Republic of Macedonia', 'MT' => 'MT-Malta', 'NL' => 'NL-Netherlands', 'NO' => 'NO-Norway', 'PL' => 'PL-Poland', 'PT' => 'PT-Portugal', 'RO' => 'RO-Romania', 'RS' => 'RS-Serbia', 'SE' => 'SE-Sweden', 'SI' => 'SI-Slovenia', 'SK' => 'SK-Slovakia', 'UK' => 'UK-United-Kingdom' ); if ( !$passed_params{type} ) { $passed_params{type} = 0; } elsif ( !$type{ $passed_params{type} } ) { $passed_params{type} = 0; } else { $passed_params{type} = $type{ $passed_params{type} }; } croak "Invalid country_code: $passed_params{country_code}" unless $passed_params{country_code}; my $url = 'http://meteoalarm.eu/en_UK/' . $passed_params{day} . '/' . $passed_params{type} . '/' . $country_codes{ $passed_params{country_code} } . '.html'; my $content = _fetch_content( $url, $self->{'user_agent'} ); my $region_warnings = _parse_region_warnings($content); return $region_warnings; } sub details { my $self = shift; my %passed_params = @_; my %country_codes = ( 'AT' => 10, 'BA' => 10, 'BE' => 801, 'BG' => 28, 'CH' => 319, 'CY' => 1, 'CZ' => 14, 'DE' => 808, 'DK' => 8, 'EE' => 805, 'ES' => 831, 'FI' => 813, 'FR' => 94, 'GR' => 16, 'HR' => 806, 'HU' => 7, 'IE' => 804, 'IL' => 803, 'IS' => 11, 'IT' => 20, 'LT' => 801, 'LU' => 2, 'LV' => 804, 'MD' => 37, 'ME' => 3, 'MK' => 6, 'MT' => 1, 'NL' => 807, 'NO' => 814, 'PL' => 802, 'PT' => 26, 'RO' => 42, 'RS' => 11, 'SE' => 813, 'SI' => 801, 'SK' => 16, 'UK' => 16 ); my ( $region, $code ) = $passed_params{region_code} =~ /^([ABCDEFGHILMNPRSU][A-Z])(\d\d\ d)/; $code =~ s /^0//; croak "Invalid region_code: $passed_params{region_code}" unless ( $country_codes{$region} and ( $code <= $country_codes{$region} ) ); my $details; my %type = ( 'all' => 0, 'wind' => 1, 'snow' => 2, 'ice' => 2, 'snow/ice' => 2, 'thunderstorm' => 3, 'fog' => 4, 'extreme high temperature' => 5, 'extreme low temperature' => 6, 'coastal event' => 7, 'forestfire' => 8, 'avalanches' => 9, 'rain' => 10, 'unnamed' => 11, 'flood' => 12, 'rainflood' => 13 ); if ( !$passed_params{type} ) { $passed_params{type} = 0; } elsif ( !$type{ $passed_params{type} } ) { $passed_params{type} = 0; } else { $passed_params{type} = $type{ $passed_params{type} }; } my %day = ( 'today' => 0, 'tomorrow' => 1 ); if ( !$passed_params{day} ) { $passed_params{day} = 0; } elsif ( !$day{ $passed_params{day} } ) { $passed_params{day} = 0; } else { $passed_params{day} = $day{ $passed_params{day} }; } my $url = 'http://meteoalarm.eu/en_UK/' . $passed_params{day} . '/' . $passed_params{type} . '/' . $passed_params{region_code} . '.html'; my $content = _fetch_content( $url, $self->{'user_agent'} ); $details = _parse_details($content); return $details; } sub codes { my $self = shift; my @codes; my @countries_short; if (@_) { @countries_short = @_; } else { @countries_short = qw(AT BA BE BG CH CY CZ DE DK EE ES FI FR GR HR HU IE IL IS IT LT LU LV MD ME MK MT NL NO PL PT RO RS SE SI SK UK); } my %country_codes = ( 'AT' => 'AT-Austria', 'BA' => 'BA_Bosnia-Herzegovina', 'BE' => 'BE-Belgium', 'BG' => 'BG-Bulgaria', 'CH' => 'CH-Switzerland', 'CY' => 'CY-Cyprus', 'CZ' => 'CZ-Czechia', 'DE' => 'DE-Germany', 'DK' => 'DK-Denmark', 'EE' => 'EE-Estonia', 'ES' => 'ES-Spain', 'FI' => 'FI-Finland', 'FR' => 'FR-France', 'GR' => 'GR-Greece', 'HR' => 'HR-Croatia', 'HU' => 'HU-Hungary', 'IE' => 'IE-Ireland', 'IL' => 'IL-Israel', 'IS' => 'IS-Iceland', 'IT' => 'IT-Italy', 'LT' => 'LT-Lithuania', 'LU' => 'LU-Luxemburg', 'LV' => 'LV-Latvia', 'MD' => 'MD-Moldova', 'ME' => 'ME-Montenegro', 'MK' => 'MK-Former Yugoslav Republic of Macedonia', 'MT' => 'MT-Malta', 'NL' => 'NL-Netherlands', 'NO' => 'NO-Norway', 'PL' => 'PL-Poland', 'PT' => 'PT-Portugal', 'RO' => 'RO-Romania', 'RS' => 'RS-Serbia', 'SE' => 'SE-Sweden', 'SI' => 'SI-Slovenia', 'SK' => 'SK-Slovakia', 'UK' => 'UK-United-Kingdom' ); foreach my $country_short (@countries_short) { my $url = 'http://meteoalarm.eu/en_UK/' . '0' . '/' . '0' . '/' . $country_codes{$country_short} . '.html'; my $content = _fetch_content( $url, $self->{'user_agent'} ); push @codes, _parse_codes($content); } return @codes; } sub _make_country_url { my ( $day, $type ) = @_; my $url = 'http://meteoalarm.eu/en_UK/' . $day . '/' . $type . '/EU-Europe .html'; return $url; } sub _fetch_content { my ( $url, $user_agent ) = @_; my $ua = LWP::UserAgent->new; $ua->agent($user_agent); my $res = $ua->request( HTTP::Request->new( GET => $url ) ); croak " Can't fetch http://meteoalarm.eu: $res->status_line \n" unless ( $res->is_success ); return $res->decoded_content; } sub _parse_country_warnings { my $content = shift; my $p = HTML::TreeBuilder->new_from_content($content); my (%data); my @cells = $p->look_down( _tag => q{td}, class => qr/^col[12]$/ ) ; for my $cell (@cells) { my @src; my $div = $cell->look_down( _tag => q{div} ); my $id = $div->id; my $alt = $div->attr(q{alt}); $data{$id}{fullname} = $alt; my @weather_events = $div->look_down( _tag => 'span', class => qr{warn awt} ); $data{$id}{warnings} = _parse_weather_events( \@weather_events ); # # get tendency # my $tendency = $div->look_down( _tag => 'div', class => qr{tendenz awt nt l\d} ); if ( $tendency->{class} ) { $tendency->{class} =~ /tendenz awt nt l(\d)/; $data{$id}{tendency} = $1; } } return \%data; } sub _parse_region_warnings { my $content = shift; my $p = HTML::TreeBuilder->new_from_content($content); my (%data); my @cells = $p->look_down(_tag=>qr{div}, id=>qr{area}); for my $cell (@cells) { $cell->id =~ /area_([A-Z][A-Z]\d+)/; my $id = $1; my $fullname = $cell->look_down(_tag=>'span',id=>'cname')->as_text ; my $div = $cell->look_down( _tag => q{div} ); $data{$id}{fullname} = $fullname; my @weather_events = $div->look_down(_tag=> 'span', class=>qr{warnflag warn2}); $data{$id}{warnings} = _parse_weather_events( \@weather_events ); # # get tendency # my $tendency = $div->look_down( _tag => 'span', class => qr{tendenz awt\d l\d} ); if ( $tendency->{class} ) { $tendency->{class} =~ /tendenz awt\d l(\d)/; $data{$id}{tendency} = $1; } } return \%data; } sub _parse_weather_events { my $events = shift; my %weather_to_text = ( # lower case for consistency 1 => 'wind', 2 => 'snow/ice', 3 => 'thunderstorm', 4 => 'fog', 5 => 'extreme high temperature', 6 => 'extreme low temperature', 7 => 'coastal event', 8 => 'forestfire', 9 => 'avalanches', 10 => 'rain', 11 => 'unnamed', 12 => 'flood', 13 => 'rainflood' ); my %literal_warnings; for my $event (@$events) { #print $event->{class}, "\n"; $event->{class} =~ /warn\d* awt l(\d+) t(\d+)/; my $warn_level = $1; my $weather = $2; $literal_warnings{ $weather_to_text{$weather} } = $warn_level; } return \%literal_warnings; } sub _parse_details { my $content = shift; my (%data); my $p = HTML::TreeBuilder->new_from_content( decode_entities $cont ent); $data{fullname} = $p->look_down( _tag => q{h1} )->as_text; if ( $p->look_down( _tag => q{div}, class => q{warnbox awt nt l l1} ) ) { $data{warnings} = 'no warnings'; } else { my @warnboxes = $p->look_down( _tag => q{div}, class => qr/warnbox awt/ ); for my $warnbox (@warnboxes) { my ($as_txt); my @info_divs = $warnbox->look_down( _tag => q{div}, class => q{info} ); $as_txt = $info_divs[0]->as_text; my ( $from, $until ) = $as_txt =~ /valid from (.*) Until ( .*)$/; $as_txt = $info_divs[1]->as_text; my ( $warning, $level ) = $as_txt =~ /(.+?)\s+Awareness Level:\s+(.*)/; $warning =~ s/s$//; my $text = $warnbox->look_down( _tag => q{div}, class => q{text} )->as_text; $data{warnings}{ lc $warning } = { #lower case for constistency level => $level, from => $from, until => $until, text => $text, }; } } return \%data; } sub _parse_codes { my $content = shift; my $p = HTML::TreeBuilder->new_from_content($content); my (%data); #my @cells = $p->look_down( _tag => 'div', class => 'flags' ); my @cells = $p->look_down( _tag => qr{a} ); for my $cell (@cells) { if ( $cell->attr('xlink:href') ) { if ( $cell->attr('xlink:href') =~ /\/([A-Z][A-Z]\d+)-(.+?) .html/ ) { my $code = $1; my $fullname = $2; $data{$fullname} = $code; } } } return \%data; } sub _make_user_agent { my $ua = shift; $ua = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:55.0) Gecko/20100101 Fire fox/55.0' unless ($ua); return $ua; } sub _extract_details_fullname { my $content = shift; my $region; if ( $content =~ /<h1>Weather warnings: (.+?)<\/h1>/ ) { $region = $1; decode_entities($region); if ( $region =~ /.??<.*<\/a>/ ) { $region =~ s/.??<.*<\/a>//; } } else { carp "Can't get region name\n"; } return $region; } 1; __END__ =head1 NAME B<Meteoalarm> - OO Interface for meteoalarm.eu =head1 SYNOPSIS This Module gets weather warnings from meteoalarm.eu. For further reading of terms and conditions see http://meteoalarm.eu/t erms.php?lang=en_UK use Meteoalarm; my $meteo = Meteoalarm->new( 'user_agent' => 'Meteobot 0.001' ); my $countries = $meteo -> countries ('type' => 'all', 'day' => 'today' ); foreach my $country_code (sort keys %{$countries}){ print "Country: $countries->{$country_code}->{'fullname'}\n"; print "Tendency = $countries->{$country_code}->{tendency}\n" if ( $countries->{$country_code}->{'tendency'}); if (keys %{$countries->{$country_code}->{'warnings'}}){ foreach my $warning (keys %{$countries->{$country_code}->{'warning s'}}){ print "Event: $warning, severity: $countries->{$country_co de}->{'warnings'}->{$warning}\n"; } } else {print "No Warnings\n";} } my $regions = $meteo->regions( 'country_code' => 'PT', 'day' => 'today ', 'type' => 'all' ); foreach my $code ( sort keys %{$regions} ) { print "Region : $regions->{$code}->{'fullname'}: region_code = $co de\n" if ( keys %{ $regions->{$code}->{'warnings'} } ); print "Tendency = $regions->{$code}->{tendency}\n" if ( $regions-> {$code}->{'tendency'}); foreach my $type ( keys %{ $regions->{$code}->{'warnings'} } ) { print "$type Severity: $regions->{$code}->{'warnings'}->{$type}\n"; } } my $details = $meteo->details( 'region_code' => 'UK010', 'day' => 'tod ay'); my $name = $details->{'fullname'}; print "$name\n"; if ( $details->{warnings} eq 'no warnings' ) { print $details->{warnings}, "\n"; } else { foreach my $warning ( keys %{ $details->{'warnings'} } ) { print "$warning\n"; foreach my $detail ( keys %{ $details->{'warnings'}->{$warning } } ) { print "$detail: $details->{'warnings'}->{$warning}->{$deta il}\n"; } } } my $codes = $meteo->codes('FR'); my @codes = $meteo->codes(); foreach my $code (@codes) { foreach my $region ( sort keys %{$code} ) { print "Region name: $region, region code: $code->{$region}\n"; } } =head1 DESCRIPTION $meteo -> countries returns hashref of warnings for all countries. $meteo -> regions returns hashref of warnings for all regions in a spe cified country $meteo -> details returns hashref of detailled warnings for a specifie d region $meteo -> codes returns arrayref of hash of name and region code of a country =head1 METHODS =head1 new( ) creates a new meteoalarm object =head2 Optional Arguments: new( 'user_agent' => 'Meteobot 0.001'); changes the user agent string =head1 my $country = $meteo -> countries (); =head2 Optional Arguments: 'day' => 'today' || 'tomorrow' if day is not defined, default value is today 'type' => 'all' || 'wind' || 'snow' || 'ice' || 'snow/ice' || 'snow' || 'ice' || 'thunderstorm' || 'fog' || 'extreme high temperature' || 'extreme low temperature' || 'coastal event' || 'fores tfire' || 'avalanches' || 'rain' if type is not defined, default type is all =head1 $regions = $meteo -> regions ('country_code' => 'DE'); country_code is a 2 letter abbreviation =head2 Optional arguments: day=> 'today' || 'tomorrow' if day is not defined, default value is today =head1 $details = $meteo->details ('region_code' => 'ES005'); region_code consits of 2 letters for the country and 3 digits =head2 Optional arguments: day=> 'today' || 'tomorrow' if day is not defined, default value is today =head1 $code = $meteo -> codes (); Returns arrayref of hash for region names and codes for all countries =head2 Optional Arguments $code = $meteo -> codes ('PL'); Countrycode for a specific country =cut

[Nov 17, 2017] Safe string handling

Nov 17, 2017 | perlmonks.com


2 direct replies -- Read more / Contribute by tdlewis77
on Aug 25, 2017 at 13:07

# This string has a mixture of ASCII, UTF-8, 2 byte wide, and 4 byte # wide characters my $crazy = "Hello\x{26c4}".encode("utf-8","\x{26f0}"). "\x{10102}\x{2fa1b}"; # Now the string only has ASCII and UTF-8 characters my $sane = safeString($crazy); # testString($crazy) returns 7 # testString($sane) returns 3 # length($sane) returns 19 # trueLength($sane) returns 9 my $snowman = safeSubstr($crazy,5,1); ######################################## # safeString($string) # return a safe version of the string sub safeString { my ($string) = @_; return "" unless defined($string); my $t = testString($string); return $string if $t <= 3; return encode("utf-8",$string) if $t <= 5; # The string has both UTF-8 and wide characters so it needs # tender-loving care my @s = unpack('C*',$string); my @r; for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { push @r, $s[$i]; $i++; } elsif ($s[$i] > 255) { # encode a wide character push @r,unpack("C*",encode("utf-8",chr($s[$i]))); $i++; } else { # copy all the utf-8 bytes $n = _charBytes($i,@s) - 1; map { push @r, $s[$i+$_] } 0..$n; $i += $n + 1; } } return pack("C*",@r); } ######################################## # safeSubstr($string,$pos,$n) # return a safe substring (treats utf-8 sequences as a single # character) sub safeSubstr { my ($string,$pos,$n) = @_; $s = safeString($string); my $p = 0; my $rPos = 0; my $rEnd = -1; my @s = unpack('C*',$s); for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { $i++; } elsif ($s[$i] > 255) { $i++; } else { $i += _charBytes($i,@s); } $p++; $rPos = $i if $p == $pos; $rEnd = $i-1 if $p == $pos + $n; } $rEnd = scalar(@s) - 1 if $rEnd < 0; return "" if $rPos > $rEnd; my @r; map { push @r, $s[$_] } $rPos..$rEnd; return pack("C*",@r); } ######################################## # testString($string) # returns information about the characters in the string # # The 1, 2, and 4 bits of the result are for ASCII, UTF-8, and # wide characters respectively. If multiple bits are set, # characters of each type appear in the string. If the result is: # <= 1 simple ASCII string # <= 3 simple UTF-8 string # >3 && <= 5 mixed ASCII & wide characters # >= 6 mixed UTF-8 & wide characters sub testString { my ($s) = @_; return undef unless defined($s); my $r = 0; my @s = unpack('C*',$s); for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { $r |= 1; $i++; } elsif ($s[$i] > 255) { $r |= 4; $i++; } else { $r |= 2; $i += _charBytes($i,@s); } } return $r; } ######################################## # trueLength($string) # returns the number of UTF-8 characters in a string sub trueLength { my ($s) = @_; return unless defined($s); my $len = 0; my @s = unpack('C*',$s); for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { $i++; } elsif ($s[$i] > 255) { $i++; } else { $i += _charBytes($i,@s); } $len++; } return $len; } ######################################## # String support routines sub _charBytes { my $n = shift(@_); my $len = scalar(@_); if ($_[$n] < 128) { return 1; } elsif ($_[$n] > 65535) { return 4; } elsif ($_[$n] > 255) { return 2; } elsif (($_[$n] & 0xFC) == 0xFC) { return min(6,$len); } elsif (($_[$n] & 0xF8) == 0xF8) { return min(5,$len); } elsif (($_[$n] & 0xF0) == 0xF0) { return min(4,$len); } elsif (($_[$n] & 0xE0) == 0xE0) { return min(3,$len); } elsif (($_[$n] & 0xC0) == 0xC0) { return min(2,$len); } else { return 1; } }

[Nov 17, 2017] ndexed Flat File databases (for ISAM, NoSQL, Perl Embedded databases)

Nov 17, 2017 | perlmonks.com


1 direct reply -- Read more / Contribute by erichansen1836
on Oct 08, 2017 at 11:13

TOPIC: FAST!! Random Access Indexed, Relational Flat File Databases, Indexed by external Perl SDBM databases of key/value pairs tied to program "in memory" hash tables, where the Key in the Key/Value Pair is one or more fields and/or partial fields concatenated together (separated by a delimiter such as a pipe "|") and contained within the Flat File records for you to arbitrarily seek to a single record or a sorted/related group of records within your database.

Since it has been over 2 years ago since I first posted about this TOPIC I discovered, I wanted to alert the Perl community to the original thread where you can find Perl source code now for examples of how to implement Joint Database Technology/Methodology. Inparticular the King James Bible Navigator software DEMO I posted which used FlatFile/SDBM for its database. I have made this a native Windows GUI application (TreeView/RichEdit COMBO interface) to demonstrate how to show your end-users a summary of the information of the data contained within a database, and allow them to drill down to a small amount of specific information (e.g. verses within a single book/chapter) for actual viewing (and retrieving from the database). The TreeView Double Click Event was originally written to random access the first verse within a chapter, then sequentially access the remaining verses within a chapter - performing a READ for each verse. I posted a separate modified TreeView Double Click Event for you to insert into the Application which reads an entire chapter in one (1) giant READ, breaking out the individual verses (into an array) using the UNPACK statement. -- Eric

Joint Database Technology: http://www.perlmonks.org/?node_id=1121222

[Nov 17, 2017] How do the Perl 6 set operations compare elements?

Notable quotes:
"... Running under moar (2016.10) ..."
Nov 17, 2017 | stackoverflow.com

Ask Question up vote down vote favorite 1

brian d foy ,Nov 26, 2016 at 4:32

Running under moar (2016.10)

Consider this code that constructs a set and tests for membership:

my $num_set = set( < 1 2 3 4 > );
say "set: ", $num_set.perl;
say "4 is in set: ", 4 ∈ $num_set;
say "IntStr 4 is in set: ", IntStr.new(4, "Four") ∈ $num_set;
say "IntStr(4,...) is 4: ", IntStr.new(4, "Four") == 4;
say "5 is in set: ", 5 ∈ $num_set;

A straight 4 is not in the set, but the IntStr version is:

set: set(IntStr.new(4, "4"),IntStr.new(1, "1"),IntStr.new(2, "2"),IntStr.new(3, "3"))
4 is in set: False
IntStr 4 is in set: True
IntStr(4,...) is 4: True
5 is in set: False

I think most people aren't going to expect this, but the docs doesn't say anything about how this might work. I don't have this problem if I don't use the quote words (i.e. set( 1, 2, 3, 4) ).

timotimo ,Nov 26, 2016 at 5:47

You took a wrong turn in the middle. The important part is what nqp::existskey is called with: the k.WHICH . This method is there for value types, i.e. immutable types where the value - rather than identity - defines if two things are supposed to be the same thing (even if created twice). It returns a string representation of an object's value that is equal for two things that are supposed to be equal. For <1>.WHICH you get IntStr|1 and for 1.WHICH you get just Int|1 .

brian d foy ,Nov 26, 2016 at 6:18

Ah, okay. I can see a lot of pain for regular people trying to debug these things. – brian d foy Nov 26 '16 at 6:18

smls ,Nov 26, 2016 at 14:46

As explained in the Set documentation, sets compare object identity, same as the === operator:

Within a Set, every element is guaranteed to be unique (in the sense that no two elements would compare positively with the === operator)

The identity of an object is defined by the .WHICH method, as timotimo elaborates in his answer.

brian d foy ,Nov 26, 2016 at 21:28

That's not really clear from that statement. That's talking about which elements are in the set. Beyond that, even if you choose to compare with ===, you have to know how other things are stored. This is the sort of info that should show up next to the Set operators. – brian d foy Nov 26 '16 at 21:28

brian d foy ,Nov 26, 2016 at 23:09

Indeed, I think I've found a bug. The qw docs says this should be true: < a b 137 > eqv ( 'a', 'b', '137' ) , but in the same version of Rakudo Star I get false. It's different object types on each side. – brian d foy Nov 26 '16 at 23:09

brian d foy ,Nov 26, 2016 at 23:16

Despite all this, your answer was the A-ha! moment that led me to look at the right thing. Thanks for all of your help. – brian d foy Nov 26 '16 at 23:16

raiph ,Nov 27, 2016 at 4:50

Write your list of numbers using commas

As you mention in your answer, your code works if you write your numbers as a simple comma separated list rather than using the <...> construct.

Here's why:

4 ∈ set 1, 2, 3, 4 # True

A bare numeric literal in code like the 4 to the left of constructs a single value with a numeric type. (In this case the type is Int, an integer.) If a set constructor receives a list of similar literals on the right then everything works out fine.

<1 2 3 4> produces a list of "dual values"

The various <...> "quote words" constructs turn the list of whitespace separated literal elements within the angle brackets into an output list of values.

The foundational variant ( qw<...> ) outputs nothing but strings. Using it for your use case doesn't work:

4 ∈ set qw<1 2 3 4> # False

The 4 on the left constructs a single numeric value, type Int . In the meantime the set constructor receives a list of strings, type Str : ('1','2','3','4') . The operator doesn't find an Int in the set because all the values are Str s so returns False .

Moving along, the huffmanized <...> variant outputs Str s unless an element is recognized as a number. If an element is recognized as a number then the output value is a "dual value". For example a 1 becomes an IntStr .

According to the doc "an IntStr can be used interchangeably where one might use a Str or an Int". But can it?

Your scenario is a case in point. While 1 ∈ set 1,2,3 and <1> ∈ set <1 2 3> both work, 1 ∈ set <1 2 3> and <1> ∈ set 1, 2, 3 both return False .

So it seems the operator isn't living up to the quoted doc's claim of dual value interchangeability

This may already be recognized as a bug in the set operation and/or other operations. Even if not, this sharp "dual value" edge of the <...> list constructor may eventually be viewed as sufficiently painful that Perl 6 needs to change.

brian d foy ,Nov 26, 2016 at 23:29

I think this is a bug, but not in the set stuff. The other answers were very helpful in sorting out what was important and what wasn't.

I used the angle-brackets form of the quote words . The quote words form is supposed to be equivalent to the quoting version (that is, True under eqv ). Here's the doc example:

<a b c> eqv ('a', 'b', 'c')

But, when I try this with a word that is all digits, this is broken:

 $ perl6
 > < a b 137 > eqv ( 'a', 'b', '137' )
 False

But, the other forms work:

> qw/ a b 137 / eqv ( 'a', 'b', '137' )
True
> Q:w/ a b 137 / eqv ( 'a', 'b', '137' )
True

The angle-bracket word quoting uses IntStr :

> my @n = < a b 137 >
[a b 137]
> @n.perl
["a", "b", IntStr.new(137, "137")]

Without the word quoting, the digits word comes out as [Str]:

> ( 'a', 'b', '137' ).perl
("a", "b", "137")
> ( 'a', 'b', '137' )[*-1].perl
"137"
> ( 'a', 'b', '137' )[*-1].WHAT
(Str)
> my @n = ( 'a', 'b', '137' );
[a b 137]
> @n[*-1].WHAT
(Str)

You typically see these sorts of errors when there are two code paths to get to a final result instead of shared code that converges to one path very early. That's what I would look for if I wanted to track this down (but, I need to work on the book!)

This does highlight, though, that you have to be very careful about sets. Even if this bug was fixed, there are other, non-buggy ways that eqv can fail. I would have still failed because 4 as Int is not "4" as Str . I think this level of attention to data types in unperly in it's DWIMery. It's certainly something I'd have to explain very carefully in a classroom and still watch everyone mess up on it.

For what it's worth, I think the results of gist tend to be misleading in their oversimplification, and sometimes the results of perl aren't rich enough (e.g. hiding Str which forces me to .WHAT ). The more I use those, the less useful I find them.

But, knowing that I messed up before I even started would have saved me from that code spelunking that ended up meaning nothing!

Christoph ,Nov 26, 2016 at 23:55

Could you clarify what you consider the bug to be? As far as I can tell, this is all by design: (a) <...> goes through &val , which returns allomorphs if possible (b) set membership is defined in terms of identity, which distinguishes between allomorphs and their corresponding value types; so I would not classify it as a bug, but 'broken' by design; or phrased another way, it's just the WAT that comes with this particular DWIMChristoph Nov 26 '16 at 23:55

Brad Gilbert ,Nov 26, 2016 at 23:59

This was intentionally added, and is part of the testsuite . ( I can't seem to find anywhere that tests for < > being equivalent to q:w:v< > and << >> / " " being equivalent to qq:ww:v<< >> ) – Brad Gilbert Nov 26 '16 at 23:59

brian d foy ,Nov 27, 2016 at 0:02

The docs say the two lists should be eqv, and they are not. If they are not meant to be equivalent, the docs need to change. Nothing in docs.perl6.org/language/quoting#Word_quoting:_qw mentions any of this stuff. – brian d foy Nov 27 '16 at 0:02

Christoph ,Nov 27, 2016 at 0:17

The documentation seems to be just wrong here, <...> does not correspond to qw(...) , but qw:v(...) . Cf S02 for the description of the adverb and this test that Brad was <del>looking for</del> already linked toChristoph Nov 27 '16 at 0:17

Christoph ,Nov 27, 2016 at 0:45

or perhaps not outright wrong, but rather 'just' misleading: <...> is indeed a :w form, and the given example code does compare equal according to eqvChristoph Nov 27 '16 at 0:45

dwarring ,Nov 27, 2016 at 18:33

Just to add to the other answers and point out a consistancy here between sets and object hashes .

An object hash is declared as my %object-hash{Any} . This effectively hashes on objects .WHICH method, which is similar to how sets distinguish individual members.

Substituting the set with an object hash:

my %obj-hash{Any};

%obj-hash< 1 2 3 4 > = Any;
say "hash: ", %obj-hash.keys.perl;
say "4 is in hash: ", %obj-hash{4}:exists;
say "IntStr 4 is in hash: ", %obj-hash{ IntStr.new(4, "Four") }:exists;
say "IntStr(4,...) is 4: ", IntStr.new(4, "Four") == 4;
say "5 is in hash: ", %obj-hash{5}:exists;

gives similar results to your original example:

hash: (IntStr.new(4, "4"), IntStr.new(1, "1"), IntStr.new(2, "2"), IntStr.new(3, "3")).Seq
4 is in hash: False
IntStr 4 is in hash: True
IntStr(4,...) is 4: True
5 is in hash: False

brian d foy ,Nov 29, 2016 at 21:52

Oh, this is going to suck hard while teaching a class. – brian d foy Nov 29 '16 at 21:52

dwarring ,Nov 30, 2016 at 4:21

I agree its not great, as it is. – dwarring Nov 30 '16 at 4:21

dwarring ,Nov 30, 2016 at 18:26

Have raised an RT rt.perl.org/Ticket/Display.html?id=130222dwarring Nov 30 '16 at 18:26

[Nov 17, 2017] Introducing TestSimple for testing Perl programs - YouTube

Nov 17, 2017 | www.youtube.com

Published on Oct 17, 2015

For details visit: http://perlmaven.com/introducing-test... Category Science & Technology License Standard YouTube License

[Nov 17, 2017] Bruce Gray - Your Perl 5 Brain, on Perl 6 > by Bruce Gray

Nov 17, 2017 | www.youtube.com

Published on Jun 21, 2017

In which I detail the Perl 6 elements that have most changed my Perl 5 coding, and share the Perl 5 techniques I have adopted.

I eat, sleep, live, and breathe Perl!

Consultant and Contract Programmer Frequent PerlMongers speaker Dedicated Shakespeare theater-goer Armchair Mathematician Author of Blue_Tiger, a tool for modernizing Perl.

36 years coding 22 years Perl 16 years Married 15 YAPCs 7 Hackathons 3 PerlWhirls Perl interests: Refactoring, Perl Idioms / Micropatterns, RosettaCode, and Perl 6.

[Nov 17, 2017] Bit operations in Perl

Nov 17, 2017 | stackoverflow.com

Ask Question up vote down vote favorite

Toren ,Jan 12, 2011 at 14:50

I have an attribute (32 bits-long), that each bit responsible to specific functionality. Perl script I'm writing should turn on 4th bit, but save previous definitions of other bits.

I use in my program:

Sub BitOperationOnAttr
{

my $a="";

MyGetFunc( $a);

$a |= 0x00000008;

MySetFunc( $a);

}

** MyGetFunc/ MySetFunc my own functions that know read/fix value.

Questions:

  1. if usage of $a |= 0x00000008; is right ?
  2. how extract hex value by Regular Expression from string I have : For example:

"Attribute: Somestring: value (8 long (0x8))"

Michael Carman ,Jan 12, 2011 at 16:13

Your questions are not related; they should be posted separately. That makes it easier for other people with similar questions to find them. – Michael Carman Jan 12 '11 at 16:13

toolic ,Jan 12, 2011 at 16:47

Same question asked on PerlMonks: perlmonks.org/?node_id=881892toolic Jan 12 '11 at 16:47

psmears ,Jan 12, 2011 at 15:00

  1. if usage of $a |= 0x00000008; is right ?

Yes, this is fine.

  1. how extract hex value by Regular Expression from string I have : For example:

"Attribute: Somestring: value (8 long (0x8))"

I'm assuming you have a string like the above, and want to use a regular expression to extract the "0x8". In that case, something like:

if ($string =~ m/0x([0-9a-fA-F]+)/) {
    $value = hex($1);
} else {
    # string didn't match
}

should work.

Michael Carman ,Jan 12, 2011 at 16:32

Perl provides several ways for dealing with binary data:

Your scenario sounds like a set of packed flags. The bitwise operators are a good fit for this:

my $mask = 1 << 3;   # 0x0008
$value |=  $mask;    # set bit
$value &= ~$mask;    # clear bit
if ($value & $mask)  # check bit

vec is designed for use with bit vectors. (Each element has the same size, which must be a power of two.) It could work here as well:

vec($value, 3, 1) = 1;  # set bit
vec($value, 3, 1) = 0;  # clear bit
if (vec($value, 3, 1))  # check bit

pack and unpack are better suited for working with things like C structs or endianness.

sdaau ,Jul 15, 2014 at 5:01

I upvoted, but there is something very important missing: vec operates on a string!

If we use a number; say:

$val=5;

printf("b%08b",$val);

(this gives b00000101 ) -- then one can see that the "check bit" syntax, say:

for($ix=7;$ix>=0;$ix--) {

print vec($val, $ix, 1);

};

print "\n";

will not work (it gives 00110101 , which is not the same number). The correct is to convert the number to ASCII char, i.e.

print vec(sprintf("%c", $val), $ix, 1); .

sdaau Jul 15 '14 at 5:01

[Nov 17, 2017] date - How to convert epoch seconds to normal time in perl - Stack Overflow

Nov 17, 2017 | stackoverflow.com

confused ,2 days ago

I have a string of epoch seconds "1510652305" which when i convert to normal time on unix command line using
`date -d @1510652305`

i get Tue Nov 14 15:08:25 IST 2017

But when i tried it in perl using something like this

use POSIX qw(strftime);
use Time::Local;
use Time::localtime;

$kickoff_time=1510652305;
$kickoff_time=ctime($kickoff_time);

i get

Thu Jan 1 05:30:00 1970

How can i achieve the result i am getting in linux in perl?

Thanks!!

mwp ,2 days ago

Don't overthink it!
my $kickoff_time = localtime 1510652305;
say $kickoff_time; # Tue Nov 14 15:08:25 2017

If you absolutely, positively need the timezone in there:

use POSIX qw{strftime};

my $kickoff_time = strftime '%a %b %e %H:%M:%S %Z %Y', localtime 1510652305;
say $kickoff_time; # Tue Nov 14 15:08:25 IST 2017

Note that this is locale-dependent.

confused ,yesterday

We have to use localtime to convert in time from epoch seconds and gmtime to convert in time from normal seconds i got it now....Thanks!! – confused yesterday

mob ,yesterday

Still confused. Both localtime and gmtime expect the input to be epoch seconds. – mob yesterday

mwp ,4 hours ago

Exactly. To expound, localtime() takes the epoch and returns a string (or date parts array) representing the time in your local timezone; gmtime() takes the epoch and returns a string (or date parts array) representing the time in UTC. – mwp 4 hours ago

,

I would recommend using Time::Piece for this job - it's core in perl.
#!/usr/bin/env perl

use strict;
use warnings; 
use Time::Piece;

my $t = localtime ( 1510652305 );

print $t;

It'll print default format, or you can use formatted using strftime .

[Nov 17, 2017] Meteoalarm - Weather warnings

Nov 17, 2017 | perlmonks.com


1 direct reply -- Read more / Contribute by walto
on Sep 23, 2017 at 00:50

Meteoalarm.eu ( http://meteoalarm.eu ) is the official website from European national weather services that gives out warnings in extreme weather situations. It has been a while ago that i wrote a perl module for processing this information. ( Weather warnings from www.meteoalarm.eu ). The website is still on but has changed since. That made some changes necessary. I wrote the module only for informational purposes and it is not meant to use it for anything critical. Here is the code: #!/usr/bin/perl # # package Meteoalarm; use strict; use warnings; use Carp; use LWP; use HTML::Entities; use HTML::TreeBuilder; use utf8; binmode STDOUT, ":encoding(UTF-8)"; our $VERSION = "0.06"; sub new { my $class = shift; my $self = {}; my %passed_params = @_; $self->{'user_agent'} = _make_user_agent( $passed_params{'user_agent'} ); bless( $self, $class ); return $self; } sub countries { my $self = shift; my %passed_params = @_; my %type = ( 'all' => 0, 'wind' => 1, 'snow' => 2, 'ice' => 2, 'snow/ice' => 2, 'thunderstorm' => 3, 'fog' => 4, 'extreme high temperature' => 5, 'extreme low temperature' => 6, 'coastal event' => 7, 'forestfire' => 8, 'avalanches' => 9, 'rain' => 10, 'unnamed' => 11, 'flood' => 12, 'rainflood' => 13 ); if ( !$passed_params{type} ) { $passed_params{type} = 0; } elsif ( !$type{ $passed_params{type} } ) { $passed_params{type} = 0; } else { $passed_params{type} = $type{ $passed_params{type} }; } my %day = ( 'today' => 0, 'tomorrow' => 1 ); if ( !$passed_params{day} ) { $passed_params{day} = 0; } elsif ( !$day{ $passed_params{day} } ) { $passed_params{day} = 0; } else { $passed_params{day} = $day{ $passed_params{day} }; } my $url = _make_country_url( $passed_params{day}, $passed_params{type} ); my $content = _fetch_content( $url, $self->{'user_agent'} ); my $country_warnings = _parse_country_warnings($content); return $country_warnings; } sub regions { my ($self) = shift; my %passed_params = @_; my %day = ( 'today' => 0, 'tomorrow' => 1 ); if ( !$passed_params{day} ) { $passed_params{day} = 0; } elsif ( !$day{ $passed_params{day} } ) { $passed_params{day} = 0; } else { $passed_params{day} = $day{ $passed_params{day} }; } my %type = ( 'all' => 0, 'wind' => 1, 'snow' => 2, 'ice' => 2, 'snow/ice' => 2, 'thunderstorm' => 3, 'fog' => 4, 'extreme high temperature' => 5, 'extreme low temperature' => 6, 'coastal event' => 7, 'forestfire' => 8, 'avalanches' => 9, 'rain' => 10, 'unnamed' => 11, 'flood' => 12, 'rainflood' => 13 ); my %country_codes = ( 'AT' => 'AT-Austria', 'BA' => 'BA_Bosnia-Herzegovina', 'BE' => 'BE-Belgium', 'BG' => 'BG-Bulgaria', 'CH' => 'CH-Switzerland', 'CY' => 'CY-Cyprus', 'CZ' => 'CZ-Czechia', 'DE' => 'DE-Germany', 'DK' => 'DK-Denmark', 'EE' => 'EE-Estonia', 'ES' => 'ES-Spain', 'FI' => 'FI-Finland', 'FR' => 'FR-France', 'GR' => 'GR-Greece', 'HR' => 'HR-Croatia', 'HU' => 'HU-Hungary', 'IE' => 'IE-Ireland', 'IL' => 'IL-Israel', 'IS' => 'IS-Iceland', 'IT' => 'IT-Italy', 'LT' => 'LT-Lithuania', 'LU' => 'LU-Luxemburg', 'LV' => 'LV-Latvia', 'MD' => 'MD-Moldova', 'ME' => 'ME-Montenegro', 'MK' => 'MK-Former Yugoslav Republic of Macedonia', 'MT' => 'MT-Malta', 'NL' => 'NL-Netherlands', 'NO' => 'NO-Norway', 'PL' => 'PL-Poland', 'PT' => 'PT-Portugal', 'RO' => 'RO-Romania', 'RS' => 'RS-Serbia', 'SE' => 'SE-Sweden', 'SI' => 'SI-Slovenia', 'SK' => 'SK-Slovakia', 'UK' => 'UK-United-Kingdom' ); if ( !$passed_params{type} ) { $passed_params{type} = 0; } elsif ( !$type{ $passed_params{type} } ) { $passed_params{type} = 0; } else { $passed_params{type} = $type{ $passed_params{type} }; } croak "Invalid country_code: $passed_params{country_code}" unless $passed_params{country_code}; my $url = 'http://meteoalarm.eu/en_UK/' . $passed_params{day} . '/' . $passed_params{type} . '/' . $country_codes{ $passed_params{country_code} } . '.html'; my $content = _fetch_content( $url, $self->{'user_agent'} ); my $region_warnings = _parse_region_warnings($content); return $region_warnings; } sub details { my $self = shift; my %passed_params = @_; my %country_codes = ( 'AT' => 10, 'BA' => 10, 'BE' => 801, 'BG' => 28, 'CH' => 319, 'CY' => 1, 'CZ' => 14, 'DE' => 808, 'DK' => 8, 'EE' => 805, 'ES' => 831, 'FI' => 813, 'FR' => 94, 'GR' => 16, 'HR' => 806, 'HU' => 7, 'IE' => 804, 'IL' => 803, 'IS' => 11, 'IT' => 20, 'LT' => 801, 'LU' => 2, 'LV' => 804, 'MD' => 37, 'ME' => 3, 'MK' => 6, 'MT' => 1, 'NL' => 807, 'NO' => 814, 'PL' => 802, 'PT' => 26, 'RO' => 42, 'RS' => 11, 'SE' => 813, 'SI' => 801, 'SK' => 16, 'UK' => 16 ); my ( $region, $code ) = $passed_params{region_code} =~ /^([ABCDEFGHILMNPRSU][A-Z])(\d\d\ d)/; $code =~ s /^0//; croak "Invalid region_code: $passed_params{region_code}" unless ( $country_codes{$region} and ( $code <= $country_codes{$region} ) ); my $details; my %type = ( 'all' => 0, 'wind' => 1, 'snow' => 2, 'ice' => 2, 'snow/ice' => 2, 'thunderstorm' => 3, 'fog' => 4, 'extreme high temperature' => 5, 'extreme low temperature' => 6, 'coastal event' => 7, 'forestfire' => 8, 'avalanches' => 9, 'rain' => 10, 'unnamed' => 11, 'flood' => 12, 'rainflood' => 13 ); if ( !$passed_params{type} ) { $passed_params{type} = 0; } elsif ( !$type{ $passed_params{type} } ) { $passed_params{type} = 0; } else { $passed_params{type} = $type{ $passed_params{type} }; } my %day = ( 'today' => 0, 'tomorrow' => 1 ); if ( !$passed_params{day} ) { $passed_params{day} = 0; } elsif ( !$day{ $passed_params{day} } ) { $passed_params{day} = 0; } else { $passed_params{day} = $day{ $passed_params{day} }; } my $url = 'http://meteoalarm.eu/en_UK/' . $passed_params{day} . '/' . $passed_params{type} . '/' . $passed_params{region_code} . '.html'; my $content = _fetch_content( $url, $self->{'user_agent'} ); $details = _parse_details($content); return $details; } sub codes { my $self = shift; my @codes; my @countries_short; if (@_) { @countries_short = @_; } else { @countries_short = qw(AT BA BE BG CH CY CZ DE DK EE ES FI FR GR HR HU IE IL IS IT LT LU LV MD ME MK MT NL NO PL PT RO RS SE SI SK UK); } my %country_codes = ( 'AT' => 'AT-Austria', 'BA' => 'BA_Bosnia-Herzegovina', 'BE' => 'BE-Belgium', 'BG' => 'BG-Bulgaria', 'CH' => 'CH-Switzerland', 'CY' => 'CY-Cyprus', 'CZ' => 'CZ-Czechia', 'DE' => 'DE-Germany', 'DK' => 'DK-Denmark', 'EE' => 'EE-Estonia', 'ES' => 'ES-Spain', 'FI' => 'FI-Finland', 'FR' => 'FR-France', 'GR' => 'GR-Greece', 'HR' => 'HR-Croatia', 'HU' => 'HU-Hungary', 'IE' => 'IE-Ireland', 'IL' => 'IL-Israel', 'IS' => 'IS-Iceland', 'IT' => 'IT-Italy', 'LT' => 'LT-Lithuania', 'LU' => 'LU-Luxemburg', 'LV' => 'LV-Latvia', 'MD' => 'MD-Moldova', 'ME' => 'ME-Montenegro', 'MK' => 'MK-Former Yugoslav Republic of Macedonia', 'MT' => 'MT-Malta', 'NL' => 'NL-Netherlands', 'NO' => 'NO-Norway', 'PL' => 'PL-Poland', 'PT' => 'PT-Portugal', 'RO' => 'RO-Romania', 'RS' => 'RS-Serbia', 'SE' => 'SE-Sweden', 'SI' => 'SI-Slovenia', 'SK' => 'SK-Slovakia', 'UK' => 'UK-United-Kingdom' ); foreach my $country_short (@countries_short) { my $url = 'http://meteoalarm.eu/en_UK/' . '0' . '/' . '0' . '/' . $country_codes{$country_short} . '.html'; my $content = _fetch_content( $url, $self->{'user_agent'} ); push @codes, _parse_codes($content); } return @codes; } sub _make_country_url { my ( $day, $type ) = @_; my $url = 'http://meteoalarm.eu/en_UK/' . $day . '/' . $type . '/EU-Europe .html'; return $url; } sub _fetch_content { my ( $url, $user_agent ) = @_; my $ua = LWP::UserAgent->new; $ua->agent($user_agent); my $res = $ua->request( HTTP::Request->new( GET => $url ) ); croak " Can't fetch http://meteoalarm.eu: $res->status_line \n" unless ( $res->is_success ); return $res->decoded_content; } sub _parse_country_warnings { my $content = shift; my $p = HTML::TreeBuilder->new_from_content($content); my (%data); my @cells = $p->look_down( _tag => q{td}, class => qr/^col[12]$/ ) ; for my $cell (@cells) { my @src; my $div = $cell->look_down( _tag => q{div} ); my $id = $div->id; my $alt = $div->attr(q{alt}); $data{$id}{fullname} = $alt; my @weather_events = $div->look_down( _tag => 'span', class => qr{warn awt} ); $data{$id}{warnings} = _parse_weather_events( \@weather_events ); # # get tendency # my $tendency = $div->look_down( _tag => 'div', class => qr{tendenz awt nt l\d} ); if ( $tendency->{class} ) { $tendency->{class} =~ /tendenz awt nt l(\d)/; $data{$id}{tendency} = $1; } } return \%data; } sub _parse_region_warnings { my $content = shift; my $p = HTML::TreeBuilder->new_from_content($content); my (%data); my @cells = $p->look_down(_tag=>qr{div}, id=>qr{area}); for my $cell (@cells) { $cell->id =~ /area_([A-Z][A-Z]\d+)/; my $id = $1; my $fullname = $cell->look_down(_tag=>'span',id=>'cname')->as_text ; my $div = $cell->look_down( _tag => q{div} ); $data{$id}{fullname} = $fullname; my @weather_events = $div->look_down(_tag=> 'span', class=>qr{warnflag warn2}); $data{$id}{warnings} = _parse_weather_events( \@weather_events ); # # get tendency # my $tendency = $div->look_down( _tag => 'span', class => qr{tendenz awt\d l\d} ); if ( $tendency->{class} ) { $tendency->{class} =~ /tendenz awt\d l(\d)/; $data{$id}{tendency} = $1; } } return \%data; } sub _parse_weather_events { my $events = shift; my %weather_to_text = ( # lower case for consistency 1 => 'wind', 2 => 'snow/ice', 3 => 'thunderstorm', 4 => 'fog', 5 => 'extreme high temperature', 6 => 'extreme low temperature', 7 => 'coastal event', 8 => 'forestfire', 9 => 'avalanches', 10 => 'rain', 11 => 'unnamed', 12 => 'flood', 13 => 'rainflood' ); my %literal_warnings; for my $event (@$events) { #print $event->{class}, "\n"; $event->{class} =~ /warn\d* awt l(\d+) t(\d+)/; my $warn_level = $1; my $weather = $2; $literal_warnings{ $weather_to_text{$weather} } = $warn_level; } return \%literal_warnings; } sub _parse_details { my $content = shift; my (%data); my $p = HTML::TreeBuilder->new_from_content( decode_entities $cont ent); $data{fullname} = $p->look_down( _tag => q{h1} )->as_text; if ( $p->look_down( _tag => q{div}, class => q{warnbox awt nt l l1} ) ) { $data{warnings} = 'no warnings'; } else { my @warnboxes = $p->look_down( _tag => q{div}, class => qr/warnbox awt/ ); for my $warnbox (@warnboxes) { my ($as_txt); my @info_divs = $warnbox->look_down( _tag => q{div}, class => q{info} ); $as_txt = $info_divs[0]->as_text; my ( $from, $until ) = $as_txt =~ /valid from (.*) Until ( .*)$/; $as_txt = $info_divs[1]->as_text; my ( $warning, $level ) = $as_txt =~ /(.+?)\s+Awareness Level:\s+(.*)/; $warning =~ s/s$//; my $text = $warnbox->look_down( _tag => q{div}, class => q{text} )->as_text; $data{warnings}{ lc $warning } = { #lower case for constistency level => $level, from => $from, until => $until, text => $text, }; } } return \%data; } sub _parse_codes { my $content = shift; my $p = HTML::TreeBuilder->new_from_content($content); my (%data); #my @cells = $p->look_down( _tag => 'div', class => 'flags' ); my @cells = $p->look_down( _tag => qr{a} ); for my $cell (@cells) { if ( $cell->attr('xlink:href') ) { if ( $cell->attr('xlink:href') =~ /\/([A-Z][A-Z]\d+)-(.+?) .html/ ) { my $code = $1; my $fullname = $2; $data{$fullname} = $code; } } } return \%data; } sub _make_user_agent { my $ua = shift; $ua = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:55.0) Gecko/20100101 Fire fox/55.0' unless ($ua); return $ua; } sub _extract_details_fullname { my $content = shift; my $region; if ( $content =~ /<h1>Weather warnings: (.+?)<\/h1>/ ) { $region = $1; decode_entities($region); if ( $region =~ /.??<.*<\/a>/ ) { $region =~ s/.??<.*<\/a>//; } } else { carp "Can't get region name\n"; } return $region; } 1; __END__ =head1 NAME B<Meteoalarm> - OO Interface for meteoalarm.eu =head1 SYNOPSIS This Module gets weather warnings from meteoalarm.eu. For further reading of terms and conditions see http://meteoalarm.eu/t erms.php?lang=en_UK use Meteoalarm; my $meteo = Meteoalarm->new( 'user_agent' => 'Meteobot 0.001' ); my $countries = $meteo -> countries ('type' => 'all', 'day' => 'today' ); foreach my $country_code (sort keys %{$countries}){ print "Country: $countries->{$country_code}->{'fullname'}\n"; print "Tendency = $countries->{$country_code}->{tendency}\n" if ( $countries->{$country_code}->{'tendency'}); if (keys %{$countries->{$country_code}->{'warnings'}}){ foreach my $warning (keys %{$countries->{$country_code}->{'warning s'}}){ print "Event: $warning, severity: $countries->{$country_co de}->{'warnings'}->{$warning}\n"; } } else {print "No Warnings\n";} } my $regions = $meteo->regions( 'country_code' => 'PT', 'day' => 'today ', 'type' => 'all' ); foreach my $code ( sort keys %{$regions} ) { print "Region : $regions->{$code}->{'fullname'}: region_code = $co de\n" if ( keys %{ $regions->{$code}->{'warnings'} } ); print "Tendency = $regions->{$code}->{tendency}\n" if ( $regions-> {$code}->{'tendency'}); foreach my $type ( keys %{ $regions->{$code}->{'warnings'} } ) { print "$type Severity: $regions->{$code}->{'warnings'}->{$type}\n"; } } my $details = $meteo->details( 'region_code' => 'UK010', 'day' => 'tod ay'); my $name = $details->{'fullname'}; print "$name\n"; if ( $details->{warnings} eq 'no warnings' ) { print $details->{warnings}, "\n"; } else { foreach my $warning ( keys %{ $details->{'warnings'} } ) { print "$warning\n"; foreach my $detail ( keys %{ $details->{'warnings'}->{$warning } } ) { print "$detail: $details->{'warnings'}->{$warning}->{$deta il}\n"; } } } my $codes = $meteo->codes('FR'); my @codes = $meteo->codes(); foreach my $code (@codes) { foreach my $region ( sort keys %{$code} ) { print "Region name: $region, region code: $code->{$region}\n"; } } =head1 DESCRIPTION $meteo -> countries returns hashref of warnings for all countries. $meteo -> regions returns hashref of warnings for all regions in a spe cified country $meteo -> details returns hashref of detailled warnings for a specifie d region $meteo -> codes returns arrayref of hash of name and region code of a country =head1 METHODS =head1 new( ) creates a new meteoalarm object =head2 Optional Arguments: new( 'user_agent' => 'Meteobot 0.001'); changes the user agent string =head1 my $country = $meteo -> countries (); =head2 Optional Arguments: 'day' => 'today' || 'tomorrow' if day is not defined, default value is today 'type' => 'all' || 'wind' || 'snow' || 'ice' || 'snow/ice' || 'snow' || 'ice' || 'thunderstorm' || 'fog' || 'extreme high temperature' || 'extreme low temperature' || 'coastal event' || 'fores tfire' || 'avalanches' || 'rain' if type is not defined, default type is all =head1 $regions = $meteo -> regions ('country_code' => 'DE'); country_code is a 2 letter abbreviation =head2 Optional arguments: day=> 'today' || 'tomorrow' if day is not defined, default value is today =head1 $details = $meteo->details ('region_code' => 'ES005'); region_code consits of 2 letters for the country and 3 digits =head2 Optional arguments: day=> 'today' || 'tomorrow' if day is not defined, default value is today =head1 $code = $meteo -> codes (); Returns arrayref of hash for region names and codes for all countries =head2 Optional Arguments $code = $meteo -> codes ('PL'); Countrycode for a specific country =cut

[Nov 17, 2017] Safe string handling

Nov 17, 2017 | perlmonks.com


2 direct replies -- Read more / Contribute by tdlewis77
on Aug 25, 2017 at 13:07

Dealing with data that comes from webpages can be really complicated. There is likely to be a combination of ASCII, UTF-8, and wide characters in the data returned and you cannot depend on the website to tell you what type of content is being returned. The routines safeString, safeSubstr, testString, and trueLength can be used to easily manipulate these strings. Pass any string to safeString and you will never get a wide character warning from print. Use safeSubstr to extract complete UTF-8 characters sequences from a string. Use testString to tell you what's really in the string. Use trueLength to find out how many characters wide the output will be. # This string has a mixture of ASCII, UTF-8, 2 byte wide, and 4 byte # wide characters my $crazy = "Hello\x{26c4}".encode("utf-8","\x{26f0}"). "\x{10102}\x{2fa1b}"; # Now the string only has ASCII and UTF-8 characters my $sane = safeString($crazy); # testString($crazy) returns 7 # testString($sane) returns 3 # length($sane) returns 19 # trueLength($sane) returns 9 my $snowman = safeSubstr($crazy,5,1); ######################################## # safeString($string) # return a safe version of the string sub safeString { my ($string) = @_; return "" unless defined($string); my $t = testString($string); return $string if $t <= 3; return encode("utf-8",$string) if $t <= 5; # The string has both UTF-8 and wide characters so it needs # tender-loving care my @s = unpack('C*',$string); my @r; for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { push @r, $s[$i]; $i++; } elsif ($s[$i] > 255) { # encode a wide character push @r,unpack("C*",encode("utf-8",chr($s[$i]))); $i++; } else { # copy all the utf-8 bytes $n = _charBytes($i,@s) - 1; map { push @r, $s[$i+$_] } 0..$n; $i += $n + 1; } } return pack("C*",@r); } ######################################## # safeSubstr($string,$pos,$n) # return a safe substring (treats utf-8 sequences as a single # character) sub safeSubstr { my ($string,$pos,$n) = @_; $s = safeString($string); my $p = 0; my $rPos = 0; my $rEnd = -1; my @s = unpack('C*',$s); for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { $i++; } elsif ($s[$i] > 255) { $i++; } else { $i += _charBytes($i,@s); } $p++; $rPos = $i if $p == $pos; $rEnd = $i-1 if $p == $pos + $n; } $rEnd = scalar(@s) - 1 if $rEnd < 0; return "" if $rPos > $rEnd; my @r; map { push @r, $s[$_] } $rPos..$rEnd; return pack("C*",@r); } ######################################## # testString($string) # returns information about the characters in the string # # The 1, 2, and 4 bits of the result are for ASCII, UTF-8, and # wide characters respectively. If multiple bits are set, # characters of each type appear in the string. If the result is: # <= 1 simple ASCII string # <= 3 simple UTF-8 string # >3 && <= 5 mixed ASCII & wide characters # >= 6 mixed UTF-8 & wide characters sub testString { my ($s) = @_; return undef unless defined($s); my $r = 0; my @s = unpack('C*',$s); for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { $r |= 1; $i++; } elsif ($s[$i] > 255) { $r |= 4; $i++; } else { $r |= 2; $i += _charBytes($i,@s); } } return $r; } ######################################## # trueLength($string) # returns the number of UTF-8 characters in a string sub trueLength { my ($s) = @_; return unless defined($s); my $len = 0; my @s = unpack('C*',$s); for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { $i++; } elsif ($s[$i] > 255) { $i++; } else { $i += _charBytes($i,@s); } $len++; } return $len; } ######################################## # String support routines sub _charBytes { my $n = shift(@_); my $len = scalar(@_); if ($_[$n] < 128) { return 1; } elsif ($_[$n] > 65535) { return 4; } elsif ($_[$n] > 255) { return 2; } elsif (($_[$n] & 0xFC) == 0xFC) { return min(6,$len); } elsif (($_[$n] & 0xF8) == 0xF8) { return min(5,$len); } elsif (($_[$n] & 0xF0) == 0xF0) { return min(4,$len); } elsif (($_[$n] & 0xE0) == 0xE0) { return min(3,$len); } elsif (($_[$n] & 0xC0) == 0xC0) { return min(2,$len); } else { return 1; } }

[Nov 17, 2017] ndexed Flat File databases (for ISAM, NoSQL, Perl Embedded databases)

Nov 17, 2017 | perlmonks.com


1 direct reply -- Read more / Contribute by erichansen1836
on Oct 08, 2017 at 11:13

TOPIC: FAST!! Random Access Indexed, Relational Flat File Databases, Indexed by external Perl SDBM databases of key/value pairs tied to program "in memory" hash tables, where the Key in the Key/Value Pair is one or more fields and/or partial fields concatenated together (separated by a delimiter such as a pipe "|") and contained within the Flat File records for you to arbitrarily seek to a single record or a sorted/related group of records within your database.

Since it has been over 2 years ago since I first posted about this TOPIC I discovered, I wanted to alert the Perl community to the original thread where you can find Perl source code now for examples of how to implement Joint Database Technology/Methodology. Inparticular the King James Bible Navigator software DEMO I posted which used FlatFile/SDBM for its database. I have made this a native Windows GUI application (TreeView/RichEdit COMBO interface) to demonstrate how to show your end-users a summary of the information of the data contained within a database, and allow them to drill down to a small amount of specific information (e.g. verses within a single book/chapter) for actual viewing (and retrieving from the database). The TreeView Double Click Event was originally written to random access the first verse within a chapter, then sequentially access the remaining verses within a chapter - performing a READ for each verse. I posted a separate modified TreeView Double Click Event for you to insert into the Application which reads an entire chapter in one (1) giant READ, breaking out the individual verses (into an array) using the UNPACK statement. -- Eric

Joint Database Technology: http://www.perlmonks.org/?node_id=1121222

[Nov 17, 2017] How do the Perl 6 set operations compare elements?

Nov 17, 2017 | stackoverflow.com

Ask Question up vote down vote favorite 1

brian d foy ,Nov 26, 2016 at 4:32

Running under moar (2016.10)

Consider this code that constructs a set and tests for membership:

my $num_set = set( < 1 2 3 4 > );
say "set: ", $num_set.perl;
say "4 is in set: ", 4 ∈ $num_set;
say "IntStr 4 is in set: ", IntStr.new(4, "Four") ∈ $num_set;
say "IntStr(4,...) is 4: ", IntStr.new(4, "Four") == 4;
say "5 is in set: ", 5 ∈ $num_set;

A straight 4 is not in the set, but the IntStr version is:

set: set(IntStr.new(4, "4"),IntStr.new(1, "1"),IntStr.new(2, "2"),IntStr.new(3, "3"))
4 is in set: False
IntStr 4 is in set: True
IntStr(4,...) is 4: True
5 is in set: False

I think most people aren't going to expect this, but the docs doesn't say anything about how this might work. I don't have this problem if I don't use the quote words (i.e. set( 1, 2, 3, 4) ).

timotimo ,Nov 26, 2016 at 5:47

You took a wrong turn in the middle. The important part is what nqp::existskey is called with: the k.WHICH . This method is there for value types, i.e. immutable types where the value - rather than identity - defines if two things are supposed to be the same thing (even if created twice). It returns a string representation of an object's value that is equal for two things that are supposed to be equal. For <1>.WHICH you get IntStr|1 and for 1.WHICH you get just Int|1 .

brian d foy ,Nov 26, 2016 at 6:18

Ah, okay. I can see a lot of pain for regular people trying to debug these things. – brian d foy Nov 26 '16 at 6:18

smls ,Nov 26, 2016 at 14:46

As explained in the Set documentation, sets compare object identity, same as the === operator:

Within a Set, every element is guaranteed to be unique (in the sense that no two elements would compare positively with the === operator)

The identity of an object is defined by the .WHICH method, as timotimo elaborates in his answer.

brian d foy ,Nov 26, 2016 at 21:28

That's not really clear from that statement. That's talking about which elements are in the set. Beyond that, even if you choose to compare with ===, you have to know how other things are stored. This is the sort of info that should show up next to the Set operators. – brian d foy Nov 26 '16 at 21:28

brian d foy ,Nov 26, 2016 at 23:09

Indeed, I think I've found a bug. The qw docs says this should be true: < a b 137 > eqv ( 'a', 'b', '137' ) , but in the same version of Rakudo Star I get false. It's different object types on each side. – brian d foy Nov 26 '16 at 23:09

brian d foy ,Nov 26, 2016 at 23:16

Despite all this, your answer was the A-ha! moment that led me to look at the right thing. Thanks for all of your help. – brian d foy Nov 26 '16 at 23:16

raiph ,Nov 27, 2016 at 4:50

Write your list of numbers using commas

As you mention in your answer, your code works if you write your numbers as a simple comma separated list rather than using the <...> construct.

Here's why:

4 ∈ set 1, 2, 3, 4 # True

A bare numeric literal in code like the 4 to the left of constructs a single value with a numeric type. (In this case the type is Int, an integer.) If a set constructor receives a list of similar literals on the right then everything works out fine.

<1 2 3 4> produces a list of "dual values"

The various <...> "quote words" constructs turn the list of whitespace separated literal elements within the angle brackets into an output list of values.

The foundational variant ( qw<...> ) outputs nothing but strings. Using it for your use case doesn't work:

4 ∈ set qw<1 2 3 4> # False

The 4 on the left constructs a single numeric value, type Int . In the meantime the set constructor receives a list of strings, type Str : ('1','2','3','4') . The operator doesn't find an Int in the set because all the values are Str s so returns False .

Moving along, the huffmanized <...> variant outputs Str s unless an element is recognized as a number. If an element is recognized as a number then the output value is a "dual value". For example a 1 becomes an IntStr .

According to the doc "an IntStr can be used interchangeably where one might use a Str or an Int". But can it?

Your scenario is a case in point. While 1 ∈ set 1,2,3 and <1> ∈ set <1 2 3> both work, 1 ∈ set <1 2 3> and <1> ∈ set 1, 2, 3 both return False .

So it seems the operator isn't living up to the quoted doc's claim of dual value interchangeability

This may already be recognized as a bug in the set operation and/or other operations. Even if not, this sharp "dual value" edge of the <...> list constructor may eventually be viewed as sufficiently painful that Perl 6 needs to change.

brian d foy ,Nov 26, 2016 at 23:29

I think this is a bug, but not in the set stuff. The other answers were very helpful in sorting out what was important and what wasn't.

I used the angle-brackets form of the quote words . The quote words form is supposed to be equivalent to the quoting version (that is, True under eqv ). Here's the doc example:

<a b c> eqv ('a', 'b', 'c')

But, when I try this with a word that is all digits, this is broken:

 $ perl6
 > < a b 137 > eqv ( 'a', 'b', '137' )
 False

But, the other forms work:

> qw/ a b 137 / eqv ( 'a', 'b', '137' )
True
> Q:w/ a b 137 / eqv ( 'a', 'b', '137' )
True

The angle-bracket word quoting uses IntStr :

> my @n = < a b 137 >
[a b 137]
> @n.perl
["a", "b", IntStr.new(137, "137")]

Without the word quoting, the digits word comes out as [Str]:

> ( 'a', 'b', '137' ).perl
("a", "b", "137")
> ( 'a', 'b', '137' )[*-1].perl
"137"
> ( 'a', 'b', '137' )[*-1].WHAT
(Str)
> my @n = ( 'a', 'b', '137' );
[a b 137]
> @n[*-1].WHAT
(Str)

You typically see these sorts of errors when there are two code paths to get to a final result instead of shared code that converges to one path very early. That's what I would look for if I wanted to track this down (but, I need to work on the book!)

This does highlight, though, that you have to be very careful about sets. Even if this bug was fixed, there are other, non-buggy ways that eqv can fail. I would have still failed because 4 as Int is not "4" as Str . I think this level of attention to data types in unperly in it's DWIMery. It's certainly something I'd have to explain very carefully in a classroom and still watch everyone mess up on it.

For what it's worth, I think the results of gist tend to be misleading in their oversimplification, and sometimes the results of perl aren't rich enough (e.g. hiding Str which forces me to .WHAT ). The more I use those, the less useful I find them.

But, knowing that I messed up before I even started would have saved me from that code spelunking that ended up meaning nothing!

Christoph ,Nov 26, 2016 at 23:55

Could you clarify what you consider the bug to be? As far as I can tell, this is all by design: (a) <...> goes through &val , which returns allomorphs if possible (b) set membership is defined in terms of identity, which distinguishes between allomorphs and their corresponding value types; so I would not classify it as a bug, but 'broken' by design; or phrased another way, it's just the WAT that comes with this particular DWIMChristoph Nov 26 '16 at 23:55

Brad Gilbert ,Nov 26, 2016 at 23:59

This was intentionally added, and is part of the testsuite . ( I can't seem to find anywhere that tests for < > being equivalent to q:w:v< > and << >> / " " being equivalent to qq:ww:v<< >> ) – Brad Gilbert Nov 26 '16 at 23:59

brian d foy ,Nov 27, 2016 at 0:02

The docs say the two lists should be eqv, and they are not. If they are not meant to be equivalent, the docs need to change. Nothing in docs.perl6.org/language/quoting#Word_quoting:_qw mentions any of this stuff. – brian d foy Nov 27 '16 at 0:02

Christoph ,Nov 27, 2016 at 0:17

The documentation seems to be just wrong here, <...> does not correspond to qw(...) , but qw:v(...) . Cf S02 for the description of the adverb and this test that Brad was <del>looking for</del> already linked toChristoph Nov 27 '16 at 0:17

Christoph ,Nov 27, 2016 at 0:45

or perhaps not outright wrong, but rather 'just' misleading: <...> is indeed a :w form, and the given example code does compare equal according to eqvChristoph Nov 27 '16 at 0:45

dwarring ,Nov 27, 2016 at 18:33

Just to add to the other answers and point out a consistancy here between sets and object hashes .

An object hash is declared as my %object-hash{Any} . This effectively hashes on objects .WHICH method, which is similar to how sets distinguish individual members.

Substituting the set with an object hash:

my %obj-hash{Any};

%obj-hash< 1 2 3 4 > = Any;
say "hash: ", %obj-hash.keys.perl;
say "4 is in hash: ", %obj-hash{4}:exists;
say "IntStr 4 is in hash: ", %obj-hash{ IntStr.new(4, "Four") }:exists;
say "IntStr(4,...) is 4: ", IntStr.new(4, "Four") == 4;
say "5 is in hash: ", %obj-hash{5}:exists;

gives similar results to your original example:

hash: (IntStr.new(4, "4"), IntStr.new(1, "1"), IntStr.new(2, "2"), IntStr.new(3, "3")).Seq
4 is in hash: False
IntStr 4 is in hash: True
IntStr(4,...) is 4: True
5 is in hash: False

brian d foy ,Nov 29, 2016 at 21:52

Oh, this is going to suck hard while teaching a class. – brian d foy Nov 29 '16 at 21:52

dwarring ,Nov 30, 2016 at 4:21

I agree its not great, as it is. – dwarring Nov 30 '16 at 4:21

dwarring ,Nov 30, 2016 at 18:26

Have raised an RT rt.perl.org/Ticket/Display.html?id=130222dwarring Nov 30 '16 at 18:26

[Nov 17, 2017] Introducing TestSimple for testing Perl programs - YouTube

Nov 17, 2017 | www.youtube.com

Published on Oct 17, 2015

For details visit: http://perlmaven.com/introducing-test... Category Science & Technology License Standard YouTube License

[Nov 17, 2017] Bruce Gray - Your Perl 5 Brain, on Perl 6 by Bruce Gray

Nov 17, 2017 | www.youtube.com

Published on Jun 21, 2017

In which I detail the Perl 6 elements that have most changed my Perl 5 coding, and share the Perl 5 techniques I have adopted.

I eat, sleep, live, and breathe Perl!

Consultant and Contract Programmer Frequent PerlMongers speaker Dedicated Shakespeare theater-goer Armchair Mathematician Author of Blue_Tiger, a tool for modernizing Perl.

36 years coding 22 years Perl 16 years Married 15 YAPCs 7 Hackathons 3 PerlWhirls Perl interests: Refactoring, Perl Idioms / Micropatterns, RosettaCode, and Perl 6.

[Nov 17, 2017] Bit operations in Perl

Nov 17, 2017 | stackoverflow.com

Ask Question up vote down vote favorite

Toren ,Jan 12, 2011 at 14:50

I have an attribute (32 bits-long), that each bit responsible to specific functionality. Perl script I'm writing should turn on 4th bit, but save previous definitions of other bits.

I use in my program:

Sub BitOperationOnAttr
{

my $a="";

MyGetFunc( $a);

$a |= 0x00000008;

MySetFunc( $a);

}

** MyGetFunc/ MySetFunc my own functions that know read/fix value.

Questions:

  1. if usage of $a |= 0x00000008; is right ?
  2. how extract hex value by Regular Expression from string I have : For example:

"Attribute: Somestring: value (8 long (0x8))"

Michael Carman ,Jan 12, 2011 at 16:13

Your questions are not related; they should be posted separately. That makes it easier for other people with similar questions to find them. – Michael Carman Jan 12 '11 at 16:13

toolic ,Jan 12, 2011 at 16:47

Same question asked on PerlMonks: perlmonks.org/?node_id=881892toolic Jan 12 '11 at 16:47

psmears ,Jan 12, 2011 at 15:00

  1. if usage of $a |= 0x00000008; is right ?

Yes, this is fine.

  1. how extract hex value by Regular Expression from string I have : For example:

"Attribute: Somestring: value (8 long (0x8))"

I'm assuming you have a string like the above, and want to use a regular expression to extract the "0x8". In that case, something like:

if ($string =~ m/0x([0-9a-fA-F]+)/) {
    $value = hex($1);
} else {
    # string didn't match
}

should work.

Michael Carman ,Jan 12, 2011 at 16:32

Perl provides several ways for dealing with binary data:

Your scenario sounds like a set of packed flags. The bitwise operators are a good fit for this:

my $mask = 1 << 3;   # 0x0008
$value |=  $mask;    # set bit
$value &= ~$mask;    # clear bit
if ($value & $mask)  # check bit

vec is designed for use with bit vectors. (Each element has the same size, which must be a power of two.) It could work here as well:

vec($value, 3, 1) = 1;  # set bit
vec($value, 3, 1) = 0;  # clear bit
if (vec($value, 3, 1))  # check bit

pack and unpack are better suited for working with things like C structs or endianness.

sdaau ,Jul 15, 2014 at 5:01

I upvoted, but there is something very important missing: vec operates on a string!

If we use a number; say:

$val=5;

printf("b%08b",$val);

(this gives b00000101 ) -- then one can see that the "check bit" syntax, say:

for($ix=7;$ix>=0;$ix--) {

print vec($val, $ix, 1);

};

print "\n";

will not work (it gives 00110101 , which is not the same number). The correct is to convert the number to ASCII char, i.e.

print vec(sprintf("%c", $val), $ix, 1); .

sdaau Jul 15 '14 at 5:01

[Nov 17, 2017] date - How to convert epoch seconds to normal time in perl - Stack Overflow

Nov 17, 2017 | stackoverflow.com

confused ,2 days ago

I have a string of epoch seconds "1510652305" which when i convert to normal time on unix command line using
`date -d @1510652305`

i get Tue Nov 14 15:08:25 IST 2017

But when i tried it in perl using something like this

use POSIX qw(strftime);
use Time::Local;
use Time::localtime;

$kickoff_time=1510652305;
$kickoff_time=ctime($kickoff_time);

i get

Thu Jan 1 05:30:00 1970

How can i achieve the result i am getting in linux in perl?

Thanks!!

mwp ,2 days ago

Don't overthink it!
my $kickoff_time = localtime 1510652305;
say $kickoff_time; # Tue Nov 14 15:08:25 2017

If you absolutely, positively need the timezone in there:

use POSIX qw{strftime};

my $kickoff_time = strftime '%a %b %e %H:%M:%S %Z %Y', localtime 1510652305;
say $kickoff_time; # Tue Nov 14 15:08:25 IST 2017

Note that this is locale-dependent.

confused ,yesterday

We have to use localtime to convert in time from epoch seconds and gmtime to convert in time from normal seconds i got it now....Thanks!! – confused yesterday

mob ,yesterday

Still confused. Both localtime and gmtime expect the input to be epoch seconds. – mob yesterday

mwp ,4 hours ago

Exactly. To expound, localtime() takes the epoch and returns a string (or date parts array) representing the time in your local timezone; gmtime() takes the epoch and returns a string (or date parts array) representing the time in UTC. – mwp 4 hours ago

,

I would recommend using Time::Piece for this job - it's core in perl.
#!/usr/bin/env perl

use strict;
use warnings; 
use Time::Piece;

my $t = localtime ( 1510652305 );

print $t;

It'll print default format, or you can use formatted using strftime .

[Nov 17, 2017] perl modules

Nov 17, 2017 | perlmonks.com

Discipulus (Monsignor) on Nov 16, 2017 at 09:04 UTC

Re: perl modules

Hello codestroman and welcome to the monastery and to the wonderful world of Perl!

First of all, please, add <c> code tags </c> around your code and output.

Then be sure to have read the standard documentation: perlmod and perlnewmod

Infact a basic perl module define a package and use Exporter to export functions in the using perl program.

In my homenode i've collected a lot of links on about module creation

L*


Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

thanos1983 (Priest) on Nov 16, 2017 at 09:17 UTC

Re: perl modules

Hello codestroman

Just to add a minor suggestion here, to the full cover reply of fellow monk Discipulus . It will assist you a lot also to read Simple Module Tutorial

Update: Direct answer to your question can be found here How to add a relative directory to @INC with multiple possible solutions. I would strongly recommend to go through all the articles that all monks proposed.

Hope this helps, BR.

Seeking for Perl wisdom...on the process of learning...not there...yet!

hippo (Abbot) on Nov 16, 2017 at 09:21 UTC

Re: perl modules (Can't locate in @INC)
PLEASE HELP!!

This is a monastery - a place of quite contemplation. The louder you shout the less wisdom shall you receive.

The error message Can't locate dog.pm in @INC is pretty explicit. Either your module file is not called dog.pm in which case, change it or else your file dog.pm is not in any of the directories listed in @INC in which case either move it to one of those directories or else change @INC with use lib .

I also see, despite the lack of formatting in your post that your module doesn't use any namespace. You should probably address that. Perhaps a solid read through Simple Module Tutorial would be a good idea?

Anonymous Monk on Nov 16, 2017 at 09:07 UTC

Re: perl modules

use an absolute pathname in use lib

Anonymous Monk on Nov 16, 2017 at 15:16 UTC

Re: perl modules

Welcome to the language ... and, to the Monastery. The "simple module tutorial" listed above is a very good place to start. Like all languages of its kind, Perl looks at runtime for external modules in a prescribed list of places, in a specified order. You can affect this in several ways, as the tutorials describe. Please read them carefully.

In the Perl(-5) language, this list is stored in a pre-defined array variable called @INC and it is populated from a variety of sources: a base-list that is compiled directly into the Perl interpreter, the PERL5LIB environment-variable, use lib statements, and even direct modification of the variable itself. Perl searches this list from beginning to end and processes (only) the first matching file that it finds.

(Note that, in Perl, the use statement is actually a pragma, or declaration to the compiler, and as such it has many "uses" and a somewhat complicated syntax.)

Corion (Pope) on Nov 16, 2017 at 15:23 UTC

Re^2: perl modules


by Corion (Pope) on Nov 16, 2017 at 15:23 UTC

(Note that, in Perl, the use statement is actually a pragma, or declaration to the compiler, and as such it has many "uses" and a somewhat complicated syntax.)

Please no.

The word "pragma" has a special meaning in Perl, and it is highly confusing to claim that a Perl "keyword" would be a "pragma". use certainly is a keyword and nothing else.

If you mean to say something different, please describe in more words what you want to say.

[Nov 17, 2017] Using the built-in debugger of Perl by Gabor Szabo

Youtube video, 9 min.
Nov 17, 2017 | www.youtube.com

Perl comes with a very powerful built-in command line debugger. In this screencast you can see basics how to use it.

For blog entries and for more screencasts see http://perlmaven.com/

About Perl Programming and Perl programmers.

For the blog entry of this screencast visit
http://perlmaven.com/using-the-built-...

Debugger commands used:
q - quit,
h - help,
p - print,
s - step in,
n - step over,
r - step out,
T - stack trace
l - listing code

The Padre project can be found here: http://padre.perlide.org/

The book mentioned was Pro Perl Debugging: http://www.apress.com/9781590594544

If you are interested an on-site Perl training contact me http://szabgab.com/contact.html

[Nov 17, 2017] Modern Perl by chromatic

Notable quotes:
"... 'shift key br0ken' ..."
"... # appease the Mitchell estate ..."
Nov 17, 2017 | www.amazon.com

Regex Modifiers

Several modifiers change the behavior of the regular expression operators. These modifiers appear at the end of the match, substitution, and qr// operators. For example, here's how to enable case-insensitive matching:

​ my ​ $pet = ​ 'ELLie' ​;
like $pet, ​ qr ​/Ellie/, ​ 'Nice puppy!' ​;
like $pet, ​ qr ​/Ellie/i, ​ 'shift key br0ken' ​;

The first like() will fail because the strings contain different letters. The second like() will pass, because the /i modifier causes the regex to ignore case distinctions. and are effectively equivalent in the second regex due to the modifier.

You may also embed regex modifiers within a pattern:

​ my ​ $find_a_cat = ​ qr ​/(?<feline>(?i)cat)/;

The (?i) syntax enables case-insensitive matching only for its enclosing group -- in this case, the named capture. You may use multiple modifiers with this form. Disable specific modifiers by preceding them with the minus character ( ):

​ my ​ $find_a_rational = ​ qr ​/(?<number>(?-i)Rat)/;

... ... ...

The /e modifier lets you write arbitrary code on the right side of a substitution operation. If the match succeeds, the regex engine will use the return value of that code as the substitution value. The earlier global substitution example could be simpler with code like the following:

# appease the Mitchell estate
$sequel =~ ​ ​{Scarlett( O​ 'Hara)?}
{
' ​Mauve​ ' . defined $1
? ' ​ Midway​ '
: ''
}ge;

Each additional occurrence of the /e modifier will cause another evaluation of the result of the expression, though only Perl golfers use anything beyond /ee

[Nov 16, 2017] Connecting with NetFTP in Perl, but fails to upload - Stack Overflow

Nov 16, 2017 | stackoverflow.com

Andrew Newby, Nov 10 at 11:33

I am trying to use Net::FTP ( http://search.cpan.org/~shay/libnet-3.10/lib/Net/FTP.pm ) to upload a file to a remote server. I have:
use Net::FTP;

my $ftp = Net::FTP->new("example.com", Debug => 1) or die "Cannot connect to example.com: $@";

$ftp->login("username",'xxxx') or die "Cannot login ", $ftp->message;
$ftp->pasv();
$ftp->binary();
$ftp->cwd("/web/example.com/public_html/cgi-bin/links/admin/IMPORT") or die "Cannot change working directory ", $ftp->message;
print "Currently in: " . $ftp->pwd(). "\n";
$ftp->put("/home/chambres/web/example.com/public_html/cgi-bin/links/admin/org.csv") or die "Cannot upload ", $ftp->message;
$ftp->quit;

However, when I run it I get:

Net::FTP>>> Net::FTP(3.05)
Net::FTP>>>   Exporter(5.72)
Net::FTP>>>   Net::Cmd(3.05)
Net::FTP>>>   IO::Socket::SSL(2.024)
Net::FTP>>>     IO::Socket::IP(0.37)
Net::FTP>>>       IO::Socket(1.38)
Net::FTP>>>         IO::Handle(1.35)
Net::FTP=GLOB(0x182e348)<<< 220 (vsFTPd 3.0.3)
Net::FTP=GLOB(0x182e348)>>> USER username
Net::FTP=GLOB(0x182e348)<<< 331 Please specify the password.
Net::FTP=GLOB(0x182e348)>>> PASS ....
Net::FTP=GLOB(0x182e348)<<< 230 Login successful.
Net::FTP=GLOB(0x182e348)>>> EPSV
Net::FTP=GLOB(0x182e348)<<< 229 Entering Extended Passive Mode (|||12065|)
Net::FTP=GLOB(0x182e348)>>> TYPE I
Net::FTP=GLOB(0x182e348)<<< 200 Switching to Binary mode.
Net::FTP=GLOB(0x182e348)>>> CWD /web/example.com/public_html/cgi-bin/links/admin/IMPORT
Net::FTP=GLOB(0x182e348)<<< 250 Directory successfully changed.
Net::FTP=GLOB(0x182e348)>>> PWD
Net::FTP=GLOB(0x182e348)<<< 257 "/web/example.com/public_html/cgi-bin/links/admin/IMPORT" is the current directory
Currently in: /web/example.com/public_html/cgi-bin/links/admin/IMPORT
Net::FTP=GLOB(0x182e348)>>> PORT 139,162,208,252,155,199
Net::FTP=GLOB(0x182e348)<<< 200 PORT command successful. Consider using PASV.
Net::FTP=GLOB(0x182e348)>>> FEAT
Net::FTP=GLOB(0x182e348)<<< 211-Features:
Net::FTP=GLOB(0x182e348)<<<  EPRT
Net::FTP=GLOB(0x182e348)<<<  EPSV
Net::FTP=GLOB(0x182e348)<<<  MDTM
Net::FTP=GLOB(0x182e348)<<<  PASV
Net::FTP=GLOB(0x182e348)<<<  REST STREAM
Net::FTP=GLOB(0x182e348)<<<  SIZE
Net::FTP=GLOB(0x182e348)<<<  TVFS
Net::FTP=GLOB(0x182e348)<<< 211 End
Net::FTP=GLOB(0x182e348)>>> HELP ALLO
Net::FTP=GLOB(0x182e348)<<< 214-The following commands are recognized.
Net::FTP=GLOB(0x182e348)<<<  ABOR ACCT ALLO APPE CDUP CWD  DELE EPRT EPSV FEAT HELP LIST MDTM MKD
Net::FTP=GLOB(0x182e348)<<<  MODE NLST NOOP OPTS PASS PASV PORT PWD  QUIT REIN REST RETR RMD  RNFR
Net::FTP=GLOB(0x182e348)<<<  RNTO SITE SIZE SMNT STAT STOR STOU STRU SYST TYPE USER XCUP XCWD XMKD
Net::FTP=GLOB(0x182e348)<<<  XPWD XRMD
Net::FTP=GLOB(0x182e348)<<< 214 Help OK.
Net::FTP=GLOB(0x182e348)>>> ALLO 37954326
Net::FTP=GLOB(0x182e348)<<< 202 ALLO command ignored.
Net::FTP=GLOB(0x182e348)>>> STOR org.csv
Net::FTP=GLOB(0x182e348)<<< 425 Failed to establish connection.
<h1>Software error:</h1>
<pre>Cannot upload Failed to establish connection.
</pre>
<p>
For help, please send mail to this site's webmaster, giving this error message
and the time and date of the error.

</p>
[Fri Nov 10 10:57:33 2017] export-csv-other-sites.cgi: Cannot upload Failed to establish connection.

It seems to work up until the put() command. Any ideas as to what is going on?

Gerhard Barnard, Nov 10 at 11:36

huh? $ftp->put("/home/chambres/web/example.com/public_html/cgi- That seems incomplete. – Gerhard Barnard Nov 10 at 11:36

Gerhard Barnard, Nov 10 at 11:38

Secondly, it is not connecting. It tells you that twice Net::FTP=GLOB(0x182e348)<<< 425 Failed to establish connection. and then again <pre>Cannot upload Failed to establish connectionGerhard Barnard Nov 10 at 11:38

Andrew Newby, Nov 10 at 11:41

@GerhardBarnard - I know that :) The weird part, is that it says it IS connected: Currently in: /web/example.com/public_html/cgi-bin/links/admin/IMPORT . – Andrew Newby Nov 10 at 11:41

Andrew Newby, Nov 10 at 11:42

"That seems incomplete." - what seems incomplete? – Andrew Newby Nov 10 at 11:42

Gerhard Barnard, Nov 10 at 11:43

I suspect it is not keeping the connection open. can you also fix the code? it seems incomplete. $ftp->put("/home/chambres/web/example.com/public_html/cgi-Gerhard Barnard Nov 10 at 11:43
Net::FTP=GLOB(0x182e348)>>> PORT 139,162,208,252,155,199
Net::FTP=GLOB(0x182e348)<<< 200 PORT command successful. Consider using PASV.

FTP uses a control connection for the command and data connections for each data transfer. With the PORT command your local system is instructing the server to connect to the given IP address (139.162.208.252) and port (39879=155*256+199). Connecting from outside to some arbitrary port on your system will not work if you are behind a firewall or some NAT or if there is a firewall configured on your system. In these cases it might work to use the passive mode where the client opens a connection to the server and not the server a connection to the client.

Net::FTP=GLOB(0x182e348)>>> STOR org.csv
Net::FTP=GLOB(0x182e348)<<< 425 Failed to establish connection.

It looks like the server could not connect to your system in order to create a connection to transfer the data. Probably a firewall or NAT involved. Try passive mode.

It looks like that you tried to use passive mode already:

$ftp->pasv();
...
Net::FTP=GLOB(0x182e348)>>> EPSV
Net::FTP=GLOB(0x182e348)<<< 229 Entering Extended Passive Mode (|||12065|)

Only you did it the wrong way. The command above just sends the PASV/EPSV command to the server but does not change which mode gets used for the next data transfer. To cite from the documentation :

If for some reason you want to have complete control over the data connection, this includes generating it and calling the response method when required, then the user can use these methods to do so.
However calling these methods only affects the use of the methods above that can return a data connection. They have no effect on methods get, put, put_unique and those that do not require data connections.

To instead enable passive mode in connection with put , get etc use passive not pasv :

$ftp->passive(1);

[Nov 16, 2017] Re^4 Strange behaviour of tr function in case the set1 is supplied by a variable

Nov 16, 2017 | perlmonks.com

likbez

// is an abbreviation for m// (be careful of context). But // is can be replaced by (almost?) any delimiter, by using m or s or tr.

You make a very good point. Now I started to understand why they put description of tr, which is actually a function into this strange place

http://perldoc.perl.org/perlop.html#Quote-Like-Operators
Strings with arbitrary delimiters after tr, m, s, etc are a special, additional type of literals. Each with its own rules. And those rules are different from rules that exist for single quoted strings, or double quoted strings or regex (three most popular types of literals in Perl).

For example, the treatment of backslash in "tr literal" is different from single quoted strings:

"A single-quoted, literal string. A backslash represents a backslash unless followed by the delimiter or another backslash, in which case the delimiter or backslash is interpolated."

This means that in Perl there is a dozen or so of different types of literals, each with its own idiosyncratic rules. Which create confusion even for long type Perl users as they tend to forget detail of constructs they use rarely and extrapolate them from more often used constructs.

For example, in my case, I was burned by the fact that "m literals" allows interpolation of variables, but "tr literals" do not. And even created a test case to study this behavior :-)

In other words, the nature of those "context-dependent-literals" (on the level of lexical scanner they are all literals) is completely defined not by delimiters they are using (which are arbitrary), but by the operator used before it. If there none, m is assumed.

This "design decision" (in retrospect this is a design decision, although in reality it was "absence of design decition" situation ;-) adds unnecessary complexity to the language and several new (and completely unnecessary) types of bugs.

This "design decision" is also poorly documented and for typical "possible blunders" (for tr that would be usage of "[","$","@" without preceding backslash) there is no warnings.

This trick of putting tr description into http://perldoc.perl.org/perlop.html that I mentioned before now can be viewed as an attempt to hide this additional complexity. It might be beneficial to revise the docs along the lines I proposed.

In reality in Perl q, qq, qr, m, s, tr are functions each of which accepts (and interpret) a specific, unique type of "context-dependent-literal" as the argument. That's the reality of this, pretty unique, situation with the language, as I see it.

Quote-Like-Operators shows 2 interesting examples with tr: tr[aeiouy][yuoiea] or tr(+\-*/)/ABCD/. [download]
The second variant look like a perversion for me. I never thought that this is possible. I thought that the "arbitrary delimiter" is "catched" after the operator and after that they should be uniform within the operator ;-).

And the first is not without problems either: if you "extrapolate" your skills with regex into tr you can write instead of tr[aeiouy][yuoiea] obviously incorrect< code>tr/ aeiouy /] yuoiea / that will work fine as long as strings are of equal length.

[Nov 16, 2017] Accessing a filehandle which is defined in main program from different modules

Nov 11, 2017 | stackoverflow.com

Rotch Miller, Nov 11 at 6:48

I have following query in Perl regarding the accessing of file handlers.

Consider the following code snippet which describes the exact scenario.

Main.pl
#!/usr/bin/perl -w
use warnings;
use strict;
use strict 'refs';

use File::Basename;
use Fcntl ':flock';

use feature qw/say switch/;

use File::Spec::Functions;
use File::Find;

require( "/home/rxa3kor/Mastering_Perl/sample.pm" );

our $LOGFILE = "sample";
open( LOGFILE, ">$LOGFILE" ) or die "__ERROR: can't open file\n'", $LOGFILE, "'!\n";
flock( LOGFILE, LOCK_EX );
print LOGFILE ( "Tool Start\n" );

&sample::func();

flock( LOGFILE, LOCK_UN );
close( LOGFILE );
sample.pm
#!/usr/bin/perl -w
package sample;

sub func() {
    print $main::LOGFILE ( "Printing in subroutine\n" );
}

when I execute the above said code snippet I am getting the following error.

print() on unopened filehandle Mastering at /home/rxa3kor/Mastering_Perl/sample.pm line 6.

Th error is because the filehandle LOGFILE is not visible under sample.pm module.

How this concept can be implemented?

I want to open a file in Main.pl and I need this file handle to be accessible in different Perl modules.

Dave Cross ,Nov 11 at 6:54

I don't think this is the code you are using. This code doesn't compile. You are missing a semicolon at the end of the use File::Find line. And once I fix that, I get another problem as you are not loading sample.pm in your main program. Please don't waste our time by posting sample code where we have to fix simple errors like that. – Dave Cross Nov 11 at 6:54

Dave Cross ,Nov 11 at 6:55

Two more errors. sample.pm does not return a true value. And the filename is different between this sample code and the error message that you quote. – Dave Cross Nov 11 at 6:55

Rotch Miller ,Nov 11 at 7:01

Basically i wanted to know whether we can open a file under main.pl and i need this file handle to be accessible in different Perl modules. – Rotch Miller Nov 11 at 7:01

DavidO ,Nov 11 at 7:11

This is unrelated to the problem that you are asking about, but what do you think will happen when you open your logfile in '>' mode, and then discover you're unable to obtain an exclusive lock because someone else has it locked? – DavidO Nov 11 at 7:11

DavidO ,Nov 11 at 7:13

Hint: Clobber-output mode will clobber the output file before you've obtained a lock. This means if someone else already had the file opened with a lock, you just clobbered them. – DavidO Nov 11 at 7:13

Dave Cross ,Nov 11 at 7:19

The reason why you're seeing this error is that $main::LOGFILE refers to the scalar variable $LOGFILE which contains the filename, sample . The filehandle, LOGFILE , is a completely different variable. And here we see the dangers of having two variables of different types (scalar vs filehandle) with the same name.

Bareword filehandles (the ones in capital letters with no sigil attached, the type you are using) are slightly strange variables. They don't need a sigil, so you shouldn't use one. So the simplest fix is to just remove the $ .

sub func()
{
  print main::LOGFILE ("Printing in subroutine\n");
}

But using global variables like this is a terrible idea. It will quickly lead to your code turning into an unmaintainable mess.

Far better to use a lexical filehandle and to pass that into your subroutine.

our $LOGFILE="sample";
open( my $log_fh, ">$LOGFILE" ) or die "__ERROR: can't open file\n'",$LOGFILE,"'!\n";
flock( $log_fh, LOCK_EX );
print $log_fh ("Tool Start\n");
&sample::func($log_fh);
flock( $log_fh, LOCK_UN );
close( $log_fh );

And in sample.pm :

sub func
{
  my ($fh) = @_;
  print $fh ("Printing in subroutine\n");
}

Note that as I'm now passing a parameter to func() . I've removed the prototype saying that it takes no parameters (although the fact that you were calling it with & turns off parameter checking!)

A few other points.

I'd write your code like this:

# main.pl
use warnings;
use strict;

use File::Basename; # Not used. Remove?
use Fcntl ':flock'; # Not user. Remove?
use feature qw/say switch/;
use File::Spec::Functions; # Not user. Remove?
use File::Find; # Not user. Remove?
use Sample;

my $LOGFILE = 'sample';
# Lexical filehandle. Three-arg version of open()
open( my $log_fh, '>', $LOGFILE )
  or die "__ERROR: can't open file\n'$LOGFILE'!\n";
flock( $log_fh, LOCK_EX );

print $log_fh ("Tool Start\n");
sample::func($log_fh);

flock( $log_fh, LOCK_UN );
close( $log_fh );

And...

package Sample;
use strict;
use warnings;

sub func {
  my ($fh) = @_;
  print $fh ("Printing in subroutine\n");
}

1;

Rotch Miller ,Nov 11 at 7:29

Is there any method where we can avoid passing the file handler to a subroutine ? I need to directly access the file handler in the perl module which is present in main,pl. Reason for this requirement is because i may have different Perl modules and different subroutines inside each modules, every time i need to pass the file handlers to each of these subroutines in Perl module. Another difficulty will be always subroutine need not be called from main.pl file, subroutine defined in a *.pm file may call other subroutine which is defined in another *.pm module. – Rotch Miller Nov 11 at 7:29

Dave Cross ,Nov 11 at 7:32

@RotchMiller: My answer already tells you how to do that. But I think it's a very bad idea. – Dave Cross Nov 11 at 7:32

Rotch Miller ,Nov 11 at 7:54

Main underlying problem is the way how the file handler's can be made visible in the subroutine of different Perl modules. Like how we have to export a scalar variables from one *.pm module to any perl modules using the EXPORTER, similar concept for file handlers would be good. – Rotch Miller Nov 11 at 7:54

Dave Cross ,Nov 11 at 7:59

@RotchMiller: Exporter works fine for filehandles. Obviously not if they're lexical variables. But for package variables and bareword filehandles, there's no problem. – Dave Cross Nov 11 at 7:59

Dave Cross ,Nov 11 at 8:03

@RotchMiller The traditional way to make a variable visible within subroutines in many different modules is to pass it in as a parameter. But if you want to ignore seventy years of good software engineering practice - feel free :-) – Dave Cross Nov 11 at 8:03

> ,

You've got an extremely detailed analysis from Dave Cross .

Here I'd like to offer a way to cleanly provide a log file for all modules to write to.

Introduce a module that performs the writes to a log file in a sub; load it by all modules that need that. In that sub open the log file to append, using state filehandle which thus stays open across the calls. Then the modules write by invoking this sub, and this can be initiated by a call from main .

The logger module

package LogAll;

use warnings;
use strict;
use feature qw(say state);
use Carp qw(croak);    
use Exporter qw(import);

our @EXPORT_OK = qw(write_log);

sub write_log {
    state $fh = do {               # initialize; stays open across calls
        my $log = 'LOG_FILE.txt';
        open my $afh, '>>', $log or croak "Can't open $log: $!";
        $afh;
    };  
    say $fh $_ for @_;
}
1;

Two other modules, that need to log, are virtually the same for this example; here is one

package Mod1;

use warnings;
use strict;

use Exporter qw(import);    
use LogAll qw(write_log);

our @EXPORT_OK = qw(f1);

sub f1 {
    write_log(__PACKAGE__ . ": @_");
}
1;

The main

use warnings;
use strict;

use LogAll qw(write_log);    
use Mod1 qw(f1);
use Mod2 qw(f2);

write_log('START');

f1("hi from " . __PACKAGE__);
f2("another " . __PACKAGE__);

A run results in the file LOG_FILE.txt

START
Mod1: hi from main
Mod2: another main

I print START for a demo but the file need not be opened from main .

Please develop the printer module further as suitable. For example, and a way for the file name to be passed optionally so that main can name the log (by varying type and number of arguments), and add a way to close the log controllably,

[Nov 16, 2017] Generating a range of Unicode characters

Notable quotes:
"... The auto-increment operator has a little extra builtin magic to it. If you increment a variable that is numeric, or that has ever been used in a numeric context, you get a normal increment. If, however, the variable has been used in only string contexts since it was set, and has a value that is not the empty string and matches the pattern /^ a-zA-Z * 0-9 *\z/ , the increment is done as a string, preserving each character within its range, with carry: ..."
"... print ++($foo = "99"); # prints "100" print ++($foo = "a0"); # prints "a1" print ++($foo = "Az"); # prints "Ba" print ++($foo = "zz"); # prints "aaa" [download] ..."
Nov 16, 2017 | perlmonks.com

davido (Archbishop) on Nov 16, 2017 at 05:46 UTC

Re: Generating a range of Unicode characters

Check out perlop Auto-increment and Auto-decrement for an explanation.

The thing to consider here is that the .. range operator leverages the semantics provided by ++ (auto-increment). The documentation for auto-increment says this:

The auto-increment operator has a little extra builtin magic to it. If you increment a variable that is numeric, or that has ever been used in a numeric context, you get a normal increment. If, however, the variable has been used in only string contexts since it was set, and has a value that is not the empty string and matches the pattern /^ a-zA-Z * 0-9 *\z/ , the increment is done as a string, preserving each character within its range, with carry:

print ++($foo = "99"); # prints "100" print ++($foo = "a0"); # prints "a1" print ++($foo = "Az"); # prints "Ba" print ++($foo = "zz"); # prints "aaa" [download]

The components of the range you are trying to construct do not meet the criteria for Perl's built-in autoincrement behavior.

However, if you're using Perl 5.26 or newer, and enable unicode_strings you can use the following, as documented in perlop Range Operators .

use charnames "greek"; my @greek_small = map { chr } (ord("\N{alpha}") .. ord("\N{omega}")); [download]

Or forgo the \N{charname} lookups and just use the actual ordinal values:

my @chars = map {chr} $ord_first .. $ord_last; [download]

Dave

Your Mother (Chancellor) on Nov 16, 2017 at 06:13 UTC

Re: Generating a range of Unicode characters

Is this what you're after?

perl -CSD -le 'print chr for 0xDF .. 0x0101' [download]

Update: I hadn't read all the way down davido 's post. He is making the same suggestion already at the end.

[Nov 16, 2017] regex - Parsing a whole file in Perl

Nov 12, 2017 | stackoverflow.com

pleriche, Nov 12 at 9:52

I have an HTML file containing a 2-column table which I want to parse in order to extract pairs of strings representing the columns. The page layout of the HTML (white space, new lines) is arbitrary, hence I can't parse the file line by line.

I recall that you can parse such a thing by slurping the whole file into a string and operating on the entire string, which I'm finding a bit more challenging. I'm trying things like the following:

#!/usr/bin/perl

open(FILE, "Glossary") || die "Couldn't open file\n";
@lines = <FILE>;
close(FILE);

$data = join(' ', @lines);

while ($data =~ /<tr>.*(<td>.*<\/td>).*(<td>.*<\/td>).*<\/tr>/g) {
    print $1, ":", $2, "\n";
}

which gives a null output. Here's a section of the input file:

<table class="wikitable">
    <tr>
        <td><b>Term</b>
        </td>
        <td><b>Meaning</b>
        </td></tr>
    <tr>
        <td><span id="0-Day">0-Day</span>
        </td>
        <td>
        <p>See <a href="#Zero_Day">Zero Day</a>.
        </p>
        </td>

Can someone help me out?

Borodin ,Nov 12 at 21:20

Use HTML::TableExtractBorodin Nov 12 at 21:20

zdim ,Nov 12 at 21:46

To correct my early comment (removed), while I recommend HTML::TreeBuilder for general parsing of HTML (and there are others), here you indeed want HTML::TableExtract . And you do not want to use regex – zdim Nov 12 at 21:46

Dave Cross ,2 days ago

You can't parse HTML with a regexDave Cross 2 days ago

Miguel Prz ,Nov 12 at 10:03

There is a HTML::TableExtract module in CPAN, which simplifies the problem you are trying to solve:
use strict;
use warnings;
use HTML::TableExtract qw(tree);

my $te = HTML::TableExtract->new( headers => qw(Term Meaning) );
my $html_file = "Glossary";
$te->parse_file($html_file);
my $table = $te->first_table_found;
# ...

pleriche ,yesterday

Thank you and I'm sure TableExtract is the better way of doing it, but the object of my question was to improve my understanding of how to use regular expressions since they're so central to Perl. Adding gs to the regexpr as someone suggested (since deleted) was the leg-up I needed. – pleriche yesterday

Miguel Prz ,yesterday

I see your point, and it's really important build a solid knowlegment on regexpr. But, like other people have said, it's not a goot idea apply regexpr to parsing html documents – Miguel Prz yesterday

,

You already have answers explaining why you shouldn't parse HTML with regexes. And you really shouldn't. But you've asked for an explanation of why your code doesn't work. So here goes...

You have two problems in your code. One stops it working and the other stops it working as you expect.

Firstly, you are using . in your regex to match any character. But . doesn't match any character. It matches any character except a newline. And you have newlines in your string. You fix that by adding the /s option to your match operator (so it has /gs instead of /s ).

With that fix in place, you get a result from your code. Using your test data, I see:

<td><b>Term</b>
         </td>:<td><b>Meaning</b>
         </td>

Which is correct. But looking at your test data, I wondered why I wasn't getting two results - because of the /g . I soon realised it was because your test data is missing the closing </td> . When I added that, I got this result:

<td><span id="0-Day">0-Day</span>
         </td>:<td>
         <p>See <a href="#Zero_Day">Zero Day</a>.
         </p>
         </td>

Ok. It's now finding the second result. But what has happened to the first one? That's the second error in your code.

You have .* a few times in your regex. That means "zero or more of any character". But it's the "or more" that is a problem here. By default, Perl regex qualifiers ( * or + ) are greedy. That means they will use up as much of the string as possible. And the first .* in your regex is eating up a lot of your string. All of it up to the second <tr> in fact.

The solution to that is to make the .* non-greedy. And you do that by adding ? to the end. So you can replace all of the .* with .*? . Having done that, I get this output:

<td><b>Term</b>
         </td>:<td><b>Meaning</b>
         </td>
<td><span id="0-Day">0-Day</span>
         </td>:<td>
         <p>See <a href="#Zero_Day">Zero Day</a>.
         </p>
         </td>

Which seems correct to me.

So, to summarise:

  1. By default, . doesn't match newlines. To do that, you need /s .
  2. Beware of greedy qualifiers.

[Nov 16, 2017] Reading/dumping a perl hash from shell

Nov 11, 2017 | stackoverflow.com

newbie ,Nov 11 at 0:27

I have a read-only perl file with a huge hash defined in it. Is there anyway for me to read this perl file and dump out the hash contents?

this is basic structure of the hash within the file.

%hash_name = {
    -files => [
         '<some_path>',
    ],
    -dirs => [
         '<some_path>',
         '<some_path>',
         '<some_path>',
         '<some_path>',
         '<some_path>',
    ],
};

Davy M ,Nov 11 at 0:30

can you not cat the file and redirect it into a one that does have write permissions? cat perl_file_name > new_perl_file_nameDavy M Nov 11 at 0:30

newbie ,Nov 11 at 0:34

yes I did consider that but will go with that approach only if there is no other way to dump the hash without creating a new file. – newbie Nov 11 at 0:34

zdim ,Nov 11 at 2:59

@newbie Thank you, and to repeat the question: Does this file have other Perl code or just this hash? Also, is the hash undeclared (just %hash_name ), as you show it, or is it "lexical," so with my such as: my %hash_name ? – zdim Nov 11 at 2:59

zdim ,Nov 11 at 3:12

@newbie What you show is invalid in Perl: the % in %hash_name indicates that the variable is a hash , but { .. } form a hash reference , which is a scalar variable (not a hash). So it should be either %hash_name = ( .. ) or it's $hashref_name = { .. }zdim Nov 11 at 3:12

Schwern ,Nov 11 at 6:59

Note this is an insecure way to store data. The data file must be evaluated as perl code. Any arbitrary code could be in the file. In addition, the data file can only be read by Perl programs. Instead, use JSON or similar data format. JSON::MaybeXS can convert between JSON and Perl. – Schwern Nov 11 at 6:59

zdim ,Nov 11 at 3:40

Ideally you'd copy the file so that you can edit it, then turn it into a module so to use it nicely.

But if for some reason this isn't feasible here are your options.

If that hash is the only thing in the file , "load" it using do and assign to a hash

use warnings;
use strict;

my $file = './read_this.pl';  # the file has *only* that one hash

my %hash = do $file;

This form of do executes the file (runs it as a script), returning the last expression that is evaluated. With only the hash in the file that last expression is the hash definition, precisely what you need.

If the hash is undeclared , so a global variable (or declared with our ), then declare as our a hash with the same name in your program and again load the file with do

our %hash_name;  # same name as in the file
do $file;        # file has "%hash" or "our %hash" (not "my %hash")

Here we "pick up" the hash that is evaluated as do runs the file by virtues of our

If the hash is "lexical" , declared as my %hash (as it should be!) ... well, this is bad. Then you need to parse the text of the file so to extract lines with the hash. This is in general very hard to do, as it amounts to parsing Perl. (A hash can be built using map , returned from a sub as a reference or a flat list ...) Once that is done you eval the variable which contains the text defining that hash.

However, if you know how the hash is built, as you imply, with no () anywhere inside

use warnings; 
use strict;

my $file = './read_this.pl';

my $content = do {  # "slurp" the file -- read it into a variable
    local $/;
    open my $fh, '<', $file or die "Can't open $file: $!";
    <$fh>;
};

my ($hash_text) = $content =~ /\%hash_name\s*=\s*(\(.*?\)/s;
my %hash = eval $hash_text;

This simple shot leaves out a lot, assuming squarely that the hash is as shown. Also note that this form of eval carries real and serious security risks.


Files are also loaded using require . Apart from it doing a lot more than do , the important thing here is that even if it runs multiple times require still loads that file only once . This matters for modules in the first place, which shouldn't be loaded multiple times, and use indeed uses require .

On the other hand, do does it every time, what makes it suitable for loading files to be used as data, which presumably should be read every time. This is the recommended method. Note that require itself uses do to actually load the file.

Thanks to Schwern for a comment.

Schwern ,Nov 11 at 4:31

do will always load the file. require will only load it once. Since you want to get data from the file, it's recommended to use do . Else the second or third time anything in that process loads the file they'll end up with 1 . – Schwern Nov 11 at 4:31

zdim ,Nov 11 at 4:41

@Schwern Right, thank you for the comment. I wanted to avoid excessive explanation thus I simply use do . (I still mention require since it is feasible that the data is loaded once.) But it is good to state this, thank you -- I am adding the comment. – zdim Nov 11 at 4:41

Schwern ,Nov 11 at 4:57

It's bad practice to use require because a future person maintaining the code may also require the same file elsewhere (not even in the same code file, it's per process) and not realize it has already been required. I'd suggest instead explaining why do is the right thing to do here instead of require , it's a necessary complexity. – Schwern Nov 11 at 4:57

zdim ,Nov 11 at 5:40

@Schwern A good point, thank you. Adjusted the post. – zdim Nov 11 at 5:40

zdim ,Nov 13 at 2:15

@DavyM Thank you for kind words. You are right, and there are so many such questions that it even seems more common (than otherwise) when people start out. All that we can do is to keep pointing it out, and write it always in full code examples. – zdim Nov 13 at 2:15

[Nov 16, 2017] perl perlpacktut not making sense for me - Stack Overflow

Nov 13, 2017 | stackoverflow.com

brian d foy ,Nov 13 at 2:34

The pack function puts one or more things together in a single string. It represents things as octets (bytes) in a way that it can unpack reliably in some other program. That program might be far away (like, the distance to Mars far away). It doesn't matter if it starts as something human readable or not. That's not the point.

Consider some task where you have a numeric ID that's up to about 65,000 and a string that might be up to six characters.

print pack 'S A6', 137, $ARGV[0];

It's easier to see what this is doing if you run it through a hex dumper as you run it:

$ perl pack.pl Snoopy | hexdump -C
00000000  89 00 53 6e 6f 6f 70 79                           |..Snoopy|

The first column counts the position in the output so ignore that. Then the first two octets represent the S (short, 'word', whatever, but two octets) format. I gave it the number 137 and it stored that as 0x8900. Then it stored 'Snoopy' in the next six octets.

Now try it with a shorter name:

$ perl test.pl Linus | hexdump -C
00000000  89 00 4c 69 6e 75 73 20                           |..Linus |

Now there's a space character at the end (0x20). The packed data still has six octets. Try it with a longer name:

$ perl test.pl 'Peppermint Patty' | hexdump -C
00000000  89 00 50 65 70 70 65 72                           |..Pepper|

Now it truncates the string to fit the six available spaces.

Consider the case where you immediately send this through a socket or some other way of communicating with something else. The thing on the other side knows it's going to get eight octets. It also knows that the first two will be the short and the next six will be the name. Suppose the other side stored that it $tidy_little_package . It gets the separate values by unpacking them:

my( $id, $name ) = unpack 'S A6', $tidy_little_package;

That's the idea. You can represent many values of different types in a binary format that's completely reversible. You send that packed string wherever it needs to be used.

I have many more examples of pack in Learning Perl and Programming Perl .

[Nov 16, 2017] perl - Grep Two Dimensional Array - Stack Overflow

Nov 16, 2017 | stackoverflow.com

Grep Two Dimensional Array Ask Question up vote down vote favorite

Taranasaur ,yesterday

Since this is not a question directly covered here, thought best I ask and answer it.

I had an issue where I wanted to add a node name to a list only if the same node doesn't already exist. The array was built using:

push (@fin_nodes, [$node, $hindex, $e->{$hip}->{FREQ}]);

So given when given array (@fin_nodes) that looks like:

$VAR1 = [
      'first-node',
      '4',
      3
    ];
$VAR2 = [
      'second-node',
      '1',
      3
    ];
$VAR3 = [
      'another-node',
      '1',
      5
    ];
$VAR4 = [
      'some-node',
      '0',
      5
    ];

To do a grep on this the following works:

my @match = grep { grep { $_ =~ $node } @$_ } @fin_nodes;

So given a $node "second-node" the above statement will return @match as:

$VAR1 = [
  'second-node',
  '1',
  3
];

Sobrique ,yesterday

Why not use a hash instead? – Sobrique yesterday

ysth ,yesterday

when dumping an array, do Data::Dumper::Dumper(\@array), not ...(@array). if passed a list, Dumper dumps each element individually, which is not what you want here – ysth yesterday

,

I would say "don't" and instead:
my %fin_nodes;
$fin_nodes{$node} = [$hindex, $e->{$hip}->{FREQ}]);

And then you can simply if ($fin_nodes{$node}) {

Failing that though - you don't need to grep every element, as your node name is always first.

So:

 my @matches = grep { $_ -> [0] eq $node } @fin_nodes;

eq is probably a better choice than =~ here, because the latter will substring match. (And worse, can potentially do some quite unexpected things if you've metacharacters in there, since you're not quoting or escaping them)

E.g. in your example - if you look for a node called "node" you'll get multiple hits.

Note - if you're only looking for one match, you can do something like:

my ( $first_match ) =  grep { $_ -> [0] eq $node } @fin_nodes;

This will just get you the first result, and the rest will be discarded. (Which isn't too efficient, because grep will continue to iterate the whole list).

Taranasaur ,yesterday

Your last statement was on point, I only needed one match. Then before pushing a node onto fin_nodes this was enough: "if (!$first_match)" – Taranasaur yesterday

Borodin ,yesterday

@Taranasaur: I think you missed the point of Sobrique's answer. A hash is by far the better choice for this, and you can simply write $fin_nodes{$node} //= [ $hindex, $e->{$hip}{FREQ} ] and avoid the need for any explicit test altogether. – Borodin yesterday

Taranasaur ,yesterday

@Borodin, no I do get Sobrique's point. The fin_nodes array is being used for a simple list function that another method is already using quite happily in my program. I will at some point go back and create a hash as there might be more attributes I'll need to include in that array/hash – Taranasaur yesterday

ysth ,yesterday

"because the latter will substring match" assuming no regex metacharacters; if there are any, it will be even worse – ysth yesterday

Sobrique ,yesterday

Good point @ysth I will add that. – Sobrique yesterday

[Nov 16, 2017] Generating a range of Unicode characters

Nov 16, 2017 | perlmonks.com

davido (Archbishop) on Nov 16, 2017 at 05:46 UTC

Re: Generating a range of Unicode characters

Check out perlop Auto-increment and Auto-decrement for an explanation.

The thing to consider here is that the .. range operator leverages the semantics provided by ++ (auto-increment). The documentation for auto-increment says this:

The auto-increment operator has a little extra builtin magic to it. If you increment a variable that is numeric, or that has ever been used in a numeric context, you get a normal increment. If, however, the variable has been used in only string contexts since it was set, and has a value that is not the empty string and matches the pattern /^ a-zA-Z * 0-9 *\z/ , the increment is done as a string, preserving each character within its range, with carry:

print ++($foo = "99"); # prints "100" print ++($foo = "a0"); # prints "a1" print ++($foo = "Az"); # prints "Ba" print ++($foo = "zz"); # prints "aaa" [download]

The components of the range you are trying to construct do not meet the criteria for Perl's built-in autoincrement behavior.

However, if you're using Perl 5.26 or newer, and enable unicode_strings you can use the following, as documented in perlop Range Operators .

use charnames "greek"; my @greek_small = map { chr } (ord("\N{alpha}") .. ord("\N{omega}")); [download]

Or forgo the \N{charname} lookups and just use the actual ordinal values:

my @chars = map {chr} $ord_first .. $ord_last; [download]

Dave

Your Mother (Chancellor) on Nov 16, 2017 at 06:13 UTC

Re: Generating a range of Unicode characters

Is this what you're after?

perl -CSD -le 'print chr for 0xDF .. 0x0101' [download]

Update: I hadn't read all the way down davido 's post. He is making the same suggestion already at the end.

[Nov 16, 2017] regex - Parsing a whole file in Perl

Nov 12, 2017 | stackoverflow.com

pleriche, Nov 12 at 9:52

I have an HTML file containing a 2-column table which I want to parse in order to extract pairs of strings representing the columns. The page layout of the HTML (white space, new lines) is arbitrary, hence I can't parse the file line by line.

I recall that you can parse such a thing by slurping the whole file into a string and operating on the entire string, which I'm finding a bit more challenging. I'm trying things like the following:

#!/usr/bin/perl

open(FILE, "Glossary") || die "Couldn't open file\n";
@lines = <FILE>;
close(FILE);

$data = join(' ', @lines);

while ($data =~ /<tr>.*(<td>.*<\/td>).*(<td>.*<\/td>).*<\/tr>/g) {
    print $1, ":", $2, "\n";
}

which gives a null output. Here's a section of the input file:

<table class="wikitable">
    <tr>
        <td><b>Term</b>
        </td>
        <td><b>Meaning</b>
        </td></tr>
    <tr>
        <td><span id="0-Day">0-Day</span>
        </td>
        <td>
        <p>See <a href="#Zero_Day">Zero Day</a>.
        </p>
        </td>

Can someone help me out?

Borodin ,Nov 12 at 21:20

Use HTML::TableExtractBorodin Nov 12 at 21:20

zdim ,Nov 12 at 21:46

To correct my early comment (removed), while I recommend HTML::TreeBuilder for general parsing of HTML (and there are others), here you indeed want HTML::TableExtract . And you do not want to use regex – zdim Nov 12 at 21:46

Dave Cross ,2 days ago

You can't parse HTML with a regexDave Cross 2 days ago

Miguel Prz ,Nov 12 at 10:03

There is a HTML::TableExtract module in CPAN, which simplifies the problem you are trying to solve:
use strict;
use warnings;
use HTML::TableExtract qw(tree);

my $te = HTML::TableExtract->new( headers => qw(Term Meaning) );
my $html_file = "Glossary";
$te->parse_file($html_file);
my $table = $te->first_table_found;
# ...

pleriche ,yesterday

Thank you and I'm sure TableExtract is the better way of doing it, but the object of my question was to improve my understanding of how to use regular expressions since they're so central to Perl. Adding gs to the regexpr as someone suggested (since deleted) was the leg-up I needed. – pleriche yesterday

Miguel Prz ,yesterday

I see your point, and it's really important build a solid knowlegment on regexpr. But, like other people have said, it's not a goot idea apply regexpr to parsing html documents – Miguel Prz yesterday

,

You already have answers explaining why you shouldn't parse HTML with regexes. And you really shouldn't. But you've asked for an explanation of why your code doesn't work. So here goes...

You have two problems in your code. One stops it working and the other stops it working as you expect.

Firstly, you are using . in your regex to match any character. But . doesn't match any character. It matches any character except a newline. And you have newlines in your string. You fix that by adding the /s option to your match operator (so it has /gs instead of /s ).

With that fix in place, you get a result from your code. Using your test data, I see:

<td><b>Term</b>
         </td>:<td><b>Meaning</b>
         </td>

Which is correct. But looking at your test data, I wondered why I wasn't getting two results - because of the /g . I soon realised it was because your test data is missing the closing </td> . When I added that, I got this result:

<td><span id="0-Day">0-Day</span>
         </td>:<td>
         <p>See <a href="#Zero_Day">Zero Day</a>.
         </p>
         </td>

Ok. It's now finding the second result. But what has happened to the first one? That's the second error in your code.

You have .* a few times in your regex. That means "zero or more of any character". But it's the "or more" that is a problem here. By default, Perl regex qualifiers ( * or + ) are greedy. That means they will use up as much of the string as possible. And the first .* in your regex is eating up a lot of your string. All of it up to the second <tr> in fact.

The solution to that is to make the .* non-greedy. And you do that by adding ? to the end. So you can replace all of the .* with .*? . Having done that, I get this output:

<td><b>Term</b>
         </td>:<td><b>Meaning</b>
         </td>
<td><span id="0-Day">0-Day</span>
         </td>:<td>
         <p>See <a href="#Zero_Day">Zero Day</a>.
         </p>
         </td>

Which seems correct to me.

So, to summarise:

  1. By default, . doesn't match newlines. To do that, you need /s .
  2. Beware of greedy qualifiers.

[Nov 16, 2017] Reading/dumping a perl hash from shell

Nov 11, 2017 | stackoverflow.com

newbie ,Nov 11 at 0:27

I have a read-only perl file with a huge hash defined in it. Is there anyway for me to read this perl file and dump out the hash contents?

this is basic structure of the hash within the file.

%hash_name = {
    -files => [
         '<some_path>',
    ],
    -dirs => [
         '<some_path>',
         '<some_path>',
         '<some_path>',
         '<some_path>',
         '<some_path>',
    ],
};

Davy M ,Nov 11 at 0:30

can you not cat the file and redirect it into a one that does have write permissions? cat perl_file_name > new_perl_file_nameDavy M Nov 11 at 0:30

newbie ,Nov 11 at 0:34

yes I did consider that but will go with that approach only if there is no other way to dump the hash without creating a new file. – newbie Nov 11 at 0:34

zdim ,Nov 11 at 2:59

@newbie Thank you, and to repeat the question: Does this file have other Perl code or just this hash? Also, is the hash undeclared (just %hash_name ), as you show it, or is it "lexical," so with my such as: my %hash_name ? – zdim Nov 11 at 2:59

zdim ,Nov 11 at 3:12

@newbie What you show is invalid in Perl: the % in %hash_name indicates that the variable is a hash , but { .. } form a hash reference , which is a scalar variable (not a hash). So it should be either %hash_name = ( .. ) or it's $hashref_name = { .. }zdim Nov 11 at 3:12

Schwern ,Nov 11 at 6:59

Note this is an insecure way to store data. The data file must be evaluated as perl code. Any arbitrary code could be in the file. In addition, the data file can only be read by Perl programs. Instead, use JSON or similar data format. JSON::MaybeXS can convert between JSON and Perl. – Schwern Nov 11 at 6:59

zdim ,Nov 11 at 3:40

Ideally you'd copy the file so that you can edit it, then turn it into a module so to use it nicely.

But if for some reason this isn't feasible here are your options.

If that hash is the only thing in the file , "load" it using do and assign to a hash

use warnings;
use strict;

my $file = './read_this.pl';  # the file has *only* that one hash

my %hash = do $file;

This form of do executes the file (runs it as a script), returning the last expression that is evaluated. With only the hash in the file that last expression is the hash definition, precisely what you need.

If the hash is undeclared , so a global variable (or declared with our ), then declare as our a hash with the same name in your program and again load the file with do

our %hash_name;  # same name as in the file
do $file;        # file has "%hash" or "our %hash" (not "my %hash")

Here we "pick up" the hash that is evaluated as do runs the file by virtues of our

If the hash is "lexical" , declared as my %hash (as it should be!) ... well, this is bad. Then you need to parse the text of the file so to extract lines with the hash. This is in general very hard to do, as it amounts to parsing Perl. (A hash can be built using map , returned from a sub as a reference or a flat list ...) Once that is done you eval the variable which contains the text defining that hash.

However, if you know how the hash is built, as you imply, with no () anywhere inside

use warnings; 
use strict;

my $file = './read_this.pl';

my $content = do {  # "slurp" the file -- read it into a variable
    local $/;
    open my $fh, '<', $file or die "Can't open $file: $!";
    <$fh>;
};

my ($hash_text) = $content =~ /\%hash_name\s*=\s*(\(.*?\)/s;
my %hash = eval $hash_text;

This simple shot leaves out a lot, assuming squarely that the hash is as shown. Also note that this form of eval carries real and serious security risks.


Files are also loaded using require . Apart from it doing a lot more than do , the important thing here is that even if it runs multiple times require still loads that file only once . This matters for modules in the first place, which shouldn't be loaded multiple times, and use indeed uses require .

On the other hand, do does it every time, what makes it suitable for loading files to be used as data, which presumably should be read every time. This is the recommended method. Note that require itself uses do to actually load the file.

Thanks to Schwern for a comment.

Schwern ,Nov 11 at 4:31

do will always load the file. require will only load it once. Since you want to get data from the file, it's recommended to use do . Else the second or third time anything in that process loads the file they'll end up with 1 . – Schwern Nov 11 at 4:31

zdim ,Nov 11 at 4:41

@Schwern Right, thank you for the comment. I wanted to avoid excessive explanation thus I simply use do . (I still mention require since it is feasible that the data is loaded once.) But it is good to state this, thank you -- I am adding the comment. – zdim Nov 11 at 4:41

Schwern ,Nov 11 at 4:57

It's bad practice to use require because a future person maintaining the code may also require the same file elsewhere (not even in the same code file, it's per process) and not realize it has already been required. I'd suggest instead explaining why do is the right thing to do here instead of require , it's a necessary complexity. – Schwern Nov 11 at 4:57

zdim ,Nov 11 at 5:40

@Schwern A good point, thank you. Adjusted the post. – zdim Nov 11 at 5:40

zdim ,Nov 13 at 2:15

@DavyM Thank you for kind words. You are right, and there are so many such questions that it even seems more common (than otherwise) when people start out. All that we can do is to keep pointing it out, and write it always in full code examples. – zdim Nov 13 at 2:15

[Nov 16, 2017] perl perlpacktut not making sense for me - Stack Overflow

Nov 13, 2017 | stackoverflow.com

brian d foy ,Nov 13 at 2:34

The pack function puts one or more things together in a single string. It represents things as octets (bytes) in a way that it can unpack reliably in some other program. That program might be far away (like, the distance to Mars far away). It doesn't matter if it starts as something human readable or not. That's not the point.

Consider some task where you have a numeric ID that's up to about 65,000 and a string that might be up to six characters.

print pack 'S A6', 137, $ARGV[0];

It's easier to see what this is doing if you run it through a hex dumper as you run it:

$ perl pack.pl Snoopy | hexdump -C
00000000  89 00 53 6e 6f 6f 70 79                           |..Snoopy|

The first column counts the position in the output so ignore that. Then the first two octets represent the S (short, 'word', whatever, but two octets) format. I gave it the number 137 and it stored that as 0x8900. Then it stored 'Snoopy' in the next six octets.

Now try it with a shorter name:

$ perl test.pl Linus | hexdump -C
00000000  89 00 4c 69 6e 75 73 20                           |..Linus |

Now there's a space character at the end (0x20). The packed data still has six octets. Try it with a longer name:

$ perl test.pl 'Peppermint Patty' | hexdump -C
00000000  89 00 50 65 70 70 65 72                           |..Pepper|

Now it truncates the string to fit the six available spaces.

Consider the case where you immediately send this through a socket or some other way of communicating with something else. The thing on the other side knows it's going to get eight octets. It also knows that the first two will be the short and the next six will be the name. Suppose the other side stored that it $tidy_little_package . It gets the separate values by unpacking them:

my( $id, $name ) = unpack 'S A6', $tidy_little_package;

That's the idea. You can represent many values of different types in a binary format that's completely reversible. You send that packed string wherever it needs to be used.

I have many more examples of pack in Learning Perl and Programming Perl .

[Nov 16, 2017] perl - Grep Two Dimensional Array - Stack Overflow

Nov 16, 2017 | stackoverflow.com

Grep Two Dimensional Array Ask Question up vote down vote favorite

Taranasaur ,yesterday

Since this is not a question directly covered here, thought best I ask and answer it.

I had an issue where I wanted to add a node name to a list only if the same node doesn't already exist. The array was built using:

push (@fin_nodes, [$node, $hindex, $e->{$hip}->{FREQ}]);

So given when given array (@fin_nodes) that looks like:

$VAR1 = [
      'first-node',
      '4',
      3
    ];
$VAR2 = [
      'second-node',
      '1',
      3
    ];
$VAR3 = [
      'another-node',
      '1',
      5
    ];
$VAR4 = [
      'some-node',
      '0',
      5
    ];

To do a grep on this the following works:

my @match = grep { grep { $_ =~ $node } @$_ } @fin_nodes;

So given a $node "second-node" the above statement will return @match as:

$VAR1 = [
  'second-node',
  '1',
  3
];

Sobrique ,yesterday

Why not use a hash instead? – Sobrique yesterday

ysth ,yesterday

when dumping an array, do Data::Dumper::Dumper(\@array), not ...(@array). if passed a list, Dumper dumps each element individually, which is not what you want here – ysth yesterday

,

I would say "don't" and instead:
my %fin_nodes;
$fin_nodes{$node} = [$hindex, $e->{$hip}->{FREQ}]);

And then you can simply if ($fin_nodes{$node}) {

Failing that though - you don't need to grep every element, as your node name is always first.

So:

 my @matches = grep { $_ -> [0] eq $node } @fin_nodes;

eq is probably a better choice than =~ here, because the latter will substring match. (And worse, can potentially do some quite unexpected things if you've metacharacters in there, since you're not quoting or escaping them)

E.g. in your example - if you look for a node called "node" you'll get multiple hits.

Note - if you're only looking for one match, you can do something like:

my ( $first_match ) =  grep { $_ -> [0] eq $node } @fin_nodes;

This will just get you the first result, and the rest will be discarded. (Which isn't too efficient, because grep will continue to iterate the whole list).

Taranasaur ,yesterday

Your last statement was on point, I only needed one match. Then before pushing a node onto fin_nodes this was enough: "if (!$first_match)" – Taranasaur yesterday

Borodin ,yesterday

@Taranasaur: I think you missed the point of Sobrique's answer. A hash is by far the better choice for this, and you can simply write $fin_nodes{$node} //= [ $hindex, $e->{$hip}{FREQ} ] and avoid the need for any explicit test altogether. – Borodin yesterday

Taranasaur ,yesterday

@Borodin, no I do get Sobrique's point. The fin_nodes array is being used for a simple list function that another method is already using quite happily in my program. I will at some point go back and create a hash as there might be more attributes I'll need to include in that array/hash – Taranasaur yesterday

ysth ,yesterday

"because the latter will substring match" assuming no regex metacharacters; if there are any, it will be even worse – ysth yesterday

Sobrique ,yesterday

Good point @ysth I will add that. – Sobrique yesterday

[Nov 16, 2017] Namespaces and modules

Nov 16, 2017 | perlmonks.com

on Feb 09, 2015 at 13:21 UTC ( # 1116049 = perlquestion : print w/replies , xml ) Need Help?? kzwix has asked for the wisdom of the Perl Monks concerning the following question:

Greetings, Ô wise monks !

I come to you because of a mystery I'd like to unravel: The module import code doesn't work as I expected. So, as I'm thinking that it probably is a problem with my chair-keyboard interface, rather than with the language, I need your help.

So, there are these modules I have, the first one goes like this:

use utf8; use Date::Manip; use LogsMarcoPolo; package LibOutils; BEGIN { require Exporter; # set the version for version checking our $VERSION = 1.00; # Inherit from Exporter to export functions and variables our @ISA = qw(Exporter); # Functions and variables which are exported by default our @EXPORT = qw(getDateDuJour getHeureActuelle getInfosSemaine ge tTailleRepertoire getInfosPartition getHashInfosContenuRepertoire dor mir); # Functions and variables which can be optionally exported our @EXPORT_OK = qw(); } # Under this line are definitions of local variables, and the subs. [download]

I also have another module, which goes like that:

use utf8; use strict; use warnings; use Cwd; # Module "CORE" use Encode; use LibOutils qw(getHeureActuelle); package LogsMarcoPolo; BEGIN { require Exporter; # set the version for version checking our $VERSION = 1.00; # Inherit from Exporter to export functions and variables our @ISA = qw(Exporter); # Functions and variables which are exported by default our @EXPORT = qw(setNomProgramme ouvreFichierPourLog assigneFluxPo urLog pushFlux popFlux init printAndLog); # Functions and variables which can be optionally exported our @EXPORT_OK = qw(); } # Here are other definitions of variables and subs, which I removed fo r the sake of clarity sub init { my ($nomDuProgramme, $pathLogGeneral, $pathLogErreurs) = @_; my $date = LibOutils::getDateDuJour(); # La date de l'appel à init() my $time = LibOutils::getHeureActuelle(); # L'heure de l'appel à init() $nomProgramme = $nomDuProgramme; # Ouverture du flux pour STDOUT: my $stdout = assigneFluxPourLog(*STDOUT); # On l'ajoute à la liste de flux 'OUT': pushFlux('OUT', $stdout); # Ouverture du flux pour STDERR: my $stderr = assigneFluxPourLog(*STDERR); # On l'ajoute à la liste de flux 'ERR', et à la liste 'DUO': pushFlux('ERR', $stderr); pushFlux('DUO', $stderr); if (defined $pathLogGeneral) { my $plg = $pathLogGeneral; $plg =~ s/<DATE>/$date/g; $plg =~ s/<TIME>/$time/g; my $logG = ouvreFichierPourLog($plg); pushFlux('OUT', $logG); pushFlux('DUO', $logG); } if (defined $pathLogErreurs) { my $ple = $pathLogErreurs; $ple =~ s/<DATE>/$date/g; $ple =~ s/<TIME>/$time/g; my $logE = ouvreFichierPourLog($ple); pushFlux('ERR', $logE); pushFlux('DUO', $logE); } } [download]

Now, look at the second module: When, in the "init" sub, I call the getDateDuJour() and getHeureActuelle() functions with an explicit namespace, it works fine.

If I remove the prefix, it doesn't work, even for the function whose name I put in the "qw(...)" chain after the use.

Would a fellow monk know why ?

choroba (Bishop) on Feb 09, 2015 at 13:24 UTC

Re: Namespaces and modules

By putting package after the use clauses, you are importing all the functions to the "main" namespace, not into your package's namespace. Moving the package declaration up should help. لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

kzwix (Sexton) on Feb 09, 2015 at 13:34 UTC

Re^2: Namespaces and modules


by kzwix (Sexton) on Feb 09, 2015 at 13:34 UTC

I wonder, could it have something to do with loop-including ?

I mean, package "LibOutils" uses "LogsMarcoPolo" (for its logging system), but "LogsMarcoPolo" uses "LibOutils" for its dates and times.

Could that circular include be the origin of this bug ?

Anonymous Monk on Feb 09, 2015 at 14:18 UTC

Re^3: Namespaces and modules
by Anonymous Monk on Feb 09, 2015 at 14:18 UTC
I wonder, could it have something to do with loop-including ?

Circular dependencies don't automatically cause a problem, it also depends on what the module does in its body (which you haven't shown). If you think there is a problem, a short piece of example code that reproduces the problem would help, see http://sscce.org/

But first, did you try what choroba suggested ?

kzwix (Sexton) on Feb 09, 2015 at 15:04 UTC

Re^4: Namespaces and modules
by kzwix (Sexton) on Feb 09, 2015 at 15:04 UTC

Corion (Pope) on Feb 09, 2015 at 15:11 UTC

Re^5: Namespaces and modules
by Corion (Pope) on Feb 09, 2015 at 15:11 UTC

Anonymous Monk on Feb 09, 2015 at 15:59 UTC

Re^5: Namespaces and modules
by Anonymous Monk on Feb 09, 2015 at 15:59 UTC

Anonymous Monk on Feb 09, 2015 at 14:11 UTC

Re: Namespaces and modules
doesn't work as I expected ... it works fine ... it doesn't work

What are the exact error messages? What is the expected behavior vs. the behavior you're getting? See How do I post a question effectively?

Replies are listed 'Best First'.

[Nov 16, 2017] perl - Passing an inner array to a function - Stack Overflow

Nov 16, 2017 | stackoverflow.com

,

There are no arrays in your code. And there are no method calls in your code.

Your hash is defined incorrectly. You cannot embed hashes inside other hashes. You need to use hash references. Like this:

my %data = (
    'a' => {
        x => 'Hello',
        y => 'World'
    },
    'b' => {
        x => 'Foo',
        y => 'Bar'
    }
);

Note, I'm using { ... } to define your inner hashes, not ( ... ) .

That still gives us an error though.

Type of arg 1 to main::p must be hash (not hash element) at passhash line 20, near "})"

If that's unclear, we can always try adding use diagnostics to get more details of the error:

(F) This function requires the argument in that position to be of a certain type. Arrays must be @NAME or @{EXPR}. Hashes must be %NAME or %{EXPR}. No implicit dereferencing is allowed--use the {EXPR} forms as an explicit dereference. See perlref.

Parameter type definitions come from prototypes. Your prototype is \% . People often think that means a hash reference. It doesn't. It means, "give me a real hash in this position and I'll take a reference to it and pass that reference to the subroutine".

(See, this is why people say that prototypes shouldn't be used in Perl - they often don't do what you think they do.)

You're not passing a hash. You're passing a hash reference. You can fix it by dereferencing the hash in the subroutine call.

p(%{$data{a}});

But that's a really silly idea. Take a hash reference and turn it into a hash, so that Perl can take its reference to pass it into a subroutine.

What you really want to do is to change the prototype to just $ so the subroutine accepts a hash reference. You can then check that you have a hash reference using ref .

But that's still overkill. People advise against using Perl prototypes for very good reasons. Just remove it

> ,

Your definition of the structure is wrong. Inner hashes need to use {} , not () .
my %data = (
    a => {
        x => 'Hello',
        y => 'World'
    },
    b => {
        x => 'Foo',
        y => 'Bar'
    }
);

Also, to get a single hash element, use $data{'a'} (or even $data{a} ), not %data{'a'} .

Moreover, see Why are Perl 5's function prototypes bad? on why not to use prototypes. After correcting the syntax as above, the code works even without the prototype. If you really need the prototype, use % , not \% . But you clearly don't know exactly what purpose prototypes serve, so don't use them.

[Nov 16, 2017] perl get reference to temp list returned by function without making a copy - Stack Overflow

Nov 16, 2017 | stackoverflow.com

newguy, 2 days ago

I have a function in perl that returns a list. It is my understanding that when foo() is assigned to list a copy is made:
sub foo() { return `ping 127.0.0.1` }

my @list = foo();

That @list then needs to be transferred to another list like @oldlist = @list; and another copy is made. So I was thinking can I just make a reference from the returned list like my $listref = \foo(); and then I can assign that reference, but that doesn't work.

The function I'm working with runs a command that returns a pretty big list (the ping command is just for example purposes) and I have call it often so I want to minimize the copies if possible. what is a good way to deal with that?

zdim ,2 days ago

Make an anonymous array reference of the list that is returned
my $listref = [ foo() ];

But, can you not return an arrayref to start with? That is better in general, too.


What you attempted "takes a reference of a list" ... what one cannot do in the literal sense; lists are "elusive" things , while a reference can be taken

By using the backslash operator on a variable, subroutine, or value.

and a "list" isn't either (with a subroutine we need syntax \&sub_name )

However, with the \ operator a reference is taken, either to all elements of the list if in list context

my @ref_of_LIST = \( 1,2,3 );  #-->  @ref_of_LIST: (\1, \2, \3)

or to a scalar if in scalar context, which is what happens in your attempt. Since your sub returns a list of values, they are evaluated by the comma operator and discarded, one by one, until the last one. The reference is then taken of that scalar

my $ref_of_LIST = \( 1,2,3 );  #--> $ref_of_LIST: \3

As it happens, all this applies without parens as well, with \foo() .

newguy ,2 days ago

I don't know how to return an array ref from a command that returns a list. Would it be acceptable to do it as return [`ping 1.2.3.4`];newguy 2 days ago

zdim ,2 days ago

@newguy Yes, that would be a fine way to do it. Another is to store the command's return in an array variable (say, @ary ) -- if you need it elsewhere in the sub -- and then return \@ary;zdim 2 days ago

newguy ,2 days ago

Ok thanks. Wouldn't the @ary way create a copy though – newguy 2 days ago

zdim ,2 days ago

@newguy For one, those elements must be stored somewhere, either anonymously by [ .. ] or associated with a named variable by @ary = .. . I don't know whether yet an extra copy is made in order to construct an array, but I'd expect that it isn't When you return \@ary no new copies are made. I would expect that they are about the same. – zdim 2 days ago

zdim ,2 days ago

@newguy I added an explanation of what happens with \foo()zdim 2 days ago

[Nov 16, 2017] Perl captured digits from string are always 1

Nov 16, 2017 | stackoverflow.com

The match operator in scalar context evaluates to a boolean that indicates whether the match succeeded or not.

my $success = $user =~ /(\d+)/;

The match operator in list context returns the captured strings (or 1 if there are no captures) on success and an empty list on error.

my ($num) = $user =~ /(\d+)/;

You used the former, but you want the latter. That gives you the following (after a few other small fixes):

sub next_level {
    my ($user) = @_;
    my ($num) = $user =~ /(\d+)\z/;
    $user =~ s/\d+\z//g;
    $user .= ++$num;
    return $user;
}

But that approach is complicated and inefficient. Simpler solution:

sub next_level {
    my ($user) = @_;
    $user =~ s/(\d+)\z/ $1 + 1 /e;
    return $user;
}

[Nov 16, 2017] regex - Use of uninitialized value $a in concatenation (.) or string - Stack Overflow

Nov 16, 2017 | stackoverflow.com

sampath, yesterday

I am trying to remove the old files in a dir if the count is more than 3 over SSH

Kindly suggest how to resolve the issue.

Please refer the code snippet

#!/usr/bin/perl
use strict;
use warnings;

my $HOME="/opt/app/latest";
my $LIBS="${HOME}/libs";
my $LIBS_BACKUP_DIR="${HOME}/libs_backups";
my $a;
my $b;
my $c;
my $d;

my $command =qq(sudo /bin/su - jenkins -c "ssh username\@server 'my $a=ls ${LIBS_BACKUP_DIR} | wc -l;my $b=`$a`;if ($b > 3); { print " Found More than 3 back up files , removing older files..";my $c=ls -tr ${LIBS_BACKUP_DIR} | head -1;my $d=`$c`;print "Old file name $d";}else { print "No of back up files are less then 3 .";} '");

print "$command\n";
system($command);

output:

sudo /bin/su - jenkins -c "ssh username@server 'my ; =ls /opt/app/latest/libs_backups | wc -l;my ; =``;if ( > 3); { print " Found More than 3 back up files , removing older files..";my ; =ls -tr /opt/app/latest/libs_backups | head -1;my ; =``;print "Old file name ";}else { print "No of back up files are less then 3 .";} '" Found: -c: line 0: unexpected EOF while looking for matching `'' Found: -c: line 1: syntax error: unexpected end of file

janh ,yesterday

Are you trying to execute parts of your local perl script in an ssh session on a remote server? That will not work. – janh yesterday

simbabque ,yesterday

Look into Object::Remote. Here is a good talk by the author from the German Perl Workshop 2014. It will essentially let you write Perl code locally, and execute it completely on a remote machine. It doesn't even matter what Perl version you have there. – simbabque yesterday

simbabque ,yesterday

You should also not use $a and $b . They are reserved global variables for sort . – simbabque yesterday

Chris Turner ,yesterday

Why are you sudoing when your command is running on an entirely different server? – Chris Turner yesterday

shawnhcorey ,yesterday

Never put sudo or su in a script. This is security breach. Instead run the script as sudo or su . – shawnhcorey yesterday
If you have three levels of escaping, you're bound to get it wrong if you do it manually. Use String::ShellQuote's shell_quote instead.

Furthermore, avoid generating code. You're bound to get it wrong! Pass the necessary information using arguments, the environment or some other channel of communication instead.

There were numerous errors in the interior Perl script on top of the fact that you tried to execute a Perl script without actually invoking perl !

#!/usr/bin/perl

use strict;
use warnings;

use String::ShellQuote qw( shell_quote );

my $HOME = "/opt/app/latest";
my $LIBS = "$HOME/libs";
my $LIBS_BACKUP_DIR = "$HOME/libs_backups";

my $perl_script = <<'__EOI__';
   use strict;
   use warnings;

   use String::ShellQuote qw( shell_quote );

   my ($LIBS_BACKUP_DIR) = @ARGV;

   my $cmd = shell_quote("ls", "-tr", "--", $LIBS_BACKUP_DIR);
   chomp( my @files =  `$cmd` );
   if (@files > 3) {
      print "Found more than 3 back up files. Removing older files...\n";
      print "$_\n" for @files;
   } else {
      print "Found three or fewer backup files.\n";
   }
__EOI__

my $remote_cmd = shell_quote("perl", "-e", $perl_script, "--", $LIBS_BACKUP_DIR);
my $ssh_cmd = shell_quote("ssh", 'username@server', "--", $remote_cmd);
my $local_cmd = shell_quote("sudo", "su", "-c", $ssh_ccmd);
system($local_cmd);

[Nov 15, 2017] converter (Priest)

on Jul 12, 2006 at 05:21 UTC ( # 560614 = perlquestion : print w/replies , xml ) Need Help?? converter has asked for the wisdom of the Perl Monks concerning the following question:

For the past several months I've been busy rewriting the horrible perl code left behind by my predecessor. His approach to development was "Write some code. If the code runs without revealing any of the damage it's done, ship it. If not, write some more code." This code is so bad that when co-workers ask me what I'm working on, I tell them "The Madman's Diary." Yes, it would have been cheaper and faster to throw this code away and start over, but I wasn't given that option.

My latest assignment is the repair of a tangled mess of a show-stopper that was discovered in a product that was supposed to ship today. After adding an open() override that logs the arguments to open() and some quality time with the watch(1) utility observing changes to the files containing the data that are causing the problem, I've narrowed the list of suspects down to a couple in-house scripts and a few (probably altered) webmin modules.

Now that I know where to look, I'd like to identify as quickly as possible which details can be safely ignored. I plan to use Devel::DProf to produce an execution graph for reference and Tie::Watch to watch variables, but I wonder if there are other tools that I should look at. A utility or module that would allow me to incrementally build a profile with persistent notes would be wonderful.

Debugging this code is a whole different game, and I'd really appreciate some input from other monks who've dealt with this type of problem.

converter

eyepopslikeamosquito (Chancellor) on Jul 12, 2006 at 08:30 UTC

Re: Strategies for maintenance of horrible code?
Yes, it would have been cheaper and faster to throw this code away and start over
Maybe. For another point of view, see Joel Spolsky on not rewriting from scratch .

I agree with adrianh . If a component is not broken, don't rewrite it. Rewrite a component when you find a number of bugs in it. But first write a regression test suite for the component. I've seen many folks over the years throw out old code, rewrite it ... and introduce a heap of new bugs in the process. If you come into a new company and introduce a swag of new bugs in previously working code, you will start to smell very badly.

See also:

GrandFather (Sage) on Jul 12, 2006 at 08:38 UTC

Re^2: Strategies for maintenance of horrible code?


by GrandFather (Sage) on Jul 12, 2006 at 08:38 UTC

Actually, just writing the tests is often a damn fine way of finding bugs. No exactly what OP is after at the moment, but something that is at the forefront of my mind because I'm in the middle of writing a set of tests (in Perl :) for some XML processing C++ code and turning up a pile of bugs as I go.

However it does suggest another test avenue: write test harnesses for modules so that you can exercise them in isolation and better understand how they work. If the test harness ends up part of a regression test system so much the better.


DWIM is Perl's answer to Gödel

tinita (Parson) on Jul 12, 2006 at 12:28 UTC

Re^2: Strategies for maintenance of horrible code?


by tinita (Parson) on Jul 12, 2006 at 12:28 UTC

Maybe. For another point of view, see Joel Spolsky on not rewriting from scratch.
uh oh. why does this remind me of perl 6? =)

adrianh (Chancellor) on Jul 12, 2006 at 07:49 UTC

Re: Strategies for maintenance of horrible code?
Debugging this code is a whole different game, and I'd really appreciate some input from other monks who've dealt with this type of problem.

I'd recommend reading Perl Medic and Working Effectively with Legacy Code (the latter isn't Perl specific - but is chock full of useful advice).

I would not spend any time fixing the code if it's not breaking (assuming you're not being paid to review/fix the code). However evil it may be - if it's doing it's job leave it alone.

Instead - every time you need to fix a bug or add some new functionality just test/refactor the bits of the evil code that are touched by the changes. I've found incrementally adding tests and refactoring to be much more effective than any sort of "big bang" fixing things for the sake of them approach :-)

If you are being paid to do a review/fix then Perl::Critic might give you some useful places to look.

webfiend (Vicar) on Jul 14, 2006 at 21:24 UTC

Re^2: Strategies for maintenance of horrible code?


by webfiend (Vicar) on Jul 14, 2006 at 21:24 UTC

Definitely agree about the approach of sorting things out with gradual refactoring and tests as the need arises. The problem with the "Big Bang" approach is that you have the potential for a very long stretch of time where there are two forks of the code: ugly shipping code that will need to be fixed and refactored as bugs are reported, and pretty nonfunctioning code that will need to incorporate those fixes as they are uncovered, resulting in a perpetual loop of "it's not quite ready yet."

Ovid (Cardinal) on Jul 12, 2006 at 10:37 UTC

Re: Strategies for maintenance of horrible code?

Check out Suggestions for working with poor code and some of the replies.

Cheers,
Ovid

New address of my CGI Course .

GrandFather (Sage) on Jul 12, 2006 at 07:39 UTC

Re: Strategies for maintenance of horrible code?

What tools are you using already and on what platform? For a large range of "detail" debugging there is nothing like as good as an IDE with a good integrated debugger. For a higher level view of where things are going Devel::TraceCalls may be handy, although it's output can be rather voluminous.


DWIM is Perl's answer to Gödel

Moron (Curate) on Jul 12, 2006 at 12:13 UTC

Re: Strategies for maintenance of horrible code?

Some basic CYA I can see:

1) Ensure there is sufficient functional and technical design documentation against which the routines can be tested.

2) (updated) Make sure there is a sufficiently detailed project plan to include tasks for: systems analysis, functional and technical design, test planning, test script writing (e.g. using Expect ), developing, unit-, integrated and functional testing, rework and implementation, to include a GANTT chart of the work done so far and by who to what % of completion, to avoid getting the blame for not meeting poorly conceived targets over which you had no control.

In response to formal testing against the plan, I find it a useful aid to bug-fixing to monitor execution with perl -d, setting breakpoints and examining variables to hunt down which line of code causes each failure.

-M

Free your mind

aufflick (Deacon) on Jul 13, 2006 at 00:17 UTC

Re: Strategies for maintenance of horrible code?

You might find the comments to my recent question Generating documentation from Perl code (not just POD) useful.

The Doxygen perl extension creates docs that are great for seeing what classes re-implement what methods etc. Also the UML::Sequence sounds intriguing - it pupports to generate a sequence diagram by monitoring code execution.

Anonymous Monk on Jul 12, 2006 at 06:40 UTC

Re: Strategies for maintenance of horrible code?

Ignore nothing. Whats the nature of the problem?

Replies are listed 'Best First'.

[Nov 15, 2017] Suggestions for working with poor code

Notable quotes:
"... Still looking for time to record time usage ..."
Nov 15, 2017 | perlmonks.com

Suggestions for working with poor code by Ovid (Cardinal)

on May 10, 2001 at 01:34 UTC ( # 79261 = perlmeditation : print w/replies , xml ) Need Help??

I am currently working on adding a fair amount of functionality to a Web site whose programs have been designed very poorly. Amongst other things, taint checking and strict have not been used. Code has been thrown together without regard to side effects, massive Here docs are used to output HTML, etc. Since I am getting a fair amount of experience with these issues, I thought I would offer some of my observations for fellow monks. Some of these pertain to the existing code and concentrates on 'quick fixes'. Some pertains to new code that's added.

Quick (?) Fixes Adding new functionality

Any and all tips that others wish to add are welcome!

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

dws (Chancellor) on May 10, 2001 at 01:51 UTC

Re: Suggestions for working with poor code

Bad formatting can hide a number of sins.

If necessary, the first thing I do when taking on bad code is reformat it. It doesn't matter whether it's Perl, Java, C, or HTML. A surprising number of problems (like mangled boolean conditions in branches and loops) fall right out when the code is tidied up so that you can actually see what it's doing.

Then it's a lot easier to get on with the fixes Ovid suggests.

tinman (Curate) on May 10, 2001 at 02:02 UTC

Re: Suggestions for working with poor code

I've found that taking a deep breath and a step back from the turmoil of badly written code can help immensely.. quite a few instances where you can see places where code can be consolidated into a single reusable library..

With this in mind, trying to understand the basic intent of the code is really important to me.. I write down a small note describing what each section of code tries to do... this allows me to focus on reuse as well as consolidate several segments together..

Related to this: in addition to liberal comments, updating documentation or in some cases, writing some document that describes the structure and function of a code block is very helpful to any person maintaining the code.. you don't have to wonder "what was that guy thinking" or "why did he do *that* ?".. its all there in a document.. and also provides a cursory overview of what has been going on without jumping straight into the code (I'm a big fan of the saying that goes "the less time you spend planning, the more time you spend coding" )...
Caveat: Docs that aren't updated are worse than useless, though...

clemburg (Curate) on May 10, 2001 at 12:20 UTC

Re: Suggestions for working with poor code

Track how long it takes you to fix bugs.

I agree enthusiastically. It will be your only argument when somebody comes and asks you where all the hours have gone. For this kind of job (take responsibility for badly written code, fixing bugs, etc.) this is an absolute must.

For these purposes, two little forms (or spreadsheets, or editor modes/templates, or whatever) will be very helpful (pedantically detailed discussion of these can be found in An Introduction to the Personal Software Process , electronic materials are available at The PSP Resource Page , including time tracking tools, emacs modes, forms, etc.):

  • Time recording log
  • Defect recording log

These are the essentials of both (header columns, add date, person, project, client, etc. as you need):

Time recording log:

  • Start Time
  • Stop Time
  • Interruption Time
  • Delta Time
  • Activity Category (coding, testing, reading docs - make up your own)
  • Comments (more detailed description of task)

Defect recording log:

  • Defect ID (e.g., sequential number)
  • Type (one of: documentation, syntax, build/package, assignment, interface, checking, data, function, system, environment - your own are welcome)
  • Inject Phase (when was the defect put into the program - estimate - design, coding, testing, linking, etc.)
  • Remove Phase (when was the defect found - compile time, testing, etc.)
  • Fix Time (how long did it take to fix)
  • Description (description of defect)

Contrary to what you may think, it does *not* take much time to use these forms (or similar means to record the information). But it will give you all the data you need to be sure you did the Right Thing, and the confidence and evidence to convince your boss or client that what you did was worth the time and the money.

Christian Lemburg
Brainbench MVP for Perl
http://www.brainbench.com

coreolyn (Parson) on May 11, 2001 at 18:55 UTC

Re: Re: Suggestions for working with poor code


by coreolyn (Parson) on May 11, 2001 at 18:55 UTC

You mean these logs haven't been automated into CPAN module yet??

coreolyn Still looking for time to record time usage

r.joseph (Hermit) on May 10, 2001 at 04:04 UTC

Re: Suggestions for working with poor code

Wonderful post Ovid - just added to my favs list. For someone who had the great misfortune a while back of inheriting a large, ill-maintained and astrociously coded website, I know what you mean and this post really highlights some of the main points that go into fixing it.

I also have to agree heartily with the replies, although I would like to add something. I find sometimes that it actually helps, with particularily insubordinate code, to take part of it out of the main file (say, a sub) and put it into another script that has major error-checking, lots of warnings and what not, and then test it from there. Sometimes this will yield a solution very quickly, and other times it has quickly allowed me to see what was wrong and what needed to be recoded.

Just thought I'd offer a quick idea...great job again!

r.
"Violence is a last resort of the incompetent" - Salvor Hardin, Foundation by Issac AsimovW

knobunc (Pilgrim) on May 10, 2001 at 18:19 UTC

Re: Suggestions for working with poor code

Very cool node.

With regard to the To Do list, I scatter them throughout my code if there is a place I need to do further work. However, I have a make rule for todo that searches for all of the lines with TODO in them and prints them out. So a usage of a TODO:

if ($whatever) { # TODO - Finish code to take over the world } [download]

Becomes:

To Do List Dir/file.pl 132: Finish code to take over the world [download]

When run through the following (ugly, suboptimal, but working) code in Tools/todo.sh :

#/bin/sh echo 'To Do List' find . -type f | xargs grep -n TODO | perl -ne '($file, $line, $rest) = split /:/, $_, 3; $file =~ s|^./||; $rest =~ s|.*?TODO.*?[-\s:]+||; $rest =~ s|"[.;,]\s*$||; $rest =~ s|\\n||g; print "$file $line: \u$r est\n"' | sort | uniq | grep -v '.#' | grep -v Makefile | grep -v CVS [download]

Which I call from my Makefile:

todo: Tools/todo.sh [download]

Kinda ugly, but it lets me put the TODO statements where I actually need to do the work. So I can proof out a block of code by writing narrative comments with TODO at the start of the line (behind comment characters of course). Then fill in the code later and not worry about missing a piece. Also since the TODOs are where the stuff needs to be filled in, I have lots of context around the issue and don't need to write as much as I would if they were at the top of the file. Plus anyone without something to do in the group can just type make todo and add some code. Finally, it is easier to add a TODO right where you need it, than bop up to the top of the file and then have to find where you were back in the code.

-ben

Replies are listed 'Best First'.

[Nov 15, 2017] A crucial element in controlling time is controlling the amount of detail needed to gain understanding. It is easy to lose sight of the forest for the trees.

Notable quotes:
"... The Perl Monks website has 83 data tables, two main type hierarchies (nodetypes and perl classes), a core engine of about 12K and about 600 additional code units spread throughout the database. Documentation is scattered and mostly out of date. ..."
"... The initial architecture seems solid but its features have been used inconsistently over time. ..."
Nov 15, 2017 | perlmonks.com

Re^2: Swallowing an elephant in 10 easy steps
by ELISHEVA (Prior) on Aug 13, 2009 at 18:27 UTC

The time drivers are the overall quality of the design, ease of access to code and database schemas, and the size of the system: the number of database tables, the complexity of the type/class system(s), the amount of code, and the number of features in whatever subsystem you explore in step 10. Rather than an average, I'll take the most recent example, Perl Monks.

The Perl Monks website has 83 data tables, two main type hierarchies (nodetypes and perl classes), a core engine of about 12K and about 600 additional code units spread throughout the database. Documentation is scattered and mostly out of date.

The initial architecture seems solid but its features have been used inconsistently over time. Accessing the schema and code samples is slow because there is no tarball to download - it has to be done through the web interface or manually cut and pasted into files off line. The database/class assessment (1-4) took about 16 hours. Steps 5-7 took about 30 hours. Steps 8-10 took about 24 hours. All told that is 70 hours, including writing up documentation and formatting it with HTML.

However, I always like to leave myself some breathing space. If I were contracting to learn a system that size, I'd want 90 hours and an opportunity to reassess time schedules after the initial code walk through was complete. If a system is very poorly designed this process takes somewhat longer.

A crucial element in controlling time is controlling the amount of detail needed to gain understanding. It is easy to lose sight of the forest for the trees. That is why I advise stopping and moving onto the next phase once your categories give a place to most design elements and the categories work together to tell story. That is also why I recommend backtracking as needed. Sometimes we make mistakes about which details really matter and which can be temporarily blackboxed. Knowing I can backtrack lets me err on the side of black boxing.

The other element affecting time is, of course, the skill of the analyst or developer. I have the advantage that I have worked both at the coding and the architecture level of software. I doubt I could work that fast if I didn't know how to read code fluently and trace the flow of data through code. Having been exposed to many different system designs over the years also helps - architectural strategies leave telltale footprints and experience helps me pick up on those quickly.

However one can also learn these skills by doing. The more you practice scanning, categorizing and tracing through code and data the better you get at it. It will take longer, but the steps are designed to build on themselves and are, in a way, self-teaching. That is why you can't just do the 10 steps in parallel as jdporter jokingly suggests below.

However some theoretical context and a naturally open mind definitely helps: if you think that database tables should always have a one-to-one relationship with classes you will be very very confused by a system where that isn't true. If I had to delegate this work to someone else I probably would work up a set of reading materials on different design strategies that have been used in the past 30 years. Alternatively or in addition, I might pair an analyst with a programmer so that they could learn from each other (with neither having priority!)

Best, beth

Update: expanded description of the PerlMonks system so that it addresses all of the time drivers mentioned in the first paragaph.

Update: fixed miscalculation of time

[Nov 15, 2017] Xref helped me make sense of the interactions in the old codebase. I didn't bother with any visualization tools or graph-creation, though. I just took the output of perl -MO=Xref filename for each file, removed some of the cruft with a text editor, ran it through mpage -4 to print, and spent a day with coffee and pencil, figuring out how things worked.

Nov 15, 2017 | perlmonks.com

dave0 (Friar) on Apr 15, 2005 at 15:32 UTC

Re: Analyzing large Perl code base.

Having recently done this on a fairly large codebase that grew organically (no design, no refactoring) over the course of four years, I feel your pain.

Writing a testsuite, on any level, is nearly essential for this. If you're rewriting an existing module, you'll need to ensure it's compatible with the old one, and the only sane way to do that is to test. If the old code is monolithic, it might be difficult to test individual units, but don't let that stop you from testing at a higher level.

B::Xref helped me make sense of the interactions in the old codebase. I didn't bother with any visualization tools or graph-creation, though. I just took the output of perl -MO=Xref filename for each file, removed some of the cruft with a text editor, ran it through mpage -4 to print, and spent a day with coffee and pencil, figuring out how things worked.

Pretty much the same tactic was used on the actual code. Print it out, annotate it away from the computer, and then sit down with the notes to implement the refactoring. If your codebase is huge (mine was about 4-5k lines in several .pl and .pm files, and was still manageable) you might not want to do this, though.

[Nov 15, 2017] Generating documentation from Perl code

Nov 15, 2017 | perlmonks.com

Re: Strategies for maintenance of horrible code?
by aufflick (Deacon) on Jul 13, 2006 at 00:17 UTC

You might find the comments to my recent question Generating documentation from Perl code (not just POD) useful.

The Doxygen perl extension creates docs that are great for seeing what classes re-implement what methods etc. Also the UML::Sequence sounds intriguing - it pupports to generate a sequence diagram by monitoring code execution.

[Nov 15, 2017] Generating documentation from Perl code (not just POD)

Nov 15, 2017 | perlmonks.com

Generating documentation from Perl code (not just POD) by aufflick (Deacon)

on Jul 11, 2006 at 05:15 UTC ( # 560312 = perlquestion : print w/replies , xml ) Need Help?? aufflick has asked for the wisdom of the Perl Monks concerning the following question:

Ideally a script/module would do that, and also interleave the POD from the file, so any POD directly before the method/sub would be linked to it. Any method/sub without POD would at least be documented by it's name.

Of course a major limitation is that (for OO Perl at least), we have no idea what the method arguments are, simply from robotically inspecting the code. Something I always liked in OpenACS is the way that they replace the builtin Tcl proc keyword with a custom ad_proc keyword that works just the same as proc but which takes an optional documentation block that accepts javadoc-like keyword embedding and also a block detailing any arguments and their default values. Because of the tight coupling, the generated documentation is very rich for little developer effort.

Does anyone know of attempts at this sort of thing in Perl, or have any good ideas to offer? Ideally I want to come up with something that will work with existing Perl code, and that any extensions won't break normal Perl compilation (no literate programming preprocessors need apply).

/Mark

planetscape (Chancellor) on Jul 11, 2006 at 06:27 UTC

Re: Generating documentation from Perl code (not just POD)

These links should get you started:


DoxyFilt ( Doxygen for Perl) offsite
Analyzing large Perl code base.
Becoming familiar with a too-big codebase?

HTH,

aufflick (Deacon) on Jul 13, 2006 at 00:34 UTC

Re^2: Generating documentation from Perl code (not just POD)


by aufflick (Deacon) on Jul 13, 2006 at 00:34 UTC

Wow - Doxygen + Doxyfilt is *exactly* what I was looking for - fantastic!

planetscape (Chancellor) on Jul 13, 2006 at 00:39 UTC

Re^3: Generating documentation from Perl code (not just POD)
by planetscape (Chancellor) on Jul 13, 2006 at 00:39 UTC

Glad to hear it! :-D

If you have any questions about configuring Doxyfile to run Doxygen / DoxyFilt under Cygwin , for instance, please /msg me .

HTH,

aufflick (Deacon) on Jul 13, 2006 at 02:12 UTC

Re^4: Generating documentation from Perl code (not just POD)
by aufflick (Deacon) on Jul 13, 2006 at 02:12 UTC

BrowserUk (Pope) on Jul 13, 2006 at 02:31 UTC

Re^3: Generating documentation from Perl code (not just POD)
by BrowserUk (Pope) on Jul 13, 2006 at 02:31 UTC
Wow - Doxygen + Doxyfilt is *exactly* what I was looking for - fantastic!

Now, please someone with influence sell the P6 guys on Doxygen. Let's have it built into the language and allow the terminally-ill POD slip away peacefully.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.

philcrow (Priest) on Jul 11, 2006 at 13:41 UTC

Re: Generating documentation from Perl code (not just POD)

I'm interested in this area. Once, in the past, I wrote UML::Sequence which runs perl programs to produce sequence diagrams of what they actually do. This leads me to think that you could write a special driver using the debugger hooks to load the modules from your app, then dump out their inheritence relationships, etc. (based on what is loaded and what those modules have in their @ISA and symbol tables). Maybe that could be incorporated with some good POD parsing, but I'm just rambling now.

Phil

Replies are listed 'Best First'.

[Nov 15, 2017] With regard to the To Do list, I scatter them throughout my code if there is a place I need to do further work.

Nov 15, 2017 | perlmonks.com

knobunc (Pilgrim) on May 10, 2001 at 18:19 UTC

Re: Suggestions for working with poor code

Very cool node.

With regard to the To Do list, I scatter them throughout my code if there is a place I need to do further work. However, I have a make rule for todo that searches for all of the lines with TODO in them and prints them out. So a usage of a TODO:

if ($whatever) {
    # TODO - Finish code to take over the world
}
[download]

Becomes:

To Do List
Dir/file.pl 132: Finish code to take over the world
[download]

When run through the following (ugly, suboptimal, but working) code in Tools/todo.sh :

#/bin/sh

echo 'To Do List'

find . -type f | xargs grep -n TODO | perl -ne \
'($file, $line, $rest) += split /:/, $_, 3; 
$file =~ s|^./||;  
$rest =~ s|.*?TODO.*?[-\s:]+||; 
$rest =~ s|"[.;,]\s*$||;  
$rest =~ s|\\n||g; print "$file $line: \u$rest\n"' 

| sort | uniq | grep -v '.#' | grep -v Makefile | grep -v CVS
[download]

Which I call from my Makefile:

todo: Tools/todo.sh [download]

Kinda ugly, but it lets me put the TODO statements where I actually need to do the work.

So I can proof out a block of code by writing narrative comments with TODO at the start of the line (behind comment characters of course).

Then fill in the code later and not worry about missing a piece. Also since the TODOs are where the stuff needs to be filled in, I have lots of context around the issue and don't need to write as much as I would if they were at the top of the file. Plus anyone without something to do in the group can just type make todo and add some code. Finally, it is easier to add a TODO right where you need it, than bop up to the top of the file and then have to find where you were back in the code.

-ben

[Nov 15, 2017] Basic Debugger Commands

Notable quotes:
"... pseudo-signal handlers, ..."
"... programmatic debugger control ..."
Nov 15, 2017 | my.safaribooksonline.com

Debugging is just an extreme case of dynamic analysis. Third-party code can be extremely convoluted (so can your own code, of course, but you don't usually think of it that way because you're familiar with it; you knew it when it was just a subroutine); sometimes you just can't tell how part of the code fits in, or whether it's called at all. The code is laid out in some arrangement that makes no sense; if only you could see where the program would actually go when it was run.

Well, you can, using Perl's built-in debugger. Even though you're not actually trying to find a bug, the code-tracing ability of the debugger is perfect for the job.

This isn't the place for a full treatment of the debugger (you can see more detail in [ SCOTT01 ]), but fortunately you don't need a full treatment; a subset of the commands is enough for what you need to do. (Using the debugger is like getting in a fight; it's usually over very quickly without using many of the fancy moves you trained for.)

-d command-line flag; either edit the program to add -d to the shebang line, or run the program by invoking Perl explicitly:

% perl -d program argument argument...

Make sure that the perl in your path is the same one in the shebang line of program or you'll go crazy if there are differences between the two perls.

Basic Debugger Commands

Armed with these commands, we can go code spelunking. Suppose you are debugging a program containing the following code fragment:

77 for my $url (@url_queue)
78 {
79 my $res = $ua->request($url);
80 summarize($res->content);
81 }

and you know that whenever the program gets to the URL http://www.perlmedic.com/fnord.html something strange happens in the summarize() subroutine. You'd like to check the HTTP::Response object to see if there were any redirects you didn't know about. You start the program under the debugger and type:

DB<1> b 80 $url =~ /fnord/
DB<2>

The program will run until it has fetched the URL you're interested in, at which point you can examine the response object -- here's an example of what it might look like:

Perl 5.8.0 and later will give you a stack trace anyway if you run a program under the debugger and some code triggers a warning. But suppose you are either running under an earlier perl, or you'd really like to have a debugger prompt at the point the warning was about to happen.

You can combine two advanced features of Perl to do this: pseudo-signal handlers, and programmatic debugger control .

A signal handler is a subroutine you can tell Perl to execute whenever your program receives a signal. For instance, when the user interrupts your program by pressing Control-C, that works by sending an INT signal to your program, which interprets it by default as an instruction to stop executing.

There are two pseudo-signals, called __WARN__ and __DIE__ . They aren't real signals, but Perl "generates" them whenever it's told to issue a warning or to die, respectively. You can supply code to be run in those events by inserting a subroutine reference in the %SIG hash (see perlvar ) as follows:

$SIG{__WARN__} = sub { print "Ouch, I'm bad" };

(Try it on some code that generates a warning.)

The next piece of the solution is that the debugger can be controlled from within your program; the variable $single in the special package DB determines what Perl does at each statement: 0 means keep going, and 1 or 2 mean give a user prompt. 1 So setting $DB::single to 1 in a pseudo-signal handler will give us a debugger prompt at just the point we wanted.

1 . The difference between the two values is that a 1 causes the debugger to act as though the last n or s command the user typed was s , whereas a 2 is equivalent to an n . When you type an empty command in the debugger (just hit Return), it repeats whatever the last n or s command was.

Putting the pieces together, you can start running the program under the debugger and give the commands:

DB<1> $SIG{__WARN__} = sub { warn @_; $DB::single = 1 }
DB<2>

Now the program will breakpoint where it was about to issue a warning, and you can issue a T command to see a stack trace, examine data, or do anything else you want. 2 The warning is still printed first.

2 . Under some circumstances, the breakpoint might not occur at the actual place of warning: The current routine might return if the statement triggering the warning is the last one being executed in that routine.

Unfortunately, no __DIE__ pseudo-signal handler will return control to the debugger (evidently death is considered too pressing an engagement to be interrupted). However, you can get a stack trace by calling the confess() function in the Carp module:

DB<1> use Carp
DB<2> $SIG{__DIE__} = sub { confess (@_) }

The output will look something like this:

DB<3>
Insufficient privilege to launch preemptive strike at wargames line
109.
main::__ANON__[(eval 17)[/usr/lib/perl5/5.6.1/
perl5db.pl:1521]:2]('Insufficient privilege to launch preemptive
strike at wargames line 109.^J') called at wargames line 121
main::preemptive('Strike=HASH(0x82069d4)') called at wargames
line 109
main::make_strike('ICBM=HASH(0x820692c)') called at wargames
line 74
main::icbm('Silo_ND') called at wargames line 32
main::wmd('ICBM') called at wargames line 22
main::strike() called at wargames line 11
main::menu() called at wargames line 5
Debugged program terminated. Use q to quit or R to restart,
use O inhibit_exit to avoid stopping after program termination,
h q, h R or h O to get additional info.

I've often found it amusing that the debugger refers to the program at this point as "debugged."

[Nov 15, 2017] Strange behaviour of tr function in case the set1 is supplied by a variable

Notable quotes:
"... Characters may be literals or any of the escape sequences accepted in double-quoted strings. But there is no interpolation, so "$" and "@" are treated as literals. ..."
Nov 15, 2017 | perlmonks.com
Nov 16, 2017 at 02:50 UTC ( # 1203542 = perlquestion : print w/replies , xml ) Need Help??

likbez has asked for the wisdom of the Perl Monks concerning the following question:

Looks like in tr function a scalar variable is accepted as the first argument, but is not compiled properly into set of characters

use strict;
use warnings;

my $str1 = 'abcde';
my $str2 = 'eda';
my $diff1 = 0;

eval "\$diff1=\$str1=~tr/$str2//";

print "diff1: $diff1\n";

$ perl foo.pl
diff1: 3

[download]

This produces in perl 5, version 26:

Test 1: strait set diff1=0, diff2=3
Test 2: complement set diff1=5, diff2=2

[download]

Obviously only the second result in both tests is correct. Looks like only explicitly given first set is correctly compiled. Is this a feature or a bug ?

Athanasius (Chancellor) on Nov 16, 2017 at 03:08 UTC

Re: Strange behaviour of tr function in case the set1 is supplied by a variable

Hello likbez ,

The transliteration operator tr/SEARCHLIST/REPLACEMENTLIST/ does not interpolate its SEARCHLIST , so in your first example the search list is simply the literal characters , , , , . See Quote and Quote like Operators .

Hope that helps,

Athanasius  < contra mundum Iustus alius egestas vitae, eros Piratica,

roboticus (Chancellor) on Nov 16, 2017 at 03:08 UTC

Re: Strange behaviour of tr function in case the set1 is supplied by a variable

likbez :

Feature, per the tr docs

Characters may be literals or any of the escape sequences accepted in double-quoted strings. But there is no interpolation, so "$" and "@" are treated as literals.

A hyphen at the beginning or end, or preceded by a backslash is considered a literal. Escape sequence details are in the table near the beginning of this section.

So if you want to use a string to specify the values in a tr statement, you'll probably have to do it via a string eval:

$ cat foo.pl use strict; use warnings;
my $str1 = 'abcde';
my $str2 = 'eda';
my $diff1 = 0;
eval "\$diff1=\$str1=~tr/$str2//";
print "diff1: $diff1\n";
perl foo.pl diff1: 3

[download]

... roboticus

When your only tool is a hammer, all problems look like your thumb.

Anonymous Monk on Nov 16, 2017 at 03:09 UTC

Re: Strange behaviour of tr function in case the set1 is supplied by a variable

Looks like in tr function a scalar variable is accepted as the fist argument, but is not compiled properly into set of characters

:)

you're guessing how tr /// works, you're guessing it works like s/// or m///, but you can't guess , it doesn't work like that, it doesn't interpolate variables, read perldoc -f tr for the details

likbez !!! on Nov 16, 2017 at 04:41 UTC

Re^2: Strange behaviour of tr function in case the set1 is supplied by a variable
you're guessing how tr/// works, you're guessing it works like s/// or m///, but you can't guess , it doesn't work like that, it doesn't interpolate variables, read perldoc -f tr for the details
Houston, we have a problem ;-)

First of all that limits tr area of applicability.

The second, it's not that I am guessing, I just (wrongly) extrapolated regex behavior on tr, as people more often use regex then tr. Funny, but searching my old code and comments in it is clear that I remembered (probably discovered the hard way, not by reading the documentation ;-) this nuance several years ago. Not now. Completely forgotten. Erased from memory. And that tells you something about Perl complexity (actually tr is not that frequently used by most programmers, especially for counting characters).

And that's a real situation, that we face with Perl in other areas too (and not only with Perl): Perl exceeds typical human memory capacity to hold the information about the language. That's why we need "crutches" like strict.

You simply can't remember all the nuances of more then a dozen of string-related built-in functions, can you? You probably can (and should) for index/rindex and substr , but that's about it.

So here are two problems here:

1. Are / / strings uniformly interpreted in the language, or there is a "gotcha" because they are differently interpreted by tr (essentially as a single quoted strings) and regex (as double quoted strings) ?

2. If so, what is the quality of warnings about this gotcha? There is no warning issued, if you use strict and warnings. BTW, it looks like $ can be escaped:

main::(-e:1): 0
DB<5> $_='\$bba\$'
DB<6> tr/\$/?/
DB<7> print $_
\?bba\?

[download]

Right now there is zero warnings issued with use strict and use warnings enabled. Looks like this idea of using =~ for tr was not so good, after all. Regular syntax like tr(set1, set2) would be much better. But it's to late to change and now we need warnings to be implemented.

likbez !!! on Nov 16, 2017 at 03:10 UTC

Re: Strange behaviour of tr function in case the set1 is supplied by a variable

With eval statement works correctly. So it looks like $ is treated by tr as a regular symbol and no warnings are issued.

$statement='$diff1=$str1'."=~tr/$str2//;";
eval($statement);
print "With eval: diff1=$diff1\n";
[download]

that will produce:

With eval: diff1=3

ww (Archbishop) on Nov 16, 2017 at 03:16 UTC

Re: Strange behaviour of tr function in case the set1 is supplied by a variable

Same results in AS 5.24 under Win7x64.

Suspected problem might have arisen from lack of strict, warnings. Wrong, same results BUT using both remains a generally good idea.

Also wondered if compiling (with qr/.../ ) might change the outcome. Wrong again, albeit with variant (erroneous) output.

Correct me if I'm wrong, guessing that "strait" is a typo or personal shortening of "straight."

Update: Now that I've seen earlier replies... ouch, pounding forehead into brick wall!

[Nov 15, 2017] Preface (Modern Perl 2011-2012)

Nov 15, 2017 | modernperlbooks.com

Modern Perl is one way to describe the way the world's most effective Perl 5 programmers work. They use language idioms. They take advantage of the CPAN. They show good taste and craft to write powerful, maintainable, scalable, concise, and effective code. You can learn these skills too!

Perl first appeared in 1987 as a simple tool for system administration. Though it began by declaring and occupying a comfortable niche between shell scripting and C programming, it has become a powerful, general-purpose language family. Perl 5 has a solid history of pragmatism and a bright future of polish and enhancement Perl 6 is a reinvention of programming based on the solid principles of Perl, but it's a subject of another book.

Over Perl's long history -- especially the 17 years of Perl 5 -- our understanding of what makes great Perl programs has changed. While you can write productive programs which never take advantage of all the language has to offer, the global Perl community has invented, borrowed, enhanced, and polished ideas and made them available to anyone willing to learn them.

[Nov 14, 2017] Exporter - search.cpan.org

Nov 14, 2017 | search.cpan.org

Todd Rinaldo > Exporter-5.72 > Exporter

Download:
Exporter-5.72.tar.gz

Dependencies

Annotate this POD

View/Report Bugs
Module Version: 5.72 Source NAME ^

Exporter - Implements default import method for modules

SYNOPSIS ^

In module YourModule.pm :

  package YourModule;
  require Exporter;
  @ISA = qw(Exporter);
  @EXPORT_OK = qw(munge frobnicate);  # symbols to export on request

or

  package YourModule;
  use Exporter 'import'; # gives you Exporter's import() method directly
  @EXPORT_OK = qw(munge frobnicate);  # symbols to export on request

In other files which wish to use YourModule :

  use YourModule qw(frobnicate);      # import listed symbols
  frobnicate ($left, $right)          # calls YourModule::frobnicate

Take a look at "Good Practices" for some variants you will like to use in modern Perl code.

DESCRIPTION ^

The Exporter module implements an import method which allows a module to export functions and variables to its users' namespaces. Many modules use Exporter rather than implementing their own import method because Exporter provides a highly flexible interface, with an implementation optimised for the common case.

Perl automatically calls the import method when processing a use statement for a module. Modules and use are documented in perlfunc and perlmod . Understanding the concept of modules and how the use statement operates is important to understanding the Exporter.

How to Export

The arrays @EXPORT and @EXPORT_OK in a module hold lists of symbols that are going to be exported into the users name space by default, or which they can request to be exported, respectively. The symbols can represent functions, scalars, arrays, hashes, or typeglobs. The symbols must be given by full name with the exception that the ampersand in front of a function is optional, e.g.

    @EXPORT    = qw(afunc $scalar @array);   # afunc is a function
    @EXPORT_OK = qw(&bfunc %hash *typeglob); # explicit prefix on &bfunc

If you are only exporting function names it is recommended to omit the ampersand, as the implementation is faster this way.

Selecting What to Export

Do not export method names!

Do not export anything else by default without a good reason!

Exports pollute the namespace of the module user. If you must export try to use @EXPORT_OK in preference to @EXPORT and avoid short or common symbol names to reduce the risk of name clashes.

Generally anything not exported is still accessible from outside the module using the YourModule::item_name (or $blessed_ref->method ) syntax. By convention you can use a leading underscore on names to informally indicate that they are 'internal' and not for public use.

(It is actually possible to get private functions by saying:

  my $subref = sub { ... };
  $subref->(@args);            # Call it as a function
  $obj->$subref(@args);        # Use it as a method

However if you use them for methods it is up to you to figure out how to make inheritance work.)

As a general rule, if the module is trying to be object oriented then export nothing. If it's just a collection of functions then @EXPORT_OK anything but use @EXPORT with caution. For function and method names use barewords in preference to names prefixed with ampersands for the export lists.

Other module design guidelines can be found in perlmod .

How to Import

In other files which wish to use your module there are three basic ways for them to load your module and import its symbols:

use YourModule;
This imports all the symbols from YourModule's @EXPORT into the namespace of the use statement.
use YourModule ();
This causes perl to load your module but does not import any symbols.
use YourModule qw(...);
This imports only the symbols listed by the caller into their namespace. All listed symbols must be in your @EXPORT or @EXPORT_OK , else an error occurs. The advanced export features of Exporter are accessed like this, but with list entries that are syntactically distinct from symbol names.

Unless you want to use its advanced features, this is probably all you need to know to use Exporter.

Advanced Features ^ Specialised Import Lists

If any of the entries in an import list begins with !, : or / then the list is treated as a series of specifications which either add to or delete from the list of names to import. They are processed left to right. Specifications are in the form:

    [!]name         This name only
    [!]:DEFAULT     All names in @EXPORT
    [!]:tag         All names in $EXPORT_TAGS{tag} anonymous array
    [!]/pattern/    All names in @EXPORT and @EXPORT_OK which match

A leading ! indicates that matching names should be deleted from the list of names to import. If the first specification is a deletion it is treated as though preceded by :DEFAULT. If you just want to import extra names in addition to the default set you will still need to include :DEFAULT explicitly.

e.g., Module.pm defines:

    @EXPORT      = qw(A1 A2 A3 A4 A5);
    @EXPORT_OK   = qw(B1 B2 B3 B4 B5);
    %EXPORT_TAGS = (T1 => [qw(A1 A2 B1 B2)], T2 => [qw(A1 A2 B3 B4)]);

Note that you cannot use tags in @EXPORT or @EXPORT_OK.

Names in EXPORT_TAGS must also appear in @EXPORT or @EXPORT_OK.

An application using Module can say something like:

    use Module qw(:DEFAULT :T2 !B3 A3);

Other examples include:

    use Socket qw(!/^[AP]F_/ !SOMAXCONN !SOL_SOCKET);
    use POSIX  qw(:errno_h :termios_h !TCSADRAIN !/^EXIT/);

Remember that most patterns (using //) will need to be anchored with a leading ^, e.g., /^EXIT/ rather than /EXIT/ .

You can say BEGIN { $Exporter::Verbose=1 } to see how the specifications are being processed and what is actually being imported into modules.

Exporting Without Using Exporter's import Method

Exporter has a special method, 'export_to_level' which is used in situations where you can't directly call Exporter's import method. The export_to_level method looks like:

    MyPackage->export_to_level(
        $where_to_export, $package, @what_to_export
    );

where $where_to_export is an integer telling how far up the calling stack to export your symbols, and @what_to_export is an array telling what symbols *to* export (usually this is @_ ). The $package argument is currently unused.

For example, suppose that you have a module, A, which already has an import function:

    package A;

    @ISA = qw(Exporter);
    @EXPORT_OK = qw($b);

    sub import
    {
        $A::b = 1;     # not a very useful import method
    }

and you want to Export symbol $A::b back to the module that called package A. Since Exporter relies on the import method to work, via inheritance, as it stands Exporter::import() will never get called. Instead, say the following:

    package A;
    @ISA = qw(Exporter);
    @EXPORT_OK = qw($b);

    sub import
    {
        $A::b = 1;
        A->export_to_level(1, @_);
    }

This will export the symbols one level 'above' the current package - ie: to the program or module that used package A.

Note: Be careful not to modify @_ at all before you call export_to_level - or people using your package will get very unexplained results!

Exporting Without Inheriting from Exporter

By including Exporter in your @ISA you inherit an Exporter's import() method but you also inherit several other helper methods which you probably don't want. To avoid this you can do:

  package YourModule;
  use Exporter qw(import);

which will export Exporter's own import() method into YourModule. Everything will work as before but you won't need to include Exporter in @YourModule::ISA .

Note: This feature was introduced in version 5.57 of Exporter, released with perl 5.8.3.

Module Version Checking

The Exporter module will convert an attempt to import a number from a module into a call to $module_name->VERSION($value) . This can be used to validate that the version of the module being used is greater than or equal to the required version.

For historical reasons, Exporter supplies a require_version method that simply delegates to VERSION . Originally, before UNIVERSAL::VERSION existed, Exporter would call require_version .

Since the UNIVERSAL::VERSION method treats the $VERSION number as a simple numeric value it will regard version 1.10 as lower than 1.9. For this reason it is strongly recommended that you use numbers with at least two decimal places, e.g., 1.09.

Managing Unknown Symbols

In some situations you may want to prevent certain symbols from being exported. Typically this applies to extensions which have functions or constants that may not exist on some systems.

The names of any symbols that cannot be exported should be listed in the @EXPORT_FAIL array.

If a module attempts to import any of these symbols the Exporter will give the module an opportunity to handle the situation before generating an error. The Exporter will call an export_fail method with a list of the failed symbols:

  @failed_symbols = $module_name->export_fail(@failed_symbols);

If the export_fail method returns an empty list then no error is recorded and all the requested symbols are exported. If the returned list is not empty then an error is generated for each symbol and the export fails. The Exporter provides a default export_fail method which simply returns the list unchanged.

Uses for the export_fail method include giving better error messages for some symbols and performing lazy architectural checks (put more symbols into @EXPORT_FAIL by default and then take them out if someone actually tries to use them and an expensive check shows that they are usable on that platform).

Tag Handling Utility Functions

Since the symbols listed within %EXPORT_TAGS must also appear in either @EXPORT or @EXPORT_OK , two utility functions are provided which allow you to easily add tagged sets of symbols to @EXPORT or @EXPORT_OK :

  %EXPORT_TAGS = (foo => [qw(aa bb cc)], bar => [qw(aa cc dd)]);

  Exporter::export_tags('foo');     # add aa, bb and cc to @EXPORT
  Exporter::export_ok_tags('bar');  # add aa, cc and dd to @EXPORT_OK

Any names which are not tags are added to @EXPORT or @EXPORT_OK unchanged but will trigger a warning (with -w ) to avoid misspelt tags names being silently added to @EXPORT or @EXPORT_OK . Future versions may make this a fatal error.

Generating Combined Tags

If several symbol categories exist in %EXPORT_TAGS , it's usually useful to create the utility ":all" to simplify "use" statements.

The simplest way to do this is:

  %EXPORT_TAGS = (foo => [qw(aa bb cc)], bar => [qw(aa cc dd)]);

  # add all the other ":class" tags to the ":all" class,
  # deleting duplicates
  {
    my %seen;

    push @{$EXPORT_TAGS{all}},
      grep {!$seen{$_}++} @{$EXPORT_TAGS{$_}} foreach keys %EXPORT_TAGS;
  }

CGI.pm creates an ":all" tag which contains some (but not really all) of its categories. That could be done with one small change:

  # add some of the other ":class" tags to the ":all" class,
  # deleting duplicates
  {
    my %seen;

    push @{$EXPORT_TAGS{all}},
      grep {!$seen{$_}++} @{$EXPORT_TAGS{$_}}
        foreach qw/html2 html3 netscape form cgi internal/;
  }

Note that the tag names in %EXPORT_TAGS don't have the leading ':'.

AUTOLOAD ed Constants

Many modules make use of AUTOLOAD ing for constant subroutines to avoid having to compile and waste memory on rarely used values (see perlsub for details on constant subroutines). Calls to such constant subroutines are not optimized away at compile time because they can't be checked at compile time for constancy.

Even if a prototype is available at compile time, the body of the subroutine is not (it hasn't been AUTOLOAD ed yet). perl needs to examine both the () prototype and the body of a subroutine at compile time to detect that it can safely replace calls to that subroutine with the constant value.

A workaround for this is to call the constants once in a BEGIN block:

   package My ;

   use Socket ;

   foo( SO_LINGER );  ## SO_LINGER NOT optimized away; called at runtime
   BEGIN { SO_LINGER }
   foo( SO_LINGER );  ## SO_LINGER optimized away at compile time.

This forces the AUTOLOAD for SO_LINGER to take place before SO_LINGER is encountered later in My package.

If you are writing a package that AUTOLOAD s, consider forcing an AUTOLOAD for any constants explicitly imported by other packages or which are usually used when your package is use d.

Good Practices ^ Declaring @EXPORT_OK and Friends

When using Exporter with the standard strict and warnings pragmas, the our keyword is needed to declare the package variables @EXPORT_OK , @EXPORT , @ISA , etc.

  our @ISA = qw(Exporter);
  our @EXPORT_OK = qw(munge frobnicate);

If backward compatibility for Perls under 5.6 is important, one must write instead a use vars statement.

  use vars qw(@ISA @EXPORT_OK);
  @ISA = qw(Exporter);
  @EXPORT_OK = qw(munge frobnicate);
Playing Safe

There are some caveats with the use of runtime statements like require Exporter and the assignment to package variables, which can be very subtle for the unaware programmer. This may happen for instance with mutually recursive modules, which are affected by the time the relevant constructions are executed.

The ideal (but a bit ugly) way to never have to think about that is to use BEGIN blocks. So the first part of the "SYNOPSIS" code could be rewritten as:

  package YourModule;

  use strict;
  use warnings;

  our (@ISA, @EXPORT_OK);
  BEGIN {
     require Exporter;
     @ISA = qw(Exporter);
     @EXPORT_OK = qw(munge frobnicate);  # symbols to export on request
  }

The BEGIN will assure that the loading of Exporter.pm and the assignments to @ISA and @EXPORT_OK happen immediately, leaving no room for something to get awry or just plain wrong.

With respect to loading Exporter and inheriting, there are alternatives with the use of modules like base and parent .

  use base qw(Exporter);
  # or
  use parent qw(Exporter);

Any of these statements are nice replacements for BEGIN { require Exporter; @ISA = qw(Exporter); } with the same compile-time effect. The basic difference is that base code interacts with declared fields while parent is a streamlined version of the older base code to just establish the IS-A relationship.

For more details, see the documentation and code of base and parent .

Another thorough remedy to that runtime vs. compile-time trap is to use Exporter::Easy , which is a wrapper of Exporter that allows all boilerplate code at a single gulp in the use statement.

   use Exporter::Easy (
       OK => [ qw(munge frobnicate) ],
   );
   # @ISA setup is automatic
   # all assignments happen at compile time
What Not to Export

You have been warned already in "Selecting What to Export" to not export:

There's one more item to add to this list. Do not export variable names. Just because Exporter lets you do that, it does not mean you should.

  @EXPORT_OK = qw($svar @avar %hvar); # DON'T!

Exporting variables is not a good idea. They can change under the hood, provoking horrible effects at-a-distance that are too hard to track and to fix. Trust me: they are not worth it.

To provide the capability to set/get class-wide settings, it is best instead to provide accessors as subroutines or class methods instead.

SEE ALSO ^

Exporter is definitely not the only module with symbol exporter capabilities. At CPAN, you may find a bunch of them. Some are lighter. Some provide improved APIs and features. Pick the one that fits your needs. The following is a sample list of such modules.

    Exporter::Easy
    Exporter::Lite
    Exporter::Renaming
    Exporter::Tidy
    Sub::Exporter / Sub::Installer
    Perl6::Export / Perl6::Export::Attrs
LICENSE ^

This library is free software. You can redistribute it and/or modify it under the same terms as Perl itself.

[Nov 14, 2017] Exporter - search.cpan.org

Nov 14, 2017 | search.cpan.org

Todd Rinaldo > Exporter-5.72 > Exporter

Download:
Exporter-5.72.tar.gz

Dependencies

Annotate this POD

View/Report Bugs
Module Version: 5.72 Source NAME ^

Exporter - Implements default import method for modules

SYNOPSIS ^

In module YourModule.pm :

  package YourModule;
  require Exporter;
  @ISA = qw(Exporter);
  @EXPORT_OK = qw(munge frobnicate);  # symbols to export on request

or

  package YourModule;
  use Exporter 'import'; # gives you Exporter's import() method directly
  @EXPORT_OK = qw(munge frobnicate);  # symbols to export on request

In other files which wish to use YourModule :

  use YourModule qw(frobnicate);      # import listed symbols
  frobnicate ($left, $right)          # calls YourModule::frobnicate

Take a look at "Good Practices" for some variants you will like to use in modern Perl code.

DESCRIPTION ^

The Exporter module implements an import method which allows a module to export functions and variables to its users' namespaces. Many modules use Exporter rather than implementing their own import method because Exporter provides a highly flexible interface, with an implementation optimised for the common case.

Perl automatically calls the import method when processing a use statement for a module. Modules and use are documented in perlfunc and perlmod . Understanding the concept of modules and how the use statement operates is important to understanding the Exporter.

How to Export

The arrays @EXPORT and @EXPORT_OK in a module hold lists of symbols that are going to be exported into the users name space by default, or which they can request to be exported, respectively. The symbols can represent functions, scalars, arrays, hashes, or typeglobs. The symbols must be given by full name with the exception that the ampersand in front of a function is optional, e.g.

    @EXPORT    = qw(afunc $scalar @array);   # afunc is a function
    @EXPORT_OK = qw(&bfunc %hash *typeglob); # explicit prefix on &bfunc

If you are only exporting function names it is recommended to omit the ampersand, as the implementation is faster this way.

Selecting What to Export

Do not export method names!

Do not export anything else by default without a good reason!

Exports pollute the namespace of the module user. If you must export try to use @EXPORT_OK in preference to @EXPORT and avoid short or common symbol names to reduce the risk of name clashes.

Generally anything not exported is still accessible from outside the module using the YourModule::item_name (or $blessed_ref->method ) syntax. By convention you can use a leading underscore on names to informally indicate that they are 'internal' and not for public use.

(It is actually possible to get private functions by saying:

  my $subref = sub { ... };
  $subref->(@args);            # Call it as a function
  $obj->$subref(@args);        # Use it as a method

However if you use them for methods it is up to you to figure out how to make inheritance work.)

As a general rule, if the module is trying to be object oriented then export nothing. If it's just a collection of functions then @EXPORT_OK anything but use @EXPORT with caution. For function and method names use barewords in preference to names prefixed with ampersands for the export lists.

Other module design guidelines can be found in perlmod .

How to Import

In other files which wish to use your module there are three basic ways for them to load your module and import its symbols:

use YourModule;
This imports all the symbols from YourModule's @EXPORT into the namespace of the use statement.
use YourModule ();
This causes perl to load your module but does not import any symbols.
use YourModule qw(...);
This imports only the symbols listed by the caller into their namespace. All listed symbols must be in your @EXPORT or @EXPORT_OK , else an error occurs. The advanced export features of Exporter are accessed like this, but with list entries that are syntactically distinct from symbol names.

Unless you want to use its advanced features, this is probably all you need to know to use Exporter.

Advanced Features ^ Specialised Import Lists

If any of the entries in an import list begins with !, : or / then the list is treated as a series of specifications which either add to or delete from the list of names to import. They are processed left to right. Specifications are in the form:

    [!]name         This name only
    [!]:DEFAULT     All names in @EXPORT
    [!]:tag         All names in $EXPORT_TAGS{tag} anonymous array
    [!]/pattern/    All names in @EXPORT and @EXPORT_OK which match

A leading ! indicates that matching names should be deleted from the list of names to import. If the first specification is a deletion it is treated as though preceded by :DEFAULT. If you just want to import extra names in addition to the default set you will still need to include :DEFAULT explicitly.

e.g., Module.pm defines:

    @EXPORT      = qw(A1 A2 A3 A4 A5);
    @EXPORT_OK   = qw(B1 B2 B3 B4 B5);
    %EXPORT_TAGS = (T1 => [qw(A1 A2 B1 B2)], T2 => [qw(A1 A2 B3 B4)]);

Note that you cannot use tags in @EXPORT or @EXPORT_OK.

Names in EXPORT_TAGS must also appear in @EXPORT or @EXPORT_OK.

An application using Module can say something like:

    use Module qw(:DEFAULT :T2 !B3 A3);

Other examples include:

    use Socket qw(!/^[AP]F_/ !SOMAXCONN !SOL_SOCKET);
    use POSIX  qw(:errno_h :termios_h !TCSADRAIN !/^EXIT/);

Remember that most patterns (using //) will need to be anchored with a leading ^, e.g., /^EXIT/ rather than /EXIT/ .

You can say BEGIN { $Exporter::Verbose=1 } to see how the specifications are being processed and what is actually being imported into modules.

Exporting Without Using Exporter's import Method

Exporter has a special method, 'export_to_level' which is used in situations where you can't directly call Exporter's import method. The export_to_level method looks like:

    MyPackage->export_to_level(
        $where_to_export, $package, @what_to_export
    );

where $where_to_export is an integer telling how far up the calling stack to export your symbols, and @what_to_export is an array telling what symbols *to* export (usually this is @_ ). The $package argument is currently unused.

For example, suppose that you have a module, A, which already has an import function:

    package A;

    @ISA = qw(Exporter);
    @EXPORT_OK = qw($b);

    sub import
    {
        $A::b = 1;     # not a very useful import method
    }

and you want to Export symbol $A::b back to the module that called package A. Since Exporter relies on the import method to work, via inheritance, as it stands Exporter::import() will never get called. Instead, say the following:

    package A;
    @ISA = qw(Exporter);
    @EXPORT_OK = qw($b);

    sub import
    {
        $A::b = 1;
        A->export_to_level(1, @_);
    }

This will export the symbols one level 'above' the current package - ie: to the program or module that used package A.

Note: Be careful not to modify @_ at all before you call export_to_level - or people using your package will get very unexplained results!

Exporting Without Inheriting from Exporter

By including Exporter in your @ISA you inherit an Exporter's import() method but you also inherit several other helper methods which you probably don't want. To avoid this you can do:

  package YourModule;
  use Exporter qw(import);

which will export Exporter's own import() method into YourModule. Everything will work as before but you won't need to include Exporter in @YourModule::ISA .

Note: This feature was introduced in version 5.57 of Exporter, released with perl 5.8.3.

Module Version Checking

The Exporter module will convert an attempt to import a number from a module into a call to $module_name->VERSION($value) . This can be used to validate that the version of the module being used is greater than or equal to the required version.

For historical reasons, Exporter supplies a require_version method that simply delegates to VERSION . Originally, before UNIVERSAL::VERSION existed, Exporter would call require_version .

Since the UNIVERSAL::VERSION method treats the $VERSION number as a simple numeric value it will regard version 1.10 as lower than 1.9. For this reason it is strongly recommended that you use numbers with at least two decimal places, e.g., 1.09.

Managing Unknown Symbols

In some situations you may want to prevent certain symbols from being exported. Typically this applies to extensions which have functions or constants that may not exist on some systems.

The names of any symbols that cannot be exported should be listed in the @EXPORT_FAIL array.

If a module attempts to import any of these symbols the Exporter will give the module an opportunity to handle the situation before generating an error. The Exporter will call an export_fail method with a list of the failed symbols:

  @failed_symbols = $module_name->export_fail(@failed_symbols);

If the export_fail method returns an empty list then no error is recorded and all the requested symbols are exported. If the returned list is not empty then an error is generated for each symbol and the export fails. The Exporter provides a default export_fail method which simply returns the list unchanged.

Uses for the export_fail method include giving better error messages for some symbols and performing lazy architectural checks (put more symbols into @EXPORT_FAIL by default and then take them out if someone actually tries to use them and an expensive check shows that they are usable on that platform).

Tag Handling Utility Functions

Since the symbols listed within %EXPORT_TAGS must also appear in either @EXPORT or @EXPORT_OK , two utility functions are provided which allow you to easily add tagged sets of symbols to @EXPORT or @EXPORT_OK :

  %EXPORT_TAGS = (foo => [qw(aa bb cc)], bar => [qw(aa cc dd)]);

  Exporter::export_tags('foo');     # add aa, bb and cc to @EXPORT
  Exporter::export_ok_tags('bar');  # add aa, cc and dd to @EXPORT_OK

Any names which are not tags are added to @EXPORT or @EXPORT_OK unchanged but will trigger a warning (with -w ) to avoid misspelt tags names being silently added to @EXPORT or @EXPORT_OK . Future versions may make this a fatal error.

Generating Combined Tags

If several symbol categories exist in %EXPORT_TAGS , it's usually useful to create the utility ":all" to simplify "use" statements.

The simplest way to do this is:

  %EXPORT_TAGS = (foo => [qw(aa bb cc)], bar => [qw(aa cc dd)]);

  # add all the other ":class" tags to the ":all" class,
  # deleting duplicates
  {
    my %seen;

    push @{$EXPORT_TAGS{all}},
      grep {!$seen{$_}++} @{$EXPORT_TAGS{$_}} foreach keys %EXPORT_TAGS;
  }

CGI.pm creates an ":all" tag which contains some (but not really all) of its categories. That could be done with one small change:

  # add some of the other ":class" tags to the ":all" class,
  # deleting duplicates
  {
    my %seen;

    push @{$EXPORT_TAGS{all}},
      grep {!$seen{$_}++} @{$EXPORT_TAGS{$_}}
        foreach qw/html2 html3 netscape form cgi internal/;
  }

Note that the tag names in %EXPORT_TAGS don't have the leading ':'.

AUTOLOAD ed Constants

Many modules make use of AUTOLOAD ing for constant subroutines to avoid having to compile and waste memory on rarely used values (see perlsub for details on constant subroutines). Calls to such constant subroutines are not optimized away at compile time because they can't be checked at compile time for constancy.

Even if a prototype is available at compile time, the body of the subroutine is not (it hasn't been AUTOLOAD ed yet). perl needs to examine both the () prototype and the body of a subroutine at compile time to detect that it can safely replace calls to that subroutine with the constant value.

A workaround for this is to call the constants once in a BEGIN block:

   package My ;

   use Socket ;

   foo( SO_LINGER );  ## SO_LINGER NOT optimized away; called at runtime
   BEGIN { SO_LINGER }
   foo( SO_LINGER );  ## SO_LINGER optimized away at compile time.

This forces the AUTOLOAD for SO_LINGER to take place before SO_LINGER is encountered later in My package.

If you are writing a package that AUTOLOAD s, consider forcing an AUTOLOAD for any constants explicitly imported by other packages or which are usually used when your package is use d.

Good Practices ^ Declaring @EXPORT_OK and Friends

When using Exporter with the standard strict and warnings pragmas, the our keyword is needed to declare the package variables @EXPORT_OK , @EXPORT , @ISA , etc.

  our @ISA = qw(Exporter);
  our @EXPORT_OK = qw(munge frobnicate);

If backward compatibility for Perls under 5.6 is important, one must write instead a use vars statement.

  use vars qw(@ISA @EXPORT_OK);
  @ISA = qw(Exporter);
  @EXPORT_OK = qw(munge frobnicate);
Playing Safe

There are some caveats with the use of runtime statements like require Exporter and the assignment to package variables, which can be very subtle for the unaware programmer. This may happen for instance with mutually recursive modules, which are affected by the time the relevant constructions are executed.

The ideal (but a bit ugly) way to never have to think about that is to use BEGIN blocks. So the first part of the "SYNOPSIS" code could be rewritten as:

  package YourModule;

  use strict;
  use warnings;

  our (@ISA, @EXPORT_OK);
  BEGIN {
     require Exporter;
     @ISA = qw(Exporter);
     @EXPORT_OK = qw(munge frobnicate);  # symbols to export on request
  }

The BEGIN will assure that the loading of Exporter.pm and the assignments to @ISA and @EXPORT_OK happen immediately, leaving no room for something to get awry or just plain wrong.

With respect to loading Exporter and inheriting, there are alternatives with the use of modules like base and parent .

  use base qw(Exporter);
  # or
  use parent qw(Exporter);

Any of these statements are nice replacements for BEGIN { require Exporter; @ISA = qw(Exporter); } with the same compile-time effect. The basic difference is that base code interacts with declared fields while parent is a streamlined version of the older base code to just establish the IS-A relationship.

For more details, see the documentation and code of base and parent .

Another thorough remedy to that runtime vs. compile-time trap is to use Exporter::Easy , which is a wrapper of Exporter that allows all boilerplate code at a single gulp in the use statement.

   use Exporter::Easy (
       OK => [ qw(munge frobnicate) ],
   );
   # @ISA setup is automatic
   # all assignments happen at compile time
What Not to Export

You have been warned already in "Selecting What to Export" to not export:

There's one more item to add to this list. Do not export variable names. Just because Exporter lets you do that, it does not mean you should.

  @EXPORT_OK = qw($svar @avar %hvar); # DON'T!

Exporting variables is not a good idea. They can change under the hood, provoking horrible effects at-a-distance that are too hard to track and to fix. Trust me: they are not worth it.

To provide the capability to set/get class-wide settings, it is best instead to provide accessors as subroutines or class methods instead.

SEE ALSO ^

Exporter is definitely not the only module with symbol exporter capabilities. At CPAN, you may find a bunch of them. Some are lighter. Some provide improved APIs and features. Pick the one that fits your needs. The following is a sample list of such modules.

    Exporter::Easy
    Exporter::Lite
    Exporter::Renaming
    Exporter::Tidy
    Sub::Exporter / Sub::Installer
    Perl6::Export / Perl6::Export::Attrs
LICENSE ^

This library is free software. You can redistribute it and/or modify it under the same terms as Perl itself.

[Nov 14, 2017] Perl Medic Transforming Legacy Code Peter Scott 9780201795264 Amazon.com Books

Nov 14, 2017 | www.amazon.com

By Ricardo Signes on August 5, 2007

great if you inherit wretched old code

When Perl Medic came out, we received a review copy. It was the first review book we got, and I was pretty excited. I took it to (of all places) the gym with me, and read it while I ran. A lot of Perl People were raving about how it was the awesomest book in a long time, and I just couldn't get that excited about it. Despite that, I find myself recommending it to more and more fellow programmers.Here are the things that I find especially useful in the book:

Tests

I am all for testing. I like testing. Testing helps me code better. Testing helps me figure out what badly documented features should do, and helps me notice that my patches are going to break in production. It's just the right thing to do. When reading, I thought the coverage of testing was a too long, but that was because I was already at home with it. Really, it's the right length for someone who's not already testing, and Peter Scott should be applauded for writing one of the first Perl books to really explain and encourage testing. Moreover, he stays true to the book's "for maintenance programmers" nature and talks about the troubles with writing tests for code you didn't write, including testing traditionally hard to test things like CGI scripts.

perldelta

The book walks through historical versions of perl from version 4 (!) to 5.8.3. The authors tells you what changed, and what you should probably do if you upgrade the platform your old Perl code is running on. (Moving to 5.6? Now you can use C<our>. Moving to 5.8? Restricted hashes!)

Cargo Cult Perl

The author devotes almost a whole chapter to pointing out stupid things that people write because they don't really know what they're doing. open without checking the return, symbolic references, three part for, and (argh!) return undef.

I hear Dominus is working on a book on this topic. Until then, this should help people write code that will be less of a pain to maintain.

*other stuff*

Perl Medic offers a concise explanation of scoping and variable types. It's no Coping with Scoping, but it's quite clear and covers C<our>. While a lot of Perl Medic is really for experienced folks maintaining old code, this section is something every new Perl hacker should read.

There's a nice section on figuring out WTF existing code /does/, suggesting modules to benchmark, profile, deobfuscate, and otherwise dissect the horrible code you're handed.

In the appendices, there's a few pages on "How to Ask Questions that Get Answered." There exist many of these guides, but I don't care. If every technical book spent two or three pages on this, it would be a blessing. Somebody who knows how to ask a question is going to get help, and is going to quickly be received into the community where he asks it.

*the down side*

I guess the biggest down side for me was that I just didn't learn many new things from this book. A number of things were well stated, and I recognized that this book would be useful for many people, but for the most part I didn't have many "A-ha!" moments.

There's a section on "how to use the CPAN," and I felt it was out of place. This might be, though, because I find the idea of Perl without CPAN (for non-tiny projects) to be insane. I guess there must be people who do have to deal with "anti-CPAN policies," though.

This book is definitely a good buy for anybody who's going to be taking over someone else's code, especially code that's old or just lousy. (Even if it's good code, this book can help.) Whenever I leave my job, I will make sure they pick up a copy for the new guy.

[Nov 14, 2017] Perl archeology Need help in refactoring of old Perl code that does not use strict

Nov 14, 2017 | perlmonks.com

likbez has asked for the wisdom of the Perl Monks concerning the following question:

Edit

This is kind of topic that previously was reserved to Cobol and PL/1 forums ;-) but now Perl is almost 30 years old and it looks like the space for Perl archeology is gradually opening ;-).

I got a dozen of fairly large scripts (several thousand lines each) written in a (very) early version of Perl 5 (below Perl 5.6), I now need:

1. Convert them to use strict pragma. The problem is that all of them share (some heavily, some not) information from main program to subroutines (and sometimes among subroutines too) via global variables in addition to (or sometimes instead of) parameters. Those scripts mostly do not use my declarations either.

So I need to map variables into local and global namespaces for each subroutine (around 40 per script; each pretty small -- less then hundred lines) to declare them properly.

As initial step I just plan use global variable with namespace qualification or our lists for each subroutine. Currently I plan to postprocess output of perl -MO=Xref old_perl_script.pl

and generate such statement. Is there a better way ?

2. If possible, I want to split the main namespace into at least two chunks putting all subroutines into another namespace, or module. I actually do not know how to export subroutines names into other namespace (for example main::) when just package statements is used in Perl as in example below. Modules do some magic via exporter that I just use but do not fully understand. For example if we have

#main_script ... ... ... x:a(1,2,3); ... ... ... package x; sub a {...) sub b {...} sub c {...} package y; ... ... ... [download] How can I access subs a,b,c without qualifying them with namespace x from the main:: namespace?

3. Generally this task looks like a case of refactoring. I wonder, if any Perl IDE has some of required capabilities, or are there tools that can helpful.

My time to make the conversion is limited and using some off the shelf tools that speed up the process would be a great help.

Any advice will be greatly appreciated.

AnomalousMonk (Chancellor) on Nov 14, 2017 at 07:20 UTC

Re: Perl archeology: Need help in refactoring of old Perl code that does not use strict

I'd like to suggest that you also need a

Step 0: Write a test suite that the current code passes for all normal modes of operation and for all failure modes.
With this test suite, you can be reasonably certain that refactored code isn't just going to be spreading the devastation.

Given that you seem to be describing a spaghetti-coded application with communication from function to function via all kinds of secret tunnels and spooky-action-at-a-distance global variables, I'd say you have a job on your hands just with Step 0. But you've already taken a test suite into consideration... Right?


Give a man a fish : <%-{-{-{-<

Monk::Thomas (Friar) on Nov 14, 2017 at 12:14 UTC

Re^2: Perl archeology: Need help in refactoring of old Perl code that does not use strict


by Monk::Thomas (Friar) on Nov 14, 2017 at 12:14 UTC

This is what I would do after 'Step 0':

If the variable does change during the run then pick a different function first. When you got the global state disentangled a bit it's a lot easier to reason about what this code is doing. Everything that's still using a global needs to be treated with very careful attention.

Corion (Pope) on Nov 14, 2017 at 08:45 UTC

Re: Perl archeology: Need help in refactoring of old Perl code that does not use strict

In addition to AnomalousMonk s advice of a test suite, I would suggest at the very least to invest the time up front to run automatic regression tests between whatever development version of the program you have and the current "good" (but ugly) version. That way you can easily verify whether your change affected the output and operation of the program. Ideally, the output of your new program and the old program should remain identical while you are cleaning things up.

Note that you can enable strict locally in blocks, so you don't need to make the main program compliant but can start out with subroutines or files and slowly convert them.

For your second question, have a look at Exporter . Basically it allows you to im/export subroutine names between packages:

package x;
use Exporter 'import';
our @EXPORT_OK = ('a', 'b', 'c');
[download] #main_script use x 'a', 'b'; # makes a() and b() available in the main namespace [download]

To find and collect the global variables, maybe it helps you to dump the global namespace before and after your program has run. All these names are good candidates for being at least declared via our to make them visible, and then ideally removed to pass the parameters explicitly instead of implicitly:

      #!perl -w
use strict;

our $already_fixed = 1; # this won't show up

# Put this right before the "uncleaned" part of the script starts
my %initial_variables;
BEGIN {
    %initial_variables = %main::; # make a copy at the start of the program
}
END {
#use Data::Dumper;
#warn Dumper \%initial_variables;
#warn Dumper \%main::;
    # At the end, look what names came newly into being, and tell us about them:
    for my $key (sort keys %main::) {
        if( ! exists $initial_variables{ $key } ) {
            print "Undeclared global variable '$key' found\n";
            
            my $glob = $main::{ $key };
            
            if( defined *{ $glob }{GLOB}) {
                print "used as filehandle *'$key', replace by a lexical filehandle\n";
            };
            if( defined *{ $glob }{CODE}) {
                print "used as subroutine '$key'\n"; # so maybe a false alarm unless you dynamically load code?!
            };
            if( defined *{ $glob }{SCALAR}) {
                print "used as scalar \$'$key', declare as 'our'\n";
            };
            if( defined *{ $glob }{ARRAY}) {
                print "used as array \@'$key', declare as 'our'\n";
            };
            if( defined *{ $glob }{HASH}) {
                print "used as hash \%'$key', declare as 'our'\n";
            };
        };
    };
}
no strict;

$foo = 1;
@bar = (qw(baz bat man));
open LOG, '<', *STDIN;
sub foo_2 {}
 
[download]

The above code is a rough cut and for some reason it claims all global names as scalars in addition to their real use, but it should give you a start at generating a list of undeclared names.

Also see Of Symbol Tables and Globs .

Anonymous Monk on Nov 14, 2017 at 08:26 UTC

Re: Perl archeology: Need help in refactoring of old Perl code that does not use strict (hurry up and wait)

1) ... strict pragma ...My time to make the conversion is limited and using some off the shelf tools that speed up the process would be a great help.

Hurry up and leave it alone :)

use strict; itself confers no benefits; The benefits come from avoidance of the bad practices forbidden by strict :)

That pretty much means convert one at a time by hand after you have learned the understanding of importance of knowing :) Speed kills

2. If possible ... I do not understand ...

That is a hint you shouldn't be refactoring anything programmatically. There are a million nodes on perlmonks, and a readers digest version might be Modern Perl a loose description of how experienced and effective Perl 5 programmers work....You can learn this too.

Hurry up and bone up

3. Generally this task looks like a case of refactoring. I wonder, if any Perl IDE has some of required capabilities, or are there tools that can helpful.

I hope you have foot insurance :) happy hunting :) perlcritic , PPI / PPIx::XPath , PPIx::EditorTools ,
App::EditorTools - Command line tool for Perl code refactoring
Code::CutNPaste - Find Duplicate Perl Code

So enjoy, test first, step0++

[Nov 14, 2017] scoping - What is the difference between my and local in Perl - Stack Overflow

Notable quotes:
"... temporarily changes the value of the variable ..."
"... within the scope ..."
"... Unlike dynamic variables created by the local operator, lexical variables declared with my are totally hidden from the outside world, including any called subroutines. ..."
Nov 14, 2017 | stackoverflow.com

down vote favorite 10

Brian G ,Sep 24, 2008 at 20:12

I am seeing both of them used in this script I am trying to debug and the literature is just not clear. Can someone demystify this for me?

J.J. ,Sep 24, 2008 at 20:24

Dynamic Scoping. It is a neat concept. Many people don't use it, or understand it.

Basically think of my as creating and anchoring a variable to one block of {}, A.K.A. scope.

my $foo if (true); # $foo lives and dies within the if statement.

So a my variable is what you are used to. whereas with dynamic scoping $var can be declared anywhere and used anywhere. So with local you basically suspend the use of that global variable, and use a "local value" to work with it. So local creates a temporary scope for a temporary variable.

$var = 4;
print $var, "\n";
&hello;
print $var, "\n";

# subroutines
sub hello {
     local $var = 10;
     print $var, "\n";
     &gogo; # calling subroutine gogo
     print $var, "\n";
}
sub gogo {
     $var ++;
}

This should print:

4
10
11
4

Brad Gilbert ,Sep 24, 2008 at 20:50

You didn't call the subroutines. – Brad Gilbert Sep 24 '08 at 20:50

brian d foy ,Sep 25, 2008 at 18:23

Don't conditionally declare lexical variables: it has undefined behavior. – brian d foy Sep 25 '08 at 18:23

Jeremy Bourque ,Sep 24, 2008 at 20:26

The short answer is that my marks a variable as private in a lexical scope, and local marks a variable as private in a dynamic scope.

It's easier to understand my , since that creates a local variable in the usual sense. There is a new variable created and it's accessible only within the enclosing lexical block, which is usually marked by curly braces. There are some exceptions to the curly-brace rule, such as:

foreach my $x (@foo) { print "$x\n"; }

But that's just Perl doing what you mean. Normally you have something like this:

sub Foo {
   my $x = shift;

   print "$x\n";
}

In that case, $x is private to the subroutine and it's scope is enclosed by the curly braces. The thing to note, and this is the contrast to local , is that the scope of a my variable is defined with respect to your code as it is written in the file. It's a compile-time phenomenon.

To understand local , you need to think in terms of the calling stack of your program as it is running. When a variable is local , it is redefined from the point at which the local statement executes for everything below that on the stack, until you return back up the stack to the caller of the block containing the local .

This can be confusing at first, so consider the following example.

sub foo { print "$x\n"; }
sub bar { local $x; $x = 2; foo(); }

$x = 1;
foo(); # prints '1'
bar(); # prints '2' because $x was localed in bar
foo(); # prints '1' again because local from foo is no longer in effect

When foo is called the first time, it sees the global value of $x which is 1. When bar is called and local $x runs, that redefines the global $x on the stack. Now when foo is called from bar , it sees the new value of 2 for $x . So far that isn't very special, because the same thing would have happened without the call to local . The magic is that when bar returns we exit the dynamic scope created by local $x and the previous global $x comes back into scope. So for the final call of foo , $x is 1.

You will almost always want to use my , since that gives you the local variable you're looking for. Once in a blue moon, local is really handy to do cool things.

Drew Stephens ,Sep 24, 2008 at 22:58

Quoting from Learning Perl :

But local is misnamed, or at least misleadingly named. Our friend Chip Salzenberg says that if he ever gets a chance to go back in a time machine to 1986 and give Larry one piece of advice, he'd tell Larry to call local by the name "save" instead.[14] That's because local actually will save the given global variable's value away, so it will later automatically be restored to the global variable. (That's right: these so-called "local" variables are actually globals!) This save-and-restore mechanism is the same one we've already seen twice now, in the control variable of a foreach loop, and in the @_ array of subroutine parameters.

So, local saves a global variable's current value and then set it to some form of empty value. You'll often see it used to slurp an entire file, rather than leading just a line:

my $file_content;
{
    local $/;
    open IN, "foo.txt";
    $file_content = <IN>;
}

Calling local $/ sets the input record separator (the value that Perl stops reading a "line" at) to an empty value, causing the spaceship operator to read the entire file, so it never hits the input record separator.

Aristotle Pagaltzis ,Sep 25, 2008 at 23:25

I can't believe no one has linked to Mark Jason Dominus' exhaustive treatises on the matter:

dan1111 ,Jan 28, 2013 at 11:21

Word of warning: both of these articles are quite old, and the second one (by the author's own warning) is obsolete. It demonstrates techniques for localization of file handles that have been superseded by lexical file handles in modern versions of Perl. – dan1111 Jan 28 '13 at 11:21

Floegipoky ,Jan 23, 2015 at 16:51

As in Clinton was President (of the US) when the first was written – Floegipoky Jan 23 '15 at 16:51

Steve Jessop ,Sep 24, 2008 at 20:21

http://perldoc.perl.org/perlsub.html#Private-Variables-via-my()

Unlike dynamic variables created by the local operator, lexical variables declared with my are totally hidden from the outside world, including any called subroutines. This is true if it's the same subroutine called from itself or elsewhere--every call gets its own copy.

http://perldoc.perl.org/perlsub.html#Temporary-Values-via-local()

A local modifies its listed variables to be "local" to the enclosing block, eval, or do FILE --and to any subroutine called from within that block. A local just gives temporary values to global (meaning package) variables. It does not create a local variable. This is known as dynamic scoping. Lexical scoping is done with my, which works more like C's auto declarations.

I don't think this is at all unclear, other than to say that by "local to the enclosing block", what it means is that the original value is restored when the block is exited.

dlamblin ,Sep 24, 2008 at 20:14

Well Google really works for you on this one: http://www.perlmonks.org/?node_id=94007

From the link:

Quick summary: 'my' creates a new variable, 'local' temporarily amends the value of a variable.

ie, 'local' temporarily changes the value of the variable , but only within the scope it exists in.

Generally use my, it's faster and doesn't do anything kind of weird.

Kevin Crumley ,Sep 24, 2008 at 20:27

While this may be true, it's basically a side effect of the fact that "local"s are intended to be visible down the callstack, while "my"s are not. And while overriding the value of a global may be the main reason for using "local", there's no reason you can't use "local" to define a new variable. – Kevin Crumley Sep 24 '08 at 20:27

1800 INFORMATION ,Jan 21, 2009 at 10:02

local does not actually define a new variable. For example, try using local to define a variable when option explicit is enabled. You need to use "our" or "my" to define a new global or local variable. "local" is correctly used to give a variable a new value – 1800 INFORMATION Jan 21 '09 at 10:02

1800 INFORMATION ,Jan 29, 2009 at 10:45

Jesus did I really say option explicit to refer to the Perl feature. I meant obviously "use strict". I've obviously not coded in Perl in a while – 1800 INFORMATION Jan 29 '09 at 10:45

catfood ,Sep 24, 2008 at 20:18

From man perlsub :

Unlike dynamic variables created by the local operator, lexical variables declared with my are totally hidden from the outside world, including any called subroutines.

So, oversimplifying, my makes your variable visible only where it's declared. local makes it visible down the call stack too. You will usually want to use my instead of local .

Michael Carman ,Sep 25, 2008 at 2:00

Your confusion is understandable. Lexical scoping is fairly easy to understand but dynamic scoping is an unusual concept. The situation is made worse by the names my and local being somewhat inaccurate (or at least unintuitive) for historical reasons.

my declares a lexical variable -- one that is visible from the point of declaration until the end of the enclosing block (or file). It is completely independent from any other variables with the same name in the rest of the program. It is private to that block.

local , on the other hand, declares a temporary change to the value of a global variable. The change ends at the end of the enclosing scope, but the variable -- being global -- is visible anywhere in the program.

As a rule of thumb, use my to declare your own variables and local to control the impact of changes to Perl's built-in variables.

For a more thorough description see Mark Jason Dominus' article Coping with Scoping .

skiphoppy ,Sep 25, 2008 at 18:52

local is an older method of localization, from the times when Perl had only dynamic scoping. Lexical scoping is much more natural for the programmer and much safer in many situations. my variables belong to the scope (block, package, or file) in which they are declared.

local variables instead actually belong to a global namespace. If you refer to a variable $x with local, you are actually referring to $main::x, which is a global variable. Contrary to what it's name implies, all local does is push a new value onto a stack of values for $main::x until the end of this block, at which time the old value will be restored. That's a useful feature in and of itself, but it's not a good way to have local variables for a host of reasons (think what happens when you have threads! and think what happens when you call a routine that genuinely wants to use a global that you have localized!). However, it was the only way to have variables that looked like local variables back in the bad old days before Perl 5. We're still stuck with it.

andy ,Sep 24, 2008 at 20:18

"my" variables are visible in the current code block only. "local" variables are also visible where ever they were visible before. For example, if you say "my $x;" and call a sub-function, it cannot see that variable $x. But if you say "local $/;" (to null out the value of the record separator) then you change the way reading from files works in any functions you call.

In practice, you almost always want "my", not "local".

Abhishek Kulkarni ,Apr 10, 2013 at 5:44

Look at the following code and its output to understand the difference.
our $name = "Abhishek";

sub sub1
{
    print "\nName = $name\n";
    local $name = "Abhijeet";

    &sub2;
    &sub3;
}

sub sub2
{
    print "\nName = $name\n";
}

sub sub3
{
    my $name = "Abhinav";
    print "\nName = $name\n";
}


&sub1;

Output is :

Name = Abhishek

Name = Abhijeet

Name = Abhinav

phreakre ,Oct 1, 2008 at 16:01

dinomite's example of using local to redefine the record delimiter is the only time I have ran across in a lot of perl programming. I live in a niche perl environment [security programming], but it really is a rarely used scope in my experience.

Saravanarajan

add a comment,Aug 6, 2009 at 8:12
&s;

sub s()
{
    local $s="5";
    &b;
    print $s;
}

sub b()
{
    $s++;
}

The above script prints 6.

But if we change local to my it will print 5.

This is the difference. Simple.

,

I think the easiest way to remember it is this way. MY creates a new variable. LOCAL temporarily changes the value of an existing variable.

[Nov 13, 2017] no title

Notable quotes:
"... What happens if the delimiter is indicated to be a null string (a string of zero characters)? ..."
Dec 28, 2006 | perlmonks.com

Re: Understanding Split and Join

I'd put more emphasis on the fact that the first argument to split is always, always, always a regular expression (except for the one special case where it isn't :-). Too often do I see people write code like this:

@stuff = split "|", $string; # or worse ... $delim = "|"; @stuff = split $delim, $string; [download] And expect it to split on the pipe symbol because they have fooled themselves into thinking that the first argument is somehow interpreted as a string rather than a regular expression. duff

jwkrahn (Monsignor) on Dec 28, 2006 at 13:23 UTC

There are cases where it is equally easy to use a regexp in list context to split a string as it is to use the split function. Consider the following examples:my @list = split /\s+/, $string; my @list = $string =~ /(\S+)/g; [download]In the first example you're defining what to throw away. In the second, you're defining what to keep. But you're getting the same results. That is a case where it's equally easy to use either syntax.

In your regexp example you don't need the parentheses, it will work the same without them.

If $string contains leading whitespace then you will NOT get the same results. To demonstrate examples that produce the same results:

my @list = split ' ', $string; my @list = $string =~ /\S+/g; [download]

chromatic (Archbishop) on Dec 29, 2006 at 00:52 UTC

What happens if the delimiter is indicated to be a null string (a string of zero characters)?

perl behaves inconsistently with regard to the "empty" regex:

my $string = 'Monk'; exit unless $string =~ /(o)/; my @matches = $string =~ //; warn join('=', @matches), "\n"; exit unless $string =~ /(o)/; my @letters = split( //, $string ); warn join('-', @letters), "\n"; [download]

ysth (Canon) on Dec 29, 2006 at 08:02 UTC

chromatic has pointed out that split treats an empty pattern normally, not as a directive to reuse the last successfully matching pattern, as m// and s/// do.

A pattern that split treats specially but m// and s/// treat normally is /^/. Normally, ^ only matches at the beginning of a string. Given the /m flag, it also matches after newlines in the interior of the string. It's common to want to break a string up into lines without removing the newlines as splitting on /\n/ would do. One way to do this is @lines = /^(.*\n?)/mg . Another, perhaps more straightforward, is @lines = split /^/m . Without the /m, the ^ should match only at the beginning of the string, so the split should return only one element, containing the entire original string. Since this is useless, and splitting on /^/m instead is common, /^/ silently becomes /^/m.

This only applies to a pattern consisting of just ^; even the apparently equivalent /^(?#)/ or /^ /x are treated normally and don't split the string at all.

ferreira (Chaplain) on Dec 30, 2006 at 19:34 UTC

Both exceptions, the special treatment of // and /^/ by split, are documented in split .

Both may deserve to be mentioned in the tutorial quickly for the profit of the unaware.

The last remark by ysth about the non-equivalence of /^(?#)/ and /^ /x with // for split purposes is a subtle thing.

More subtle if you compare to the fact that / /x , / # /x or even / (?#)/x have the same treatment as // when passed to this function.

Looks like a case to be fixed either in the docs or in the code of the Perl interpreter itself (if not barred by compatibility issues).

[Nov 13, 2017] Strip Pod as Pod from Perl file - Stack Overflow

Nov 13, 2017 | stackoverflow.com

Håkon Hægland ,Nov 2, 2014 at 12:10

I am trying to extract the Pod documentation from a Perl file. I do not want to convert the documentation to text as is done by Pod::Simple::Text . I just want the Pod text as Pod text, such that I can feed it into Pod::Template later. For example:
use warnings;
use strict;
use Pod::Simple::Text;
my $ps=Pod::Simple::Text->new();
my $str;
$ps->output_string( \$str );
$ps->parse_file($0);
print $str;

__END__

=head1 SYNOPSIS

prog [OPTIONS]

This will print the Pod as text. Is there a CPAN module that can give me the Pod text, that is:

=head1 SYNOPSIS

prog [OPTIONS]

instead?

Update

The solution should be able to handle Pod docs in strings, like

my $str = '__END__

=head1 SYNOPSIS';

Miller ,Nov 2, 2014 at 18:42

This can be done using PPI :
use strict;
use warnings;

use PPI;

# Slurp source code
my $src = do { local ( @ARGV, $/ ) = $0; <> };

# Load a document
my $doc = PPI::Document->new( \$src );

# Find all the pod within the doc
my $pod = $doc->find('PPI::Token::Pod');
for (@$pod) {
    print $_->content, "\n";
}

=comment
Hi Pod
=cut

1;

__END__

=head1 SYNOPSIS

prog [OPTIONS]

Outputs:

=comment
Hi Pod
=cut

=head1 SYNOPSIS

prog [OPTIONS]

Håkon Hægland ,Nov 3, 2014 at 12:51

Thanks for this great solution. It even works with Pod docs embedded in strings, like my $str='__END__ =head1 SYNOPSIS';Håkon Hægland Nov 3 '14 at 12:51

Tim ,Nov 2, 2014 at 13:58

Use the -u option for perldoc . This strips out the POD and displays it raw.

If you want to extract the POD from within a Perl program, you could do something like this:

my $rawpod;
if (open my $fh, '-|', 'perldoc', '-u', $filename) {
  local $/;
  my $output = <$fh>;
  if (close $fh) {
    $rawpod = $output;
  }
}

If you really don't want to run perldoc as an executable, you might be interested that the perldoc executable is a very simple wrapper around Pod::Perldoc which you might want to consider using yourself.

Håkon Hægland ,Nov 3, 2014 at 12:55

Thanks, but it does not work with Pod docs embedded in strings. See my updated question for an example.. – Håkon Hægland Nov 3 '14 at 12:55

Tim ,Nov 3, 2014 at 18:54

Well, if you change the question, it's not that surprising that a given answer no longer works. I'm pleased you've found a solution to your new question. – Tim Nov 3 '14 at 18:54

Håkon Hægland ,Nov 5, 2014 at 6:40

The problem with perldoc is that there is a bug, so it thinks Pod embedded in a string belongs to the document. – Håkon Hægland Nov 5 '14 at 6:40

Calle Dybedahl ,Nov 2, 2014 at 12:54

Pod::Simple::SimpleTree will give it to you as a parse tree. You can convert that back to POD source easily enough.

toolic ,Nov 2, 2014 at 13:51

+1 if you provide a runnable code example. – toolic Nov 2 '14 at 13:51

[Nov 13, 2017] Perl for Newbies - Part 3 by Shlomi Fish

Nov 13, 2017 | perl-begin.org

Perl Modules

Perl modules are namespaces that contain function and variables. Two distinct modules may each contain a function (or a variable) with the same name, yet the perl interpreter will be able to tell them apart. Furthermore, both functions can be invoked from the same code.


Contents Up Prev Next
Perl for Newbies - Part 3 - Modules and ObjectsPerl ModulesDeclaring a Package 3.1. Declaring a Package

In order to designate that a code belongs to a different namespace you should use the package directive. For instance, if you want your module name to be "MyModule" your file should look something like this:

# This is the file MyModule.pm
#

package MyModule;

use strict;
use warnings;

sub a_function
{
    print "Hello, World!\n";
}

1;

Note that a module has to return a true value to the perl interpreter, which explains the use of "1;".

A namespace may contain sub-namespaces. To separate namespace components, use the :: separator. For example:

# This is the file Hoola/Hoop.pm
#

package Hoola::Hoop;

use strict;
use warnings;

my $counter = 0;

sub get_counter
{
    return $counter++;
}

1;

Contents Up Prev Next
Perl for Newbies - Part 3 - Modules and ObjectsPerl ModulesDeclaring a Package Where to find a module 3.1.1. Where to find a module

A module is a separate file that may contain one or more different namespaces. That file is found in a place that corresponds to the module's name. To find the filename of the module relative to the script's directory, replace every :: with a slash and add ".pm" to the name of the last component.

For instance: the MyModule module will be found in the file "MyModule.pm"; the Hello::World module will be found in the file "World.pm" under the Hello sub-directory; etc.


Contents Up Prev Next
Perl for Newbies - Part 3 - Modules and ObjectsPerl ModulesLoading Modules and Importing their Functions 3.2. Loading Modules and Importing their Functions

In order to have access to a module from within your script (or from within another module) you can use the use directive followed by the name of the module as it is deduced from its path. Here's an example: assume that the file "MyModule.pm" is this:

# This is the file MyModule.pm
#

package MyModule;

use strict;
use warnings;

sub a_function
{
    print "Hello, World!\n";
}

1;

And this is your script:

#!/usr/bin/perl

use strict;
use warnings;

use MyModule;

# Notice that we use "::" to call the function out of the module.
MyModule::a_function();

That way, the program will print "Hello, World!" on the screen.


Contents Up Prev Next
Perl for Newbies - Part 3 - Modules and ObjectsPerl ModulesLoading Modules and Importing their Functions Accessing Functions from a Different Module 3.2.1. Accessing Functions from a Different Module

As could be seen from the last example, once the module has been use 'd, its functions can be invoked by typing the full path of the package followed by :: and followed by the function name.

Note that if you are in package Foo and you are trying to access functions from package Foo::Bar , then typing Bar::my_func() will not do. You have to type the full path of the module. ( Foo::Bar::my_func() in our case)


Contents Up Prev Next
Perl for Newbies - Part 3 - Modules and ObjectsPerl ModulesLoading Modules and Importing their Functions Exporting and Importing Functions 3.2.2. Exporting and Importing Functions

It is possible to make a functions of your module automatically available in any other namespace or script that uses it. To do so one needs to type the following code fragment near the beginning of the module:

use Exporter;

use vars qw(@ISA @EXPORT);

@ISA=qw(Exporter);

@EXPORT=("function1", "function2", "function3");

What this fragment does is make the module inherit the Exporter module which is a special Perl module that can export symbols. Then it declares the special variable @EXPORT which should be filled with all the functions that one wishes to export.

Here is an example which has a module called "Calc" and a script that uses it:

# File: Calc.pm
#
package Calc;

use strict;
use warnings;

use Exporter;

use vars qw(@ISA @EXPORT);

@ISA=qw(Exporter);

@EXPORT=("gcd");

# This function calculates the greatest common divisor of two integers
sub gcd
{
    my $m = shift;
    my $n = shift;

    if ($n > $m)
    {
        ($m, $n) = ($n , $m);
    }

    while ($m % $n > 0)
    {
        ($m, $n) = ($n, $m % $n);
    }

    return $n;
}

1;
#!/usr/bin/perl

use strict;
use warnings;

use Calc;

my $m = 200;
my $n = 15;

print "gcd($m,$n) == " , gcd($m,$n), "\n";

As you can see, the script invokes the "gcd" function of the "Calc" module without having to invoke it with Calc::gcd() . Exporting functions like that should be used with care, as the function names may conflict with those of the importing namespace.


Contents Up Prev Next
Perl for Newbies - Part 3 - Modules and ObjectsPerl ModulesLoading Modules and Importing their Functions Using Variables from a Different Namespace 3.2.3. Using Variables from a Different Namespace

It is also possible to use the global variables of different packages. However, such variables need to be declared using the use vars qw($myvar1 @myvar2) construct.

Here's an example for a module that defines a variable and another one that access it:

# This file is MyVar.pm
#
package MyVar;

use strict;
use warnings;

# Declare a namespace-scoped variable named $myvar.
use vars qw($myvar);

sub print_myvar
{
    print $myvar, "\n";
}

1;
#!/usr/bin/perl

use strict;
use warnings;

use MyVar;

$MyVar::myvar = "Hello";

MyVar::print_myvar();

$MyVar::myvar = "World";

MyVar::print_myvar();

Contents Up Prev Next
Perl for Newbies - Part 3 - Modules and ObjectsPerl ModulesBEGIN and END 3.3. BEGIN and END

There are two special code blocks for perl modules - BEGIN and END . These blocks are executed when a module is first loaded, and when the perl interpreter is about to unload it, respectively.

Here's an example for a logging module that makes use of this facility:

# File : MyLog.pm
#

package MyLog;

use strict;
use warnings;

BEGIN
{
    open MYLOG, ">", "mylog.txt";
}

sub log
{
    my $what = shift;

    # Strip the string of newline characters
    $what =~ s/\n//g;

    # The MYLOG filehandle is already open by virtue of the BEGIN
    # block.
    print MYLOG $what, "\n";
}

END
{
    close(MYLOG);
}

1;

Contents Up Prev Next
Perl for Newbies - Part 3 - Modules and ObjectsPerl ModulesThe "main" Namespace 3.4. The "main" Namespace

One can access the main namespace (i.e, the namespace of the script), from any other namespace by designating main as the namespace path. For instance, main::hello() will invoke the function named "hello" in the script file.

Usually, the "main" part can be omitted and the symbols be accessed with the notation of ::hello() alone.


Contents Up Prev Next
Perl for Newbies - Part 3 - Modules and ObjectsPerl ModulesDifference between Namespaces and Modules 3.5. Difference between Namespaces and Modules

[Nov 13, 2017] Translation Substring Error

Notable quotes:
"... "Biological data are typically huge. For reasons of efficiency, when dealing with this type of data, you should choose a fast solution over a slower one. Perl's string handling functions ... are measurably faster than regexes ..." ..."
Nov 13, 2017 | perlmonks.com


5 direct replies -- Read more / Contribute by FIJI42
on Nov 09, 2017 at 10:26

I have a subroutine for a basic one frame translation that is giving me an error for "Use of uninitialized value $codon in hash element" and "substr outside of string". I think my problem is I need to modify the subroutine's for loop to account for nucleotide sequences with odd numbers of acids (i.e. not in multiples of 3).

Does anyone have suggestions for how to modify the code properly?

Here is the subroutine I'm using in a simple example:

use strict;
use warnings; 

my $amino_acid='';
my $s1 = 'ATGCCCGTAC';     ## Sequence 1
my $s2 = 'GCTTCCCAGCGC';   ## Sequence 2

print "Sequence 1 Translation:";
OneFrameTranslation ($s1);        ## Calls subroutine
print "$amino_acid\n";

print "Sequence 2 Translation:";
OneFrameTranslation ($s2);       ## Calls subroutine
print "$amino_acid\n";

### Subroutine ###

sub OneFrameTranslation {
    my ($seq) = shift;
    my $amino_acid='';
    my $seqarray='';
    
my %genetic_code = (
     'TTT' => 'F', 'TTC' => 'F', 'TTA' => 'L', 'TTG' => 'L',
     'CTT' => 'L', 'CTC' => 'L', 'CTA' => 'L', 'CTG' => 'L', 
     'ATT' => 'I', 'ATC' => 'I', 'ATA' => 'I', 'ATG' => 'M',
     'GTT' => 'V', 'GTC' => 'V', 'GTA' => 'V', 'GTG' => 'V',
     'TCT' => 'S', 'TCC' => 'S', 'TCA' => 'S', 'TCG' => 'S',
     'CCT' => 'P', 'CCC' => 'P', 'CCA' => 'P', 'CCG' => 'P',
     'ACT' => 'T', 'ACC' => 'T', 'ACA' => 'T', 'ACG' => 'T',
     'GCT' => 'A', 'GCC' => 'A', 'GCA' => 'A', 'GCG' => 'A',
     'TAT' => 'Y', 'TAC' => 'Y', 'TAA' => '*', 'TAG' => '*',
     'CAT' => 'H', 'CAC' => 'H', 'CAA' => 'Q', 'CAG' => 'Q',
     'AAT' => 'N', 'AAC' => 'N', 'AAA' => 'K', 'AAG' => 'K',
     'GAT' => 'D', 'GAC' => 'D', 'GAA' => 'E', 'GAG' => 'E',
     'TGT' => 'C', 'TGC' => 'C', 'TGA' => '*', 'TGG' => 'W',
     'CGT' => 'R', 'CGC' => 'R', 'CGA' => 'R', 'CGG' => 'R',
     'AGT' => 'S', 'AGC' => 'S', 'AGA' => 'R', 'AGG' => 'R',
     'GGT' => 'G', 'GGC' => 'G', 'GGA' => 'G', 'GGG' => 'G' 
    );
## '---' = 3 character codon in hash above
## '-' = one letter amino acid abbreviation in hash above
my @seqarray = split(//,$seq);   ## Explodes the string
    for (my $i=0; $i
      
      
   



   
      Re: Translation Substring 
      Error (updated) 
      by haukex 
      (Monsignor) on Nov 09, 2017 at 15:47 UTC 
   
   
      
      

         @seqarray and $seqarray are two 
         different variables, and you never assign anything to $seqarray, 
         so using substr on it does not 
         make much sense, I suspect you just want to look directly at $seq 
         instead of splitting it (BTW, to 
         get multiple elements out of an array, use
         Slices or
         splice). Also, note that you 
         overwrite $amino_acid on every loop iteration. The following 
         minimal changes make your code work for me:


my $seq = shift; my $amino_acid; for (my $i=0; $i<=length($seq)-3; $i=$i+3) { my $codon = substr($seq,$i,3); $amino_acid .= $genetic_code{$codon}; } return $amino_acid; [download]

<update2> Fixed an off-by-one error in the above code; I initially incorrectly translated your $#seqarray-2 into length($seq)-2 ( $#seqarray returns the last index of the array, not its length like scalar(@seqarray) does, or length does for strings). That's a good argument against the classic for(;;) and for the two solutions below instead :-) </update2>

If you output the return value from OneFrameTranslation (your current code is ignoring the return value), this gives you:

print OneFrameTranslation('ATGCCCGTAC'),"\n"; print OneFrameTranslation('GCTTCCCAGCGC'),"\n"; __END__ MPV ASQR [download]

By the way, you can probably move your %genetic_code to the top of your code (outside of the sub ), so that it only gets initialized once instead of on every call to the sub , and making its name uppercase is the usual convention to indicate it is a constant that should not be changed.

Another way to break up a string is using regular expressions, the following also works - it matches three characters, and then matches again at the position that the previous match finished, and so on:

my $amino_acid; while ($seq=~/\G(...)/sg) { $amino_acid .= $genetic_code{$1}; } return $amino_acid; [download]

Or, possibly going a little overboard, here's a technique I describe in Building Regex Alternations Dynamically to make the replacements using a single regex. I have left out the quotemeta and sort steps only because I know for certain that all keys are three-character strings without any special characters, if you have any doubts about the input data, put those steps back in!

# build the regex, this only needs to be done once my ($genetic_regex) = map qr/$_/, join '|', keys %genetic_code; # apply the regex (my $amino_acid = $seq) =~ s/($genetic_regex)/$genetic_code{$1}/g; return $amino_acid; [download]

However, note this produces slightly different output for the first input: " MPVC " (the leftover C remains unchanged). Whether or not you want this behavior or not is up to you; it can also be accomplished in the first two solutions (although slightly less elegantly than with a regex). Update: Also, in the first two solutions you haven't defined what would happen if a code happens to not be available in the table; the third regex solution would simply leave it unchanged. Also minor edits for clarification.

[reply]
[d/l]
[select]

FIJI42 (Acolyte) on Nov 09, 2017 at 16:12 UTC

Re^2: Translation Substring Error (updated)


by FIJI42 (Acolyte) on Nov 09, 2017 at 16:12 UTC

Good point. If a nucleotide triplet with an unknown nucleotide appears (ex. ANC instead of ATC), I'd want to either skip those, or mark them with a letter like 'X'.

I do like the regex solution though, it's quite elegant.

haukex (Monsignor) on Nov 09, 2017 at 16:18 UTC

Re^3: Translation Substring Error
by haukex (Monsignor) on Nov 09, 2017 at 16:18 UTC
If a nucleotide triplet with an unknown nucleotide appears (ex. ANC instead of ATC), I'd want to either skip those, or mark them with a letter like 'X'.

In the first two solutions, you can use exists , e.g.:

if ( exists $genetic_code{$codon} ) { $amino_acid .= $genetic_code{$codon}; } else { $amino_acid .= $codon; # - OR - $amino_acid .= 'X'; # or something else... } [download]

Update: Or, written more tersely, either $amino_acid .= exists $genetic_code{$codon} ? $genetic_code{$codon} : 'X'; or $amino_acid .= $genetic_code{$codon} // 'X'; (the former uses the Conditional Operator , and the latter uses Logical Defined Or instead of exists , assuming you don't have any undef values in your hash).

haukex (Monsignor) on Nov 09, 2017 at 16:36 UTC

Re^3: Translation Substring Error
by haukex (Monsignor) on Nov 09, 2017 at 16:36 UTC
I do like the regex solution though, it's quite elegant.

You can combine my second and third suggestions (for nonexistent codes, this uses the defined-or solution I showed here , the exists solution would work as well):

(my $amino_acid = $seq) =~ s{(...)} { $genetic_code{$1} // 'X' }esg; return $amino_acid; [download]

toolic (Bishop) on Nov 09, 2017 at 15:49 UTC

Re: Translation Substring Error

The reason for the "substr outside of string" warning is that you assign the $seqarray variable to the empty string and you never assign it any other value. You are likely getting confused because you use the same name for two variables (an array and a scalar): $seqarray is a different variable from @seqarray. If you can specify what you want for output, you will get more specific help.

See also:

FIJI42 (Acolyte) on Nov 09, 2017 at 16:06 UTC

Re^2: Translation Substring Error


by FIJI42 (Acolyte) on Nov 09, 2017 at 16:06 UTC

Basically, I was just trying to get a string for the translated amino acids:

Example: MLVG

If I have sequence like this: ATGGCGA, then I'd just like the translation: MA. The "A" from the end of "ATGGCGA" can be ignored/not output.

Laurent_R (Canon) on Nov 09, 2017 at 16:12 UTC

Re^3: Translation Substring Error
by Laurent_R (Canon) on Nov 09, 2017 at 16:12 UTC $ perl dna.pl Sequence 1 Translation:MPV Sequence 2 Translation:ASQR [download]

Laurent_R (Canon) on Nov 09, 2017 at 16:01 UTC

Re: Translation Substring Error

Hi,

Try this:

use strict; use warnings; my $s1 = 'ATGCCCGTAC'; ## Sequence 1 my $s2 = 'GCTTCCCAGCGC'; ## Sequence 2 print "Sequence 1 Translation:"; my $amino_acid = OneFrameTranslation ($s1); ## Calls subroutine print "$amino_acid\n"; print "Sequence 2 Translation:"; $amino_acid = OneFrameTranslation ($s2); ## Calls subroutine print "$amino_acid\n"; ### Subroutine ### sub OneFrameTranslation { my ($seq) = shift; my $amino_acid=''; my $seqarray=''; my %genetic_code = ( 'TTT' => 'F', 'TTC' => 'F', 'TTA' => 'L', 'TTG' => 'L', 'CTT' => 'L', 'CTC' => 'L', 'CTA' => 'L', 'CTG' => 'L', 'ATT' => 'I', 'ATC' => 'I', 'ATA' => 'I', 'ATG' => 'M', 'GTT' => 'V', 'GTC' => 'V', 'GTA' => 'V', 'GTG' => 'V', 'TCT' => 'S', 'TCC' => 'S', 'TCA' => 'S', 'TCG' => 'S', 'CCT' => 'P', 'CCC' => 'P', 'CCA' => 'P', 'CCG' => 'P', 'ACT' => 'T', 'ACC' => 'T', 'ACA' => 'T', 'ACG' => 'T', 'GCT' => 'A', 'GCC' => 'A', 'GCA' => 'A', 'GCG' => 'A', 'TAT' => 'Y', 'TAC' => 'Y', 'TAA' => '*', 'TAG' => '*', 'CAT' => 'H', 'CAC' => 'H', 'CAA' => 'Q', 'CAG' => 'Q', 'AAT' => 'N', 'AAC' => 'N', 'AAA' => 'K', 'AAG' => 'K', 'GAT' => 'D', 'GAC' => 'D', 'GAA' => 'E', 'GAG' => 'E', 'TGT' => 'C', 'TGC' => 'C', 'TGA' => '*', 'TGG' => 'W', 'CGT' => 'R', 'CGC' => 'R', 'CGA' => 'R', 'CGG' => 'R', 'AGT' => 'S', 'AGC' => 'S', 'AGA' => 'R', 'AGG' => 'R', 'GGT' => 'G', 'GGC' => 'G', 'GGA' => 'G', 'GGG' => 'G' ); ## '---' = 3 character codon in hash above ## '-' = one letter amino acid abbreviation in hash above my @seqarray = split(//,$seq); ## Explodes the string for (my $i=0; $i<=$#seqarray-2; $i=$i+3) { my $codon = substr($seq,$i,3); $amino_acid .= $genetic_code{$codon}; } return ($amino_acid); } [download] The main errors in your code is that the $seqarray is never initialized to anything (note that this is different from @seqarray ) and that you don't use the return values from your subroutines.

Update: haukex and toolic were faster than me. Also note I only made the minimal changes, you don't really need to create @seqarray , since you're not really using it (except in the $i<=$#seqarray-2 for loop termination clause where you could simply use the length of the sequence).

FIJI42 (Acolyte) on Nov 09, 2017 at 16:08 UTC

Re^2: Translation Substring Error


by FIJI42 (Acolyte) on Nov 09, 2017 at 16:08 UTC

This works great, thank you.

Yeah, I see the error with with $seqarray - I'll try to more dynamic variable names to minimize confusion next time.

johngg (Abbot) on Nov 09, 2017 at 23:10 UTC

Re: Translation Substring Error

This is not addressing the problem you were having, rather it is a suggestion for a simpler way of initialising your %genetic_code hash that would save some typing. The glob function can be used to generate combinations of letters. Your hash contains 64 keys which are all possible 3-character combinations of A, C, G and T. These can be generated using glob like this ...

johngg@shiraz:~/perl/Monks > perl -E 'say for glob q{{A,C,G,T}} x 3' AAA AAC AAG AAT ACA ACC ACG ACT ... TGA TGC TGG TGT TTA TTC TTG TTT [download]

Arranging the corresponding amino acid letters in an array allows us to map keys (genetic codes) and values (amino acids) shift 'ed from the array together to create the hash lookup.

my %genetic_code = do { my @amino_acids = qw{ K N K N T T T T R S R S I I M I Q H Q H P P P P R R R R L L L L E D E D A A A A G G G G V V V V * Y * Y S S S S * C W C L F L F }; map { $_ => shift @amino_acids } glob q{{A,C,G,T}} x 3; }; [download]

I hope this is of interest.

Cheers,

JohnGG

kcott (Chancellor) on Nov 11, 2017 at 07:52 UTC

Re: Translation Substring Error

G'day FIJI42 ,

I wrote in " Re: Identifying Overlapping Matches in Nucleotide Sequence ":

"Biological data are typically huge. For reasons of efficiency, when dealing with this type of data, you should choose a fast solution over a slower one. Perl's string handling functions ... are measurably faster than regexes ..."

Here's a solution that uses the string handling functions length and substr (no regexes are used at all):

#!/usr/bin/env perl -l use strict; use warnings; my @dna_seqs = qw{ATGCCCGTAC GCTTCCCAGCGC}; print "$_ => ", dna_prot_map($_) for @dna_seqs; { my %code; BEGIN { %code = qw{ATG M CCC P GTA V GCT A TCC S CAG Q CGC R} } sub dna_prot_map { join '', map $code{substr $_[0], $_*3, 3}, 0..length($_[0])/3- 1 } } [download]

Output:

ATGCCCGTAC => MPV GCTTCCCAGCGC => ASQR [download]

Notes:

My %code is just a subset of your %genetic_code : it only has the data required for your example sequences. You will still need all the data; you can save yourself some typing by omitting the 128 single quotes around all the keys.

You can use state within your subroutine (if you're using Perl version 5.10 or higher); although, be aware that limits the scope. I often find that when I write code like:

sub f { state $static_var = ... ... do something with $static_var here ... } [download]

instead of like:

{ my $static_var; BEGIN { $static_var = ... } sub f { ... do something with $static_var here ... } } [download]

I subsequently find I need to share $static_var with another routine. This requires a major rewrite which ends up looking very much like the version with BEGIN :

{ my $static_var; BEGIN { $static_var = ... } sub f { ... do something with $static_var here ... } sub g { ... do something with $static_var here ... } } [download]

Just having to add ' sub g { ... } ' to existing code is a lot less work and a lot less error-prone.

How you choose to do it is up to you: I'm only providing advice of possible pitfalls based on my experience.

-- Ken

[Nov 13, 2017] Strip Pod as Pod from Perl file - Stack Overflow

Nov 13, 2017 | stackoverflow.com

Håkon Hægland ,Nov 2, 2014 at 12:10

I am trying to extract the Pod documentation from a Perl file. I do not want to convert the documentation to text as is done by Pod::Simple::Text . I just want the Pod text as Pod text, such that I can feed it into Pod::Template later. For example:
use warnings;
use strict;
use Pod::Simple::Text;
my $ps=Pod::Simple::Text->new();
my $str;
$ps->output_string( \$str );
$ps->parse_file($0);
print $str;

__END__

=head1 SYNOPSIS

prog [OPTIONS]

This will print the Pod as text. Is there a CPAN module that can give me the Pod text, that is:

=head1 SYNOPSIS

prog [OPTIONS]

instead?

Update

The solution should be able to handle Pod docs in strings, like

my $str = '__END__

=head1 SYNOPSIS';

Miller ,Nov 2, 2014 at 18:42

This can be done using PPI :
use strict;
use warnings;

use PPI;

# Slurp source code
my $src = do { local ( @ARGV, $/ ) = $0; <> };

# Load a document
my $doc = PPI::Document->new( \$src );

# Find all the pod within the doc
my $pod = $doc->find('PPI::Token::Pod');
for (@$pod) {
    print $_->content, "\n";
}

=comment
Hi Pod
=cut

1;

__END__

=head1 SYNOPSIS

prog [OPTIONS]

Outputs:

=comment
Hi Pod
=cut

=head1 SYNOPSIS

prog [OPTIONS]

Håkon Hægland ,Nov 3, 2014 at 12:51

Thanks for this great solution. It even works with Pod docs embedded in strings, like my $str='__END__ =head1 SYNOPSIS';Håkon Hægland Nov 3 '14 at 12:51

Tim ,Nov 2, 2014 at 13:58

Use the -u option for perldoc . This strips out the POD and displays it raw.

If you want to extract the POD from within a Perl program, you could do something like this:

my $rawpod;
if (open my $fh, '-|', 'perldoc', '-u', $filename) {
  local $/;
  my $output = <$fh>;
  if (close $fh) {
    $rawpod = $output;
  }
}

If you really don't want to run perldoc as an executable, you might be interested that the perldoc executable is a very simple wrapper around Pod::Perldoc which you might want to consider using yourself.

Håkon Hægland ,Nov 3, 2014 at 12:55

Thanks, but it does not work with Pod docs embedded in strings. See my updated question for an example.. – Håkon Hægland Nov 3 '14 at 12:55

Tim ,Nov 3, 2014 at 18:54

Well, if you change the question, it's not that surprising that a given answer no longer works. I'm pleased you've found a solution to your new question. – Tim Nov 3 '14 at 18:54

Håkon Hægland ,Nov 5, 2014 at 6:40

The problem with perldoc is that there is a bug, so it thinks Pod embedded in a string belongs to the document. – Håkon Hægland Nov 5 '14 at 6:40

Calle Dybedahl ,Nov 2, 2014 at 12:54

Pod::Simple::SimpleTree will give it to you as a parse tree. You can convert that back to POD source easily enough.

toolic ,Nov 2, 2014 at 13:51

+1 if you provide a runnable code example. – toolic Nov 2 '14 at 13:51

[Nov 13, 2017] Perl for Newbies - Part 3 by Shlomi Fish

Nov 13, 2017 | perl-begin.org

Perl Modules

Perl modules are namespaces that contain function and variables. Two distinct modules may each contain a function (or a variable) with the same name, yet the perl interpreter will be able to tell them apart. Furthermore, both functions can be invoked from the same code.


Contents Up Prev Next
Perl for Newbies - Part 3 - Modules and ObjectsPerl ModulesDeclaring a Package 3.1. Declaring a Package

In order to designate that a code belongs to a different namespace you should use the package directive. For instance, if you want your module name to be "MyModule" your file should look something like this:

# This is the file MyModule.pm
#

package MyModule;

use strict;
use warnings;

sub a_function
{
    print "Hello, World!\n";
}

1;

Note that a module has to return a true value to the perl interpreter, which explains the use of "1;".

A namespace may contain sub-namespaces. To separate namespace components, use the :: separator. For example:

# This is the file Hoola/Hoop.pm
#

package Hoola::Hoop;

use strict;
use warnings;

my $counter = 0;

sub get_counter
{
    return $counter++;
}

1;

Contents Up Prev Next
Perl for Newbies - Part 3 - Modules and ObjectsPerl ModulesDeclaring a Package Where to find a module 3.1.1. Where to find a module

A module is a separate file that may contain one or more different namespaces. That file is found in a place that corresponds to the module's name. To find the filename of the module relative to the script's directory, replace every :: with a slash and add ".pm" to the name of the last component.

For instance: the MyModule module will be found in the file "MyModule.pm"; the Hello::World module will be found in the file "World.pm" under the Hello sub-directory; etc.


Contents Up Prev Next
Perl for Newbies - Part 3 - Modules and ObjectsPerl ModulesLoading Modules and Importing their Functions 3.2. Loading Modules and Importing their Functions

In order to have access to a module from within your script (or from within another module) you can use the use directive followed by the name of the module as it is deduced from its path. Here's an example: assume that the file "MyModule.pm" is this:

# This is the file MyModule.pm
#

package MyModule;

use strict;
use warnings;

sub a_function
{
    print "Hello, World!\n";
}

1;

And this is your script:

#!/usr/bin/perl

use strict;
use warnings;

use MyModule;

# Notice that we use "::" to call the function out of the module.
MyModule::a_function();

That way, the program will print "Hello, World!" on the screen.


Contents Up Prev Next
Perl for Newbies - Part 3 - Modules and ObjectsPerl ModulesLoading Modules and Importing their Functions Accessing Functions from a Different Module 3.2.1. Accessing Functions from a Different Module

As could be seen from the last example, once the module has been use 'd, its functions can be invoked by typing the full path of the package followed by :: and followed by the function name.

Note that if you are in package Foo and you are trying to access functions from package Foo::Bar , then typing Bar::my_func() will not do. You have to type the full path of the module. ( Foo::Bar::my_func() in our case)


Contents Up Prev Next
Perl for Newbies - Part 3 - Modules and ObjectsPerl ModulesLoading Modules and Importing their Functions Exporting and Importing Functions 3.2.2. Exporting and Importing Functions

It is possible to make a functions of your module automatically available in any other namespace or script that uses it. To do so one needs to type the following code fragment near the beginning of the module:

use Exporter;

use vars qw(@ISA @EXPORT);

@ISA=qw(Exporter);

@EXPORT=("function1", "function2", "function3");

What this fragment does is make the module inherit the Exporter module which is a special Perl module that can export symbols. Then it declares the special variable @EXPORT which should be filled with all the functions that one wishes to export.

Here is an example which has a module called "Calc" and a script that uses it:

# File: Calc.pm
#
package Calc;

use strict;
use warnings;

use Exporter;

use vars qw(@ISA @EXPORT);

@ISA=qw(Exporter);

@EXPORT=("gcd");

# This function calculates the greatest common divisor of two integers
sub gcd
{
    my $m = shift;
    my $n = shift;

    if ($n > $m)
    {
        ($m, $n) = ($n , $m);
    }

    while ($m % $n > 0)
    {
        ($m, $n) = ($n, $m % $n);
    }

    return $n;
}

1;
#!/usr/bin/perl

use strict;
use warnings;

use Calc;

my $m = 200;
my $n = 15;

print "gcd($m,$n) == " , gcd($m,$n), "\n";

As you can see, the script invokes the "gcd" function of the "Calc" module without having to invoke it with Calc::gcd() . Exporting functions like that should be used with care, as the function names may conflict with those of the importing namespace.


Contents Up Prev Next
Perl for Newbies - Part 3 - Modules and ObjectsPerl ModulesLoading Modules and Importing their Functions Using Variables from a Different Namespace 3.2.3. Using Variables from a Different Namespace

It is also possible to use the global variables of different packages. However, such variables need to be declared using the use vars qw($myvar1 @myvar2) construct.

Here's an example for a module that defines a variable and another one that access it:

# This file is MyVar.pm
#
package MyVar;

use strict;
use warnings;

# Declare a namespace-scoped variable named $myvar.
use vars qw($myvar);

sub print_myvar
{
    print $myvar, "\n";
}

1;
#!/usr/bin/perl

use strict;
use warnings;

use MyVar;

$MyVar::myvar = "Hello";

MyVar::print_myvar();

$MyVar::myvar = "World";

MyVar::print_myvar();

Contents Up Prev Next
Perl for Newbies - Part 3 - Modules and ObjectsPerl ModulesBEGIN and END 3.3. BEGIN and END

There are two special code blocks for perl modules - BEGIN and END . These blocks are executed when a module is first loaded, and when the perl interpreter is about to unload it, respectively.

Here's an example for a logging module that makes use of this facility:

# File : MyLog.pm
#

package MyLog;

use strict;
use warnings;

BEGIN
{
    open MYLOG, ">", "mylog.txt";
}

sub log
{
    my $what = shift;

    # Strip the string of newline characters
    $what =~ s/\n//g;

    # The MYLOG filehandle is already open by virtue of the BEGIN
    # block.
    print MYLOG $what, "\n";
}

END
{
    close(MYLOG);
}

1;

Contents Up Prev Next
Perl for Newbies - Part 3 - Modules and ObjectsPerl ModulesThe "main" Namespace 3.4. The "main" Namespace

One can access the main namespace (i.e, the namespace of the script), from any other namespace by designating main as the namespace path. For instance, main::hello() will invoke the function named "hello" in the script file.

Usually, the "main" part can be omitted and the symbols be accessed with the notation of ::hello() alone.


Contents Up Prev Next
Perl for Newbies - Part 3 - Modules and ObjectsPerl ModulesDifference between Namespaces and Modules 3.5. Difference between Namespaces and Modules

[Nov 13, 2017] Translation Substring Error

Nov 13, 2017 | perlmonks.com


5 direct replies -- Read more / Contribute by FIJI42
on Nov 09, 2017 at 10:26

I have a subroutine for a basic one frame translation that is giving me an error for "Use of uninitialized value $codon in hash element" and "substr outside of string". I think my problem is I need to modify the subroutine's for loop to account for nucleotide sequences with odd numbers of acids (i.e. not in multiples of 3).

Does anyone have suggestions for how to modify the code properly?

Here is the subroutine I'm using in a simple example:

use strict;
use warnings; 

my $amino_acid='';
my $s1 = 'ATGCCCGTAC';     ## Sequence 1
my $s2 = 'GCTTCCCAGCGC';   ## Sequence 2

print "Sequence 1 Translation:";
OneFrameTranslation ($s1);        ## Calls subroutine
print "$amino_acid\n";

print "Sequence 2 Translation:";
OneFrameTranslation ($s2);       ## Calls subroutine
print "$amino_acid\n";

### Subroutine ###

sub OneFrameTranslation {
    my ($seq) = shift;
    my $amino_acid='';
    my $seqarray='';
    
my %genetic_code = (
     'TTT' => 'F', 'TTC' => 'F', 'TTA' => 'L', 'TTG' => 'L',
     'CTT' => 'L', 'CTC' => 'L', 'CTA' => 'L', 'CTG' => 'L', 
     'ATT' => 'I', 'ATC' => 'I', 'ATA' => 'I', 'ATG' => 'M',
     'GTT' => 'V', 'GTC' => 'V', 'GTA' => 'V', 'GTG' => 'V',
     'TCT' => 'S', 'TCC' => 'S', 'TCA' => 'S', 'TCG' => 'S',
     'CCT' => 'P', 'CCC' => 'P', 'CCA' => 'P', 'CCG' => 'P',
     'ACT' => 'T', 'ACC' => 'T', 'ACA' => 'T', 'ACG' => 'T',
     'GCT' => 'A', 'GCC' => 'A', 'GCA' => 'A', 'GCG' => 'A',
     'TAT' => 'Y', 'TAC' => 'Y', 'TAA' => '*', 'TAG' => '*',
     'CAT' => 'H', 'CAC' => 'H', 'CAA' => 'Q', 'CAG' => 'Q',
     'AAT' => 'N', 'AAC' => 'N', 'AAA' => 'K', 'AAG' => 'K',
     'GAT' => 'D', 'GAC' => 'D', 'GAA' => 'E', 'GAG' => 'E',
     'TGT' => 'C', 'TGC' => 'C', 'TGA' => '*', 'TGG' => 'W',
     'CGT' => 'R', 'CGC' => 'R', 'CGA' => 'R', 'CGG' => 'R',
     'AGT' => 'S', 'AGC' => 'S', 'AGA' => 'R', 'AGG' => 'R',
     'GGT' => 'G', 'GGC' => 'G', 'GGA' => 'G', 'GGG' => 'G' 
    );
## '---' = 3 character codon in hash above
## '-' = one letter amino acid abbreviation in hash above
my @seqarray = split(//,$seq);   ## Explodes the string
    for (my $i=0; $i
      
      
   



   
      Re: Translation Substring 
      Error (updated) 
      by haukex 
      (Monsignor) on Nov 09, 2017 at 15:47 UTC 
   
   
      
      

         @seqarray and $seqarray are two 
         different variables, and you never assign anything to $seqarray, 
         so using substr on it does not 
         make much sense, I suspect you just want to look directly at $seq 
         instead of splitting it (BTW, to 
         get multiple elements out of an array, use
         Slices or
         splice). Also, note that you 
         overwrite $amino_acid on every loop iteration. The following 
         minimal changes make your code work for me:


my $seq = shift; my $amino_acid; for (my $i=0; $i<=length($seq)-3; $i=$i+3) { my $codon = substr($seq,$i,3); $amino_acid .= $genetic_code{$codon}; } return $amino_acid; [download]

<update2> Fixed an off-by-one error in the above code; I initially incorrectly translated your $#seqarray-2 into length($seq)-2 ( $#seqarray returns the last index of the array, not its length like scalar(@seqarray) does, or length does for strings). That's a good argument against the classic for(;;) and for the two solutions below instead :-) </update2>

If you output the return value from OneFrameTranslation (your current code is ignoring the return value), this gives you:

print OneFrameTranslation('ATGCCCGTAC'),"\n"; print OneFrameTranslation('GCTTCCCAGCGC'),"\n"; __END__ MPV ASQR [download]

By the way, you can probably move your %genetic_code to the top of your code (outside of the sub ), so that it only gets initialized once instead of on every call to the sub , and making its name uppercase is the usual convention to indicate it is a constant that should not be changed.

Another way to break up a string is using regular expressions, the following also works - it matches three characters, and then matches again at the position that the previous match finished, and so on:

my $amino_acid; while ($seq=~/\G(...)/sg) { $amino_acid .= $genetic_code{$1}; } return $amino_acid; [download]

Or, possibly going a little overboard, here's a technique I describe in Building Regex Alternations Dynamically to make the replacements using a single regex. I have left out the quotemeta and sort steps only because I know for certain that all keys are three-character strings without any special characters, if you have any doubts about the input data, put those steps back in!

# build the regex, this only needs to be done once my ($genetic_regex) = map qr/$_/, join '|', keys %genetic_code; # apply the regex (my $amino_acid = $seq) =~ s/($genetic_regex)/$genetic_code{$1}/g; return $amino_acid; [download]

However, note this produces slightly different output for the first input: " MPVC " (the leftover C remains unchanged). Whether or not you want this behavior or not is up to you; it can also be accomplished in the first two solutions (although slightly less elegantly than with a regex). Update: Also, in the first two solutions you haven't defined what would happen if a code happens to not be available in the table; the third regex solution would simply leave it unchanged. Also minor edits for clarification.

[reply]
[d/l]
[select]

FIJI42 (Acolyte) on Nov 09, 2017 at 16:12 UTC

Re^2: Translation Substring Error (updated)


by FIJI42 (Acolyte) on Nov 09, 2017 at 16:12 UTC

Good point. If a nucleotide triplet with an unknown nucleotide appears (ex. ANC instead of ATC), I'd want to either skip those, or mark them with a letter like 'X'.

I do like the regex solution though, it's quite elegant.

haukex (Monsignor) on Nov 09, 2017 at 16:18 UTC

Re^3: Translation Substring Error
by haukex (Monsignor) on Nov 09, 2017 at 16:18 UTC
If a nucleotide triplet with an unknown nucleotide appears (ex. ANC instead of ATC), I'd want to either skip those, or mark them with a letter like 'X'.

In the first two solutions, you can use exists , e.g.:

if ( exists $genetic_code{$codon} ) { $amino_acid .= $genetic_code{$codon}; } else { $amino_acid .= $codon; # - OR - $amino_acid .= 'X'; # or something else... } [download]

Update: Or, written more tersely, either $amino_acid .= exists $genetic_code{$codon} ? $genetic_code{$codon} : 'X'; or $amino_acid .= $genetic_code{$codon} // 'X'; (the former uses the Conditional Operator , and the latter uses Logical Defined Or instead of exists , assuming you don't have any undef values in your hash).

haukex (Monsignor) on Nov 09, 2017 at 16:36 UTC

Re^3: Translation Substring Error
by haukex (Monsignor) on Nov 09, 2017 at 16:36 UTC
I do like the regex solution though, it's quite elegant.

You can combine my second and third suggestions (for nonexistent codes, this uses the defined-or solution I showed here , the exists solution would work as well):

(my $amino_acid = $seq) =~ s{(...)} { $genetic_code{$1} // 'X' }esg; return $amino_acid; [download]

toolic (Bishop) on Nov 09, 2017 at 15:49 UTC

Re: Translation Substring Error

The reason for the "substr outside of string" warning is that you assign the $seqarray variable to the empty string and you never assign it any other value. You are likely getting confused because you use the same name for two variables (an array and a scalar): $seqarray is a different variable from @seqarray. If you can specify what you want for output, you will get more specific help.

See also:

FIJI42 (Acolyte) on Nov 09, 2017 at 16:06 UTC

Re^2: Translation Substring Error


by FIJI42 (Acolyte) on Nov 09, 2017 at 16:06 UTC

Basically, I was just trying to get a string for the translated amino acids:

Example: MLVG

If I have sequence like this: ATGGCGA, then I'd just like the translation: MA. The "A" from the end of "ATGGCGA" can be ignored/not output.

Laurent_R (Canon) on Nov 09, 2017 at 16:12 UTC

Re^3: Translation Substring Error
by Laurent_R (Canon) on Nov 09, 2017 at 16:12 UTC This is what I'm getting with your program modified as in my earlier post below: $ perl dna.pl Sequence 1 Translation:MPV Sequence 2 Translation:ASQR [download]

Laurent_R (Canon) on Nov 09, 2017 at 16:01 UTC

Re: Translation Substring Error

Hi,

Try this:

use strict; use warnings; my $s1 = 'ATGCCCGTAC'; ## Sequence 1 my $s2 = 'GCTTCCCAGCGC'; ## Sequence 2 print "Sequence 1 Translation:"; my $amino_acid = OneFrameTranslation ($s1); ## Calls subroutine print "$amino_acid\n"; print "Sequence 2 Translation:"; $amino_acid = OneFrameTranslation ($s2); ## Calls subroutine print "$amino_acid\n"; ### Subroutine ### sub OneFrameTranslation { my ($seq) = shift; my $amino_acid=''; my $seqarray=''; my %genetic_code = ( 'TTT' => 'F', 'TTC' => 'F', 'TTA' => 'L', 'TTG' => 'L', 'CTT' => 'L', 'CTC' => 'L', 'CTA' => 'L', 'CTG' => 'L', 'ATT' => 'I', 'ATC' => 'I', 'ATA' => 'I', 'ATG' => 'M', 'GTT' => 'V', 'GTC' => 'V', 'GTA' => 'V', 'GTG' => 'V', 'TCT' => 'S', 'TCC' => 'S', 'TCA' => 'S', 'TCG' => 'S', 'CCT' => 'P', 'CCC' => 'P', 'CCA' => 'P', 'CCG' => 'P', 'ACT' => 'T', 'ACC' => 'T', 'ACA' => 'T', 'ACG' => 'T', 'GCT' => 'A', 'GCC' => 'A', 'GCA' => 'A', 'GCG' => 'A', 'TAT' => 'Y', 'TAC' => 'Y', 'TAA' => '*', 'TAG' => '*', 'CAT' => 'H', 'CAC' => 'H', 'CAA' => 'Q', 'CAG' => 'Q', 'AAT' => 'N', 'AAC' => 'N', 'AAA' => 'K', 'AAG' => 'K', 'GAT' => 'D', 'GAC' => 'D', 'GAA' => 'E', 'GAG' => 'E', 'TGT' => 'C', 'TGC' => 'C', 'TGA' => '*', 'TGG' => 'W', 'CGT' => 'R', 'CGC' => 'R', 'CGA' => 'R', 'CGG' => 'R', 'AGT' => 'S', 'AGC' => 'S', 'AGA' => 'R', 'AGG' => 'R', 'GGT' => 'G', 'GGC' => 'G', 'GGA' => 'G', 'GGG' => 'G' ); ## '---' = 3 character codon in hash above ## '-' = one letter amino acid abbreviation in hash above my @seqarray = split(//,$seq); ## Explodes the string for (my $i=0; $i<=$#seqarray-2; $i=$i+3) { my $codon = subs