Softpanorama May the source be with you, but remember the KISS principle ;-)	Home	Switchboard	Unix Administration	Red Hat	TCP/IP Networks	Neoliberalism	Toxic Managers
	(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix

Perl Style: defensive programming in Perl

News	Programming Style	Recommended Links	Perl Debugging	Perl Error Checklist	Perl as a command line utility tool	Perl Programming Environment
Perl POD as literate programming tool	Literate Programming	Defensive programming	Conway Law	Brooks law	Conceptual Integrity	Programming style
Pretty-printers	Sysadmin Horror Stories	History and philosophy	Tips	Language Design and Programming Quotes	Humor	Etc

Introduction
Defensive programming
General recommendation from The Elements of Programming Style
Lexical style
Programming style
- Avoid dogmatism and religious zealotry
- Do not imitate complexity junkies

Introduction

Perl gives you enough rope to hang yourself, so it is prudent to practice defensive programming.

Programming style is not a dogma and textbooks with the word 'style" on the cover are better to be avoided ;-). And they key feature of style is consistency.

Wikipedia defines Programming style rather narrowly, emphasizing lexical style but in reality the concept is broader then that

Programming style is a set of rules or guidelines used when writing the source code for a computer program. It is often claimed that following a particular programming style will help programmers to read and understand source code conforming to the style, and help to avoid introducing errors.

A classic work on the subject was The Elements of Programming Style, written in the 1970s, and illustrated with examples from the Fortran and PL/I languages prevalent at the time.

The programming style used in a particular program may be derived from the coding standards or code conventions of a company or other computing organization, as well as the preferences of the author of the code. Programming styles are often designed for a specific programming language (or language family): style considered good in C source code may not be appropriate for BASIC source code, and so on. However, some rules are commonly applied to many languages.

We will distinguish three elements of style:

Defensive programming related elements of style.
Lexical style. lexical formatting now is often usually enforced by pretty printer, but within those constrains you can
Programming style. This is pretty individual thing, but excessive "cleverness" can byte you later.

Defensive programming

One important trend is programming style is so called defensive programming.

For example the first thing you need to do when you use somebody else code, the first thing to do is read it. That will get you general idea if it is OK to use it or it is an overkill. Often it is an overkill. In this case it might be better to create a simpler derivative.

Also with SPAN modules, each time release new version you need to diff it. As this advice is difficult to follow, overuse of CPAN modules can lower your quality of code. Unmaintained SPAN modules have another danger -- the fall behind changes in OS, and you become their maintainer.

Number of bugs in your scripts depends on the style you consciously or unconsciously adopted. Also each language has typical errors which need to be consistently checked for as they crop in again and again. See Perl Error Checklist

As an example of early attempt to formulate some principles of defensive programming style we can list Tom Christiansen recommendations (Jan 1, 1998) for Perl language Perl does not have strict typing of variables and, by default, does not require any declaration of variables, creating potential for misspelled variables slipping into production version of the program. Unless you use strict pragma -- the use the latter became standard in modern Perl). While they are more then 20 years old they are still relevant:

use strict
#!/usr/bin/perl -w
Check all syscall return values, printing $!
Watch for external program failures in $?
Check $@ after eval"" or s///ee.
[Use] Parameter asserts
#!/usr/bin/perl -T (taint mode in which Perl distrust any data from outside world, see below)
Always have an else after a chain of elsifs
Put commas at the end of lists to so your program won't break if someone inserts another item at the end of the list.

General recommendation from The Elements of Programming Style

Its lessons are summarized at the end of each section in pithy maxims, such as "Let the machine do the dirty work" (The Elements of Programming Style - Wikipedia

Write clearly — don't be too clever.
Say what you mean, simply and directly.
Use library functions whenever feasible.
Avoid too many temporary variables.
Write clearly — don't sacrifice clarity for efficiency.
Let the machine do the dirty work.
Replace repetitive expressions by calls to common functions.
Parenthesize to avoid ambiguity.
Choose variable names that won't be confused.
Avoid unnecessary branches.
If a logical expression is hard to understand, try transforming it.
Choose a data representation that makes the program simple.
Write first in easy-to-understand pseudo language; then translate into whatever language you have to use.
Modularize. Use procedures and functions.
Avoid gotos completely if you can keep the program readable.
Don't patch bad code — rewrite it.
Write and test a big program in small pieces.
Use recursive procedures for recursively-defined data structures.
Test input for plausibility and validity.
Make sure input doesn't violate the limits of the program.
Terminate input by end-of-file marker, not by count.
Identify bad input; recover if possible.
Make input easy to prepare and output self-explanatory.
Use uniform input formats.
Make input easy to proofread.
Use self-identifying input. Allow defaults. Echo both on output.
Make sure all variables are initialized before use.
Don't stop at one bug.
Use debugging compilers.
Watch out for off-by-one errors.
Take care to branch the right way on equality.
Be careful if a loop exits to the same place from the middle and the bottom.
Make sure your code does "nothing" gracefully.
Test programs at their boundary values.
Check some answers by hand.
10.0 times 0.1 is hardly ever 1.0.
7/8 is zero while 7.0/8.0 is not zero.
Don't compare floating point numbers solely for equality.
Make it right before you make it faster.
Make it fail-safe before you make it faster.
Make it clear before you make it faster.
Don't sacrifice clarity for small gains in efficiency.
Let your compiler do the simple optimizations.
Don't strain to re-use code; reorganize instead.
Make sure special cases are truly special.
Keep it simple to make it faster.
Don't diddle code to make it faster — find a better algorithm.
Instrument your programs. Measure before making efficiency changes.
Make sure comments and code agree.
Don't just echo the code with comments — make every comment count.
Don't comment bad code — rewrite it.
Use variable names that mean something.
Use statement labels that mean something.
Format a program to help the reader understand it.
Document your data layouts.
Don't over-comment

Lexical style

The key idea of proper lexical style is make spotting typical lexical errors more quickly. There are several types of errors rampant in C-style languages that are diagnosed easier if the program source is pretty-printed. Among them:

Missing closing curve bracket
Unclosed single quotes of double quoted literal
Missing regular round bracket in expression

Perl inherited C-language design decisions to use curvy brackets for blocks and that creates great problems with determining which block a particular curvy bracket closes. Indenting blocks via pretty-printer makes this more apparent. Which means that a good pretty printer is must for Perl. Similar wart exist due to "overuse" of "regular" round brackets :

if ($tar=substr($text,1,3) {
# In the line above ) - bracket is missing  
... 
}

Here the fact that one closing bracket is missing is not immediately apparent and Perl diagnostic is typically simply horrible (aka misleading) as for this type of errors. In a way in this area designers of Perl interpreters not only failed to make the progress with the level of compilers that existed in early 70th, but managed to make a step back ;-). Making spaces for surrounding bracket makes this situation better, but not by much:

if ( $tar=substr($text,1,3) ) {
... 
}

The key requirement to lexical style is than it should be consistent. It is better achieved using automatic pretty-printing programs. So they have great value for easy spotting the types of errors mentioned above and generally improve readability of programs.

That does not mean that each type of lexical errors are addressed equally well. Missing curve bracket problem is addressed very well indeed. Bu the problem with missing round brackets need some additional efforts to detect and solve. Here there is a difference between a good pretty-printer and an excellent one. As for literals this problem is now solved on editor level as syntax highlighting makes this error apparent. Also better editors change the font and color of matching bracket when you an opening or closing bracket.

Still the problem of "unclosed" single of double quote is badly diagnosed on the interpreter level. This handling can be improved. For example many years ago PL/1 pioneered a special parameter that represented a max length of the literals in the program. Other languages provide pragma that makes literals limited to a single line (actually concatenation of literals can be done at compile time so this solution makes more sense that commonly accepted. None of this solutions are available for Perl as of Perl 5.10. Here a good pretty printing program can definitely help as it can find and mark lines with unbalanced quotes quite easily. Perl lint is another good option.

One interesting aspect of lexical style is that a better style makes programs less verbose, in a sense that the program occupies less lines, while not compromising readability. In this sense putting an opening curly bracket on the same line as the keyword and using "cuddled else style" ( } else { ) are more compact and equally readable in comparison with alternatives. For example:

5 lines:

if ($a>$b) {
   $max=$a;
} else {
   $max=$b;
}

8 lines:

if ($a>$b)
{
   $max=$a;
}
else
{
   $max=$b;
}

This example demonstrates certain, desirable from my point of view, elements of Perl lexical style with the following characteristics:

Three space indents.
Blanks in stead of tabs for nesting.
Cuddled else: } else {
Opening curly on the same line as the keyword it belongs to
Closing bracket should be put vertically aligned with that keyword \
No comments after opening curvy bracket

The only exception that I would consider worthwhile is to use separate line for opening curvy bracket in procedures to mark procedures as different entities then regular blocks

sub test 
{ 

}

Another important element of lexical style is that you should indent with spaces, not tabs. The problem is that different editors treat tabs differently and often have different default tab settings. That "mangles" your layout if you open the program in a new editor.

Programming style

Programming style is set of rules as for using language constructions. Pioneering book in this respect the book The Elements of Programming Style by Brian W. Kernighan and P. J. Plauger published in 1978.

In reality style recommendations are far from being absolute. Much depends on tools and language used. Some extremely like structured programming or verification movement while containing some useful bits more distract from writing a good program then help. Like Knuth noted the weakness of structure programming is that it emphasize absence of goto, not the presence of structure.

Style can easily become a religious issue. And that's a big big danger, as fanatics poison everything they touch. Like Rob Pike aptly noted:

"Under no circumstances should you program the way I say to because I say to; program the way you think expresses best what you're trying to accomplish in the program. And do so consistently and ruthlessly."

As Dennis Ritchie noted "

It is very hard to get things both right (coherent and correct) and usable (consistent enough, attractive enough)".

That's the matter of elegance. Most time making code shorter improves maintainability. But with Perl it is important overusing/abusing idioms, an addiction some Perl books author suffer from.

One important aspect of programming style is related to what is called defensive programming. Here are some recommendations from Tom Christiansen:

Use option -w to find misspelled variables and get more meaningful diagnostic. For example:
```
#!/usr/bin/perl -wc your_script
```
Check all syscall return values, printing $!
Watch for external program failures in $?
Check $@ after eval"" or s///ee.
Use parameters asserts
#!/usr/bin/perl -T
Always have an else after a chain of elsifs
Put commas at the end of lists to so your program won't break if someone inserts another item at the end of the list.

advice on the naming of variables and functions is highly subjective. I use the following rules:

While short identifiers like $gotit are probably ok for my variables they are bad idea for glabals.
You can use undersores for local variables and mixed case for globals $VarNamesLikeThis. constnts can be put in all Capspa especially for non-native speakers of English. It's also a simple rule that works consistently with VAR_NAMES_LIKE_THIS.

You may find it helpful to use letter case to indicate the scope or nature of a variable. For example:

   $ALL_CAPS_HERE   constants only (beware clashes with perl vars!)
   $Some_Caps_Here  package-wide global/static
   $no_caps_here    function scope my() or local() variables

Functions and methods names seem to work best as all lowercase. E.g., $obj->as_string().

`Procedure names should reflect what they do; function names should reflect what they return.' --Rob Pike. Name objects so that they read well in English. For example, predicate functions should usually be named with `is', `does', `can', or `has'. Thus, &is_ready is better than &ready for the same function. Therefore, &canonize as a void function (procedure), &canonical_version as a value-returning function, and &is_canonical for a boolean check.
The &abc2xyz and &abc_to_xyz forms are also well established for conversion functions or hash mappings.

Hashes usually express some property of the keys, and are used with the English word `of' or the possessive form. Name hashes for their values, not their keys.

   GOOD:
        %color = ('apple' => 'red', 'banana' => 'yellow');
        print $color{'apple'};          # Prints `red'

   BAD:
        %fruit = ('apple' => 'red', 'banana' => 'yellow');
        print $fruit{'apple'};          # Prints `red'

Avoid dogmatism and religious zealotry

Those thing helped to kill Perl popularity, promoting languages which is many respect is one step forward two steps back.

Here is a couple of other useful point which help to avoid dogmatism, which is a real danger for programmer. Religious zealot is usually a very bad, dangerous programmer ready to mold the world in his often comply faulty vision. Proliferation of OO zealots is a sign of degeneration of programming as a profession. may be it has something to do with neoliberalism with its fake metrics for everything, I do not know ;-) They are usually only half-clever.

Below is a great post which expands on those points (and actually shows that Permonks community contians, or used to contain, real wizards):

Best practices, revisited on Jul 05, 2009 at 14:20 by ELISHEVA

... ... ...

In the last 10 years the term "best practice" has lost much of its association with the process of learning. Instead best practice has become a buzz word that is increasingly associated with a laundry list of rules and procedures. Perhaps it is our innate need to measure ourselves against a standard. Or perhaps it is the word "best". There can only be one best, even if it takes a process to find it. Why reinvent the wheel once the best has been found?

Nowhere is this more clear than in the way many organizations and some monks seem to use Damian Conway's book on Perl best practices. The best practice in Damian Conway's book refers (or should refer) to the process that Damian Conway went through while developing his coding practice. He wrote this book in part because, over the years, his own coding style had come to resemble an archeological dig though his own coding history. Interview with Damian Conway, Brian d foy. However, few people talk about his process, whereas many preach (or complain about) his rule list.

It may be human nature to turn best practices into best rules, but it isn't good management:

1. Best practice by the rulebook oversimplifies the knowledge transfer process. Knowledge consists of several components: facts, recipes, thinking processes, information gathering skills, and methods of evaluation. Rules are only effective in transferring the first two of these. However, all the rest are essential. Without them rules get out of date or will be applied in counter productive ways.

Facts, recipes, and coding standards are like wheels and brakes. But they do not drive the car. If the driver doesn't know the difference between the brake and the accelerator, the car will crash no matter how wonderful the wheels. Hard to communicate skills like information gathering and methods of evaluation are what drive the coding car, not the rules capturing layout and syntax.

If we focus only on rules, it is natural to assume that knowledge will be transferred simply by giving people enough motivation to follow rules. But this doesn't turn out to be the case.

In 1996 (Strategic Management Journal), Gabriel Szulanski (The Wharton School) published a study analyzing the impediments to knowledge transfer. (see Exploring internal stickiness: Impediments to the transfer of best practice in the firm). He considered many factors that might get in the way. The study concluded that motivation was overshadowed by three other issues: "lack of absorptive capacity", "causal ambiguity", and "arduousness of the relationship".

If rules alone were enough none of these would matter. "Lack of absorptive capacity" means that the necessary background knowledge to understand and value the rule is missing. Causal ambiguity means insufficient knowledge of how the rules relate to outcomes. Put in plain English: we aren't very good at applying rules without reasons or context.

However, explaining rules also means transferring judgment - something that cannot be captured purely in the rules themselves. And this brings us to the last barrier to knowledge transfer: "arduousness of the relationship". This awkward term refers to how well the knowledge provider and receiver work together. Do they have a mentoring relationship that can answer questions and provide background information? Or are they simply conduits for authority, insisting on the value of the rules without helping show how the knowledge can be adapted to exceptional situations?

2. An overemphasis on rules is a short-term investment in a long-term illusion. Software is full of symbols and a great deal of code is boiler plate. It is easy to imagine that rules play a large role in software and the right set of rules will have a large payback.

This might be true if writing software were merely a transformation process. But if it were, we'd have developed software to automatically translate business processes, math books, and motion studies into software long ago. To be sure some of the coding today could probably be done by software, but not all of it. In every human endeavor there is a certain amount of boiler plate activity that passes for intellectual labour. This can be automated. But there is also a certain amount of genuine creativity and analysis. It takes a human being to know which is which. It takes a human being to do the later.

If we want superior development teams, we need to spend our energy nurturing what only we humans can do. This is where our investment needs to sit. As for the things we can do with rules: if we focus our skills on the creative portions we will figure out a way to write software that makes the boiler plate things go away. It is only a matter of time.

3. Rules that free us from thinking do not provide for change. Rules that free us from thinking are, by their very nature, static. In 1994 a management book "Built to Last" took the management world by storm and became a knock out best seller for several years thereafter. 10 years later, the magazine "Fast Company" wrote an article reviewing the impact of the book and the companies featured in that book. Was Build to Last Built to Last - in 2004 about half the companies described no longer would qualify as built to last. When interviewed for the article, one of the authors of the book, James C. Collins, argued that these companies had lost sight of what had made them great. He emphasized "Theeee most important part of the book is chapter four! ... Preserve the core! And! Stimulate progress! To be built to last, you have to be built for change!"

4. If it isn't abnormal it can't produce abnormal returns. The things that can be reduced to judgment-free rules offer no competitive advantage because they can be easily reproduced. No matter how hard we try we cannot build the best coding shop by following everybody else's rules. To excel, our practices need to be closely matched to our team's strengths and weaknesses.

Some of the more recent management literature has begun stressing the concept of "signature practices". Signature practices are practices that are unique to an organization. They capture its special ethos and talents and serve as a focal point around which the company (or coding team) can develop its competitive edge. (See, for example "Beyond Best Practice", by Linda Gratton and Sumatra Ghoshal, Sloan Management Review, April 15, 2005).

I don't mean to be knocking rules. They have their place. But if we want to have an outstanding development team, our definition of best practice needs to expand beyond rules. We need to think about what makes our teams thrive. What helps them be at their most creative? What gets them into flow? When are they best at sharing knowledge with each other? At understanding each others code? Incorporating new team members? At meeting customers' needs? And then we have to be prepared to be ruthless in getting rid of anything that gets in the way of that. Even if it is the rules themselves.

Best, beth

Update: Clarification of point #4, in response to mzedeler below.

Do not imitate complexity junkies

Complexity junkies are determined to show us their inventiveness, often to detriment of clarity. By saving a couple of line here and there, they create difficulties in maintainance and their style should be avoided. Here is one example from standard module Std.pm

sub getopts ($;$) {
    my ($argumentative, $hash) = @_;
    my (@args,$first,$rest,$exit);
    my $errs = 0;
    local $_;
    local @EXPORT;

    @args = split( / */, $argumentative );
    while(@ARGV && ($_ = $ARGV[0]) =~ /^-(.)(.*)/s) {
	($first,$rest) = ($1,$2);
	if (/^--$/) {	# early exit if --
	    shift @ARGV;
	    last;
	}
	my $pos = index($argumentative,$first);
	if ($pos >= 0) {
	    if (defined($args[$pos+1]) and ($args[$pos+1] eq ':')) {
		shift(@ARGV);
		if ($rest eq '') {
		    ++$errs unless @ARGV;
		    $rest = shift(@ARGV);
		}
		if (ref $hash) {
		    $$hash{$first} = $rest;
		}
		else {
		    ${"opt_$first"} = $rest;
		    push( @EXPORT, "\$opt_$first" );
		}
	    }
	    else {
		if (ref $hash) {
		    $$hash{$first} = 1;
		}
		else {
		    ${"opt_$first"} = 1;
		    push( @EXPORT, "\$opt_$first" );
		}
		if ($rest eq '') {
		    shift(@ARGV);
		}
		else {
		    $ARGV[0] = "-$rest";
		}
	    }
	}
	else {
	    if ($first eq '-' and $rest eq 'help') {
		version_mess($argumentative, 'main');
		help_mess($argumentative, 'main');
		try_exit();
		shift(@ARGV);
		next;
	    } elsif ($first eq '-' and $rest eq 'version') {
		version_mess($argumentative, 'main');
		try_exit();
		shift(@ARGV);
		next;
	    }
	    warn "Unknown option: $first\n";
	    ++$errs;
	    if ($rest ne '') {
		$ARGV[0] = "-$rest";
	    }
	    else {
		shift(@ARGV);
	    }
	}
    }
    unless (ref $hash) { 
	local $Exporter::ExportLevel = 1;
	import Getopt::Std;
    }
    $errs == 0;
}

Can be simplified to

sub getopts
{
my ($options_def,$options_hash)=@_;
my ($first,$rest,$pos,$cur_opt);
   while(@ARGV){
      $cur_opt=$ARGV[0];
      last if( substr($cur_opt,0,1) ne '-' );
      if ($cur_opt eq '--'){
          shift @ARGV;
          last;
      }
      $first=substr($cur_opt,1,1);
      $pos = index($options_def,$first);
      if( $pos==-1) {
         warn("Undefined option -$first skipped without processing\n");
         shift(@ARGV);
         next;
      }
      $rest=substr($cur_opt,2);
      if( $pos<length($options_def)-1 && substr($options_def,$pos+1,1) eq ':' ){
         # option with parameters
         if( $rest eq ''){
           shift(@ARGV); # get the value of option
           unless( @ARGV ){
              warn("End of line reached for option -$first which requires argument\n");
              $$options_hash{$first}='';
              last;
           }
           if ( $ARGV[0] =~/^-/ ) {
               warn("Option -$first requires argument\n");
               $$options_hash{$first} = '';
           }else{
               $$options_hash{$first}=$ARGV[0];
               shift(@ARGV); # get next chunk
           }
         } else {
            #value is concatenated with option like -ddd
            if( ($first x length($rest)) eq $rest ){
               $$options_hash{$first} = length($rest)+1;
            }else{
               $$options_hash{$first}=$rest;
            }
            shift(@ARGV);
         }
      }else {
         $$options_hash{$first} = 1; # set the option
         if ($rest eq '') {
            shift(@ARGV);
         } else {
            $ARGV[0] = "-$rest"; # there can be other options without arguments after the first
         }
      }
   }
}

Larry Wall guidance (from Perl Style documentation)

Regarding aesthetics of code lay out, about the only thing Larry cares strongly about is that the closing curly brace of a multi-line BLOCK should line up with the keyword that started the construct.

Beyond that, he has other preferences that aren't so strong:

4-column indent.
Opening curly on same line as keyword, if possible, otherwise line up.
Space before the opening curly of a multi-line BLOCK.
One-line BLOCK may be put on one line, including curlies.
No space before the semicolon.
Semicolon omitted in "short'' one-line BLOCK.
Space around most operators.
Space around a "complex'' subscript (inside brackets).
Blank lines between chunks that do different things.
Uncuddled elses.
No space between function name and its opening parenthesis.
Space after each comma.
Long lines broken after an operator (except "and'' and "or'').
Space after last parenthesis matching on current line.
Line up corresponding items vertically.
Omit redundant punctuation as long as clarity doesn't suffer

Top Visited <p>Your browser does not support iframes.</p>					Switchboard
					Latest
					Past week
					Past month

NEWS CONTENTS

20200930 : Postfix if conditions are useful mainly to specify exist condition in the loop, rarely elsewhere ( Sep 30, 2020 , perlmonks.org )
20200929 : At what point excessive syntactic flexibility stimulates perverted programming style which is reflected in the derogative term "complexity junkies"? ( Sep 29, 2020 , perlmonks.org )
20191121 : Tux.nl - Style and Layout ( Nov 21, 2019 , tux.nl )
20171129 : How can I have variable assertions in Perl ( Nov 29, 2017 , stackoverflow.com )
20171117 : Bruce Gray - Your Perl 5 Brain, on Perl 6 by Bruce Gray ( Nov 17, 2017 , www.youtube.com )
20171113 : Perls Worst Best Practices by Daina Pettit ( ‎Perl's Worst Best Practices‎, Nov 13, 2017 )
20120126 : Appendix B Perl Best Practices - OReilly Media ( Appendix B Perl Best Practices - O'Reilly Media, Jan 26, 2012 )
20111027 : How can I convert Perl code into HTML with syntax highlighting? ( Jun 16, 2002 , www.perlmonks.org )
20111027 : togotutor ( softpanorama.org, )
20111027 : Perl style guide - Juerds site ( Perl style guide - Juerd's site , )
20050714 : Perl.com- Ten Essential Development Practices by Damian Conway ( Perl.com- Ten Essential Development Practices, July 14, 2005 )
20031017 : Part 13 Perl Style by Dan Richter ( Part 13 Perl Style, Oct 17,2003 )
20031017 : The road to better programming Introduction and chapter 1 by Teodor Zlatanov ( November 1, 2001 )
20031017 : The road to better programming ( The road to better programming, )

Old News ;-)

[Sep 30, 2020] Postfix if conditions are useful mainly to specify exist condition in the loop, rarely elsewhere

Sep 30, 2020 | perlmonks.org

Re^6: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by likbez on Sep 27, 2020 at 22:37 UTC

We need not "assume that somebody uses this formatting". I do it frequently, and I have often seen it in other people's code. That fact that you use it and saw it in other people code means nothing. People often adopt and use bad programming style. Even talented programmers do. Look at classic The Elements of Programming Style , by Brian W. Kernighan and P. J. Plauger. They include such recommendations as ( cited from https://en.wikipedia.org/wiki/The_Elements_of_Programming_Style ) :

Write clearly -- don't be too clever.

Say what you mean, simply and directly.

... ... ...

Write clearly -- don't sacrifice clarity for efficiency.

... ... ...

Parenthesize to avoid ambiguity.

... ... ...

Make sure special cases are truly special.

... ... ...

The real question is whether the use you advocate represents a good Perl programming style or not.

I would understand the use of post-fix if construct in a loop to specify exit condition. Something like:
return if ($exit_condition);
They make code more readable in comparison with the regular if statement as as such have certain value and IMHO represent a good programming style.
In many other cases the desire to save two curly braces looks to me a very questionable practice and a bad programming style. Your mileage may vary.

[Sep 29, 2020] At what point excessive syntactic flexibility stimulates perverted programming style which is reflected in the derogative term "complexity junkies"?

Notable quotes:

"... In your private role you are free to do whatever you wish. After all programming open source is about fun, not so much about discipline. ..."

"... The situation radically changes in commercial projects. If you are a manager of a large project you need to ensure a uniform and preferably simple style via guidelines that explicitly prohibit such "excesses" and to step on the throat of such "excessively creative" people to make them "behave". ..."

"... That's why languages that allow too much syntactic freedom are generally not welcomed in large commercial projects, even if they are able to manage large namespaces more or less OK. ..."

Sep 29, 2020 | perlmonks.org

likbez on Sep 29, 2020 at 18:01 UTC

Re^11: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by likbez on Sep 29, 2020 at 18:01 UTC

If I have to maintain (as only maintainer) a piece of perl code, I will *rewrite* *all* statements as you state from action if expression; to expression and action; as that (to me) is waaaaaaaaaay easier to read/understand/maintain. Nothing to do with "idiomatic perl". Nothing at all!
People are extremely flexible. The same is true for programmers. Many of talented programmers I encountered have a somewhat idiosyncratic style...
In your private role you are free to do whatever you wish. After all programming open source is about fun, not so much about discipline.

The situation radically changes in commercial projects. If you are a manager of a large project you need to ensure a uniform and preferably simple style via guidelines that explicitly prohibit such "excesses" and to step on the throat of such "excessively creative" people to make them "behave".

That's why languages that allow too much syntactic freedom are generally not welcomed in large commercial projects, even if they are able to manage large namespaces more or less OK.

Let's be frank: Perl lost Web applications development more or less completely. The reasons are not clear and can be argued, but the fact is indisputable. But the problem that I see is that Perl can lose attraction among sysadmins because of excessive push of OO programming style by OO-fanatics and the second rate book authors, as well as due to inability of distribute remaining scarce development resources toward modest and not fancy (unlike closures, accessors, frameworks and other fancy staff) improvements in the procedural programming arena (the area which this post is all about).

An interesting question is: at what point excessive syntactic flexibility stimulates perverted programming style which is reflected in the derogative term "complexity junkies"? When in the program simple things look complex, and complex unmanageable. "Object oriented spaghetti" ('Lasagna code' with too many layers) is another term that addresses the same problem. See, for example, discussion at https://medium.com/better-programming/is-object-oriented-programming-garbage-66c4f41adcaa

Also https://www.youtube.com/watch?time_continue=9&v=V6VP-2aIcSc&feature=emb_logo

[Nov 21, 2019] Tux.nl - Style and Layout

Nov 21, 2019 | tux.nl

Why my style is best better

I will try to explain the logic behind the style decisions taken over that last 35+ years of programming in different languages.

About programming style and layout there are as many opinions as there are people. Most important in my opinion is to think about the reasoning behind what you, your team or your company chooses to follow as guides.

I seriously think that way too many (young) programmers leave school, brainwashed with GNU-style coding without realizing that the amount of indentation and the placing of braces, brackets and parentheses were well thought about.

Several well known styles (including mine) are discussed at wikimedia . It is worth reading through them to see the pros and cons of each.

For me personally, the GNU coding style is one of the reasons I do NOT contribute a lot to these projects. The style does not fit my logic, and if I send patches that are rejected simply because I wrote them in a style/layout that I think is way better because I then understand the underlying logic, I give up.

Here I will take a tour through what I think is the only correct way of (perl) code layout, and why. Most of this can be achieved with Perl::Tidy and a correct .perltidyrc . I'll use their configuration definitions as a guide.
Indentation in code blocks

Opening Block Brace Right or Left

Braces Left
Because braces are just syntactic sugar to keep a block together, it should visually also bind to the block, and not to the conditional. As the closing brace - or END in languages like PASCAL - is visually showing me the end of the block, it should obviously have the same indent as the block itself. An advantage is that the alignment of the closing brace with the block emphasizes the fact that the entire block is conceptually (as well as programmatically) a single compound statement.
In other words: I see the braces being part of the block, and as all statements inside a block share the same indentation, in my opinion the brace - being part of the block - should have the same indentation too.

Indent width is 4, tabs are allowed (when set to 8). I prefer having it being spaces only, but as I cannot see the difference with good editors, I do not really care.

Opening brace should be on the same line as the conditional

Block should be indented

Closing brace should have the same indent as the block
  if ($flag eq "a") {
      $anchor = $header;
      }
This style is also referred to as Ratliff style on wikipedia or Banner style on wikimedia.
Continuation Indentation
  if ($flag eq "a") {
      $anchor = substr ($header, 0, 6) .
                substr ($char_list, $place_1, 1) .
                substr ($char_list, $place_2, 1);
      }
Or, also acceptable:
  if ($flag eq "a") {
      $anchor =
          substr ($header, 0, 6) .
          substr ($char_list, $place_1, 1) .
          substr ($char_list, $place_2, 1);
      }
Braces Right
  if ($bigwasteofspace1 && $bigwasteofspace2 ||
      $bigwasteofspace3 && $bigwasteofspace4) {
      big_waste_of_time ();
      }
also acceptable:
  if (   $bigwasteofspace1 && $bigwasteofspace2
      || $bigwasteofspace3 && $bigwasteofspace4) {
      big_waste_of_time ();
      }
also acceptable:
  if (  $bigwasteofspace1 && $bigwasteofspace2 ||
        $bigwasteofspace3 && $bigwasteofspace4) {
      big_waste_of_time ();
      }
(No) Cuddled Else
Of course cuddled else is not the way to go, as it makes removing either branch more difficult and makes the indent of the closing brace go wrong. The only right way to use if/else indent is uncuddled:
  if ($flag eq "h") {
      $headers = 0;
      }
  elsif ($flag eq "f") {
      $sectiontype = 3;
      }
  else {
      print "invalid option: " . substr ($arg, $i, 1) . "\n";
      dohelp ();
      }
Vertical tightness
  sub _directives
  {
      {   ENDIF => \&_endif,
          IF    => \&_if,
          };
      } # _directives
the opening brace of a sub may optionally be put on a new line. If so, it should be in column one, for all those that use 'vi' or one of it's clones, so }, {, ]], and [[ work as expected.
if the opening brace is on the same line, which I prefer, it requires a single leading space
  sub _directives {
      {   ENDIF => \&_endif,
          IF    => \&_if,
          };
      } # _directives
Indentation Style for Other Containers

Opening Vertical Tightness
  $dbh = DBI->connect (undef, undef, undef, {
      PrintError => 0,
      RaiseError => 1,
      });
  if (!defined (start_slip ($DEVICE, $PHONE,  $ACCOUNT, $PASSWORD,
                            $LOCAL,  $REMOTE, $NETMASK, $MTU)) &&
       $continuation_flag) {
      do_something_about_it ();
      }
Closing Token Placement
  my @month_of_year = ( "Jan", "Feb", "Mar", "Apr", "May", "Jun",
                        "Jul", "Aug", "Sep", "Oct", "Nov", "Dec",
                        );
also acceptable:
  my @month_of_year = (qw(
      Jan Feb Mar Apr May Jun
      Jul Aug Sep Oct Nov Dec
      ));
As with the closing brace of a block, the closing parenthesis belongs to the data in the container it closes, and thus should have the same indentation.
Define Horizontal Tightness
Of course function <space> <paren> <no-space> <first-arg> <comma> <space>
  if ((my $duration = travel ($target, $means)) > 1200) {
One of my pet-peeves. Having white-space between the function name and its opening parenthesis is the best match to how we think. As an example, if I would ask someone to describe his/her day, he/she might answer
  I woke up
  I freshened myself
  I had breakfast
  I got to work
  I worked
  I had lunch
  I worked again
  I went home
  I had diner
  I watched TV
  I brushed my teeth
  I went to bed
In computer-speak
  wake_up ();
  wash ($self);
  eat ("breakfast");
  goto ("work")
  work ();
  eat ("lunch");
  work ();
  goto ("home");
  eat ("diner");
  watch_tv ();
  wash ($teeth);
  sleep ();
In which the seasoned programmer might see
  for $day in (qw( Mon Tue Wed Thu Fri )) {
     wake_up ();
     wash ($self);
     eat ("breakfast");
     :
     :
Or, more extreme to show the sequence of actions
  for $day in (qw( Mon Tue Wed Thu Fri )) {
     wake_up ();
     wash    ($self);
     eat     ("breakfast");
     :
     :
Where it, IMHO, clearly shows that the actions are far more important than what it takes to perform the action. When I read through the process, I don't care about what transport the person uses to get to work and if eggs are part of the breakfast. These are the parameters to the actions
  for $day in (qw( Mon Tue Wed Thu Fri )) {
     wake_up ();
     wash    ($day eq "Fri" ? "bath" : "shower", water_temp => "47");
     eat     (type   => "breakfast", eggs  => 2, toast => 4, Tea => "yes");
     travel  (target => $work,       means => "train");
     :
     :
I will only have a look at the function's argument if I need to. In reading that I eat , I see what action is taken. That's enough for understanding the program flow. The arguments to the function have to be grouped together using parenthesis for the function to know that all the arguments are for the function: the parenthesis are there to group the arguments, not to make the function a function so the parenthesis belong to the arguments and not to the function and therefor are to be close to the arguments ant not to the function.
Arguments are separated by a comma and a space, just to separate the arguments more for better readability
  my $width = $col[$j + $k] - $col[$j];

  my %bf = map { $_ => -M $_ } grep { m/\.deb$/ } dirents ".";
Statement modifiers
  $work_done and go_home ();
A rule of thumb is to NEVER use statement modifiers like
  go_home () unless $work_done; # WRONG!
As it will draw the attention to going home (unconditionally) instead of to the condition, which is more important. This is especially annoying when using exit, die, croak or return. Any of these will visually end the current scope, so you do not have to read on. Unless there is a statement modifier and you need to re-read the entire section.
No else after return/exit/die/croak/throw
  if (expression) {
      return;
      }
  else {
      return 42;
      }
As any of return, exit, die, croak, or throw will immediately exit the current scope, the mind will read the code as to stop processing it right there, which is exactly what those keywords are for.
In an if/else construct, the code after the construct is supposed to be executed when either if the if/else branches where followed. If the if-branch exits the current scope, there is no need to run the code after the construct, so the else is useless.
This is the main reason why these keywords should never have a statement modifier (and no, you cannot come up with a valid exception to this rule).
Statement Termination Semicolon Spaces
  my $i = 1;
For Loop Semicolon Spaces
  for (@a = @$ap, $u = shift @a; @a; $u = $v) {
Block Comment Indentation

If comment is aligned to the left margin, leave it there

If the original comment was indented, match the indent to the surrounding code.

Never reformat comments itself. Do not wrap

Outdenting Long Quotes
  if ($source_stream) {
      if (@ARGV > 0) {
          die "You may not specify any filenames when a source array is given\n";
          }
      }
  if ($source_stream) {
      if (@ARGV > 0) {
          die "You may not specify any filenames ".
              "when a source array is given\n";
          }
      }
  for (@methods) {
      push @results, {
          name => $_->name,
          help => $_->help,
          };
      }

[Nov 29, 2017] How can I have variable assertions in Perl

Nov 29, 2017 | stackoverflow.com

Alex ,Jun 21, 2009 at 12:45
How can I check that a variable has a specific value in Perl? Is there a command to stop a script's execution to look up some of it's variables?
I wonder if I can use the Pythonic practice of inserting:
    assert 0, (foo, bar)
to debug scripts in a debuger-less way?
Telemachus ,Jun 21, 2009 at 12:58

A quick CPAN search suggests Carp::Assert .

Sinan Ünür ,Jun 21, 2009 at 13:30

+1 for typing in more characters faster than I did. I am going to edit the URL to be version agnostic though. – Sinan Ünür Jun 21 '09 at 13:30

Telemachus ,Jun 21, 2009 at 13:50

I was going to comment on your answer about the photo finish. As for the URL, I constantly forget that, so thanks. – Telemachus Jun 21 '09 at 13:50

Sinan Ünür ,Jun 21, 2009 at 12:58

See Carp::Assert .

zoul ,Jun 21, 2009 at 13:44

Smart::Comments are nice.

RET ,Jun 22, 2009 at 3:28

Smart::Comments++ When used with the -ENV switch, it's a fantastic tool for this sort of thing. Much better than having to strip all the tests out before going to production, as someone else suggested.
– RET Jun 22 '09 at 3:28

nik ,Jun 21, 2009 at 12:54

There is a script at PerlMonks that introduces a fast assert method.
Speed is important since Perl is interpreted and any inline checks will impact performance (unlike simple C macros for example)

I am not sure if these things are going to be directly usable.

there is Test::Harness in default installs. Here is a starter tutorial . The more recent module is TAP::Harness

A slower version along the lines you talk is Sub::Assert

Ok! This is what i was looking for -- PDF Warning: Test-Tutorial.pdf . The Test::Harness is used for writing Perl module tests.

Ape-inago ,Jun 21, 2009 at 13:51
$var_to_check =~ /sometest/ or die "bad variable!";
I tend to throw things like this in my code, and later use a find and replace to get rid of them (in production code).

Also, ' eval ' can be used to run a section of code and capture errors and can be used to create exception handling functionality. If you are asserting that a value is not 0, perhaps you want to throw an exception and handle that case in a special way?
> ,
if ( $next_sunrise_time > 24*60*60 ) { warn( "assertion failed" ); } # Assert that the sun must rise in the next 24 hours.
You can do this if you do not have access to Perl 5.9 which is required for Carp::Assert .

[Nov 17, 2017] Bruce Gray - Your Perl 5 Brain, on Perl 6 by Bruce Gray

Nov 17, 2017 | www.youtube.com

Published on Jun 21, 2017

In which I detail the Perl 6 elements that have most changed my Perl 5 coding, and share the Perl 5 techniques I have adopted.

I eat, sleep, live, and breathe Perl!

Consultant and Contract Programmer Frequent PerlMongers speaker Dedicated Shakespeare theater-goer Armchair Mathematician Author of Blue_Tiger, a tool for modernizing Perl.

36 years coding 22 years Perl 16 years Married 15 YAPCs 7 Hackathons 3 PerlWhirls Perl interests: Refactoring, Perl Idioms / Micropatterns, RosettaCode, and Perl 6.

[Nov 13, 2017] ‎Perl's Worst Best Practices‎ by Daina Pettit

YouTube 2016

Some good critique

[Jan 26, 2012] Appendix B Perl Best Practices - O'Reilly Media

This is very questionable recommendations. Should be taken with a grain of salt

Table of Contents

B.1. Chapter 2, [Code Layout]

B.2. Chapter 3, [Naming Conventions]

B.3. Chapter 4, [Values and Expressions]

B.4. Chapter 5, [Variables]

B.5. Chapter 6, [Control Structures]

B.6. Chapter 7, [Documentation]

B.7. Chapter 8, [Built-in Functions]

B.8. Chapter 9, [Subroutines]

B.9. Chapter 10, [I/O]

B.10. Chapter 11, [References]

B.11. Chapter 12, [Regular Expressions]

B.12. Chapter 13, [Error Handling]

B.13. Chapter 14, [Command-Line Processing]

B.14. Chapter 15, [Objects]

B.15. Chapter 16, [Class Hierarchies]

B.16. Chapter 17, [Modules]

B.17. Chapter 18, [Testing and Debugging]

B.18. Chapter 19, [Miscellanea]

This appendix lists the complete set of 256 guidelines presented in this book. The section heading under which each guideline appears is also provided in square brackets.
Chapter 2, [Code Layout]

Brace and parenthesize in K&R style. [Bracketing]

Separate your control keywords from the following opening bracket. [Keywords]

Don't use unnecessary parentheses for builtins and "honorary" builtins. [Builtins]

Separate complex keys or indices from their surrounding brackets. [Keys and Indices]

Use whitespace to help binary operators stand out from their operands. [Operators]

Place a semicolon after every statement. [Semicolons]

Place a comma after every value in a multiline list. [Commas]

Use 78-column lines. [Line Lengths]

Use four-column indentation levels. [Indentation]

Indent with spaces, not tabs. [Tabs]

Code in paragraphs. [Chunking]

Align corresponding items vertically. [Vertical Alignment]

Break long expressions before an operator. [Breaking Long Lines]

Factor out long expressions in the middle of statements. [Non-Terminal Expressions]

Always break a long expression at the operator of the lowest possible precedence. [Breaking by Precedence]

Break long assignments before the assignment operator. [Assignments]

Enforce your chosen layout style mechanically. [Automated Layout]

Chapter 3, [Naming Conventions]

Use grammatical templates when forming identifiers. [Identifiers]

Name booleans after their associated test. [Booleans]

Mark variables that store references with a _ref suffix. [Reference Variables]

Name arrays in the plural and hashes in the singular. [Arrays and Hashes]

Use underscores to separate words in multiword identifiers. [Underscores]

Distinguish different program components by case. [Capitalization]

Abbr idents by prefx. [Abbreviations]

Abbreviate only when the meaning remains unambiguous. [Ambiguous Abbreviations]

Avoid using inherently ambiguous words in names. [Ambiguous Names]

Prefix "for internal use only" subroutines with an underscore. [Utility Subroutines]

Chapter 4, [Values and Expressions]

Use interpolating string delimiters only for strings that actually interpolate. [String Delimiters]

Don't use "" or '' for an empty string. [Empty Strings]

Don't write one-character strings in visually ambiguous ways. [Single-Character Strings]

Use named character escapes instead of numeric escapes. [Escaped Characters]

Use named constants, but don't use constant. [Constants]

Don't pad decimal numbers with leading zeros. [Leading Zeros]

Use underscores to improve the readability of long numbers. [Long Numbers]

Lay out multiline strings over multiple lines. [Multiline Strings]

Use a heredoc when a multiline string exceeds two lines. [Here Documents]

Use a "theredoc" when a heredoc would compromise your indentation. [Heredoc Indentation]

Make every heredoc terminator a single uppercase identifier with a standard prefix. [Heredoc Terminators]

When introducing a heredoc, quote the terminator. [Heredoc Quoters]

Don't use barewords. [Barewords]

Reserve => for pairs. [Fat Commas]

Don't use commas to sequence statements. [Thin Commas]

Don't mix high- and low-precedence booleans. [Low-Precedence Operators]

Parenthesize every rrreraw list. [Lists]

Use table-lookup to test for membership in lists of strings; use any() for membership of lists of anything else. [List Membership]

Chapter 5, [Variables]

Avoid using non-lexical variables. [Lexical Variables]

Don't use package variables in your own development. [Package Variables]

If you're forced to modify a package variable, localize it. [Localization]

Initialize any variable you localize. [Initialization]

use English for the less familiar punctuation variables. [Punctuation Variables]

If you're forced to modify a punctuation variable, localize it. [Localizing Punctuation Variables]

Don't use the regex match variables. [Match Variables]

Beware of any modification via $_. [Dollar-Underscore]

Use negative indices when counting from the end of an array. [Array Indices]

Take advantage of hash and array slicing. [Slicing]

Use a tabular layout for slices. [Slice Layout]

Factor large key or index lists out of their slices. [Slice Factoring]

Chapter 6, [Control Structures]

Use block if, not postfix if. [If Blocks]

Reserve postfix if for flow-of-control statements. [Postfix Selectors]

Don't use postfix unless, for, while, or until. [Other Postfix Modifiers]

Don't use unless or until at all. [Negative Control Statements]

Avoid C-style for statements. [C-Style Loops]

Avoid subscripting arrays or hashes within loops. [Unnecessary Subscripting]

Never subscript more than once in a loop. [Necessary Subscripting]

Use named lexicals as explicit for loop iterators. [Iterator Variables]

Always declare a for loop iterator variable with my. [Non-Lexical Loop Iterators]

Use map instead of for when generating new lists from old. [List Generation]

Use grep and first instead of for when searching for values in a list. [List Selections]

Use for instead of map when transforming a list in place. [List Transformation]

Use a subroutine call to factor out complex list transformations. [Complex Mappings]

Never modify $_ in a list function. [List Processing Side Effects]

Avoid cascading an if. [Multipart Selections]

Use table look-up in preference to cascaded equality tests. [Value Switches]

When producing a value, use tabular ternaries. [Tabular Ternaries]

Reject as many iterations as possible, as early as possible. [Linear Coding]

Don't contort loop structures just to consolidate control. [Distributed Control]

Use for and redo instead of an irregularly counted while. [Redoing]

Label every loop that is exited explicitly, and use the label with every next, last, or redo. [Loop Labels]

Chapter 7, [Documentation]

Distinguish user documentation from technical documentation. [Types of Documentation]

Create standard POD templates for modules and applications. [Boilerplates]

Extend and customize your standard POD templates. [Extended Boilerplates]

Put user documentation in source files. [Location]

Keep all user documentation in a single place within your source file. [Contiguity]

Place POD as close as possible to the end of the file. [Position]

Subdivide your technical documentation appropriately. [Technical Documentation]

Use block templates for major comments. [Comments]

Use full-line comments to explain the algorithm. [Algorithmic Documentation]

Use end-of-line comments to point out subtleties and oddities. [Elucidating Documentation]

Comment anything that has puzzled or tricked you. [Defensive Documentation]

Consider whether it's better to rewrite than to comment. [Indicative Documentation]

Use "invisible" POD sections for longer technical discussions. [Discursive Documentation]

Check the spelling, syntax, and sanity of your documentation. [Proofreading]

Chapter 8, [Built-in Functions]

Don't recompute sort keys inside a sort. [Sorting]

Use reverse to reverse a list. [Reversing Lists]

Use scalar reverse to reverse a scalar. [Reversing Scalars]

Use unpack to extract fixed-width fields. [Fixed-Width Data]

Use split to extract simple variable-width fields. [Separated Data]

Use Text::CSV_XS to extract complex variable-width fields. [Variable-Width Data]

Avoid string eval. [String Evaluations]

Consider building your sorting routines with Sort::Maker. [Automating Sorts]

Use 4-arg substr instead of lvalue substr. [Substrings]

Make appropriate use of lvalue values. [Hash Values]

Use glob, not <...>. [Globbing]

Avoid a raw select for non-integer sleeps. [Sleeping]

Always use a block with a map and grep. [Mapping and Grepping]

Use the "non-builtin builtins". [Utilities]

Chapter 9, [Subroutines]

Call subroutines with parentheses but without a leading &. [Call Syntax]

Don't give subroutines the same names as built-in functions. [Homonyms]

Always unpack @_ first. [Argument Lists]

Use a hash of named arguments for any subroutine that has more than three parameters. [Named Arguments]

Use definedness or existence to test for missing arguments. [Missing Arguments]

Resolve any default argument values as soon as @_ is unpacked. [Default Argument Values]

Always return scalar in scalar returns. [Scalar Return Values]

Make list-returning subroutines return the "obvious" value in scalar context. [Contextual Return Values]

When there is no "obvious" scalar context return value, consider Contextual::Return instead. [Multi-Contextual Return Values]

Don't use subroutine prototypes. [Prototypes]

Always return via an explicit return. [Implicit Returns]

Use a bare return to return failure. [Returning Failure]

Chapter 10, [I/O]

Don't use bareword filehandles. [Filehandles]

Use indirect filehandles. [Indirect Filehandles]

If you have to use a package filehandle, localize it first. [Localizing Filehandles]

Use either the IO::File module or the three-argument form of open. [Opening Cleanly]

Never open, close, or print to a file without checking the outcome. [Error Checking]

Close filehandles explicitly, and as soon as possible. [Cleanup]

Use while (<>), not for (<>). [Input Loops]

Prefer line-based I/O to slurping. [Line-Based Input]

Slurp a filehandle with a do block for purity. [Simple Slurping]

Slurp a stream with Perl6::Slurp for power and simplicity. [Power Slurping]

Avoid using *STDIN, unless you really mean it. [Standard Input]

Always put filehandles in braces within any print statement. [Printing to Filehandles]

Always prompt for interactive input. [Simple Prompting]

Don't reinvent the standard test for interactivity. [Interactivity]

Use the IO::Prompt module for prompting. [Power Prompting]

Always convey the progress of long non-interactive operations within interactive applications. [Progress Indicators]

Consider using the Smart::Comments module to automate your progress indicators. [Automatic Progress Indicators]

Avoid a raw select when setting autoflushes. [Autoflushing]

Chapter 11, [References]

Wherever possible, dereference with arrows. [Dereferencing]

Where prefix dereferencing is unavoidable, put braces around the reference. [Braced References]

Never use symbolic references. [Symbolic References]

Use weaken to prevent circular data structures from leaking memory. [Cyclic References]

Chapter 12, [Regular Expressions]

Always use the /x flag. [Extended Formatting]

Always use the /m flag. [Line Boundaries]

Use \A and \z as string boundary anchors. [String Boundaries]

Use \z, not \Z, to indicate "end of string". [End of String]

Always use the /s flag. [Matching Anything]

Consider mandating the Regexp::Autoflags module. [Lazy Flags]

Use m{...} in preference to /.../ in multiline regexes. [Brace Delimiters]

Don't use any delimiters other than /.../ or m{...}. [Other Delimiters]

Prefer singular character classes to escaped metacharacters. [Metacharacters]

Prefer named characters to escaped metacharacters. [Named Characters]

Prefer properties to enumerated character classes. [Properties]

Consider matching arbitrary whitespace, rather than specific whitespace characters. [Whitespace]

Be specific when matching "as much as possible". [Unconstrained Repetitions]

Use capturing parentheses only when you intend to capture. [Capturing Parentheses]

Use the numeric capture variables only when you're sure that the preceding match succeeded. [Captured Values]

Always give captured substrings proper names. [Capture Variables]

Tokenize input using the /gc flag. [Piecewise Matching]

Build regular expressions from tables. [Tabular Regexes]

Build complex regular expressions from simpler pieces. [Constructing Regexes]

Consider using Regexp::Common instead of writing your own regexes. [Canned Regexes]

Always use character classes instead of single-character alternations. [Alternations]

Factor out common affixes from alternations. [Factoring Alternations]

Prevent useless backtracking. [Backtracking]

Prefer fixed-string eq comparisons to fixed-pattern regex matches. [String Comparisons]

Chapter 13, [Error Handling]

Throw exceptions instead of returning special values or setting flags. [Exceptions]

Make failed builtins throw exceptions too. [Builtin Failures]

Make failures fatal in all contexts. [Contextual Failure]

Be careful when testing for failure of the system builtin. [Systemic Failure]

Throw exceptions on all failures, including recoverable ones. [Recoverable Failure]

Have exceptions report from the caller's location, not from the place where they were thrown. [Reporting Failure]

Compose error messages in the recipient's dialect. [Error Messages]

Document every error message in the recipient's dialect. [Documenting Errors]

Use exception objects whenever failure data needs to be conveyed to a handler. [OO Exceptions]

Use exception objects when error messages may change. [Volatile Error Messages]

Use exception objects when two or more exceptions are related. [Exception Hierarchies]

Catch exception objects in most-derived-first order. [Processing Exceptions]

Build exception classes automatically. [Exception Classes]

Unpack the exception variable in extended exception handlers. [Unpacking Exceptions]

Chapter 14, [Command-Line Processing]

Enforce a single consistent command-line structure. [Command-Line Structure]

Adhere to a standard set of conventions in your command-line syntax. [Command-Line Conventions]

Standardize your meta-options. [Meta-options]

Allow the same filename to be specified for both input and output. [In-situ Arguments]

Standardize on a single approach to command-line processing. [Command-Line Processing]

Ensure that your interface, run-time messages, and documentation remain consistent. [Interface Consistency]

Factor out common command-line interface components into a shared module. [Interapplication Consistency]

Chapter 15, [Objects]

Make object orientation a choice, not a default. [Using OO]

Choose object orientation using appropriate criteria. [Criteria]

Don't use pseudohashes. [Pseudohashes]

Don't use restricted hashes. [Restricted Hashes]

Always use fully encapsulated objects. [Encapsulation]

Give every constructor the same standard name. [Constructors]

Don't let a constructor clone objects. [Cloning]

Always provide a destructor for every inside-out class. [Destructors]

When creating methods, follow the general guidelines for subroutines. [Methods]

Provide separate read and write accessors. [Accessors]

Don't use lvalue accessors. [Lvalue Accessors]

Don't use the indirect object syntax. [Indirect Objects]

Provide an optimal interface, rather than a minimal one. [Class Interfaces]

Overload only the isomorphic operators of algebraic classes. [Operator Overloading]

Always consider overloading the boolean, numeric, and string coercions of objects. [Coercions]

Chapter 16, [Class Hierarchies]

Don't manipulate the list of base classes directly. [Inheritance]

Use distributed encapsulated objects. [Objects]

Never use the one-argument form of bless. [Blessing Objects]

Pass constructor arguments as labeled values, using a hash reference. [Constructor Arguments]

Distinguish arguments for base classes by class name as well. [Base Class Initialization]

Separate your construction, initialization, and destruction processes. [Construction and Destruction]

Build the standard class infrastructure automatically. [Automating Class Hierarchies]

Use Class::Std to automate the deallocation of attribute data. [Attribute Demolition]

Have attributes initialized and verified automatically. [Attribute Building]

Specify coercions as :STRINGIFY, :NUMERIFY, and :BOOLIFY methods. [Coercions]

Use :CUMULATIVE methods instead of SUPER:: calls. [Cumulative Methods]

Don't use AUTOLOAD(). [Autoloading]

Chapter 17, [Modules]

Design the module's interface first. [Interfaces]

Place original code inline. Place duplicated code in a subroutine. Place duplicated subroutines in a module. [Refactoring]

Use three-part version numbers. [Version Numbers]

Enforce your version requirements programmatically. [Version Requirements]

Export judiciously and, where possible, only by request. [Exporting]

Consider exporting declaratively. [Declarative Exporting]

Never make variables part of a module's interface. [Interface Variables]

Build new module frameworks automatically. [Creating Modules]

Use core modules wherever possible. [The Standard Library]

Use CPAN modules where feasible. [CPAN]

Chapter 18, [Testing and Debugging]

Write the test cases first. [Test Cases]

Standardize your tests with Test::Simple or Test::More. [Modular Testing]

Standardize your test suites with Test::Harness. [Test Suites]

Write test cases that fail. [Failure]

Test both the likely and the unlikely. [What to Test]

Add new test cases before you start debugging. [Debugging and Testing]

Always use strict. [Strictures]

Always turn on warnings explicitly. [Warnings]

Never assume that a warning-free compilation implies correctness. [Correctness]

Turn off strictures or warnings explicitly, selectively, and in the smallest possible scope. [Overriding Strictures]

Learn at least a subset of the perl debugger. [The Debugger]

Use serialized warnings when debugging "manually". [Manual Debugging]

Consider using "smart comments" when debugging, rather than warn statements. [Semi-Automatic Debugging]

Chapter 19, [Miscellanea]

Use a revision control system. [Revision Control]

Integrate non-Perl code into your applications via the Inline:: modules. [Other Languages]

Keep your configuration language uncomplicated. [Configuration Files]

Don't use formats. [Formats]

Don't tie variables or filehandles. [Ties]

Don't be clever. [Cleverness]

If you must rely on cleverness, encapsulate it. [Encapsulated Cleverness]

Don't optimize code-benchmark it. [Benchmarking]

Don't optimize data structures-measure them. [Memory]

Look for opportunities to use caches. [Caching]

Automate your subroutine caching. [Memoization]

Benchmark any caching strategy you use. [Caching for Optimization]

Don't optimize applications-profile them. [Profiling]

Be careful to preserve semantics when refactoring syntax. [Enbugging]

[Oct 27, 2011] How can I convert Perl code into HTML with syntax highlighting?

vim can do it.

Jun 16, 2002 | www.perlmonks.org

ChemBoy

Use Perltidy's HTML formatting options (-html etc.) As shown in the Perltidy manual, you can generate a whole page or a simple preformatted section, with embedded or linked style sheets, and set different colors for all manner of different constructs. Answer: How can I convert Perl code into HTML with syntax highlighting?

yodabjorn

vim can do it.

:runtime! syntax/2html.vim
[download]

To view the help page:
:help 2html
[download]

Details are in the vim help under "convert-to-HTML".

epoptai

Code2HTML is a free Perl script that syntax highlights 15 different programming languages.

choroba

Emacs can do it. I often use the htmlfontify library.

togotutor

There is a robust online Source Code to HTML option at ToGoTutor - Code2Html . The website has tools for Perl, Java and other languages.<

Perl style guide - Juerd's site

This is how I like my code, in no specific order. :)

4 space indents

No tabs in code (includes indents)

Always Class->method, never method Class (this includes "new"!)

Cuddled else: } else {

Opening curly on the same line as the keyword it belongs to

Closing vertically aligned with that keyword

Space after comma or semi-colon, but not before

No extra spaces around or inside parens: foo, (bar, baz), quux

Extra spaces in arrayref constructor: [ foo, bar ]

Extra spaces in hashref constructor: { foo => bar }

Extra spaces in code delimiting curlies: sort { $a <=> $b } @foo

No $a or $b except when sorting

No parens unless needed for clarity

Space between special keyword and its arguments: if (...) { ... }

No space between keyword and its arguments if the "looks like a function, therefore it is a function" rule applies: print((split)[22]), not print ((split)[22]). (And of course not print (split)[22])

No subroutine prototypes if they're ignored anyway

No subroutine prototypes just to hint the number of arguments

Prototypes enforce context, so use them only if that makes sense

No globals when access from another package is not needed

use strict and -w. Loading of normal modules comes after loading strict.

Lots of modules, but not to replace few-liners or simple regexes

Comments on code lines have two spaces before and one after the # symbol

No double spaces except for vertical alignment and comments

Only && || ! where parens would be needed with and or not

No double empty lines

Empty line between logical code chunks

Explicit returns from subs

Guards (return if ...) are nicer than large else-blocks

No space between array/hash and index/key: $foo[0], $foo{bar}

No quotes for simple literal hash keys

Space around index/key if it is complex: $foo{ $bar{baz}{bar} }

Long lines: indent according to parens, but always 4 spaces (or [], {}, etc)

Long lines: continuing lines are indented

Long lines: Lines end with operator, unless it's || && and or

No "outdent"s

No half indents

No double indents

grep EXPR and map EXPR when BLOCK is not needed

Logical order in comparisons: $foo == 4, but never 4 == $foo

English identifiers

Not the English.pm module

Multi-word identifiers have no separation, or are separated by underscores

Lowercase identifiers, but uppercase for constants

Whatever tool is useful: no OO when it does not make sense

It's okay to import symbols

No here-documents, but multi-line q/qq. Even repeated prints are better :) (Okay, here-docs can be used when they're far away from code that contains any logic. Code MUST NOT break when (un)indented.)

Always check return values where they are important

No spaces around: -> **

Spaces around: =~ !~ * / % + - . << >> comparison_ops & | ^ && || ?: assignment_ops => and or xor

Spaces or no spaces, depending on complexity: .. ... x

No space after, unless complex: ~ u+ u-

Long lines: break between method calls, -> comes first on a line, space after it

=> where it makes sense

qw where useful

qw when importing, but '' when specifying pragma behaviour

() for empty list, not qw()

-> to dereference, where possible

No abbreviations (acronyms are okay, and so are VERY common abbreviations) NEVER "ary"

Data type not represented in variable name: %foo and @foo, but not %foo_hash or @foo_array

Sometimes: data type of referent in reference variable names: $bla_hash is okay

Sometimes: data type 'reference' in reference variable names: $hashref is okay

No one-letter variable names, unless $i or alike

$i is a(n index) counter

Dummy variables can be called foo, bar, baz, quux or just dummy

Taint mode *only* for setuid programs

No sub main(), unless it needs to be called more often than once

Subs before main code!

Declare variables on first use, not before (unless required)

\cM > \x0d > \015. \r only where it makes sense as carriage return.

Complex regexes get /x

No space between ++/-- and the variable

List assignment for parameters/arguments, not lots of shifts

Only shift $self from @_ if @_ is used elsewhere in the sub

Direct @_ access is okay in very short subs

No eval STRING if not needed

Constructor "new" does not clone. Only handles a *class* as $_[0]

Constructor that clones is called "clone"

Constructor can be something else than "new", but "new" is an alias

No setting of $| when it is not needed

Lexical filehandles

No v-strings

Single quotes when double-quote features not used

In DBI: value interpolation using placeholders only

use base 'BaseClass' instead of use BaseClass and setting @ISA

Comments where code is unclear

Comments usually explain the WHY, not the HOW

POD at the bottom, not top, not interleaved

Sane variable scopes

No local, except for perlvar vars

No C-style loop for skipless iteration

No looping over indexes if only the element is used

80 characters width. It's okay to give up some whitespace

Unbalanced custom delimiters are not metacharacters and not alphanumeric

RHS of complex s///e is delimited by {}

Favourite custom delimiter is []

Semi-colon only left out for implicit return or in single-statement block

No $&, $` or $'

Localization of globals if they're to be changed (local $_ often avoids weird bugs)

Semi-colon not on its own line

(in|de)crement in void context is post(in|de)crement

No map or grep in void context

? and : begin lines in complex expressions

True and false are always implied. No $foo == 0 when testing for truth.

Only constructors return $self. Accessor methods never do this.

Stacking methods is okay, but a non-constructor method should never return $self.

Accessor methods should behave like variables (Attribute::Property!)

Other methods should behave like subroutines

our $VERSION, not use vars qw($VERSION);

Module version numbers are ^\d+\.\d\d\z

Error checking is done using or. This means open or do { ... } instead of unless (open) { ... } when handling the error is more than a simple statement.

The result of the modulus operator (%) has no useful boolean meaning (it is reversed), so explicit == 0 should be used.

[July 14, 2005] Perl.com- Ten Essential Development Practices by Damian Conway

Beware, Damian Conway is an OO fundamentalist...

The following ten tips come from Perl Best Practices, a new book of Perl coding and development guidelines by Damian Conway.

1. Design the Module's Interface First

The most important aspect of any module is not how it implements the facilities it provides, but the way in which it provides those facilities in the first place. If the module's API is too awkward, or too complex, or too extensive, or too fragmented, or even just poorly named, developers will avoid using it. They'll write their own code instead. In that way, a poorly designed module can actually reduce the overall maintainability of a system.

Designing module interfaces requires both experience and creativity. Perhaps the easiest way to work out how an interface should work is to "play test" it: to write examples of code that will use the module before implementing the module itself. These examples will not be wasted when the design is complete. You can usually recycle them into demos, documentation examples, or the core of a test suite.

The key, however, is to write that code as if the module were already available, and write it the way you'd most like the module to work.

Once you have some idea of the interface you want to create, convert your "play tests" into actual tests (see Tip #2). Then it's just a Simple Matter Of Programming to make the module work the way that the code examples and the tests want it to.

Of course, it may not be possible for the module to work the way you'd most like, in which case attempting to implement it that way will help you determine what aspects of your API are not practical, and allow you to work out what might be an acceptable alternative.

2. Write the Test Cases Before the Code

Probably the single best practice in all of software development is writing your test suite first.

A test suite is an executable, self-verifying specification of the behavior of a piece of software. If you have a test suite, you can--at any point in the development process--verify that the code works as expected. If you have a test suite, you can--after any changes during the maintenance cycle--verify that the code still works as expected.

Write the tests first. Write them as soon as you know what your interface will be (see #1). Write them before you start coding your application or module. Unless you have tests, you have no unequivocal specification of what the software should do, and no way of knowing whether it does it.

Writing tests always seems like a chore, and an unproductive chore at that: you don't have anything to test yet, so why write tests? Yet most developers will--almost automatically--write driver software to test their new module in an ad hoc way:
> cat try_inflections.pl # Test my shiny new English inflections module... use Lingua::EN::Inflect qw( inflect ); # Try some plurals (both standard and unusual inflections)... my %plural_of = ( 'house' => 'houses', 'mouse' => 'mice', 'box' => 'boxes', 'ox' => 'oxen', 'goose' => 'geese', 'mongoose' => 'mongooses', 'law' => 'laws', 'mother-in-law' => 'mothers-in-law', ); # For each of them, print both the expected result and the actual inflection... for my $word ( keys %plural_of ) { my $expected = $plural_of{$word}; my $computed = inflect( "PL_N($word)" ); print "For $word:\n", "\tExpected: $expected\n", "\tComputed: $computed\n"; }
A driver like that is actually harder to write than a test suite, because you have to worry about formatting the output in a way that is easy to read. It's also much harder to use the driver than it would be to use a test suite, because every time you run it you have to wade though that formatted output and verify "by eye" that everything is as it should be. That's also error-prone; eyes are not optimized for picking out small differences in the middle of large amounts of nearly identical text.

Instead of hacking together a driver program, it's easier to write a test program using the standard Test::Simple module. Instead of print statements showing what's being tested, you just write calls to the ok() subroutine, specifying as its first argument the condition under which things are okay, and as its second argument a description of what you're actually testing:
> cat inflections.t use Lingua::EN::Inflect qw( inflect); use Test::Simple qw( no_plan); my %plural_of = ( 'mouse' => 'mice', 'house' => 'houses', 'ox' => 'oxen', 'box' => 'boxes', 'goose' => 'geese', 'mongoose' => 'mongooses', 'law' => 'laws', 'mother-in-law' => 'mothers-in-law', ); for my $word ( keys %plural_of ) { my $expected = $plural_of{$word}; my $computed = inflect( "PL_N($word)" ); ok( $computed eq $expected, "$word -> $expected" ); }
Note that this code loads Test::Simple with the argument qw( no_plan ). Normally that argument would be tests => count, indicating how many tests to expect, but here the tests are generated from the %plural_of table at run time, so the final count will depend on how many entries are in that table. Specifying a fixed number of tests when loading the module is useful if you happen know that number at compile time, because then the module can also "meta-test:" verify that you carried out all the tests you expected to.

The Test::Simple program is slightly more concise and readable than the original driver code, and the output is much more compact and informative:
> perl inflections.t ok 1 - house -> houses ok 2 - law -> laws not ok 3 - mongoose -> mongooses # Failed test (inflections.t at line 21) ok 4 - goose -> geese ok 5 - ox -> oxen not ok 6 - mother-in-law -> mothers-in-law # Failed test (inflections.t at line 21) ok 7 - mouse -> mice ok 8 - box -> boxes 1..8 # Looks like you failed 2 tests of 8. 
More importantly, this version requires far less effort to verify the correctness of each test. You just scan down the left margin looking for a not and a comment line.

You might prefer to use the Test::More module instead of Test::Simple. Then you can specify the actual and expected values separately, by using the is() subroutine, rather than ok():
use Lingua::EN::Inflect qw( inflect ); use Test::More qw( no_plan ); # Now using more advanced testing tools my %plural_of = ( 'mouse' => 'mice', 'house' => 'houses', 'ox' => 'oxen', 'box' => 'boxes', 'goose' => 'geese', 'mongoose' => 'mongooses', 'law' => 'laws', 'mother-in-law' => 'mothers-in-law', ); for my $word ( keys %plural_of ) { my $expected = $plural_of{$word}; my $computed = inflect( "PL_N($word)" ); # Test expected and computed inflections for string equality... is( $computed, $expected, "$word -> $expected" ); }
Apart from no longer having to type the eq yourself, this version also produces more detailed error messages:
> perl inflections.t ok 1 - house -> houses ok 2 - law -> laws not ok 3 - mongoose -> mongooses # Failed test (inflections.t at line 20) # got: 'mongeese' # expected: 'mongooses' ok 4 - goose -> geese ok 5 - ox -> oxen not ok 6 - mother-in-law -> mothers-in-law # Failed test (inflections.t at line 20) # got: 'mothers-in-laws' # expected: 'mothers-in-law' ok 7 - mouse -> mice ok 8 - box -> boxes 1..8 # Looks like you failed 2 tests of 8.
The Test::Tutorial documentation that comes with Perl 5.8 provides a gentle introduction to both Test::Simple and Test::More.

3. Create Standard POD Templates for Modules and Applications

One of the main reasons documentation can often seem so unpleasant is the "blank page effect." Many programmers simply don't know how to get started or what to say.

Perhaps the easiest way to make writing documentation less forbidding (and hence, more likely to actually occur) is to circumvent that initial empty screen by providing a template that developers can cut and paste into their code.

For a module, that documentation template might look something like this:
=head1 NAME <Module::Name> - <One-line description of module's purpose> =head1 VERSION The initial template usually just has: This documentation refers to <Module::Name> version 0.0.1. =head1 SYNOPSIS use <Module::Name>; # Brief but working code example(s) here showing the most common usage(s) # This section will be as far as many users bother reading, so make it as # educational and exemplary as possible. =head1 DESCRIPTION A full description of the module and its features. May include numerous subsections (i.e., =head2, =head3, etc.). =head1 SUBROUTINES/METHODS A separate section listing the public components of the module's interface. These normally consist of either subroutines that may be exported, or methods that may be called on objects belonging to the classes that the module provides. Name the section accordingly. In an object-oriented module, this section should begin with a sentence (of the form "An object of this class represents ...") to give the reader a high-level context to help them understand the methods that are subsequently described. =head1 DIAGNOSTICS A list of every error and warning message that the module can generate (even the ones that will "never happen"), with a full explanation of each problem, one or more likely causes, and any suggested remedies. =head1 CONFIGURATION AND ENVIRONMENT A full explanation of any configuration system(s) used by the module, including the names and locations of any configuration files, and the meaning of any environment variables or properties that can be set. These descriptions must also include details of any configuration language used. =head1 DEPENDENCIES A list of all of the other modules that this module relies upon, including any restrictions on versions, and an indication of whether these required modules are part of the standard Perl distribution, part of the module's distribution, or must be installed separately. =head1 INCOMPATIBILITIES A list of any modules that this module cannot be used in conjunction with. This may be due to name conflicts in the interface, or competition for system or program resources, or due to internal limitations of Perl (for example, many modules that use source code filters are mutually incompatible). =head1 BUGS AND LIMITATIONS A list of known problems with the module, together with some indication of whether they are likely to be fixed in an upcoming release. Also, a list of restrictions on the features the module does provide: data types that cannot be handled, performance issues and the circumstances in which they may arise, practical limitations on the size of data sets, special cases that are not (yet) handled, etc. The initial template usually just has: There are no known bugs in this module. Please report problems to <Maintainer name(s)> (<contact address>) Patches are welcome. =head1 AUTHOR <Author name(s)> (<contact address>) =head1 LICENSE AND COPYRIGHT Copyright (c) <year> <copyright holder> (<contact address>). All rights reserved. followed by whatever license you wish to release it under. For Perl code that is often just: This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See L<perlartistic>. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Of course, the specific details that your templates provide may vary from those shown here, according to your other coding practices. The most likely variation will be in the license and copyright, but you may also have specific in-house conventions regarding version numbering, the grammar of diagnostic messages, or the attribution of authorship.

4. Use a Revision Control System

Maintaining control over the creation and modification of your source code is utterly essential for robust team-based development. And not just over source code: you should be revision controlling your documentation, and data files, and document templates, and makefiles, and style sheets, and change logs, and any other resources your system requires.

Just as you wouldn't use an editor without an Undo command or a word processor that can't merge documents, so too you shouldn't use a file system you can't rewind, or a development environment that can't integrate the work of many contributors.

Programmers make mistakes, and occasionally those mistakes will be catastrophic. They will reformat the disk containing the most recent version of the code. Or they'll mistype an editor macro and write zeros all through the source of a critical core module. Or two developers will unwittingly edit the same file at the same time and half their changes will be lost. Revision control systems can prevent those kinds of problems.

Moreover, occasionally the very best debugging technique is to just give up, stop trying to get yesterday's modifications to work correctly, roll the code back to a known stable state, and start over again. Less drastically, comparing the current condition of your code with the most recent stable version from your repository (even just a line-by-line diff) can often help you isolate your recent "improvements" and work out which of them is the problem.

Revision control systems such as RCS, CVS, Subversion, Monotone, darcs, Perforce, GNU arch, or BitKeeper can protect against calamities, and ensure that you always have a working fallback position if maintenance goes horribly wrong. The various systems have different strengths and limitations, many of which stem from fundamentally different views on what exactly revision control is. It's a good idea to audition the various revision control systems, and find the one that works best for you. Pragmatic Version Control Using Subversion, by Mike Mason (Pragmatic Bookshelf, 2005) and Essential CVS, by Jennifer Vesperman (O'Reilly, 2003) are useful starting points.

5. Create Consistent Command-Line Interfaces

Command-line interfaces have a strong tendency to grow over time, accreting new options as you add features to the application. Unfortunately, the evolution of such interfaces is rarely designed, managed, or controlled, so the set of flags, options, and arguments that a given application accepts are likely to be ad hoc and unique.

This also means they're likely to be inconsistent with the unique ad hoc sets of flags, options, and arguments that other related applications provide. The result is inevitably a suite of programs, each of which is driven in a distinct and idiosyncratic way. For example:
> orchestrate source.txt -to interim.orc > remonstrate +interim.rem -interim.orc > fenestrate --src=interim.rem --dest=final.wdw Invalid input format > fenestrate --help Unknown option: --help. Type 'fenestrate -hmo' for help
Here, the orchestrate utility expects its input file as its first argument, while the -to flag specifies its output file. The related remonstrate tool uses -infile and +outfile options instead, with the output file coming first. The fenestrate program seems to require GNU-style "long options:" --src=infile and --dest=outfile, except, apparently, for its oddly named help flag. All in all, it's a mess.

When you're providing a suite of programs, all of them should appear to work the same way, using the same flags and options for the same features across all applications. This enables your users to take advantage of existing knowledge--instead of continually asking you.

Those three programs should work like this:
> orchestrate -i source.txt -o dest.orc > remonstrate -i source.orc -o dest.rem > fenestrate -i source.rem -o dest.wdw Input file ('source.rem') not a valid Remora file (type "fenestrate --help" for help) > fenestrate --help fenestrate - convert Remora .rem files to Windows .wdw format Usage: fenestrate [-i <infile>] [-o <outfile>] [-cstq] [-h|-v] Options: -i <infile> Specify input source [default: STDIN] -o <outfile> Specify output destination [default: STDOUT] -c Attempt to produce a more compact representation -h Use horizontal (landscape) layout -v Use vertical (portrait) layout -s Be strict regarding input -t Be extra tolerant regarding input -q Run silent --version Print version information --usage Print the usage line of this summary --help Print this summary --man Print the complete manpage
Here, every application that takes input and output files uses the same two flags to do so. A user who wants to use the substrate utility (to convert that final .wdw file to a subroutine) is likely to be able to guess correctly the required syntax:
> substrate -i dest.wdw -o dest.sub
Anyone who can't guess that probably can guess that:
> substrate --help
is likely to render aid and comfort.

A large part of making interfaces consistent is being consistent in specifying the individual components of those interfaces. Some conventions that may help to design consistent and predictable interfaces include:
Require a flag preceding every piece of command-line data, except filenames.
Users don't want to have to remember that your application requires "input file, output file, block size, operation, fallback strategy," and requires them in that precise order:
> lustrate sample_data proc_data 1000 normalize log
They want to be able to say explicitly what they mean, in any order that suits them:
> lustrate sample_data proc_data -op=normalize -b1000 --fallback=log
Provide a flag for each filename, too, especially when a program can be given files for different purposes.
Users might also not want to remember the order of the two positional filenames, so let them label those arguments as well, and specify them in whatever order they prefer:
> lustrate -i sample_data -op normalize -b1000 --fallback log -o proc_data
Use a single - prefix for short-form flags, up to three letters (-v, -i, -rw, -in, -out).
Experienced users appreciate short-form flags as a way of reducing typing and limiting command-line clutter. Don't make them type two dashes in these shortcuts.

Use a double -- prefix for longer flags (--verbose, --interactive, --readwrite, --input, --output).
Flags that are complete words improve the readability of a command line (in a shell script, for example). The double dash also helps to distinguish between the longer flag name and any nearby file names.
If a flag expects an associated value, allow an optional = between the flag and the value.
Some people prefer to visually associate a value with its preceding flag:
> lustrate -i=sample_data -op=normalize -b=1000 --fallback=log -o=proc_data
Others don't:
> lustrate -i sample_data -op normalize -b1000 --fallback log -o proc_data
Still others want a bit each way:
> lustrate -i sample_data -o proc_data -op=normalize -b=1000 --fallback=log
Let the user choose.
Allow single-letter options to be "bundled" after a single dash.
It's irritating to have to type repeated dashes for a series of flags:
> lustrate -i sample_data -v -l -x
Allow experienced users to also write:
> lustrate -i sample_data -vlx
Provide a multi-letter version of every single-letter flag.
Short-form flags may be nice for experienced users, but they can be troublesome for new users: hard to remember and even harder to recognize. Don't force people to do either. Give them a verbose alternative to every concise flag; full words that are easier to remember, and also more self-documenting in shell scripts.

Always allow - as a special filename.
A widely used convention is that a dash (-) where an input file is expected means "read from standard input," and a dash where an output file is expected means "write to standard output."

Always allow -- as a file list marker.
Another widely used convention is that the appearance of a double dash (--) on the command line marks the end of any flagged options, and indicates that the remaining arguments are a list of filenames, even if some of them look like flags.
6. Agree Upon a Coherent Layout Style and Automate It with perltidy

Formatting. Indentation. Style. Code layout. Whatever you choose to call it, it's one of the most contentious aspects of programming discipline. More and bloodier wars have been fought over code layout than over just about any other aspect of coding.

What is the best practice here? Should you use classic Kernighan and Ritchie style? Or go with BSD code formatting? Or adopt the layout scheme specified by the GNU project? Or conform to the Slashcode coding guidelines?

Of course not! Everyone knows that <insert your personal coding style here> is the One True Layout Style, the only sane choice, as ordained by <insert your favorite Programming Deity here> since Time Immemorial! Any other choice is manifestly absurd, willfully heretical, and self-evidently a Work of Darkness!

That's precisely the problem. When deciding on a layout style, it's hard to decide where rational choices end and rationalized habits begin.

Adopting a coherently designed approach to code layout, and then applying that approach consistently across all your coding, is fundamental to best-practice programming. Good layout can improve the readability of a program, help detect errors within it, and make the structure of your code much easier to comprehend. Layout matters.

However, most coding styles--including the four mentioned earlier--confer those benefits almost equally well. While it's true that having a consistent code layout scheme matters very much indeed, the particular code layout scheme you ultimately decide upon does not matter at all! All that matters is that you adopt a single, coherent style; one that works for your entire programming team, and, having agreed upon that style, that you then apply it consistently across all your development.

In the long term, it's best to train yourself and your team to code in a consistent, rational, and readable style. However, the time and commitment necessary to accomplish that isn't always available. In such cases, a reasonable compromise is to prescribe a standard code-formatting tool that must be applied to all code before it's committed, reviewed, or otherwise displayed in public.

There is now an excellent code formatter available for Perl: perltidy. It provides an extensive range of user-configurable options for indenting, block delimiter positioning, column-like alignment, and comment positioning.

Using perltidy, you can convert code like this:
if($sigil eq '$'){ if($subsigil eq '?'){ $sym_table{substr($var_name,2)}=delete $sym_table{locate_orig_var($var)}; $internal_count++;$has_internal{$var_name}++ } else { ${$var_ref} = q{$sym_table{$var_name}}; $external_count++; $has_external{$var_name}++; }} elsif ($sigil eq '@'&&$subsigil eq '?') { @{$sym_table{$var_name}} = grep {defined $_} @{$sym_table{$var_name}}; } elsif ($sigil eq '%' && $subsigil eq '?') { delete $sym_table{$var_name}{$EMPTY_STR}; } else { ${$var_ref} = q{$sym_table{$var_name}} }
into something readable:
if ( $sigil eq '$' ) { if ( $subsigil eq '?' ) { $sym_table{ substr( $var_name, 2 ) } = delete $sym_table{ locate_orig_var($var) }; $internal_count++; $has_internal{$var_name}++; } else { ${$var_ref} = q{$sym_table{$var_name}}; $external_count++; $has_external{$var_name}++; } } elsif ( $sigil eq '@' && $subsigil eq '?' ) { @{ $sym_table{$var_name} } = grep {defined $_} @{ $sym_table{$var_name} }; } elsif ( $sigil eq '%' && $subsigil eq '?' ) { delete $sym_table{$var_name}{$EMPTY_STR}; } else { ${$var_ref} = q{$sym_table{$var_name}}; }
Mandating that everyone use a common tool to format their code can also be a simple way of sidestepping the endless objections, acrimony, and dogma that always surround any discussion on code layout. If perltidy does all the work for them, then it will cost developers almost no effort to adopt the new guidelines. They can simply set up an editor macro that will "straighten" their code whenever they need to.

7. Code in Commented Paragraphs

A paragraph is a collection of statements that accomplish a single task: in literature, it's a series of sentences conveying a single idea; in programming, a series of instructions implementing a single step of an algorithm.

Break each piece of code into sequences that achieve a single task, placing a single empty line between each sequence. To further improve the maintainability of the code, place a one-line comment at the start of each such paragraph, describing what the sequence of statements does. Like so:
# Process an array that has been recognized... sub addarray_internal { my ($var_name, $needs_quotemeta) = @_; # Cache the original... $raw .= $var_name; # Build meta-quoting code, if requested... my $quotemeta = $needs_quotemeta ? q{map {quotemeta $_} } : $EMPTY_STR; # Expand elements of variable, conjoin with ORs... my $perl5pat = qq{(??{join q{|}, $quotemeta \@{$var_name}})}; # Insert debugging code if requested... my $type = $quotemeta ? 'literal' : 'pattern'; debug_now("Adding $var_name (as $type)"); add_debug_mesg("Trying $var_name (as $type)"); return $perl5pat; }
Paragraphs are useful because humans can focus on only a few pieces of information at once. Paragraphs are one way of aggregating small amounts of related information, so that the resulting "chunk" can fit into a single slot of the reader's limited short-term memory. Paragraphs enable the physical structure of a piece of writing to reflect and emphasize its logical structure.

Adding comments at the start of each paragraph further enhances the chunking by explicitly summarizing the purpose of each chunk (note: the purpose, not the behavior). Paragraph comments need to explain why the code is there and what it achieves, not merely paraphrase the precise computational steps it's performing.

Note, however, that the contents of paragraphs are only of secondary importance here. It is the vertical gaps separating each paragraph that are critical. Without them, the readability of the code declines dramatically, even if the comments are retained:
sub addarray_internal { my ($var_name, $needs_quotemeta) = @_; # Cache the original... $raw .= $var_name; # Build meta-quoting code, if required... my $quotemeta = $needs_quotemeta ? q{map {quotemeta $_} } : $EMPTY_STR; # Expand elements of variable, conjoin with ORs... my $perl5pat = qq{(??{join q{|}, $quotemeta \@{$var_name}})}; # Insert debugging code if requested... my $type = $quotemeta ? 'literal' : 'pattern'; debug_now("Adding $var_name (as $type)"); add_debug_mesg("Trying $var_name (as $type)"); return $perl5pat; }
8. Throw Exceptions Instead of Returning Special Values or Setting Flags

Returning a special error value on failure, or setting a special error flag, is a very common error-handling technique. Collectively, they're the basis for virtually all error notification from Perl's own built-in functions. For example, the built-ins eval, exec, flock, open, print, stat, and system all return special values on error. Unfortunately, they don't all use the same special value. Some of them also set a flag on failure. Sadly, it's not always the same flag. See the perlfunc manpage for the gory details.

Apart from the obvious consistency problems, error notification via flags and return values has another serious flaw: developers can silently ignore flags and return values, and ignoring them requires absolutely no effort on the part of the programmer. In fact, in a void context, ignoring return values is Perl's default behavior. Ignoring an error flag that has suddenly appeared in a special variable is just as easy: you simply don't bother to check the variable.

Moreover, because ignoring a return value is the void-context default, there's no syntactic marker for it. There's no way to look at a program and immediately see where a return value is deliberately being ignored, which means there's also no way to be sure that it's not being ignored accidentally.

The bottom line: regardless of the programmer's (lack of) intention, an error indicator is being ignored. That's not good programming.

Ignoring error indicators frequently causes programs to propagate errors in entirely the wrong direction. For example:
# Find and open a file by name, returning the filehandle # or undef on failure... sub locate_and_open { my ($filename) = @_; # Check acceptable directories in order... for my $dir (@DATA_DIRS) { my $path = "$dir/$filename"; # If file exists in an acceptable directory, open and return it... if (-r $path) { open my $fh, '<', $path; return $fh; } } # Fail if all possible locations tried without success... return; } # Load file contents up to the first <DATA/> marker... sub load_header_from { my ($fh) = @_; # Use DATA tag as end-of-"line"... local $/ = '<DATA/>'; # Read to end-of-"line"... return <$fh>; } # and later... for my $filename (@source_files) { my $fh = locate_and_open($filename); my $head = load_header_from($fh); print $head; }
The locate_and_open() subroutine simply assumes that the call to open works, immediately returning the filehandle ($fh), whatever the actual outcome of the open. Presumably, the expectation is that whoever calls locate_and_open() will check whether the return value is a valid filehandle.

Except, of course, "whoever" doesn't check. Instead of testing for failure, the main for loop takes the failure value and immediately propagates it "across" the block, to the rest of the statements in the loop. That causes the call to loader_header_from() to propagate the error value "downwards." It's in that subroutine that the attempt to treat the failure value as a filehandle eventually kills the program:
readline() on unopened filehandle at demo.pl line 28.
Code like that--where an error is reported in an entirely different part of the program from where it actually occurred--is particularly onerous to debug.

Of course, you could argue that the fault lies squarely with whoever wrote the loop, for using locate_and_open() without checking its return value. In the narrowest sense, that's entirely correct--but the deeper fault lies with whoever actually wrote locate_and_open() in the first place, or at least, whoever assumed that the caller would always check its return value.

Humans simply aren't like that. Rocks almost never fall out of the sky, so humans soon conclude that they never do, and stop looking up for them. Fires rarely break out in their homes, so humans soon forget that they might, and stop testing their smoke detectors every month. In the same way, programmers inevitably abbreviate "almost never fails" to "never fails," and then simply stop checking.

That's why so very few people bother to verify their print statements:
if (!print 'Enter your name: ') { print {*STDLOG} warning => 'Terminal went missing!' }
It's human nature to "trust but not verify."

Human nature is why returning an error indicator is not best practice. Errors are (supposed to be) unusual occurrences, so error markers will almost never be returned. Those tedious and ungainly checks for them will almost never do anything useful, so eventually they'll be quietly omitted. After all, leaving the tests off almost always works just fine. It's so much easier not to bother. Especially when not bothering is the default!

Don't return special error values when something goes wrong; throw an exception instead. The great advantage of exceptions is that they reverse the usual default behaviors, bringing untrapped errors to immediate and urgent attention. On the other hand, ignoring an exception requires a deliberate and conspicuous effort: you have to provide an explicit eval block to neutralize it.

The locate_and_open() subroutine would be much cleaner and more robust if the errors within it threw exceptions:
# Find and open a file by name, returning the filehandle # or throwing an exception on failure... sub locate_and_open { my ($filename) = @_; # Check acceptable directories in order... for my $dir (@DATA_DIRS) { my $path = "$dir/$filename"; # If file exists in acceptable directory, open and return it... if (-r $path) { open my $fh, '<', $path or croak( "Located $filename at $path, but could not open"); return $fh; } } # Fail if all possible locations tried without success... croak( "Could not locate $filename" ); } # and later... for my $filename (@source_files) { my $fh = locate_and_open($filename); my $head = load_header_from($fh); print $head; }
Notice that the main for loop didn't change at all. The developer using locate_and_open() still assumes that nothing can go wrong. Now there's some justification for that expectation, because if anything does go wrong, the thrown exception will automatically terminate the loop.

Exceptions are a better choice even if you are the careful type who religiously checks every return value for failure:
SOURCE_FILE: for my $filename (@source_files) { my $fh = locate_and_open($filename); next SOURCE_FILE if !defined $fh; my $head = load_header_from($fh); next SOURCE_FILE if !defined $head; print $head; }
Constantly checking return values for failure clutters your code with validation statements, often greatly decreasing its readability. In contrast, exceptions allow an algorithm to be implemented without having to intersperse any error-handling infrastructure at all. You can factor the error-handling out of the code and either relegate it to after the surrounding eval, or else dispense with it entirely:
for my $filename (@directory_path) { # Just ignore any source files that don't load... eval { my $fh = locate_and_open($filename); my $head = load_header_from($fh); print $head; } }
9. Add New Test Cases Before you Start Debugging

The first step in any debugging process is to isolate the incorrect behavior of the system, by producing the shortest demonstration of it that you reasonably can. If you're lucky, this may even have been done for you:
To: [email protected] From: [email protected] Subject: Bug in inflect module Zdravstvuite, I have been using your Lingua::EN::Inflect module to normalize terms in a data-mining application I am developing, but there seems to be a bug in it, as the following example demonstrates: use Lingua::EN::Inflect qw( PL_N ); print PL_N('man'), "\n"; # Prints "men", as expected print PL_N('woman'), "\n"; # Incorrectly prints "womans"
Once you have distilled a short working example of the bug, convert it to a series of tests, such as:
use Lingua::EN::Inflect qw( PL_N ); use Test::More qw( no_plan ); is(PL_N('man') , 'men', 'man -> men' ); is(PL_N('woman'), 'women', 'woman -> women' );
Don't try to fix the problem straight away, though. Instead, immediately add those tests to your test suite. If that testing has been well set up, that can often be as simple as adding a couple of entries to a table:
my %plural_of = ( 'mouse' => 'mice', 'house' => 'houses', 'ox' => 'oxen', 'box' => 'boxes', 'goose' => 'geese', 'mongoose' => 'mongooses', 'law' => 'laws', 'mother-in-law' => 'mothers-in-law', # Sascha's bug, reported 27 August 2004... 'man' => 'men', 'woman' => 'women', );
The point is: if the original test suite didn't report this bug, then that test suite was broken. It simply didn't do its job (finding bugs) adequately. Fix the test suite first by adding tests that cause it to fail:
> perl inflections.t ok 1 - house -> houses ok 2 - law -> laws ok 3 - man -> men ok 4 - mongoose -> mongooses ok 5 - goose -> geese ok 6 - ox -> oxen not ok 7 - woman -> women # Failed test (inflections.t at line 20) # got: 'womans' # expected: 'women' ok 8 - mother-in-law -> mothers-in-law ok 9 - mouse -> mice ok 10 - box -> boxes 1..10 # Looks like you failed 1 tests of 10.
Once the test suite is detecting the problem correctly, then you'll be able to tell when you've correctly fixed the actual bug, because the tests will once again fall silent.

This approach to debugging is most effective when the test suite covers the full range of manifestations of the problem. When adding test cases for a bug, don't just add a single test for the simplest case. Make sure you include the obvious variations as well:
my %plural_of = ( 'mouse' => 'mice', 'house' => 'houses', 'ox' => 'oxen', 'box' => 'boxes', 'goose' => 'geese', 'mongoose' => 'mongooses', 'law' => 'laws', 'mother-in-law' => 'mothers-in-law', # Sascha's bug, reported 27 August 2004... 'man' => 'men', 'woman' => 'women', 'human' => 'humans', 'man-at-arms' => 'men-at-arms', 'lan' => 'lans', 'mane' => 'manes', 'moan' => 'moans', );
The more thoroughly you test the bug, the more completely you will fix it.

10. Don't Optimize Code--Benchmark It

If you need a function to remove duplicate elements of an array, it's natural to think that a "one-liner" like this:
sub uniq { return keys %{ { map {$_=>1} @_ } } }
will be more efficient than two statements:
sub uniq { my %seen; return grep {!$seen{$_}++} @_; }
Unless you are deeply familiar with the internals of the Perl interpreter (in which case you already have far more serious personal issues to deal with), intuitions about the relative performance of two constructs are exactly that: unconscious guesses.

The only way to know for sure which of two--or more--alternatives will perform better is to actually time each of them. The standard Benchmark module makes that easy:
# A short list of not-quite-unique values... our @data = qw( do re me fa so la ti do ); # Various candidates... sub unique_via_anon { return keys %{ { map {$_=>1} @_ } }; } sub unique_via_grep { my %seen; return grep { !$seen{$_}++ } @_; } sub unique_via_slice { my %uniq; @uniq{@_} = (); return keys %uniq; } # Compare the current set of data in @data sub compare { my ($title) = @_; print "\n[$title]\n"; # Create a comparison table of the various timings, making sure that # each test runs at least 10 CPU seconds... use Benchmark qw( cmpthese ); cmpthese -10, { anon => 'my @uniq = unique_via_anon(@data)', grep => 'my @uniq = unique_via_grep(@data)', slice => 'my @uniq = unique_via_slice(@data)', }; return; } compare('8 items, 10% repetition'); # Two copies of the original data... @data = (@data) x 2; compare('16 items, 56% repetition'); # One hundred copies of the original data... @data = (@data) x 50; compare('800 items, 99% repetition');
The cmpthese() subroutine takes a number, followed by a reference to a hash of tests. The number specifies either the exact number of times to run each test (if the number is positive), or the absolute number of CPU seconds to run the test for (if the number is negative). Typical values are around 10,000 repetitions or ten CPU seconds, but the module will warn you if the test is too short to produce an accurate benchmark.

The keys of the test hash are the names of your tests, and the corresponding values specify the code to be tested. Those values can be either strings (which are eval'd to produce executable code) or subroutine references (which are called directly).

The benchmarking code shown above would print out something like the following:
[8 items, 10% repetitions] Rate anon grep slice anon 28234/s -- -24% -47% grep 37294/s 32% -- -30% slice 53013/s 88% 42% -- [16 items, 50% repetitions] Rate anon grep slice anon 21283/s -- -28% -51% grep 29500/s 39% -- -32% slice 43535/s 105% 48% -- [800 items, 99% repetitions] Rate anon grep slice anon 536/s -- -65% -89% grep 1516/s 183% -- -69% slice 4855/s 806% 220% --
Each of the tables printed has a separate row for each named test. The first column lists the absolute speed of each candidate in repetitions per second, while the remaining columns allow you to compare the relative performance of any two tests. For example, in the final test tracing across the grep row to the anon column reveals that the grepped solution was 1.83 times (183 percent) faster than using an anonymous hash. Tracing further across the same row also indicates that grepping was 69 percent slower (-69 percent faster) than slicing.

Overall, the indication from the three tests is that the slicing-based solution is consistently the fastest for this particular set of data on this particular machine. It also appears that as the data set increases in size, slicing also scales much better than either of the other two approaches.

However, those two conclusions are effectively drawn from only three data points (namely, the three benchmarking runs). To get a more definitive comparison of the three methods, you'd also need to test other possibilities, such as a long list of non-repeating items, or a short list with nothing but repetitions.

Better still, test on the real data that you'll actually be "unique-ing."

For example, if that data is a sorted list of a quarter of a million words, with only minimal repetitions, and which has to remain sorted, then test exactly that:
our @data = slurp '/usr/share/biglongwordlist.txt'; use Benchmark qw( cmpthese ); cmpthese 10, { # Note: the non-grepped solutions need a post-uniqification re-sort anon => 'my @uniq = sort(unique_via_anon(@data))', grep => 'my @uniq = unique_via_grep(@data)', slice => 'my @uniq = sort(unique_via_slice(@data))', };
Not surprisingly, this benchmark indicates that the grepped solution is markedly superior on a large sorted data set:
s/iter anon slice grep anon 4.28 -- -3% -46% slice 4.15 3% -- -44% grep 2.30 86% 80% --
Perhaps more interestingly, the grepped solution still benchmarks as being marginally faster when the two hash-based approaches aren't re-sorted. This suggests that the better scalability of the sliced solution as seen in the earlier benchmark is a localized phenomenon, and is eventually undermined by the growing costs of allocation, hashing, and bucket-overflows as the sliced hash grows very large.

Above all, that last example demonstrates that benchmarks only benchmark the cases you actually benchmark, and that you can only draw useful conclusions about performance from benchmarking real data.

[Oct 17,2003] Part 13 Perl Style by Dan Richter

Part 13: Perl Style

1) Introduction

2) The Many Faces of Perl

3) The Special File Handle "ARGV"

4) Exercise

5) [Non-]Answer to Previous Exercise

6) Past Information

7) Credits

8) Licensing

1) Introduction

An important concept in Perl is that "There Is More Than One Way To Do It" (TIMTOWTDY, pronounced "Tim Towtdy"). We have seen this to some degree already. A simple example is "tr///" and "y///", which mean exactly the same thing. As another example, "unless" is the same as "if not". Likewise, "s///" can usually do the job of "tr///".

This is one of the ways in which Perl is similar to human languages, which have many different words to express similar concepts (as a quick look through a thesaurus will demonstrate). Perl borrows other ideas from human languages as well. One simple example is the way "if" can be placed before or after the command to be conditionally executed. In addition, the implied use of "$_" is foreign to English speakers, but implied subjects are common in other languages such as Latin and Chinese.

Larry Wall, who invented Perl, is fascinated with human languages. He even studied them in graduate school - and for an interesting reason: "At the time, [my wife and I] were actually planning to be missionaries (more specifically, Bible translators), but we had to drop that idea for health reasons." But Wall doesn't regret the change of plans: he figures that Perl is actually more useful to the missionaries than he personally could have been.

Perl's similarity to a human language goes hand-in-hand with its ability to do many different types of jobs. As Jon Udell puts it, "[Larry] Wall [inventor of Perl] believes that people think about things in different ways, that natural languages accommodate many mindsets, and that programming languages should too."

We're now going to see a bit of Perl's human-language-like versatility.

2) The Many Faces of Perl

As we have seen, Perl commands can be written as functions or statements.
	   die("Error!") unless defined($foo);  
	# Look like functions.
   die "Error!"  unless defined $foo;    # Look
	like statements.
	
This changes the appearance only: the functionality is exactly the same.
More interestingly, the following code looks like it comes from a shell script, but it is actually valid Perl code:
   (-d $dir) || mkdir $dir;    #
	Create directory if it doesn't exist.
   $size=`wc -c $dir/*`;       # Get
	sizes of files.
   print <<EOF;
      Looks like
      everything's OK!
   EOF
	
Yes, Perl borrows "-d", "mkdir" back-ticks and "here-documents" (the "<<EOF" part) from your favourite shell.
Of course, the similarity is limited. Functions like "mkdir" are Perl built-ins; you cannot execute an arbitrary shell command directly from Perl. (That's why we had to use back-ticks to execute "wc".) In addition, you can't use "<", ">" and "|" to pipe I/O the way you would in a shell script (except when using "open", as we saw last week).

But I still think that it's impressive that a Perl program can be written like C or like a shell script.

3) The Special File Handle "ARGV"
Okay, I admit that this section isn't too related to Perl style, but I had to put it somewhere!
"ARGV" is a special file handle which sequentially opens (for reading) all the files specified on the command-line (i.e., every command-line argument is interpreted as a file name and is opened). The files are read in the order in which they were specified on the command-line and as though the contents formed one big file (i.e., when you reach the end of a file, Perl automatically closes it and opens the next). The variable "$ARGV" always contains the name of the file that is currently being read by "<ARGV>".

In practice, you never see "<ARGV>". Instead, you see it written as "<>", because "ARGV" is the default file handle for reading if none is specified.

For example, consider the following code:
  while ( <> ) {    # while (
	defined ( $_ = <ARGV> ) ) {
     print;         
	#  print $_;
   }
If you store this code in a file named "test.pl" and run:
  perl test.pl file1 file2 file3
the program will output the contents of "file1", then the contents of "file2", then the contents of "file3".
A dash ("-") given as a command-line argument refers to standard input. Also, if no command-line arguments are given at all, standard input is assumed.

The road to better programming Introduction and chapter 1 by Teodor Zlatanov

November 1, 2001

Zlatanov ([email protected])

Programmer, Gold Software Systems

The success or failure of any software programming group depends largely on its ability to work together as a team. From manager to members, to well-conceived, yet dynamic guidelines, the team as a whole is defined by the unison of its parts. Shattering the myth of the faultless programmer, Teodor dismantles the uninspired software group and then builds it up again into a synchronized, energized ensemble.

Welcome to a series of articles on developerWorks comprising a complete guide to better programming in Perl. In this first installment, Teodor introduces his book and looks at coding guidelines from a fresh perspective.

This is the book for the beginner to intermediate Perl programmer. But even an advanced Perl programmer can find the majority of the chapters exciting and relevant, from the tips of Part I to the project management tools presented in Part II to the Parse::RecDescentsource code analysis scripts in Part III.

The words program and script are used interchangeably. In Perl, the two mean pretty much the same thing. A program can, indeed, be made up of many scripts, and a script can contain many programs, but for simplicity's sake, we will use the two terms with the understanding that one script file contains only one program.

Goals of the book

Part I is full of tips to improve your Perl skills, ranging from best programming practices to code debugging. It does not teach you Perl programming. There are many books with that purpose, and they would be hard to surpass in clarity and completeness.

Part II will teach you how a small Perl software team can be better managed with the standard tools of software project management. Often, Perl programmers embody the "herd of cats" view of software teams. Part II will apply project management tools to a small (2- to 6-person) Perl development team, and will examine how managing such a team successfully is different from the classic project management approach.

Part III will develop tools to analyze source code (Perl and C examples will be developed) and to help you manage your team better. Analysis of source code is superficial at best today, ranging from the obvious and irrelevant "lines of code" metrics to function points (see Resources later in this article), which do not help in understanding the programmer's mindset. Understanding the programmer's mindset will be the goal of Part III. Tools will be developed that help track metrics such as comment legibility and consistency, repetitiveness of code, and code legibility. These metrics will be introduced as a part of a software project, not its goal.

There is no perfection in programming, only its pursuit. Good programmers learn something new every day and continually improve their skills and technique. Rigidity and inflexibility are forever the enemy of ingenuity and creativity.

In pursuit of perfection

The most common mistake a programmer can make is not in the list of bugs for his program. It is not a function of the programmer's age or language of choice. It is, simply, the assumption that his abilities are complete and there is no room for improvement.

Arguably, such is human nature; but I would argue that human nature is always on the prowl for knowledge and improvement. Only hubris and the fear of being proven wrong hold us back. Resisting them both will not only make a better programmer, but a better person as well.

The social interactions and the quality of the people, I believe, are what create successful software teams more than any other factors. Constant improvement in a programmer's skills and the ability to take criticism are fundamental requirements for members of a software team. These requirements should precede all others.

Think back to the last time you changed your style. Was it the new algorithm you learned, or commenting style, or simply a different way of naming your variables? Whatever it was, it was only a step along the way, not the final change that made your code complete and perfect.

A programmer shouldn't be required to follow precise code guidelines to the letter; nor should he improvise those guidelines to get the job done. Consider an orchestra -- neither static, soulless performers nor wildly improvisational virtuosos (though the latter is more acclaimed). A static performer simply follows the notes without putting effort and soul into the music; the virtuoso must restrain herself from errantly exploring new pieces of the melody or marching to the beat of her own drum.

Striking a concordant tone

Code guidelines are like the written directions a musician follows -- when to come on, when to come off, how fast to play, what beat, etc. The notes themselves, to extend the analogy somewhat precariously, are the goals of the project -- sometimes lone high notes, and sometimes a harmony of instruments.

In an orchestra, there is a conductor that directs but does not tell every musician how to play, and everyone has a part in the performance. The conductor creates harmony. Because music has been around for many more centuries than the art of programming, perhaps these are lessons well worth learning. The software project manager is neither a gorilla nor a walled-off convict. She is a part of the team just like everyone else.

The guidelines presented in this series are not to be blindly extracted into an official coding policy. The coding standards in your project are uniquely yours, and they reflect your very own orchestral composition. Don't force programmers to do things exactly right, thereby creating an atmosphere of distrust and fear. You can forget about code reviews, or admission of responsibility for the smallest bugs.

Instead, present the guidelines and watch how people react. If no one adopts the comment format you like, perhaps it's a bad format. If people write without cleverness, perhaps you have been too clever in the guidelines. If the debugger you thought everyone must run is sitting in a dusty room, still packed, then rethink the need for Whizzo Debugger 3.4. Maybe everyone is happy with Acme Debugger 1.0 for a reason.

Of course, programmers can be stubborn for no reason at all, only out of reluctance to change. It's hard to convince people that 20 years of experience do not entitle them to an organized religion. On the other hand, freshly minted college graduates often lack self-confidence. Recognize and adapt to those characteristics, and to all the others of your team. Present ideas to the stubbornly experienced in such a way that they feel they have helped with it. Build up the college graduates with guidance and support until they can fly on their own.

All this, just for a few coding guidelines?

Coding guidelines are fundamental to a software team, just as direction and harmony are to music. They create consistency and cohesiveness. New team members will feel welcome and gel more quickly. Ye olde team members will accept newcomers more readily. The loss of a team member will not cripple the project just because someone can't understand someone else's code.

Keep in mind that speed is not the only measure of improvement in a program's code. Consider ease of testing, documentation, and maintenance just as important to any software project, especially for the long term. A language as flexible as Perl facilitates good coding in every stage of the software project. Although this book focuses on Perl, many of the principles are valid for other languages such as C, C++, Java, and Python.

Finally, be an innovator. Regardless of your position in the team -- manager or member -- always look for new ideas and put them into action. Perfection may be impossible, but it's a worthy goal. Innovators are the true strength of a team and without them the melody grows stale very quickly. Stay in touch with your peers; continually learn new things from them. A medium such as Usenet (see Resources) is a great place for an exchange of ideas. Teach and learn, to and from each other. Remember, there's always room for improvement. Above all, have fun, and let the music begin.

Resources

Read more chapters of The road to better programming.

Read Applied Software Measurement: Assuring Productivity and Quality, 2nd edition (McGraw-Hill), by Capers Jones.

Read the Function Points FAQ, and check out the International Function Points User Group.

Read all you ever wanted to know about Usenet.

Read Teodor's other Perl articles in the developerWorks "Cultured Perl" series:

A programmer's Linux-oriented setup

Application configuration with Perl

Automating UNIX system administration with Perl

Debugging Perl with ease

The elegance of JAPH

Genetic algorithms applied with Perl

One-liners 101

Parsing with Perl modules

Perl 5.6 for C and Java programmers

Reading and writing Excel files with Perl

Review of Programming Perl, Third Edition

Small observations about the big picture

Writing Perl programs that speak English

Browse more Linux resources on developerWorks.

Browse more Open source resources on developerWorks.

About the author

Teodor Zlatanov graduated with an M.S. in computer engineering from Boston University in 1999. He has worked as a programmer since 1992, using Perl, Java, C, and C++. His interests are in open source work on text parsing

The road to better programming

The road to better programming- Introduction and chapter 1

The road to better programming- Chapter 2

The road to better programming- Chapter 3

The road to better programming- Chapter 4

The road to better programming- Chapter 5

The road to better programming- Chapter 6. Developing cfperl, from the beginning

The road to better programming- Chapter 7. Top-level control flow and configuration

The road to better programming- Chapter 8. The top-level and compound-class parsers

The road to better programming- Chapter 9. The classes and default parsers

The road to better programming- Chapter 10. User management with cfperl

The road to better programming- Chapter 11. Crontab management with cfperl

The road to better programming- Chapter 12. File editing with the perledit- section

Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D

Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last updated: September 30, 2020

Perl Style: defensive programming in Perl

Larry Wall guidance (from Perl Style documentation)

[Sep 30, 2020] Postfix if conditions are useful mainly to specify exist condition in the loop, rarely elsewhere

Sep 30, 2020 | perlmonks.org

[Sep 29, 2020] At what point excessive syntactic flexibility stimulates perverted programming style which is reflected in the derogative term "complexity junkies"?

Notable quotes:

"... In your private role you are free to do whatever you wish. After all programming open source is about fun, not so much about discipline. ..."

"... That's why languages that allow too much syntactic freedom are generally not welcomed in large commercial projects, even if they are able to manage large namespaces more or less OK. ..."

Sep 29, 2020 | perlmonks.org

[Nov 21, 2019] Tux.nl - Style and Layout

Nov 21, 2019 | tux.nl

[Nov 29, 2017] How can I have variable assertions in Perl

Nov 29, 2017 | stackoverflow.com

[Nov 17, 2017] Bruce Gray - Your Perl 5 Brain, on Perl 6 by Bruce Gray

Nov 17, 2017 | www.youtube.com

[Nov 13, 2017] ‎Perl's Worst Best Practices‎ by Daina Pettit

Some good critique

[Jan 26, 2012] Appendix B Perl Best Practices - O'Reilly Media

This is very questionable recommendations. Should be taken with a grain of salt

[Oct 27, 2011] How can I convert Perl code into HTML with syntax highlighting?

vim can do it.

Jun 16, 2002 | www.perlmonks.org

[July 14, 2005] Perl.com- Ten Essential Development Practices by Damian Conway

Beware, Damian Conway is an OO fundamentalist...

1. Design the Module's Interface First

2. Write the Test Cases Before the Code

3. Create Standard POD Templates for Modules and Applications

4. Use a Revision Control System

5. Create Consistent Command-Line Interfaces

6. Agree Upon a Coherent Layout Style and Automate It with perltidy

7. Code in Commented Paragraphs

8. Throw Exceptions Instead of Returning Special Values or Setting Flags

9. Add New Test Cases Before you Start Debugging

10. Don't Optimize Code--Benchmark It

[Oct 17,2003] Part 13 Perl Style by Dan Richter

2) The Many Faces of Perl

3) The Special File Handle "ARGV"

The road to better programming Introduction and chapter 1 by Teodor Zlatanov

November 1, 2001

Google matched content

Softpanorama Recommended

Internal

External

6. Agree Upon a Coherent Layout Style and Automate It with `perltidy`