|
Softpanorama |
May the source be with you, but remember the KISS principle ;-)
Softpanorama Search
|
Prev | Up | Contents | Down | Next
Control structures are one of the strong point of Perl. It improves and extends C control structures in two major ways. First after if, while, etc there should always be a block (set of statements in curve brackets). That's just logical correction of C blunder ;-)
|
In Perl one should always use block of statements after all control structures |
Second it introduced several new variant of traditional control structures (unless) and new control statements for loops redo, etc. We will discuss this in more details in Ch.6. Right now we will provide a brief overview of control structures. Sadly enough Perl redefined several of C loop control structure keywords, but we will discuss in later.
Like C, Perl does not have a logical (Boolean) data type. An interesting innovation in C was that arithmetic expressions in conditional statements are evaluated and then converted to the integer value. In C if the result of expression evaluation is zero than it's false and if the result is not zero it's true.
Perl continues this C tradition with numbers and tries to extend it to strings. If expression evaluates to string than Perl treat it differently. In this case strings taken as logical expression are treated like integers equal to the length of the string -- empty string is considered false all others as true) . But the is an important exception -- string "0" is treated as false. I do not understand real reasons behind this decision, but you should be aware about it...
|
Not only empty string, but also string "0"(with length one) is treated as false in Perl |
Like in C a variable is acceptable as an expression in Perl. In this case the value of the variable is considered to be the value of expression. Like in C, you can write if ($a) {...}. Does this pay in terms of programmer convenience versus additional bugs? Who knows. This C innovation proved to be a mixed blessing...
But generally Perl trades possible subtle errors for (probably incorrectly understood :-) convenience of programmer.
All in all Perl is probably the only language in which there are three different cases in which an expression evaluates to false.
| 1. Expression is false if it is considered
numeric and evaluates to zero 2. Expression is false if it is considered a string and evaluates to an empty string (''), of string zero ("0") 3. Expression is false if it evaluates to an empty list Collorary: Expression is false if it evaluates to the value undef . |
Note that Perl does not convert string to numeric in logical expressions that involve text comparison. With a noted above exception of string "0" Perl just use string length instead. That means that paradoxically if ("00") {... } evaluates to true just because it has length 2. All strings other than '0', empty (and undef) always numerically evaluates to true so, if ("a") {...} would be true too.
If we discount rather strange idea of adding "0" string to the false list it generally looks like more or less logical generalization of the idea used in the C language. C essentially substitutes Boolean data type with integer data type and treats 0 as false and any other integer value as true. Similarly strings of zero length can be treated as false and not zero as true. Here are examples of these four cases:
1. Expression is false if it is considered numeric and evaluates to zero
if (0) { } # false
if ("0.0") { } # true because it is a string and it's length is more than one.
if ("abba") { } # true
2. Expression is false if it is considered a string and evaluates to an empty string (''), of string zero ("0")
$a='';
if ($a) {...} # false
3. Expression is false if it evaluates to an empty list. We will discuss this later
@X=();
if (@X) {...} # we will discuss this later
4. Expression is false if it evaluated to the special value undef. This is because in the numeric context undef will be converted to zero and in the string context it will be converted to an empty string. Both values correspond to false.
if ($a) {...} # if we assume the $a was not used before, than this expression will be false
Perl adopted the design decision to use separate sets of comparison operators for numbers and strings instead of casting (for example by changing comment symbol to //(with discarding its usage for regular expressions too) and using prefix # for scalars that need to be converted to numeric (like in if (#a==#b) instead of if $a eq $b). Also it would be nice to be able to control conversion to integer or to double float with declaration type statements like in ksh93. Currently one needs to use pragma use integer for this. That that's fantasy, and let's return to the reality that we need to face. String comparison operators in Perl are different from numeric comparison operators:
gt means greater than, lt is less than, eq is equal, and ne is not equal. (cmp is a special three way comparison useful in sorting and searching). These string comparison operators compare variables in the dictionary order. Any scalars (on one side of unary operators and on both sides of binary operators) are converted to strings before comparison.
So please remember that if you want to compare two numbers you should use == and to compare two strings you should use eq. This unfortunate dichotomy (or assembler style solution, if you wish) is a source on endless errors in Perl programs. Here are some examples of comparisons on numbers and strings.
($a == $b) # Is numerical representation of $a(if any)
# equal to numerical representation of $b (in double float)
($a=$b) # Beware frequent mistake: Don't use the = operator.
($a != $b) # Is $a converted to double float unequal to $b?
($a eq $b) # Is string $a equal to string $b?
($a == "abba") # Common error: using == instead of eq
($a ne $b) # Is string $a unequal to string $b?
All regular Boolean operation from C work as well. three principal Boolean operations are a !("NOT"), an &&("AND") and an ||("OR"). The following table shows the results from applying AND and OR operations to two compared states (operations are commutative):
| !(NOT) | &&(AND) | ||(OR) |
| !0=1 !1=0 |
0&&0=0 0&&1=0 1&&0=0 1&&1=1
|
0||0=0 0||1=1 1||0=1 1||1=1 |
For a summary of Boolean algebra see Boolean Algebra. and Venn Diagrams for Boolean Logic, a color-illustrated explanation of Boolean Searching. See also explanation of Boolean Algebra in terms of set theory.
($a && $b) # Is logical expression $a and expression $b true? ($a || $b) # Is either expressions $a or $b true? !($a) # is expression $a false?
Both && and || are short-circuit. If the first operand evaluated to false in &&(AND) and to true in ||(OR) then the second operator is never evaluated.
Another very useful innovation in Perl is that both an
if statement (and some
loops, see below) has two symmetrical syntaxes: one for condition to become
true to execute then part (regular if statement) and the other false
(reversed if statement). The left variant in the table below is a regular
if statement. The second is the same statement with the logical condition
reversed -- then part will be executed if condition is false so it
is equal if (!(.....)).
This is a useful extension, but I would prefer the keyword
ifnot.
| if (expression)
{ # Code block executed # if condition is true. } else { # Code block executed # if condition is false. } |
unless (expression)
{ # Code block executed # if condition is true. } else { # Code block executed # if condition is false. } |
All statements in Perl are also expressions, so you can put arbitrary statement in the if. This is often used as a shorthand for input operations and later you will see a lot of idioms like while(<STDIN>) { ... } that use this possibility.
Perl improved syntax for if-then-else statements in comparison with C -- it does not accept a single statement in then or else clause and always requires a group of statements in curvy brackets as then or else clause. That is a really good decision as it prevents errors when the programmer in a hurry can add one statement thinking it will be in then (or else clause) and in reality it will be outside because there was only single statement in then or else group:
if ($cost > 100) { print "This item is too expensive\n"; } else { print "Price is OK. We need to think about it\n"; }
The curly braces around the statement block are not optional in Perl as they are in C. Even one-line statement blocks must be surrounded by curly braces.
That means that brackets around conditional expression are redundant, but currently for some unclear reason Perl does require them.
Brackets for conditional expressions are not optional. It's an error to omit them
The important think to remember is that Perl has two sets of conditionals -- one for numbers and one for strings. That creates a lot of problems for novices, but here -w flag can help to detect this unpleasant errors.
The main problem with Perl is that you may wish that operand will be interpreted an string but Perl will decide otherwise. For all numeric comparisons both operators are interpreted as numeric values. Please try to run the following two line script:
$answer="No";
if ($answer == "Yes") { print "The answer equals Yes";}
Now you will see that you are in trouble. Both variable $answer and literal "Yes" will evaluate to numbers (zero in both cases) and comparison will always be true for any non-number value of the variable $answer.
Empty string or uninitialized string (with value undef converted to an empty string) are considered to be false in Perl.
No else if style nesting are allowed. Cascading if statements can be created using the keyword elsif:
if ($name="TCP"){ # The first comparison
print "This is a TCP/IP transmission protocol\n";
}
elsif ($name="SSL"){ # The second comparison
print "This is a security protocol over TCP/IP\n";
}
elsif ($name="IPSEC"){ # The third comparison
print "This is IP extension for secure connections\n";
}
else { # Default case- all conditions failed
print "Unknown protocol";
}
IMHO by using the elsif keyword Larry Wall was reinventing the bicycle. It should really be elif ;-).
If you are using several cascading elsif with all conditions being of the same type (for example checking for equality), you can make your code more clear by conversion of the values into array. For example:
#Wrong way to perform the operation. Adapted from Medinets book
# Initialize $month to 2.
# If the value of $month is 1, then print January.
# If the value of $month is 2, then print February.
# If the value of $month is 3, then print March.
# For every other value of $month, print a message.
$month = 2;
if ($month == 1) {
print("January\n");
}
elsif ($month == 2) {
print("February\n");
}
elsif ($month == 3) {
print("March\n");
}
else {
print("Not one of the first three months\n");
}
Actually one should program this something like (note the usage of qw for simplification of creation of quoted list of words):
@year=qw(January, February, March);
if ($year[$month-1] != undef ) {print "$year[$month-1]";}
else {print "wrong index $month -- not in 1..3 range";}
Ternary ``?:'' is the conditional operator, just as in C. It works much like an if-then-else. If the argument before the ? is true, the argument before the : is returned, otherwise the argument after the : is returned. For example:
printf "I have %d dog%s.\n", $n,
($n == 1) ? '' : "s";
Scalar or list context propagates downward into the 2nd or 3rd argument, whichever is selected.
$a = $ok ? $b : $c; # get a scalar
@a = $ok ? @b : @c; # get an array
$a = $ok ? @b : @c; # oops, that's just a count!
The operator may be assigned to if both the 2nd and 3rd arguments are legal lvalues (meaning that you can assign to them):
($a_or_b ? $a : $b) = $c;
This is not necessarily guaranteed to contribute to the readability of your program. Because this operator produces an assignable result, using assignments without parentheses will get you in trouble. For example, this:
$a % 2 ? $a += 10 : $a += 2
Really means this:
(($a % 2) ? ($a += 10) : $a) += 2
Rather than this:
($a % 2) ? ($a += 10) : ($a += 2)
In C tradition Perl has binary ``|'' and binary & as well as logical || and logical && operators. The latter in shell tradition can be used for control flow control.
Logical "&&'' performs a short-circuit logical AND operation. That is, if the left operand is false, the right operand is not even evaluated. Scalar or list context propagates down to the right operand if it is evaluated.
Logical "||'' performs a short-circuit logical OR operation. That is, if the left operand is true, the right operand is not even evaluated. Scalar or list context propagates down to the right operand if it is evaluated.
Like in shell and unlike C the || and && operators
return the last value evaluated.
The fact that both || and && operators return
the last value evaluated means that you shouldn't use them for selecting
between two aggregates in assignment:
@a = @b || @c; # this is wrong
@a = scalar(@b) || @c; # do you really meant this ? Try it...
@a = @b ? @b : @c; # this works fine, though
An often used Perl cliché for testing if opening of file succeeded is:
open(STDIN,$myfile) && die "Can't open file $myfile!\n";It also can be used as if-then statement:
( index($line,"Subject:")>-1 ) && $subject=$line;
This is the same as
if ( index($line,"Subject:")>-1 ) { $subject=$line; }
So it is not clear what we win by using this notation: it is not much shorter.
Another popular use is for controlling debugging statement, which are typically printing statement with a special variable $DEBUG:
$DEBUG=1; ... ... ... $DEBUG && print " text=$text\n";
Perl provides and and or operators
as more readable alternatives to && and || when
used for control flow. The short-circuit behavior is identical. The precedence
of "and'' and "or'' is much lower, however, so that you can safely use them
after a list operator without the need for parentheses. Still they are very
rarely use.
Perl is one of the few languages were loops are "done right.". Notation used is more flexible and more powerful that in C, C++ or Java despite the fact that the need for loops in scripting languages is less.
While, until and for loops in C have semantic similar to C. From shell languages a very useful loop -- called foreach loop was added. Several constructs for exiting loops were also added and in this respect Perl has better looping structuring facilities that any other language that I know.
Let me remind the key classification elements for loops which are applicable to any language:
Type of the loop which is essentially a type of terminating condition. A loop is used to repeat the execution of a statement block until a certain condition is reached. This condition is called terminating condition. For example a loop can be used to iterate through an array looking for a value and terminates if the value is found (simple search loop). Loops also can be used to count quantities. In this case terminating condition is the end of the aggregate that is indexed in the loop. There are three types of loops:
while/until loops: terminating condition is a Boolean expression
for loops: the loop specified a special variable called counter which is incremented or decremented and terminating condition is a Boolean expression for the counter (for example exceeding a certain value of decrementing to zero or negative value). For loop originated in Fortran and paradoxically is older then while/until loops.
foreach loops. In this case loop body is
executed for every element of array or hash. This type of loops
originated in Unix shells. This is a more modern type of loop then
either for loop or while/until loop.
Prefix and postfix loops. There are two forms of the loop: one where the terminating condition is checked before the statements are executed (it often called while loop but to avoid confusion we will can it prefix-check loop or simply prefix loop), and one in which the terminating condition is checked after the statements are executed it is often called until loop in computer science, but to avoid confusion we will call it postfix check loop of simply postfix loop. This classification is usually applicable only to while/until loops as for for loops and foreach loops the position of the terminating condition is fixed.
There are two forms of this loop -- one with body executed while condition is true and the second with the body of the loop executed while condition is false:
| Terminating condition is checked for "true" | Terminating condition is checked for "false" |
while (expression) { |
until (expression ) { |
Prefix loops (called while loops) is most often used in Perl to read input from the file. This typical Perl construct should be used with caution. Here is an artificial example that will not work -- false condition for string can be empty string . The following will print out 'h e l l o' and stop:
@s = ('h','e','l','l','0',' ', 'w','o','r','l','d');
$i = 0;
while ($letter = $s[$i]) {
print "$letter ";
$i++;
}
It does not print out the 'world' because of typo: zero was typed instead of the 'o' in 'hello'. At this point the value of $letter becomes "0" and the loop terminates. Please try to run this example via debugger.
Another example is also artificial but it shows inconvenience of pure prefix loop (code need to be duplicated):
print "Password? "; # Ask for the first input
$pass = <STDIN>; # Get input the first input
chop $pass; # Remove the newline at the end
while ($pass ne ":SeSaM:"){ # While input is wrong.
# Please note the use of ne operator with string
print "Sorry. Please reenter: "; # Ask again
$pass = <STDIN>; # Get next input line
chop $pass; # Chop off newline again
}
print "Welcome!\n";
The body of the loop -- the block in curly-braces is executed while the input does not equal the password. The code should be fairly clear, but please notice several things. First, we can read from the standard input (the keyboard) without opening the file first. Second, when the password is entered $pass is given that value including the newline character at the end. The chop function removes the last character of a string which in this case is the newline.
Logically this is a famous n+1/2 loop and as D. Knuth pointed out many years ago, it can not be adequately programmed using while or until constructs.
For arrays one can use while loop as a self-terminating loop (when array will be exhausted the next value will be undef):
while ($x[$i]) { print $x[$i]; $i++; }
But that is a danger that one of the array elements can be "0" or empty string (as we know Perl is incapable to check for undef and it is automatically converted to empty string or zero on comparison depending on the type of comparison operator), In such cases the loop will terminated prematurely. For example following loop will terminate after printing 123, before it will reach the last element:
@x = (1,2,3,0,5,6,7)$i = 0;while ($x[$i]) { print $x[$i]; $i++; }
An exit from while loop in Perl occurs, if you are dealing with an array or a hash, and you get zero elements in an array, i.e.: something evaluates to the empty list '()', the loop processing terminates. This also lead to subtle bugs. For example:
%weekend = {'St'->"Saturday','Sn'->'Sunday'};
while (($abbrev, $fullname) = each (%weekend)) {
print "$abbrev $fullname";
}
prints out all pairs (but not necessary in the order they were entered) and then stops. The loop stops when the last hash element has been put into this list. This form of terminating is very helpful when using function calls function can be programmed so that it will return '()' when there is no more output to produce.
Here the condition is after the body and the body will execute at least once before the loop terminates. Like in if statement you can reverse condition. The following C-style syntax is used:
do { |
do { |
This example shows that the statement block is executed even though the condition $i < 0 is false when the loop starts.
The fact that zero evaluates to false permit creation of while loops that count down to zero. This is useful when you need a specific number of iterations and do not care about the fact that index decrease, not increase. For example:
$limit = 5;
$pageno=0;
do (); {
printPage();
$pageno++
while($pageno<=limit}
When this loop is done, all five pages will be printed. Actually this type of loop will behave wrong if you will specify negative number of pages, so regular for loop(see below) would be much better.
Here is example that we discussed in prefix loops rewritten using the postfix loop:
do {
"Please enter password: "; # Ask for input
$pass = <STDIN>; # Get input
chop $pass; # Chop off newline
}
while ($a ne ":SeSaM-1999:") # Redo while wrong input
Here is another example of the postfix loop.
$i = 1;
do {
print("inner loop iteration: i = $i\n");
$i++;
} while ($i < 0);
print("loop ended with: i = $i\n");
This program displays:
inner loop iteration: i = 1 loop ended with: i = 2
Perl has a for structure that mimics that of C. As in C it has the form
for (initialize; terminating condition; increment){
statement;
statement;
...
}
First the statement initialize is executed. Then while test is true the block of actions is executed. After each time the block is executed inc takes place. Here is an example of the for loop to print how many day passed from the beginning of the year for the first of each month .
@month=( 'Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec');
@days = ( 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31);
$total=0;
for ($i=0; $i<12; $i++){
print("$month[$i] \t $total\n"); // print total for the current month
$total+=$days[$i]; // add days of the current month to total
}
It is clear that for loop without a counter is essentially equivalent to a while loop:
for (; i<12; ) {
}
The for loop without counter in Perl is an idiom for so called "forever loop", the loop that can be terminated only by a break statement inside the body and that does not contain terminating condition is the header to tail of the loop.
for(;;) {
}
Such loop is convenient when on the last iteration only part of the body of the loop needs to be executed. This is often he case in loops that deal with input.
The terminating condition expression is used to determine whether the loop should continue or be ended. When the condition expression evaluates to false, the loop will end. When writing the condition, be sure to use the numeric comparison.
The increment/decrement expression is used to modify the loop variables in some way each time the code block has been executed.
For arrays the usual way to loop through all the elements of an array is to use test $i<@month, for example:
@month=( 'Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec');
for($i=0; $i<@month; $i++) { # Visit each item in turn
print "$month[$i]\n"; # Print the item
}
This is a very popular form of loop that is typical for all scripting languages. Even extremely primitive DOS command.com batch language has a similar form of loop. For those with Unix background I would like to note that foreach is very similar to the "for..in" structure in Bourne Shell. In Java similar capabilities are provided by iterators.
Right now you probably will be better off skipping this section and returning to it after learning material of Chapter 3.
Foreach loop iterates through each of the elements of an array, by assigning each element to a temporary variable (this variable is actually a reference to the current element and changing it will change the element, see below) when iterating over an array or hash
The foreach statement provides a very convenient iteration mechanism without having to resort to a counter if one need to process all elements of an array. Therefore if task requires to scan an array checking each element, the foreach loop is a natural control structure to use. For example it can be used for finding the max/min (but built-in functions are better), various sums, selecting elements that satisfy some condition (if grep is not suitable for the task), etc.
The idea is very simple -- the body of the loop is executed once for every element of an array from the starting element to the end. On each iteration the value of the current element is assigned to a selected temporary variable:
@month=( 'Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec');
foreach $m (@month) { # Visit each item in turn
print "$m\n"; # Print the item
}
Here the loop control variable is $m. But it is a regular Perl variable. It is actually a reference to the element of array we process and is you change it in the loop you will change the element of the array.
As elements of an array can be generated using ranges, for ranges it
is more understandable that a regular for loop (in the current Perl implementation,
no temporary array is created when the range operator is used as the expression
in foreach loops):
foreach $i (0..11) { # equivalent to for ($i=0; $i<$days;$i++) {
$s=$s+$days[$i];
}
The foreach
keyword is actually a synonym for the
for keyword,
so in this case I recommend using for instead of
foreach:
for $i (0..11) {
$s=$s+$days[$i];
}
If you do not need index, that the foreach loop is a perfect way to perform some calculation or select some elements of an array. In cases when one need to analyze the elements an array in sequence but actually do not need an index it is better to use foreach loop instead of for or while loops:
$total=0;
foreach $item (@expenses) {
$total=$total+$item; # some calculation
}
print "Total expenses=$total";
The variable $item is assigned the value of each array element, in turn until the end of the array is reached. Actually it is better to use foreach loop instead of while loop in many cases like that.
Foreach loop can be used for hashes, but hash need to be converted to the array first. The idea is to use a special built-in functions (all of them are not limited to loops and can be used outside loops too):
The following code that prints all keys from the hash is pretty typical:
foreach $key (keys %hash) {
print $hash{$key};
}
The expression (keys %hash) will first generate an array containing all keys. Then this array will be used like in examples above -- in each iteration of the loop one element will be picked in sequence. We can rewrite this loop using values() function in the following way:
foreach $v (values %hash) {
print $v;
}
If we need to print both key and value that we should use each() function:
foreach ($v,$k) (each %hash) {
print "Key=$k, Value=$v\n";
}
From the point of view of memory consumption the function each() is the most economical as it does not create an array of all keys and all values as keys() and values() do.
Please note that in if and unless statement should have two closing brackets, if you use a function as a test:
if (open(SYSIN, "<$fname")) {
|________________|
|______________________|
In case, god forbid, you miss one, Perl diagnostic is really misleading.
Prev | Up | Contents | Down | Next
Copyright © 1996-2009 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Site uses AdSense so you need to be aware of Google privacy policy. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
Disclaimer:
Last modified: September 07, 2009