|
Softpanorama |
May the source be with you, but remember the KISS principle ;-)
Softpanorama Search
|
Prev | Up | Contents | Down | Next
Second look on scalar variables
Operations of scalar variables (to be written)
Second look on hashes (Associative arrays)
Perl has just four basic types of variables:
All Perl variables have special prefixes that act much like "on the fly" declarations of the type. You can also think about $ of a function that converts to string a variable that it prefixed.
The main achievement of Perl in this area is the introduction of the undef value. This is a very interesting and pretty innovative solution to the problem of uninitialized variables.
Let's reiterate what we already know about scalar variables from previous
section. At this point I believe many suspect that scalar variables in Perl
always start with $ and can hold both strings
and numbers. This is true and a scalar variable always has a string value
and a numeric value. Yes, any string in Perl has a numeric value, and in
case of non-numeric strings it always zero.. Like C Perl is case sensitive,
so variables $a and
$A are different.
|
Like C Perl is case sensitive, so variables $a and $A are different |
First Perl determines that left side of expression is well-formed numeric value. Then it will convert it to double-float (internal representation of numeric values in Perl). After than it will convert numeric value to string and will assign it to scalar on the right part (so string value of the scalar will be 5 not 5.0 as one would expect.$price = 5.0; # this variable is assigned a numeric value print "price=$price"; # will print 5
If you supply the number as a string literal than string value will be stored first. It will not be converted to numeric value:
$price= '5.0'; print "price=$price"; # will print 5.0
In general variable names consists of numbers, letters and underscores, but they should not start with a number and the names like $_ and several others are special, as we'll see later.
Many Perl operations return undef
to indicate unusual situations -- failure, end of file, system error,
uninitialized variable, and other exceptional conditions. There is
a defined() function than allows you to check whether a variable
contains the undef value. A conditional
expression will not distinguish among
undef, zero, the empty string,
and "0", which are all considered to be false.
Note that since undef is a valid scalar, its presence doesn't necessarily indicate an exceptional condition: many built-in functions returns undef when there no output any more (but also in case when element happens to have the value undef.)
Use of built-in function defined() upon aggregates (hashes and arrays) is not guaranteed to produce intuitive results, and should probably be avoided. When used on the element of an associative array (hash -- see below), it tells you whether the value is defined, not whether the key exists in the hash. Use exists for the latter purpose.
One big advantage of interpreted languages is that they can check whether the variable was initialized or not. Here Perl provides a solution that is superior to any other I know of. It explicitly defines special undefined value and set of functions to check for this value.
You can check if the variable exists in the symbol table (was it initialized or not) by using special built-in function defined. All uninitialized variables are assumed to have a special value before they are assigned any explicitly -- this value is called undef. Paradoxically you can assign undef like any other value to variables. Logically this should mean that the variable is deleted from the symbol table, but I do not know whether this true in Perl or not. For example:
$arg1 = undef; # set to undefundef($arg1); # same as above
| In Perl all uninitialized scalar
variables are assumed to have the value
undef .
Please note that -w switch produce some partially useful warning about uninitialized variables and this is another case when it can be useful. |
That leads to the major difference between Perl and most of compiled languages (C, Pascal, etc.) -- a scalar variable can be used without initial assignment a value to it. In this case it has a default value undef that can be converted to string (resulting in zero length string) or to numeric value (resulting in 0).
In numeric operations the undef value is converted to zero much like and string literal. So it is perfectly legal to write
$k=$i+0;
Let's assume that at in this expression the variable $i is undefined and uninitialized. It is first created with initial value undef. Then because operator "+" requires numeric value it is converted to a numeric value (zero). Uninitialized value can also be used for comparison -- they will be converted according to the operator either to text value (zero length string) or numeric value (zero).
|
For arithmetic operations undef
behaves like 0, |
If you need to test whether or not a scalar has initialized you can use defined function. For example, if argument was passed to a subroutine then the corresponding variable can be either defined or undefined. If it is not defined, then you need to assign default value to it providing Perl with the capability to set default variables to subroutine parameters. We will learn about them more later.
From logical standpoint you cannot use undef in comparison because the type of variable in Perl is determined by an operator and should be either numeric or string depending on context:
if ( $arg1 == undef ) {...} # undef will always be converted to zero first
The undef value is often used in Perl instead of exceptions to signal end of the data stream (Perl 5 does not support the notion of exceptions). For example as we will see in Ch.4, when you read the file and reach the end, the value undef is returned to signal that there is no more records in the file. Similarly functions can signal that there is no more values by returning the undef value.
Here is how defined if described in Perl documentation (perlfun):
Returns a Boolean value telling whether EXPR has a value other than the undefined value undef. If EXPR is not present,
$_will be checked.Many operations return undef to indicate failure, end of file, system error, uninitialized variable, and other exceptional conditions. This function allows you to distinguish undef from other values. (A simple Boolean test will not distinguish among undef, zero, the empty string, and
"0", which are all equally false.) Note that since undef is a valid scalar, its presence doesn't necessarily indicate an exceptional condition: pop() returns undef when its argument is an empty array, or when the element to return happens to be undef.You may also use defined() to check whether a subroutine exists, by saying
defined &funcwithout parentheses. On the other hand, use of defined() upon aggregates (hashes and arrays) is not guaranteed to produce intuitive results, and should probably be avoided.When used on a hash element, it tells you whether the value is defined, not whether the key exists in the hash. Use exists for the latter purpose.
(to be written)
Arrays allow to access to their elements by number. Array in Perl are quite different from arrays in C and are more like a buffer of a text editor with indexes as line numbers. There are two notation for arrays -- regular index notation and so called list notation. For example (1,2,3) is an array with three elements.
Like in C and unlike any text editor the first element of array has index zero. For example the first element of array @mybuffer is $mybuffer[0], the second element is $mybuffer[1], and so on.
|
One needs to use prefix @ for arrays and
prefix $ for array elements. |
You can initialize arrays using list notation:
@x=(1,2,3) # now @x contains three elements $x[0], $x[1] and $x[2]
Negative indexes does not make sense in Perl arrays, but they are used to denote access to the end of the array, not from the start. The last element is $mybuffer[-1], the element before last is $mybuffer[-2], etc. This is quite convenient shortcut worth remembering.
|
Negative indexes are used to access elements |
Again, arrays in Perl are much more like lists (or buffers
of text editors which are essentially lists) -- they have no lower
or upper bound and can accept any type of variable (both numbers and strings).
All array names are prefixed by an @ symbol, but elements are
prefixed with $. For example:
@workweek = ("Mn", "Ts", "Wn", "Th","Fr"); # initialization of array @workweek with 5 values
@weekend = ("St", "Sn");
These statements assigns a five element list to the array variable
@workweek and a two element list to the array
variable @weekend. Like in C the array is accessed by using indices starting from 0, and square brackets are used to specify the index. For example:
print $workweek[4]; # will print Fr. Notice symbol $ instead of @
This substitution of @ with $ is a frequent source of errors for beginners. Be careful. In this particular case there is some logic in this convention (after all each element of array is a scalar) and you need to adapt to it.
|
If you want to access one element of array you need to use a scalar like in $week[0]. Usage of @week[0] is a frequent error. Watch your steps ! |
If index is non-numeric that 0 will used. So the following two statements are equal:
$color["abba"]="blue"; $color[0]="blue"; # same as aboveSlices
Unlike C it is possible to specify multiple indexes. This is called a array slice or simply slice. For example
@x[1,2]=(3,4) # slice of array @x with indexes 1 and 2 is assigned values 3 and 4
Any slice in scalar context returns the last element of the slice, not the number of elements like an array. So
$s=@x[0]; # wrong way to assign $x[0] to a veriable $s
Slices accept Pascal range notation (..). for example
@x[2..5]=(1,2,3,4);
@danger_levels[1..3]=('green','orange','red');
Arrays on the left side of assignment statement are evaluated before any operations on right side. That's mean that in Perl you can exchange two elements using:
($a,$b)=($b, $a);
Only uninitialized scalar variables in Perl have the undef value. Uninitialized arrays have the value of the empty list (). You can assign undef value to array and pretty logically it destroys the content of the array and free all the memory. The same effect can be achieved by assigning the array an empty list and this is a more common notation.
|
Unlike scalars the initial value of array is
not undef value, but an empty list
(). |
You cannot shorten array or remove elements from array by assigning undef value to them. This not very logical but that's how it is. So here undef is a special value, not just the fact of absence of identifier in the symbol table like it the logical view on undef for scalars presuppose.
Arrays are classical data structure. So operations on arrays will be discussed in more details in the next chapter.
Associative arrays or hashes are a generalization of a regular arrays to a non numeric indexes. They provide a built in search capability. You put values into the hash by defining key-value pairs. Like Perl arrays, hashes grow and shrink automatically when you add or subtract elements. The main difference is that array indexes are converted to numeric before retrieving the value and in associative arrays they are converted to string and are usually arbitrary strings (for regular arrays all non-numeric indexes are equivalent to the index 0).
The second important difference is that associative array entries are not created on mere reference, like scalars.
To define an associative array we use the usual parenthesis notation, but the array itself is prefixed by a % sign. Suppose we want to create an array of url of sites and there IP addresses. It would look like this:
%ip =( "www.yahoo.com", "204.71.200.68", # note brackets "(" and ")"
"www.google.com", "209.185.108.220",
"www.northenlight.com", "128.11.1.1",
);
As a cosmetic improvement we can replace ",' with "=>". this way it's easier to count pairs so it is a recommended notation in all cases where Perl script is written by human (in generated scripts notation above is simpler and can be preferable):
%ip = ( "www.yahoo.com" => "204.71.200.68",
"www.google.com" => "209.185.108.220",
"www.northenlight.com" => "128.11.1.1",
);
Now we can find the IP addresses of sited with the following expressions
(note curly brackets):
$ip{"www.yahoo.com"}; # Returns 204.71.200.68
$ip{"www.northenlight.com"}; # Returns 128.11.1.1
Notice that like in arrays to access elements of hash each % sign
has been changed to a $ because that element is a scalar. Unlike
list arrays, the index (in this case the person's name) is enclosed in curly
braces.
An associative array can be converted back into a list array just by assigning it to a list array variable. Order of variable is underemined in this conversion. A list array can be converted into an associative array by assigning it to an associative array variable. each pair will be converted into one hash element. Logically the list array should have an even number of elements, but if not the value of the last will be undef.
@info = %ip; # @info is a list array. It # now has 6 elements, but the order of pairs may changed %ip=@info; # Reverse operation
If you wish to access a value, you can say:
print $ip{'www.yahoo.com'}; # note curly brackets
Again note that one needs to use prefix $ instead of %. To change the value you can also say:
$ip{'www.yahoo.com'} = '204.71.200.67';
Hashes are not lists, so there is no previous and next element related operations defined on hashes. If you try to get all of them, then the order in which Perl will extract values is undetermined and can be different from the order in which you put elements into the hash.
You can delete a single element of hash with the operator delete, for example:
delete $ip{'www.yahoo.com'};
Like arrays the initial value of hash is an associative array with no elements. So if you convert such hash to an array you will receive an empty list. Built-in function undef is applicable to hashes and will convert any hash to an empty one. As you will see in the next chapter you need to apply a special built-in function delete to remove element from the hash.
Hashes are non-traditional data structure and here Perl is a to certain extent a pioneer. We will discuss operations on hashes in more details in the next chapter.
Perl provides a facility for creating constant values, via the "use constant"
pragma. use constant works for scalars and arrays, not hashes.
use constant PI => 3.14159;
... use constant PI => 2.71828;
use constant CARRAY => (2, 3, 5, 7, 11, 13); $a_prime = CARRAY[2]; # wrong! $a_prime = (CARRAY)[2]; # right -- MUST use parentheses
use constant SOME_KEY => 'key';
%hash = (key => 'value', other_key => 'other_value');
$some_value = $hash{SOME_KEY}; # wrong!
$some_value = $hash{+SOME_KEY}; # right
(who thinks to use a unary plus when using a hash?)
Another way to create read-only scalars is to modify the symbol table entry for the variable by using a typeglob:
*a = \'value';
This works fine, but it only works for global variables ("my" variables have no symbol table entry). Also, the following similar constructs do not work:
*a = [1, 2, 3]; # Does NOT create a read-only array
*a = { a => 'A'}; # Does NOT create a read-only hash
There is also Readonly.pm but it imposes a performance penalty. In other words it's pretty slow and pretty complex. Not recommended. I think that this feature should be implemented on the language level not via crutches.
Prev | Up | Contents | Down | Next
Copyright © 1996-2009 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Site uses AdSense so you need to be aware of Google privacy policy. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
Disclaimer:
Last modified: September 05, 2009