|
Home | Switchboard | Unix Administration | Red Hat | TCP/IP Networks | Neoliberalism | Toxic Managers |
May the source be with you, but remember the KISS principle ;-) |
|
See Nikolai Bezroukov. Introduction to Perl for Unix System Administrators for the introduction.
1.2. The Place of Perl among other scripting languages; good and bad things about Perl
1.3. Notes on Perl hype
2.1. Hello World in Perl
2.2. Overview of Perl Lexical Structure, Syntax and Operators
2.3. Perl Variables
2.5. Typical errors and pitfalls
2.6. Exercises
3.1. String operations
3.2 Numeric values and arithmetic operators
3.3. Operations on Arrays
3.4. Operations on Hashes
3.5. Typical errors
3.6. Summary
3.7. Exercises
5. 1 Introduction
5.2. Overview of Perl regular expressions
5.3. More complex Regular Expressions
5.4. Non-greedy matching
5.5. Split function
5.6 Best practices
5.8. Summary
5.9. Exercises
|
![]() |
![]() |
![]() |
Dec 21, 2017 | affy.blogspot.com
One very common error is to use elseif instead of the correct elsif keyword. As you program, you'll find that you consistently make certain kinds of errors. This is okay. Everyone has his or her own little quirks. Mine is that I keep using the assignment operator instead of the equality operator. Just remember what your particular blind spot is. When errors occur, check for your personal common errors first.
This section shows some common syntax errors and the error messages that are generated as a result. First, the error message is shown and then the script that generated it. After the script, I'll cast some light as to why that particular message was generated.
Missing semiconon in one of the statements
Scalar found where operator expected at test.pl line 2, near "$bar" (Missing semicolon on previous line?) $foo = { } # this line is missing a semi-colon. $bar = 5;Perl sees the anonymous hash on the first line and is expecting either an operator or the semicolon to follow it. The scalar variable that it finds, $bar , does not fit the syntax of an expression because two variables can't be right after each other. In this case, even though the error message indicates line 2, the problem is in line 1.Missing quote
Bare word found where operator expected at test.pl line 2, near "print("This" (Might be a runaway multi-line "" string starting on line 1) syntax error at test.pl line 2, near "print("This is " String found where operator expected at test.pl line 3, near "print("" (Might be a runaway multi-line "" string starting on line 2) (Missing semicolon on previous line?) Bare word found where operator expected at test.pl line 3, near "print("This" String found where operator expected at test.pl line 3, at end of line (Missing operator before "); ?) Can't find string terminator '"' anywhere before EOF at test.pl line 3.print("This is a test.\n); # this line is missing a ending quote. print("This is a test.\n"); print("This is a test.\n");In this example, a missing end quote has generated 12 lines of error messages! You really need to look only at the last one in order to find out that the problem is a missing string terminator. While the last error message describes the problem, it does not tell you where the problem is. For that piece of information, you need to look at the first line where it tells you to look at line two. Of course, by this time you already know that if the error message says line 2, the error is probably in line 1.Unquoted literal
Can't call method "a" in empty package "test" at test.pl line 1.print(This is a test.\n); # this line is missing a beginning quote.The error being generated here is very cryptic and has little to do with the actual problem. In order to understand why the message mentions methods and packages, you need to understand the different, arcane ways you can invoke methods when programming with objects. You probably need to add a beginning quote if you ever see this error message.... ... ..
This list of syntax errors could go on for quite a while, but you probably understand the basic concepts:
- Errors are not always located on the line mentioned in the error message.
- Errors frequently have nothing to do with the error message displayed.
![]() |
![]() |
![]() |
Dec 20, 2017 | www.davetill.com
Chapter 21 The Perl Debugger
CONTENTS
- Entering and Exiting the Perl Debugger
- Listing Your Program
- Stepping Through Programs
- Displaying Variable Values
- Breakpoints
- Tracing Program Execution
- Line Actions
- Other Debugging Commands
- Summary
- Q&A
- Workshop
Today's lesson describes the Perl debugging facility. You'll learn the following:
Entering and Exiting the Perl Debugger
- How to enter and exit the Perl debugger
- How to list parts of your program
- How to execute one statement at a time
- How to set breakpoints and trace program execution
- How to perform line actions
- About other useful debugging commands
The following sections describe how to start the Perl debugger and how to exit.
Entering the DebuggerTo debug a Perl program, specify the -d option when you run the program. For example, to debug a program named debugtest , specify the following command:
$ perl -d debugtestYou can supply other options along with -d if you want to.
When the Perl interpreter sees the -d option, it starts the Perl debugger. The debugger begins by displaying a message similar to the following one on your screen:
Loading DB routines from $RCSfile: perldb.pl,v $$Revision: 4.0.1.3 $$Date: 92/06/08 13:43:57 $ Emacs support available. Enter h for help. main::(debugtest:3): $dircount = 0; DB<1>The first few lines display the date on which this version of the debugger was created. The only lines of interest are the last two.
The second-to-last line in this display lists the line that the debugger is about to execute. When the debugger starts, the first executable line of the program is displayed.
When the debugger displays a line that it is about to execute, it also provides the following information about the line:
- The package in which the line is contained (in this case, the default package, which is main )
- The name of the file containing the line (here, the file is named debugtest )
- The current line number (which, in this example, is 3)
The last line of the display prompts you for a debugging command. The number enclosed in angle brackets indicates the command number; in this case, the number is 1 , because you are about to specify the first debugging command.
Later today you will learn how to use the debugging command number to re-enter debugging commands you have previously executed.
Exiting the Debugger
NOTE To enter the debugger without supplying a program, supply the -e option with the -d option:$ perl -d -e "1;"This line starts the debugger with a "program" consisting of the single statement1;(which is an expression that doesn't do anything meaningful).Starting the debugger without a program enables you to examine the predefined system variables or supply statements to be executed. You will learn how to perform both of these tasks later in today's lessonTo exit the debugger, enter the debugging command q :
DB<1> qThis command halts program execution immediately and returns you to the command shell.
Listing Your ProgramYou can list any part of your program from within the debugger. The following sections describe debugging commands that perform the display operations.
The commandThe simplest way to list part of your program is with the l command, which lists the next few statements in your program:
DB<1> l 3: $dircount = 0; 4: $curdir = ""; 5: while (1) { 6: # if we don't have a current directory, get one 7: if ($curdir eq "") { 8: print ("Enter directory to list:\n"); 9: $curdir = <STDIN>; 10: $curdir =~ s/^\s+|\s+$//g; 11: $curdir = &followlink($curdir); 12: &readsubdirs($curdir);The l command lists a window of executable statements, which is a group of statements small enough to be easily displayed on your screen. A window usually consists of about ten statements. The line number of each statement is displayed at the beginning of its line.
NOTE The statements displayed in today's lesson are taken from the program presented in "Week 2 in Review."Entering the l command again displays the window of statements immediately following the last displayed line, which in this example is line 12:
DB<1> l 13: } 14: $curdir = &menudir($curdir); 15: } 16: 17: 18: # Find all subdirectories of the given directory, 19: # and store them in an associative array. 20: # 21: # The associative array subscripts and values are: 22: # <directory name>: 1You can specify the lines displayed by the l command. If you specify a single line number with the l command, it displays that line:
DB<1> l 10 10: $curdir =~ s/^\s+|\s+$//g;To display a range of lines, specify the first and last line number, and separate them with a hyphen:
DB<1> l 10-15 10: $curdir =~ s/^\s+|\s+$//g; 11: $curdir = &followlink($curdir); 12: &readsubdirs($curdir); 13: } 14: $curdir = &menudir($curdir); 15: }To display a specified number of lines starting at a certain point, supply the starting line number, a + character, and a number of additional lines to display:
DB<1> l 10+5 10: $curdir =~ s/^\s+|\s+$//g; 11: $curdir = &followlink($curdir); 12: &readsubdirs($curdir); 13: } 14: $curdir = &menudir($curdir); 15: }You also can use the l command to specify a subroutine to display. To do this, provide the name of the subroutine to display:
DB<1> l readsubdirs 26: sub readsubdirs { 27: local ($dirname) = @_; 28: local ($dirvar, $subdircount, $name, $index); 29: 30: # open the current directory; 31: # $dircount ensures that each file variable is unique 32: $dirvar = "DIR" . ++$dircount; 33: if (!opendir ($dirvar, $dirname)) { 34: warn ("Can't open $dirname\n"); 35: return;This command lists the statements in the subroutine. If the subroutine is too large to fit in a single window, only the first few statements are listed; you can list subsequent statements by entering l with no arguments.
The CommandYou can display the lines immediately preceding the last displayed line by entering the - command. For example, the following - command lists the window of lines immediately preceding the subroutine readsubdirs .
DB<1> - 16: 17: 18: # Find all subdirectories of the given directory, 19: # and store them in an associative array. 20: # 21: # The associative array subscripts and values are: 22: # <directory name>: 1 23: # (indicates that directory has been read) 24: # <directory name>.<num> the <num>th subdirectory 25:Subsequent - commands go back further in the file.
The CommandTo list a window of lines containinga specified line, use the w command, and specify the number of the line to be included:
DB<1> w 7 4: $curdir = ""; 5: while (1) { 6: # if we don't have a current directory, get one 7: if ($curdir eq "") { 8: print ("Enter directory to list:\n"); 9: $curdir = <STDIN>; 10: $curdir =~ s/^\s+|\s+$//g; 11: $curdir = &followlink($curdir); 12: &readsubdirs($curdir); 13: }The w command displays the three lines before the specified line and fills the window with the lines following it.
The // and ?? CommandsYou can search for a line containing a particular pattern by enclosing the pattern in slashes:
DB<1> /Find/ 18: # Find all subdirectories of the given directory,The debugger searches forward from the last displayed line for a line matching the specified pattern. If it finds such a line, the line is displayed.
To search backward for a particular pattern, enclose the pattern in question marks:
DB<1> ?readsubdirs? 12: &readsubdirs($curdir);This command starts with the last displayed line and searches backward until it finds a line matching the specified pattern.
The Command
NOTE Patterns specified by // and ?? can contain any special character understood by the Perl interpreter.You optionally can omit the final / or ? character when you match a pattern.The S command lists all the subroutines in the current file, one subroutine per line:
DB<> S main::display main::followlink main::menudir main::readsubdirsEach subroutine name is preceded by the package name and a single quotation mark.
Stepping Through ProgramsOne of the most useful features of the Perl debugger is the capability to execute a program one statement at a time. The following sections describe the statements that carry out this action.
The CommandTo execute a single statement of your program, use the s command:
DB<2> s main::(debugtest:4): $curdir = "";This command executes one statement of your program and then displays the next statement to be executed. If the statement executed needs to read from the standard input file, the debugger waits until the input is provided before displaying the next line to execute.
TIP If you have forgotten which line is the next line to execute (because, for example, you have displayed lines using the l command), you can list the next line to execute using the L command:DB<2> L3: $dircount = 0;The L command lists the last lines executed by the program. It also lists any breakpoints and line actions that have been defined for particular lines. Breakpoints and line actions are discussed later today.If the statement executed by the s command calls a subroutine, the Perl debugger enters the subroutine but does not execute any statements in it. Instead, it stops at the first executable statement in the subroutine and displays it. For example, if the following is the current line:
main::(debugtest:12): &readsubdirs($curdir);specifying the s command tells the Perl debugger to enter readsubdirs and display the following, which is the first executable line of readsubdirs :
main::readsubdirs(debugtest:27): local ($dirname) = @_;The s command assumes that you want to debug the subroutine you have entered. If you know that a particular subroutine works properly and you don't want to step through it one statement at a time, use the n command, described in the following section.
The CommandThe n command, like the s command, executes one line of your program and displays the next line to be executed:
DB<2> n main::(debugtest:5): while (1) {The n statement, however, does not enter any subroutines. If the statement executed by n contains a subroutine call, the subroutine is executed in its entirety. After the subroutine is executed, the debugger displays the line immediately following the call.
For example, if the current line is
main::(debugtest:12): &readsubdirs($curdir);the n command tells the debugger to execute readsubdirs and then display the next line in the program, which is
main::(debugtest:13:): }Combining the use of s and n ensures that the debugger examines only the subroutines you want to see.
The command
NOTE The Perl debugger does not enable you to enter any library functions. You can enter only subroutines that you have created yourself or that have been created previously and added to a subroutine libraryThe f command tells the Perl debugger to execute the remainder of the statements in the current subroutine and then display the line immediately after the subroutine call. This is useful when you are looking for a bug and have determined that the current subroutine does not contain the problem.
The Carriage-Return CommandIf you are stepping through a program using s or n , you can save yourself some typing by just pressing Enter when you want to execute another statement. When you press Enter, the debugger repeats the last s or n command executed.
For example, to step from line 5 to line 7, you can use the s command as usual:
DB<3> s main::(debugtest:7): if ($curdir eq "") {(Line 6 is skipped because it contains no executable statements.) To execute line 7, you can now just press Enter:
DB<2> main::(debugtest:8): print ("Enter directory to list:\n");The Command
NOTE Pressing Enter has no effect if you have not specified any s or n commands.If you are inside a subroutine and decide that you no longer need to step through it, you can tell the Perl debugger to finish executing the subroutine and return to the statement after the subroutine call. To do this, use the r command:
DB<4> r main::(debugtest:13:): }The statement displayed by the debugger is the first statement following the call to the subroutine.
Displaying Variable ValuesAnother powerful feature of the Perl debugger is the capability to display the value of any variable at any time. The following sections describe the commands that perform this action.
The CommandThe X command displays variables in the current package (which is main if no other package has been specified). If the X command is specified by itself, it lists all the variables in the current package, including the system-defined variables and the variables used by the Perl interpreter itself. Usually, you won't want to use the X command by itself, because there are a lot of system-defined and internal variables known to the Perl interpreter.
To print the value of a particular variable or variables, specify the variable name or names with the X command:
DB<5> X dircount $dircount = '0'This capability often is useful when you are checking for errors in your program.
You must not supply the $ character with the variable name when you use the X command. If you supply the $ character (or the @ or % characters for arrays), the debugger displays nothing.You can use X to display the values of array variables and associative array variables.
DB<6> X regarray @regarray = ( 0 14 1 'hello' 2 36 ) DB<7> X assocarray %assoc_array = ( 'hi' 1 'there' 2 )Each command prints the subscripts of the array and their values. Regular arrays are printed in order of subscript; associative arrays are printed in no particular order.
The Command
NOTE If you have an array variable and a scalar variable with the same name, the X command prints both variables:DB<8> X var
$var = '0'
@var = (
0 'test1'
1 'test2'
)There is no way to use X to display one variable but not the other.The V command is identical to the X command except that it prints the values of variables in any package. If you specify just a package name, as in the following, this command displays the values of all variables in the package (including system-defined and internal variables):
DB<9> V mypackIf you specify a package name and one or more variable names, as in the following, the debugger prints the values of the variables (if they are defined in that package):
BreakpointsDB<10> V main dircount $dircount = '0'As you have seen, you can tell the Perl debugger to execute one statement at a time. Another way of controlling program execution is to tell the debugger to execute up to a certain specified point in the program, called a breakpoint .
The following sections describe the commands that create breakpoints, and the command that executes until a breakpoint is detected.
The CommandTo set a breakpoint in your program, use the b command. This command tells the debugger to halt program execution whenever it is about to execute the specified line. For example, the following command tells the debugger to halt when it is about to execute line 10:
DB<11> b 10(If the line is not breakable, the debugger will return Line 10 is not breakable .)
NOTE You can have as many breakpoints in your program as you want. The debugger will halt program execution if it is about to execute any of the statements at which a breakpoint has been defined.The b command also accepts subroutine names:
DB<12> b menudirThis sets a breakpoint at the first executable statement of the subroutine menudir .
You can use the b command to tell the program to halt only when a specified condition is true. For example, the following command tells the debugger to halt if it is about to execute line 10 and the variable $curdir is equal to the null string:
DB<12> b 10 ($curdir eq "")The condition specified with the b statement can be any legal Perl conditional expression.
The Command
If a statement is longer than a single line, you can set a breakpoint only at the first line of the statement:71: print ("Test",
72: " here is more output");Here, you can set a breakpoint at line 71, but not line 72.After you have set a breakpoint, you can tell the debugger to execute until it reaches either the breakpoint or the end of the program. To do this, use the c command:
DB<13> c main::(debugtest:10): $curdir =~ s/^\s+|\s+$//g; DB<14>When the debugger detects that it is about to execute line 10-the line at which the breakpoint was set-it halts and displays the line. (Recall that the debugger always displays the line it is about to execute.)
The debugger now prompts you for another debugging command. This action enables you to start executing one statement at a time using n or s , continue execution using c , set more breakpoints using b , or perform any other debugging operation.
You can specify a temporary (one-time-only) breakpoint with the c command by supplying a line number:
DB<15> c 12 main::(debugtest:12): &readsubdirs($curdir);The argument 12 supplied with the c command tells the debugger to define a temporary breakpoint at line 12 and then resume execution. When the debugger reaches line 12, it halts execution, displays the line, and deletes the breakpoint. (The line itself still exists, of course.)
Using c to define a temporary breakpoint is useful if you want to skip a few lines without wasting your time executing the program one statement at a time. Using c also means that you don't have to bother defining a breakpoint using b and deleting it using d (described in the following section).
The Command and Breakpoints
TIP If you intend to define breakpoints using c or b , it is a good idea to ensure that each line of your program contains at most one statement. If you are in the habit of writing lines that contain more than one statement, such as$x++; $y++;you won't get as much use out of the debugger, because it can't stop in the middle of a lineTo list all of your breakpoints, use the L command. This command lists the last few lines executed, the current line, the breakpoints you have defined, and the conditions under which the breakpoints go into effect.
DB<16> L 3: $dircount = 0; 4: $curdir = ""; 5: while (1) { 7: if ($curdir eq "") { 10: $curdir =~ s/^\s+|\s+$//g; break if (1)Here, the program has executed lines 3-7, and a breakpoint is defined for line 10. (Line 6 is not listed because it is a comment.) You can distinguish breakpoints from executed lines by looking for the breakpoint conditional expression, which immediately follows the breakpoint. Here, the conditional expression is (1) , which indicates that the breakpoint is always in effect.
The and CommandsWhen you are finished with a breakpoint, you can delete it using the d command.
DB<16> d 10This command tells the debugger to delete the breakpoint at line 10. The line itself remains in the program.
If you do not specify a breakpoint to delete, the debugger assumes that a breakpoint is defined for the next line to be executed, and deletes it.
main::(debugtest:12): &readsubdirs($curdir); DB<17> dHere, line 12 is the next line to be executed, so the debugger deletes the breakpoint at line 12.
To delete all your breakpoints, use the D command.
DB<18> DThis command deletes all the breakpoints you have defined with the b command.
Tracing Program ExecutionWhen you run a program using the Perl debugger, you can tell it to display each line as it is executed. When the debugger is doing this, it is said to be in trace mode .
To turn on trace mode, use the T command.
DB<18> t Trace = onWhen a statement is executed in trace mode, the statement is displayed. For example, if the current line is line 5 and the command c 10 (which executes up to line 10) is entered, the following is displayed:
DB<18> c 10 main::(debugtest:5): while (1) { main::(debugtest:7): if ($curdir eq "") { main::(debugtest:10): $curdir =~ s/^\s+|\s+$//g; DB<19>The debugger prints and executes line 5 and line 7, then displays line 10 and waits for further instructions.
To turn off trace mode, specify the t command again.
DB<19> t Trace = offAt this point, trace mode is turned off until another t command is entered.
Line ActionsThe Perl debugger enables you to specify one or more statements to be executed whenever the program reaches a specified line. Such statements are known as line actions. The most common line actions are printing the value of a variable and resetting a variable containing an erroneous value to the value you want.
The following sections describe the debugging commands that define line actions.
The CommandTo specify a line action for a particular line, use the a command.
DB<19> a 10 print ("curdir is $curdir\n");This command tells the debugger to execute the statement
print ("curdir is $curdir\n");whenever it is about to execute line 10 of the program. The debugger performs the action just after it displays the current line and before it asks for the next debugging command.
To create a line action containing more than one statement, just string the statements together. If you need more than one line for the statements, put a backslash at the end of the first line.
DB<20> a 10 print ("curdir is $curdir\n"); print \ ("this is a long line action\n");In this case, when the debugger reaches line 10, it executes the following statements:
The Commandprint ("curdir is $curdir\n"); print ("this is a long line action\n");To delete the line actions defined using the a command, use the A command.
DB<21> AThis command deletes all line actions currently defined.
The < and > Commands
NOTE The A command does not affect the < and > commands, described in the following section.To define a line action that is to be executed before the debugger executes any further statements, use the > command.
DB<21> > print ("curdir before execution is $curdir\n");This command tells the debugger to print the value of $curdir before continuing.
Similarly, the < command defines a line action that is to be performed after the debugger has finished executing statements and before it asks for another debugging command:
DB<22> < print ("curdir after execution is $curdir\n");This command tells the debugger to print the value of $curdir before halting execution again.
The < and > commands are useful when you know that one of your variables has the wrong value, but you don't know which statement assigned the wrong value to the variable. By single-stepping through the program using s or n , and printing the variable either before or after executing each statement, you can determine where the variable was given its incorrect value.
Displaying Line Actions Using the Command
NOTE To delete a line action defined by the < command, enter another < command with no line action defined.DB<23> <Similarly, the following command undoes the effects of a > command:DB<24> >The L command prints any line actions you have defined using the a command (as well as breakpoints and executed lines). For example, suppose that you have defined a line action using the following command:
DB<25> a 10 print ("curdir is $curdir\n");The L command then displays this line action as shown here:
main::(debugtest:10): $curdir =~ s/^\s+|\s+$//g; action: print ("curdir is $curdir\n");The line action is always displayed immediately after the line for which it is defined. This method of display enables you to distinguish lines containing line actions from other lines displayed by the L command.
Other Debugging CommandsThe following sections describe the debugging commands not previously covered.
Executing Other Perl StatementsIn the debugger, anything that is not a debugging command is assumed to be a Perl statement and is performed right away. For example:
DB<4> @array = (1, 2, 3);You can use statements such as this to alter values in your program as it is being executed. This capability is useful when you are testing your code.
The Command: Listing Preceding Commands
NOTE If you wish, you can omit the semicolon at the end of the statement.The H (for "history") command lists the preceding few commands you have entered.
DB<4> H 3: b 7 2: b 14 1: b 13The commands are listed in reverse order, with the most recently executed command listed first. Each command is preceded by its command number, which is used by the ! command (described in the following section).
The Command: Executing Previous Commands
NOTE The debugger saves only the commands that actually affect the debugging environment. Commands such as l and s , which perform useful work but do not change how the debugger behaves, are not listed by the H command.This is not a significant limitation because you can enter the letter again if needed.Each command that is saved by the debugger and can be listed by the H command has a command number. You can use this command number to repeat a previously executed command. For example, to repeat command number 5, make the following entry:
DB <11> !5 b 8 DB <12>The debugger displays command number 5-in this case, the command b 8 - and then executes it.
If you omit the number, the debugger repeats the last command executed.
DB <12> $foo += $bar + 1 DB <13> ! $foo += $bar + 1 DB <14>If you specify a negative number with ! , the debugger skips back that many commands:
DB <14> $foo += $bar + 1 DB <15> $foo *= 2 DB <16> ! -2 $foo += $bar + 1 DB <17>Here, the ! -2 command refers to the command $foo += $bar + 1 .
The Command: Stack Tracing
You can use ! only to repeat commands that are actually repeatable. Use the H command to list the commands that the debugger has saved and that can be repeatedThe T command enables you to display a stack trace, which is a collection of all the subroutines that have been called, listed in reverse order. Here is an example:
DB <16> T $ = &main::sub2('hi') from file debug1 line 7 $ = &main::sub1('hi') from file debug1 line 3Here, the T command indicates that the program is currently inside subroutine sub2 , which was called from line 7 of your program; this subroutine is part of the main package. The call to sub2 is passed the argument 'hi' .
The $ = preceding the subroutine name indicates that the subroutine call is expecting a scalar return value. If the call is expecting a list to be returned, the characters @ = appear in front of the subroutine name.
The next line of the displayed output tells you that sub2 was called by another subroutine, sub1 . This subroutine was also passed the argument 'hi' , and it was called by line 3 of the program. Because the stack trace lists no more subroutines, line 3 is part of your main program.
The Command: Printing an Expression
NOTE The list of arguments passed to a subroutine that is displayed by the stack trace is the list of actual values after variable substitution and expression evaluation are performed. This procedure enables you to use the stack trace to check whether your subroutines are being passed the values you expect.An easy way to print the value of an expression from inside the debugger is to use the p command.
DB <17> p $curdir + 1 1The p command evaluates the expression and displays the result.
The Command: Defining Aliases
NOTE The p command writes to the screen even when the program has redirected STDOUT to a file.If you find yourself repeatedly entering a long debugging command and you want to save yourself some typing, you can define an alias for the long command by using the = command. For example:
DB <15> = pc print ("curdir is $curdir\n"); = pc print ("curdir is $curdir\n");The = command prints the alias you have just defined and then stores it in the associative array %DB'alias (package DB , array name alias ) for future reference. From here on, the command
DB <16> pcis equivalent to the command
DB <16> print ("curdir is $curdir\n");To list the aliases you have defined so far, enter the = command by itself:
DB <17> = pc = print ("curdir is $curdir\n")This command displays your defined aliases and their equivalent values.
Predefining AliasesYou can define aliases that are to be created every time you enter the Perl debugger.
When the debugger starts, it first searches for a file named .perldb in your home directory. If the debugger finds this file, it executes the statements contained there.
To create an alias, add it to the .perldb file. For example, to add the alias
= pc print ("curdir is $curdir\n");add the following statement to your .perldb file:
$DB'alias{"pc"} = 's/^pc/print ("curdir is $curdir\n");/';Here's how this works: when the Perl debugger creates an alias, it adds an element to the $DB'alias associative array. The subscript for this element is the alias you are defining, and the value is a substitution command that replaces the alias with the actual command you want to use. In the preceding example, the substitution takes any command starting with pc and replaces it with
print ("curdir is $curdir\n");The Command: Debugger Help
Be careful when you define aliases in this way. For example, your substitution should match only the beginning of a command, as in /^pc/ . Otherwise, the alias will replace any occurrence of the letters pc with your print command, which is not what you want.The h (for help) command provides a list of each of the debugger commands listed in today's lesson, along with a one-line explanation of each. This is handy if you are in the middle of debugging a program and forget the syntax of a particular command.
SummaryToday, you have learned about the Perl debugger. This debugger enables you to perform the following tasks, among others:
Q&A
- List any part of your source file
- Step through your program one statement at a time
- Display any variables you have defined
- Set breakpoints, which tell the debugger when to stop and request further commands
- Set line actions, which are statements to be executed when the program reaches a particular line
- Trace program execution as it happens
- Print a stack trace, which lists the current subroutine you are in and the subroutines that called it
Workshop
Q: Is it possible to enter more than one debugging command at a time? A: No; however, there's no real need to do so. If you want to perform several single steps at once, use the c command to skip ahead to a specified point. If you want to both step ahead and print the value of a variable, use the < or > command. Q: Is it possible to examine variables in one package while inside another? A: Yes. Use the V command or the standard Perl package/variable syntax. Q: If I discover that my program works and I want to turn off debugging, what do I do? A: You cannot exit the debugger in the middle of a program. However, if you delete all breakpoints and line actions and then enter the c command, the program begins executing normally and is no longer under control of the debugger. Q: How can I convert to a reusable breakpoint a one-time breakpoint created using c ? A: By default, the b command sets a breakpoint at the line that is about to be executed. This is the line at which c has set its one-time breakpoint. Q: How can I execute other UNIX commands from inside the debugger? A: Enter a statement containing a call to the Perl system function. For example, to display the contents of the current directory, enter the following command:
DB <11> system ("ls");
To temporarily escape from the debugger to a UNIX shell, enter the following command:
DB <12> system ("sh");
When you are finished with the shell, enter the command exit, and you will return to the debugger.Q: What special built-in variables can be accessed from inside the debugger? A: All of them. The Workshop provides quiz questions to help you solidify your understanding of the material covered.
Quiz
- Define the following terms:
- trace mode
- stack trace
- breakpoint
- line action
- Explain the differences between the X and V commands.
- Explain the differences between the // and ?? commands.
- Explain the differences between the < and > commands.
- Explain the differences between the s and n commands.
- What do the following commands do?
- l
- l 26
- l 5-7
- l 5+7
- w
![]()
![]()
![]()
![]()
![]() |
![]() |
![]() |
Dec 20, 2017 | stackoverflow.com
The way I do this is by having the following line in my~/.perldb
file:
&parse_options("HistFile=$ENV{HOME}/.perldb.hist");
Debugger commands are then stored in
~/.perldb.hist
and accessible across sessions.I did the following:1) Created
~/.perldb
, which did not exist previously.2) Added
&parse_options("HistFile=$ENV{HOME}/.perldb.hist");
from mirod's answer.3) Added
export PERLDB_OPTS=HistFile=$HOME/.perldb.history
to ~/.bashrc from mephinet's answer.4) Ran
source .bashrc
5) Ran
perl -d my program.pl
, and got this warning/errorperldb: Must not source insecure rcfile /home/ics/.perldb. You or the superuser must be the owner, and it must not be writable by anyone but its owner.6) I protected
~/.perldb
with owner rwchmod 700 ~/.perldb
, and the error went away.
![]() |
![]() |
![]() |
Dec 20, 2017 | nnc3.com
Subroutines
There is one more variation of the list code command, l . It is the ability to list the code of a subroutine, by typing l sub , where sub is the subroutine name.
Running the code in Listing 2 returns:
Loading DB routines from perl5db.pl version 1 Emacs support available. Enter h or h h for help. main::(./p2.pl:3): require 5.001; DB<1>Entering l searchdir allows us to see the text of searchdir , which is the meat of this program.
22 sub searchdir { # takes directory as argument 23: my($dir) = @_; 24: my(@files, @subdirs); 25 26: opendir(DIR,$dir) or die "Can't open \" 27: $dir\" for reading: $!\n"; 28 29: while(defined($_ = readdir(DIR))) { 30: /^\./ and next; # if file begins with '.', skip 31 32 ### SUBTLE HINT ###As you can see, I left a subtle hint. The bug is that I deleted an important line at this point.Setting Breakpoints
If we were to step through every line of code in a subroutine that is supposed to be recursive, it would take all day. As I mentioned before, the code as in Listing 2 seems only to list the files in the current directory, and it ignores the files in any subdirectories. Since the code only prints the files in the current, initial directory, maybe the recursive calls aren't working. Invoke the Listing 2 code under the debugger.
Now, set a breakpoint. A breakpoint is a way to tell the debugger that we want normal execution of the program until it gets to a specific point in the code. To specify where the debugger should stop, we insert a breakpoint. In the Perl debugger, there there are two basic ways to insert a breakpoint. The first is by line number, with the syntax b linenum . If linenum is omitted, the breakpoint is inserted at the next line about to be executed. However, we can also specify breakpoints by subroutine, by typing b sub , where sub is the subroutine name. Both forms of breakpointing take an optional second argument, a Perl conditional. If when the flow of execution reached the breakpoint the conditional evaluates to true, the debugger will stop at the breakpoint; otherwise, it will continue. This gives greater control of execution.
For now we'll set a break at the searchdir subroutine with b searchdir . Once the breakpoint is set, we'll just execute until we hit the subroutine. To do this, enter c (for continue). Adding Actions
Looking at the code in Listing 2, we can see that the first call to searchdir comes in the main code. This seems to works fine, or else nothing would be printed out. Press c again to continue to the next invocation of searchdir , which occurs in the searchdir routine.
We wish to know what is in the $dir variable, which represents the directory that will be searched for files and subdirectories. Specifically, we want to know the contents of this variable each time we cycle through the code. We can do this by setting an action. By looking at the program listing, we see that by line 25, the variable $dir has been assigned. So, set an action at line 25 in this way:
a 25 print "dir is $dir\n"Now, whenever line 25 comes around, the print command will be executed. Note that for the a command, the line number is optional and defaults to the next line to be executed.
Pressing c will execute the code until we come across a breakpoint, executing action points that are set along the way. In our example, pressing c continuously will yield the following:
main::(../p2.pl:3): require 5.001; DB<1> b searchdir DB<2> a 25 print "dir is $dir\n" DB<3> c main::searchdir(../p2.pl:23): my($dir) = @_; DB<3> c dir is . main::searchdir(../p2.pl:23): my($dir) = @_; DB<3> c dir is dir1.0 main::searchdir(../p2.pl:23): my($dir) = @_; DB<3> c dir is dir2.0 main::searchdir(../p2.pl:23): my($dir) = @_; DB<3> c dir is dir3.0 file1 file1 file1 file1 DB::fake::(/usr/lib/perl5/perl5db.pl:2043): 2043: "Debugged program terminated. Use `q' to quit or `R' to restart."; DB<3>Note that older versions of the debugger don't output the last line as listed here, but instead exit the debugger. This newer version is nice because when the program has finished it still lets you have control so that you can restart the program.
It still seems that we aren't getting into any subdirectories. Enter D and A to clear all breakpoints and actions, respectively, and enter R to restart. Or, in older debugger versions, simply restart the program to begin again.
We now know that the searchdir subroutine isn't being called for any subdirectories except the first level ones. Looking back at the text of the program, notice in lines 44 through 46 that the only time the searchdir subroutine is called recursively is when there is something in the @subdirs list. Put an action at line 42 that will print the $dir and @subdirs variables by entering:
a 42 print "in $dir is @subdirs \n"Now, put a breakpoint at line 12 to prevent the program from outputting to our screen ( b 12 ), then enter c . This will tell us all the subdirectories that our program thinks are in the directory.
main::(../p2.pl:3): require 5.001; DB<1> a 42 print "in $dir is @subdirs \n" DB<2> b 12 DB<3> c in . is dir1.0 dir2.0 dir3.0 in dir1.0 is in dir2.0 is in dir3.0 is main::(../p2.pl:12): foreach (@files) { DB<3>This program sees that there are directories in ".", but not in any of the subdirectories within ".". Since we are printing out the value of @subdirs at line 42, we know that @subdirs has no elements in it. (Notice that when listing line 42, there is the letter "a" after the line number and a colon. This tells us that there is an action point here.) So, nothing is being assigned to @subdirs in line 37, but should be if the current (as held in $_ ) file is a directory. If it is, it should be pushed into the @subdirs list. This is not happening.One error I've committed (intentionally, of course) is on line 38. There is no catch-all "else" statement. I should probably put an error statement here. Instead of doing this, let's put in another action point. Reinitialize the program so that all points are cleared and enter the following:
a 34 if( ! -f $_ and ! -d $_ ) { print "in $dir: $_ is weird!\n" } b 12" cwhich reveals:
main::(../p2.pl:3): require 5.001; DB<1> a 34 if( ! -f $_ and ! -d $_ ) { print "in $dir: $_ is weird!\n" } DB<2> b 12 DB<3> c in dir1.0: dir1.1 is weird! in dir1.0: dir2.1 is weird! in dir1.0: file2 is weird! in dir1.0: file3 is weird! in dir2.0: dir2.1 is weird! in dir2.0: dir1.1 is weird! in dir2.0: file2 is weird! in dir2.0: file3 is weird! main::(../p2.pl:12): foreach (@files) { DB<3>While the program can read (through the readdir call on line 29) that dir1.1 is a file of some type in dir1.0, the file test (the -f construct) on dir1.1 says that it is not.It would be nice to halt the execution at a point (line 34) where we have a problem. We can use the conditional breakpoint that I mentioned earlier to do this. Reinitialize or restart the debugger, and enter:
b 34 ( ! -f $_ and ! -d $_ ) c p p $dirYou'll get output that looks like this:
main::(../p2.pl:3): require 5.001; DB<1> b 34 ( ! -f $_ and ! -d $_ ) DB<2> c main::searchdir(../p2.pl:34): if( -f $_) { # if its a file... DB<2> p dir1.1 DB<2> p $dir dir1.0 DB<3>The first line sets the breakpoint, the next c executes the program until the break point stops it. The p prints the contents of the variable $_ and the last command, p $dir prints out $dir . So, dir1.1 is a file in dir1.0, but the file tests ( -d and -f ) don't admit that it exists, and therefore dir1.1 is not being inserted into @subdirs (if it's a directory) or into @files (if it's a file).Now that we are back at a prompt, we could inspect all sorts of variables, subroutines or any other Perl construct. To save you from banging your heads against your monitors, and thus saving both your heads and your monitors, I'll tell you what is wrong.
All programs have something known as the current working directory (CWD). By default, the CWD is the directory where the program starts. Any and all file accesses (such as file tests or file and directory openings) are made in reference from the CWD. At no time does our program change its CWD. But the values returned by the readdir call on line 29 are simply file names relative to the directory that readdir is reading (which is in $dir ). So, when we do the readdir , $_ gets assigned a string representing a file (or directory) within the directory in $dir (which is why it's called a subdirectory). But when running the -f and -d file tests, they look for $_ in the context of the CWD. But it isn't in the CWD, it's in the directory represented by $dir . The moral of the story is that we should be working with $dir/$_ , not just $_ . So the string
###SUBTLE HINT###should be replaced by
$_ = "$dir/$_"; # make all path names absoluteThat sums it up. Our problem was we were dealing with relative paths, not absolute (from the CWD) paths.Putting it back into our example, we need to check dir1.0/dir1.1 , not dir1.1 . To check to make sure that this is what we want, we can put in another action point. Try typing:
a 34 $_ = "$dir/$_" cIn effect this temporarily places the corrective measure into our code. Action points are the first item on the line to be evaluated. You should now see the proper results of the execution of the program:
DB<1> a 34 $_ = "$dir/$_" DB<2> c ./file1 ./dir1.0/file1 ./dir1.0/file2 ./dir1.0/file3 ./dir1.0/dir1.1/file1 ./dir1.0/dir1.1/file2 ./dir1.0/dir1.1/file3 ./dir2.0/file1 ./dir2.0/file2 ./dir2.0/file3 ./dir2.0/dir2.1/file1 ./dir2.0/dir2.1/file2 ./dir3.0/file1 DB::fake::(/usr/lib/perl5/perl5db.pl:2043): 2043: "Debugged program terminated. Use `q' to quit or `R' to restart."; DB<2>Stack Traces
Now that we've got the recursive call debugged, let's play with the calling stack a bit. Giving the command T will display the current calling stack. The calling stack is a list of the subroutines which have been called between the current point in execution and the beginning of execution. In other words, if the main portion of the code executes subroutine "a", which in turn executes subroutine "b", which calls "c", then pressing "T" while in the middle of subroutine "c" outputs a list going from "c" all the way back to "main".
Start up the program and enter the following commands (omit the second one if you have fixed the bug we discovered in the last section):
b 34 ( $_ =~ /file2$/) a 34 $_ = "$dir/$_" cThese commands set a breakpoint that will only stop execution if the value of the variable $_ ends with the string file2 . Effectively, this code will halt execution at arbitrary points in the program. Press T and you'll get this:
@ = main::searchdir('./dir1.0/file2') called from file '../p2.pl' line 45 @ = main::searchdir(.) called from file '../p2.pl' line 10Enter c , then T again:
@ = main::searchdir('./dir1.0/dir1.1/file2') called from file `../p2.pl' line 45 @ = main::searchdir(undef) called from file '../p2.pl' line 45 @ = main::searchdir(.) called from file '../p2.pl' line 10Do it once more:
@ = main::searchdir('./dir2.0/file2') called from file '../p2.pl' line 45 @ = main::searchdir(.) called from file '../p2.pl' line 10You can go on, if you so desire, but I think we have enough data from the arbitrary stack dumps we've taken.
We see here which subroutines were called, the debugger's best guess of which arguments were passed to the subroutine and which line of which file the subroutine was called from. Since the lines begin with @ = , we know that searchdir will return a list. If it were going to return a scalar value, we'd see $ = . For hashes (also known as associative arrays), we would see % = .
I say "best guess of what arguments were passed" because in Perl, the arguments to subroutines are placed into the @_ magic list. However, manipulating @_ (or $_ ) in the body of the subroutine is allowed and even encouraged. When a T is entered, the stack trace is printed out, and the current value of @_ is printed as the arguments to the subroutine. So when @_ is changed, the trace doesn't reflect what was actually passed as arguments to the subroutine.
![]() |
![]() |
![]() |
Dec 20, 2017 | affy.blogspot.com
The = command is used to create command aliases. If you find yourself issuing the same long command over and over again, you can create an alias for that command. For example, the debugger command
= pFoo print("foo=$foo\n");creates an alias called pFoo . After this command is issued, typing pFoo at the debugger prompt produces the same results as typing print("foo=$foo\n"); .You use the = command without any arguments when you want a list of the current aliases.
If you want to set up some aliases that will always be defined, create a file called .perldb and fill it with your alias definitions. Use the following line as a template:
$DB::alias{'pFoo'} = 'print("foo=$foo\n");';After you create this file and its alias definitions, the aliases will be available in every debugging session.
![]() |
![]() |
![]() |
Dec 20, 2017 | shinnok.com
Revision 0.1 for Perl Debugger version 5.8.x
Copyright: Andrew Ford refcards.com™
... ... ...
Debugger Commands
The debugger reads commands from the files .perldb in the current and home directories, and stops before the first run-time executable statement, displaying the line it is about to execute and a prompt:
DB<1>
If you run code from the debugger and hit another breakpoint, the prompt will look like DB"42". The numbers within the angle brackets are the command numbers, used when repeating commands.
Any input to the debugger that is not recognized is executed as Perl code in the current package.
Prefixing a command with ' | ' pipes the output to your current pager.
Help and Quiting
- h -- display summary' help
- h h -- display extended help
- h command -- display help on command
- man [manpage] -- view manpage for the command
- q -- quit the debugger
Debugger Control
- ! number -- re-execute debugger command
- H [-number] -- display history of commands
- source filename execute commands from file
- save filename -- save history to file
- = alias string set alias
- = list aliases
- !! command -- execute shell command
... ... ...
![]() |
![]() |
![]() |
Dec 20, 2017 | ods.com.ua
... ... ... Looking at ValuesTo see the values of certain variables in the program, use the V command. Used by itself, V lists all the variables in scope at this time. Here's the syntax:
V [ package [ variable ]]To look at values in your program, you'll want to look at the main package. For example, to print the value of $reply , use this command:
V main reply
$reply = '1'Note that the dollar sign before the variable specified to V is not supplied. Therefore, if you specify the command V main $reply , you are actually asking for the value of $$reply and not $reply .
The trace option is available with the t toggle command. Issuing trace once turns it on, and issuing it again turns it off. See Figure 30.4 for a sample use of the trace command on Listing 30.2. In this example, trace is turned on, and then the c command is issued to run the debugger continuously. In trace mode, the debugger prints out each line of code that executes.
Figure 30.4 : Using the trace command with breakpoints.
The X command is helpful when displaying values of variables in the current package. Remember that the main package is the default package for a Perl script. Issued by itself with no options, the X command displays all the variables in the current package. Avoid issuing the X command by itself because it can generate a very long listing of all the variables in the main package.
To see the value of a particular variable instead of all the variables, type the name of the variable after the X command. For example, the following command
X fileNumberwill print the value of the fileNumber variable in the current package. If you have array variables and scalar variables with the same name in the same package, the X command will display the values of both these variables. For example, if you have a scalar variable called names and an array called names , the X command will show the values of both variables:
DB<3> X namesBreakpoints
$names = "kamran"
@names = (
"kamran"
"joe"
"donald"
)You can place breakpoints at suspect locations in your code and run the program until one of the specified breakpoints is hit. Breakpoints can be specified to be hit as soon as the line of code is about to be executed.
The c command is used to step forward until either the program stops or a specified breakpoint is hit. To specify a breakpoint at the current line, use the b command without any parameters. To specify a specific line, use the command of the form:
b linenumberUsually, you use trace statements to see statements between the current execution point and a breakpoint (refer to Figure 30.4). The program is run in continuous mode with the c command until it hits a breakpoint. There is a breakpoint in Listing 30.1 that causes the debugger to stop. The L command is issued in the example to list the breakpoints in the system.
Breakpoints can also be specified to occur at the first executable line of code within a subroutine. Simply use the b command with the name of the subroutine as the first parameter. For example, to break at the first line of code in the xyc subroutine, try this command:
b xycYou can also ask the debugger to look at a condition when a line is hit with a breakpoint tag on it. If the breakpoint is specified at a line and the condition is true, the debugger stops; otherwise, it keeps on going. For example, if you want the debugger to stop in xyc only when the global $reply is 1 , use this command:
b xyc ($reply == '1')To list all breakpoints defined during a debug session, use the L command. If you issue unconditional breakpoints, you'll see breakpoints listed as this:
break if (1)The L command will also list up to the last five executed lines of the program.
To remove a breakpoint, use the d command and specify the line number to delete. To remove all breakpoints, use the D command. For example, to delete a breakpoint at line 12, you would issue the command d 12 .
The DB package uses the following sequence to hit breakpoints and evaluate code on each line of executable code:
Actions
- Checks to see whether the breakpoint is defined at this line number. If there is no breakpoint defined for this line, it starts to process the next line. If there is a break-
point at this line, the debugger prepares to stop. If the condition for the defined breakpoint is true, the debugger stops execution and presents a prompt to the user.- Checks to see whether the line of code is printable. If so, it prints the entire line of code (including code spanning multiple lines).
- Checks to see whether there are any actions defined for this line and performs these actions. (An action is a set of Perl commands to be executed.)
- Checks to see whether the stop was due to a breakpoint. If the condition for the breakpoint is true and a breakpoint has been marked in this location, the debugger stops and presents a prompt for user interaction.
- Evaluates the line and gets ready to execute it. Gets user input if the user is stopping; otherwise, it executes the line and returns to item 1 in order to process the next line.
You can specify actions to take when a certain line of code is executed. This step is very important when you want to print out values as the program executes (see Figure 30.5). Notice how the value of reply is printed out when line 73 is reached. The action is defined with this statement:
Figure 30.5 : Using actions in the debugger.
a 73 print "I am on line 73 and reply is $reply"Notice that you did not have to terminate the action command with a semicolon. You need to use semicolons only if you have more than one statement for an action. If you forget to supply the terminating semicolon, the debugger will supply it for you. In any event, try to keep actions simple and short. Don't write lengthy actions unless absolutely necessary; otherwise, you'll slow down the debugger and clutter up the output on your terminal.
Actions are not limited to displaying values. For instance, you can use an action to reset a variable to a known value while in a loop, using a statement like this:
a 73 $reply = 1; print "forced reply to 1\n";To execute statements within the debugged program's space, simply type the command at the prompt. For example, to explicitly create and set the value of $kw to 2 in the code, use the following commands at the DB<> prompt:
DB<1> $kw = 2
... nothing is printed here ...
DB<1> print $kw
2
DB<1> V main kw
$kw = '2'In this example, the variable $kw is created and defined in the program environment. You cannot modify the source code in the original program, but you can add items to the name space.
In some cases, your program may have redirected its output to STDOUT and therefore whatever it is printing will not be shown on the console. To evaluate an expression and print its value out to the console regardless of how STDOUT is redirected, you can use the p command. The p command evaluates an expression in the current program's environment and prints it out to the debugger console. Basically, the print command prints the output to wherever STDOUT is redirected, whereas the p command is equivalent to the following print command:
print DB::OUTThe command above forces output from a print command to where the DB:: package prints its output.
Searching for PatternsTo look for certain strings in the source code, you can use the forward slash command followed by the string to look for. Note that there are no spaces between the / and the string you are looking for. The string can be specified between two slashes, but the second slash is optional. Actually, you can search for regular expressions, just as in Perl.
To search forward in the file, use the / operator. To search backward, use the question mark operator ( ? ).
The history of the commands you have executed is tracked in the debugger. Only commands greater than one character long are listed in this directory. To execute commands from the history list, use the bang operator ( ! ) followed by the index of the command. To execute a command from the history, type ! and the index of the command to redo. This should be familiar to Bash and C shell programmers.
To see the current history of commands in the buffer of commands in the debugger, type the H command. For example, in the middle of a debug session, if you type in the H command at the DB<3> prompt, you should expect to see three items listed in reverse order of execution:
DB<3> HSubroutines
3: b 79
2: w 2
1: w 9To list all the subroutines currently in the system, use the S command. The output from the S command lists all subroutines in any package that your code uses. For example, if you run the program in Listing 30.2 with the debugger, you will see output as shown in Figure 30.6.
Figure 30.6 : Listing subroutine names.
Listing 30.2. A sample listing.
1 #!/usr/bin/perl -d
2
3 use VRML;
4 use VRML::Cube;
5
6 my $header = VRML::new();
7 $header->VRML::startHeader;
8
9 $header->VRML::startSeparator;
10
11 my $cubeb = $header->VRML::putCube(
12 'width' => 0.5, 'height' => 0.5 , 'depth' => 0.5 ,
13 'translation' => [1,0,0]
14 );
15 my $cubed = $header->VRML::putCube(
16 'width' => 1, 'height' => 1 , 'depth' => 1 ,
17 'translation' => [1,1,0],
18 );
19 $header->VRML::stopSeparator;
At any time in a debug session, you can do a "stack trace," which is a listing of the calling order of the functions called so far. Be aware that if you are modifying the argument stack in any way, the values of the passed arguments might not be correct. The T command will do a stack trace for you.
CaveatsFirst of all, there is no way to restart the debugger if there is a problem. If you overstep something, you have to start all over. This means getting out of the program and restarting the debugger.
Second, the debugger itself is not completely debugged yet. If you notice certain problems, such as your commands not being recognized, it's probably because you typed too many characters at the prompt.
Table 30.1 lists the information about the available debugger commands. All information in this table is gleaned from the perl5db.pl source file. Keep this table handy so that you don't have to go to the file to see what options are available.
Table 30.1. The commands available from the debugger. Customizing Your Debugger Environment
Command Description a [ ln ] command Sets an action to take before the line is executed. b Sets an unconditional breakpoint at the current line. b [ ln ] [ cond ] Sets a breakpoint if the condition is true at the specified line number. b sname [ cond ] Sets a breakpoint at the first line inside the subroutine sname() . c Continues until the next breakpoint or until the end of the program. c line Continues and stops at the specified line. d [ line ] Deletes the breakpoint at a given line. D Deletes all breakpoints. f filename Switches to the filename as the default. H - number Displays history of all commands longer than one character. L Lists all breakpoints and actions. l min+incr Lists incr+1 lines starting at line #min . l min-max Lists lines from min to max , inclusively. l line Lists one line of code at a specified line. l Lists the next 10 lines of code from the last location. l name Lists a subroutine by name. n Next code at the same level. Steps over subroutine calls. p expr Same as print DB::OUT expr in current package. q or ^D Quits. You cannot use quit . r Returns from current subroutine. s Single-step over code. Steps into subroutines. S Lists all known subroutine names in the current scope. t Toggles trace mode on and off. T Performs a stack trace. V Lists all variables in all used packages. V pkg List all variables in a given package. V pkg var Lists all variables in a package that have var in them. w line Lists five lines before and five lines after current line. <CR> Repeats last n or s . - Lists the previous window. / regexp / Searches forward for a pattern using a regular expression. ? regexp ? Searches backward for a pattern using a regular expression. < command Defines the command before the prompt. > command Defines the command after the prompt. ! number Redoes a command (the default is the previous command). ! - number Redoes number\'th to the last command. = [ alias value ] Starts a command alias. = Lists all the current aliases. command Executes as a Perl statement in the current package. There are ways to customize your debugger environment. If you do not like the one-character commands that come with the debugger, you can use different aliases. There is a hash in the DB:: package called %alias() that contains the command strings. You can substitute your own commands in place of the existing ones using the = command. Since most of the time you'll want to keep your changes consistent between debug sessions, you can edit a file called .perldb in the current working directory and place the assignments there. Here's a sample .perldb file:
$DB::alias{'ln'} = 's/ln/p $1/';
$DB::alias{'z'} = 's/z/l/';These two lines will substitute the value of p for every command ln you type, and the value of l for every z command. Of course, you'll probably want to alias long commands into short one-character sequences to save yourself some time.
Using the debugger should not be your only method for getting bugs out of the system. The -w switch is important if you want Perl to do checking and warn you of error conditions while executing. The types of messages generated vary from warnings to notifications of fatal errors that can cause the program to abort.
For More InformationReading the source file perl5db.pl gives you a few clues about how the debugger works and the commands that are available during a debug session. Consult the perldebug.html page at www.metronet.com . This file contains the full list of all the options in the debug environment. Review the perldiag.html page for a list of possible diagnostic values you get from using the w switch.
SummaryNothing really beats the use of well-placed print statements to do debugging. However, Perl does offer a simple yet powerful debugging tool with the -d option. The interactive debugger lets you step through code, into or over subroutines, set breakpoints, execute commands, and look at variables in a Perl program.
![]() |
![]() |
![]() |
Dec 20, 2017 | perldoc.perl.org
chomp Perl functions A-Z | Perl functions by category | The 'perlfunc' manpage
- chomp VARIABLE
- chomp( LIST )
- chomp
This safer version of chop removes any trailing string that corresponds to the current value of $/ (also known as
$INPUT_RECORD_SEPARATOR
in the English module). It returns the total number of characters removed from all its arguments. It's often used to remove the newline from the end of an input record when you're worried that the final record may be missing its newline. When in paragraph mode ($/ = ''
), it removes all trailing newlines from the string. When in slurp mode ($/ = undef
) or fixed-length record mode ( $/ is a reference to an integer or the like; see perlvar ), chomp won't remove anything. If VARIABLE is omitted, it chomps $_ . Example:If VARIABLE is a hash, it chomps the hash's values, but not its keys, resetting the each iterator in the process.
You can actually chomp anything that's an lvalue, including an assignment:
If you chomp a list, each element is chomped, and the total number of characters removed is returned.
Note that parentheses are necessary when you're chomping anything that is not a simple variable.
This is because
chomp $cwd = `pwd`
is interpreted as(chomp $cwd) = `pwd`
, rather than aschomp ($cwd = `pwd`)
which you might expect.Similarly,
chomp $a $b
is interpreted aschomp($a) $b
rather than aschomp $a $b
![]() |
![]() |
![]() |
Oct 01, 2009 | perlide.org
In October 2009 we ran a poll asking people Which editor(s) or IDE(s) are you using for Perl development? . The poll was promoted via the blog of Gabor Szabo which is syndicated in several Perl related sites such as the Iron Man Challenge , Perlshpere and Planet Perl . It was also promoted via Twitter , the Perl group in Reddit , the Perl Mongers group in LinkedIn and the Perl Community Adserver to get more people to cast their vote. Request was also sent to the Perl Monger group leaders. Some of them have forwarded the request to their respective groups.
The list of editors was taken from the Perl Development Tools page on Perlmonks and the "randomize answers" checkbox was clicked after filling in the data. No idea if that really randomized the answers. During the poll people could mark other editors and type in the name of and editor. Some of these editors were added to the list of possible answers during the poll. In addition there were people who typed in the name of the editor in the other field even though the name appeared on the list.
At the begining we set the poll to allow multiple choice with up to 3 answers per person but later on we noticed that at one of the updates it became multiple choice unlimited answers. Unfortunatelly the free polling system we used gave details only on the number of answers and not the number of people who answered.
The poll ran between 21-24 October 2009 for about 72 hours. There were 3,234 answers when it was closed.
The results are as follows.
Vim (or vi or gvim) 1097 34% Emacs (or xemacs, with or without extensions) 430 13% Ultra Edit (plain or Studio) 224 7% Eclipse EPIC 210 6% Other answer... 143 4% Notepad++ 142 4% Komodo IDE 128 4% Komodo Edit 105 3% TextMate 105 3% Padre 101 3% Kate 56 2% Gedit 55 2% TextPad 49 2% nano 40 1% SciTE 38 1% Geany 36 1% NEdit 27 1% mcedit 26 1% EditPlus 26 1% BBEdit 25 1% JEdit 23 1% Joe 20 1% Smultron 16 0% TextWrangler 14 0% PSPad 12 0% Notepad2 12 0% Open Perl IDE 10 0% OptiPerl 9 0% Pico 7 0% Jed 6 0% Kephra 6 0% SlickEdit 6 0% KDevelop 6 0% Notepad 5 0% Crimson 4 0% Anjuta 3 0% EngInSite-Perl 3 0% KEdit 3 0% Perl Express 2 0% DzSoft Perl 2 0% PerlWiz 1 0% Far 1 0% Perl Studio 0 0% Perl Builder 0 0% Editeur 0 0% Perl Code Editor 0 0% ED for Windows 0 0% PerlEdit 0 0% FTE 0 0% visiPerl+ 0 0% Prof. Notepad 0 0% Perl Scripting Tool 0 0%
![]() |
![]() |
![]() |
Mar 13, 2007 | cs.rpi.edu
On this page, I will post aides and tools that Perl provides which allow you to more efficently debug your Perl code. I will post updates as we cover material necessary for understanding the tools mentioned.
- CGI::Dump
- Dump is one of the functions exported in CGI.pm's :standard set. It's functionality is similar to that of Data::Dumper . Rather than pretty-printing a complex data structure, however, this module pretty-prints all of the parameters passed to your CGI script. That is to say that when called, it generates an HTML list of each parameter's name and value, so that you can see exactly what parameters were passed to your script. Don't forget that you must print the return value of this function - it doesn't do any printing on its own.
use CGI qw/:standard/; print Dump;- Benchmark
- As you know by now, one of Perl's mottos is "There's More Than One Way To Do It" (TMTOWTDI ©). This is usually a Good Thing, but can occasionally lead to confusion. One of the most common forms of confusion that Perl's verstaility causes is wondering which of multiple ways one should use to get the job done most quickly.
Analyzing two or more chunks of code to see how they compare time-wise is known as "Benchmarking". Perl provides a standard module that will Benchmark your code for you. It is named, unsurprisingly, Benchmark . Benchmark provides several helpful subroutines, but the most common is called cmpthese() . This subroutine takes two arguments: The number of iterations to run each method, and a hashref containing the code blocks (subroutines) you want to compare, keyed by a label for each block. It will run each subroutine the number of times specified, and then print out statistics telling you how they compare.
For example, my solution to ICA5 contained three different ways of creating a two dimensional array. Which one of these ways is "best"? Let's have Benchmark tell us:
#!/usr/bin/perl use strict; use warnings; use Benchmark 'cmpthese'; sub explicit { my @two_d = ([ ('x') x 10 ], [ ('x') x 10 ], [ ('x') x 10 ], [ ('x') x 10 ], [ ('x') x 10 ]); } sub new_per_loop { my @two_d; for (0..4){ my @inner = ('x') x 10; push @two_d, \@inner; } } sub anon_ref_per_loop { my @two_d; for (0..4){ push @two_d, [ ('x') x 10 ]; } } sub nested { my @two_d; for my $i (0..4){ for my $j (0..9){ $two_d[$i][$j] = 'x'; } } } cmpthese (10_000, { 'Explicit' => \&explicit, 'New Array Per Loop' => \&new_per_loop, 'Anon. Ref Per Loop' => \&anon_ref_per_loop, 'Nested Loops' => \&nested, } );The above code will print out the following statistics (numbers may be slightly off, of course):Benchmark: timing 10000 iterations of Anon. Ref Per Loop, Explicit, Nested Loops, New Array Per Loop... Anon. Ref Per Loop: 2 wallclock secs ( 1.53 usr + 0.00 sys = 1.53 CPU) @ 6535.95/s (n=10000) Explicit: 1 wallclock secs ( 1.24 usr + 0.00 sys = 1.24 CPU) @ 8064.52/s (n=10000) Nested Loops: 4 wallclock secs ( 4.01 usr + 0.00 sys = 4.01 CPU) @ 2493.77/s (n=10000) New Array Per Loop: 2 wallclock secs ( 1.76 usr + 0.00 sys = 1.76 CPU) @ 5681.82/s (n=10000) Rate Nested Loops New Array Per Loop Anon. Ref Per Loop Explicit Nested Loops 2494/s -- -56% -62% -69% New Array Per Loop 5682/s 128% -- -13% -30% Anon. Ref Per Loop 6536/s 162% 15% -- -19% Explicit 8065/s 223% 42% 23% --The benchmark first tells us how many iterations of which subroutines it's running. It then tells us how long each method took to run the given number of iterations. Finally, it prints out the statistics table, sorted from slowest to fastest. The Rate column tells us how many iterations each subroutine was able to perform per second. The remaining colums tells us how fast each method was in comparison to each of the other methods. (For example, 'Explicit' was 223% faster than 'Nested Loops', while 'New Array Per Loop' is 13% slower than 'Anon. Ref Per Loop'). From the above, we can see that 'Explicit' is by far the fastest of the four methods. It is, however, only 23% faster than 'Ref Per Loop', which requires far less typing and is much more easily maintainable (if your boss suddenly tells you he'd rather have the two-d array be 20x17, and each cell init'ed to 'X' rather than 'x', which of the two would you rather had been used?).
You can, of course, read more about this module, and see its other options, by reading: perldoc Benchmark
- Command-line options
- Perl provides several command-line options which make it possible to write very quick and very useful "one-liners". For more information on all the options available, refer to perldoc perlrun
- -e
- This option takes a string and evaluates the Perl code within. This is the primary means of executing a one-liner
perl -e'print qq{Hello World\n};'(In windows, you may have to use double-quotes rather than single. Either way, it's probably better to use q// and qq// within your one liner, rather than remembering to escape the quotes).- -l
- This option has two distinct effects that work in conjunction. First, it sets $\ (the output record terminator) to the current value of $/ (the input record separator). In effect, this means that every print statement will automatically have a newline appended. Secondly, it auto-chomps any input read via the <> operator, saving you the typing necessary to do it.
perl -le 'while (<>){ $_ .= q{testing}; print; }'The above would automatically chomp $_, and then add the newline back on at the print statement, so that "testing" appears on the same line as the entered string.- -w
- This is the standard way to enable warnings in your one liners. This saves you from having to type use warnings;
- -M
- This option auto- use s a given module.
perl -MData::Dumper -le'my @foo=(1..10); print Dumper(\@foo);'- -n
- This disturbingly powerful option wraps your entire one-liner in a while (<>) { ... } loop. That is, your one-liner will be executed once for each line of each file specified on the command line, each time setting $_ to the current line and $. to current line number.
perl -ne 'print if /^\d/' foo.txt beta.txtThe above one-line of code would loop through foo.txt and beta.txt, printing out all the lines that start with a digit. ($_ is assigned via the implicit while (<>) loop, and both print and m// operate on $_ if an explict argument isn't given).- -p
- This is essentially the same thing as -n , except that it places a continue { print; } block after the while (<>) { ... } loop in which your code is wrapped. This is useful for reading through a list of files, making some sort of modification, and printing the results.
perl -pe 's/Paul/John/' email.txtOpen the file email.txt, loop through each line, replacing any instance of "Paul" with "John", and print every line (modified or not) to STDOUT- -i
- This one sometimes astounds people that such a thing is possible with so little typing. -i is used in conjunction with either -n or -p. It causes the files specified on the command line to be edited "in-place", meaning that while you're looping through the lines of the files, all print statements are directed back to the original files. (That goes for both explicit print s, as well as the print in the continue block added by -p.)
If you give -i a string, this string will be used to create a back-up copy of the original file. Like so:
perl -pi.bkp -e's/Paul/John/' email.txt msg.txtThe above opens email.txt, replaces each line's instance of "Paul" with "John", and prints the results back to email.txt. The original email.txt is saved as email.txt.bkp. The same is then done for msg.txtRemember that any of the command-line options listed here can also be given at the end of the shebang in non-one-liners. (But please do not start using -w in your real programs - use warnings; is still preferred because of its lexical scope and configurability).
- Data::Dumper
- The standard Data::Dumper module is very useful for examining exactly what is contained in your data structure (be it hash, array, or object (when we come to them) ). When you use this module, it exports one function, named Dumper . This function takes a reference to a data structure and returns a nicely formatted description of what that structure contains.
#!/usr/bin/env perl use strict; use warnings; use Data::Dumper; my @foo = (5..10); #add one element to the end of the array #do you see the error? $foo[@foo+1] = 'last'; print Dumper(\@foo);When run, this program shows you exactly what is inside @foo:
$VAR1 = [ 5, 6, 7, 8, 9, 10, undef, 'last' ];(I know we haven't covered references yet. For now, just accept my assertion that you create a reference by prepending the variable name with a backslash...)
- __DATA__ & <DATA>
- Perl uses the __DATA__ marker as a pseudo-datafile. You can use this marker to write quick tests which would involve finding a file name, opening that file, and reading from that file. If you just want to test a piece of code that requires a file to be read (but don't want to test the actual file opening and reading), place the data that would be in the input file under the __DATA__ marker. You can then read from this pseudo-file using <DATA>, without bothering to open an actual file:
#!/usr/bin/env perl use strict; use warnings; while (my $line = <DATA>) { chomp $line; print "Size of line $.: ", length $line, "\n"; } __DATA__ hello world 42 abcdeThe above program would print:
Size of line 1: 11 Size of line 2: 2 Size of line 3: 5- $.
- The $. variable keeps track of the line numbers of the file currently being processed via a while (<$fh>) { ... } loop. More explicitly, it is the number of the last line read of the last file read.
- __FILE__ & __LINE__
- These are two special markers that return, respectively, the name of the file Perl is currently executing, and the Line number where it resides. These can be used in your own debugging statements, to remind yourself where your outputs were in the source code:
print "On line " . __LINE__ . " of file " . __FILE__ . ", \$foo = $foo\n";Note that neither of these markers are variables, so they cannot be interpolated in a double-quoted string
- warn() & die()
- These are the most basic of all debugging techniques. warn() takes a list of strings, and prints them to STDERR. If the last element of the list does not end in a newline, warn() will also print the current filename and line number on which the warning occurred. Execution then proceeds as normal.
die() is identical to warn() , with one major exception - the program exits after printing the list of strings.
All debugging statements should make use of either warn() or die() rather than print() . This will insure you see your debugging output even if STDOUT has been redirected, and will give you the helpful clues of exactly where in your code the warning occurred.
![]() |
![]() |
![]() |
Dec 19, 2017 | open-perl-ide.sourceforge.net
This section explains how to use Open Perl IDE for debugging.
Important: Open Perl IDE is not able to debug any scripts, if it does not know a path to "perl.exe". If the PATH environment variable contains a valid location, then "perl.exe" will be detected automatically. Otherwise it is necessary to enter a valid location into
the "Preferences | General | Directories | Path to perl.exe" field.
There are two methods to debug a script:After execution is stopped, it is possible to analyse the actual state of the script by
- Set one or more breakpoints (as explained in section 5.1 Breakpoints) and run the script, which is executed until a breakpoint is reached.
- Choose Step Over from the Run Menu. After the script and all required modules are loaded and initalized, the execution stops on the first line of non-initialization code.
Furthermore, it is possible to set/delete breakpoints (see section 5.1 Breakpoints) or to continue/abort the execution of the script. The following table shows the different navigation possibilities:
- Viewing Console Output, see section 4.2 Compile and Run a script
- Evaluating some variables, see section 5.2 Variable Evaluation
- Viewing the list of loaded modules, see section 5.3 Other debug windows
- Viewing the callstack, see section 5.3 Other debug windows
Table: Debug Navigation
Name Shortcut Description Run F9 Start/Continue script execution until next breakpoint is reached. Step Over F8 Execute the current script line, not tracing into subroutines. Step Into F7 Execute the next command in the current script line, tracing into subroutines. Abort CTRL-F2 Request termination of debug session. Force Termination CTRL-ALT-F12 Immediately terminate debug session.
You should only use "Force Termination" if you see no other way to stop script execution. Dont't expect Open Perl IDE to work correctly after using forced termination !
If script execution has finished, then Open Perl IDE automatically switches back from debug mode to edit mode.
![]() |
![]() |
![]() |
Dec 19, 2017 | www.informit.com
"Complexity is the enemy, and our aim is to kill it." -Jan Baan
One of Perl's greatest strengths is its expressiveness and extreme conciseness. Complexity is the bane of software development: when a program grows beyond a certain size, it becomes much harder to test, maintain, read, or extend. Unfortunately, today's problems mean this is true for every program we need. Anything you can do to minimize the complexity of your program will pay handsome dividends.
The complexity of a program is a function of several factors:
- The number of distinct lexical tokens
- The number of characters
- The number of branches in which control can pass to a different point
- The number of distinct program objects in scope at any time
Whenever a language allows you to change some code to reduce any of these factors, you reduce complexity.
3.7.1 Lose the Temporary VariablesThe poster child for complexity is the temporary variable. Any time a language intrudes between you and the solution you visualize, it diminishes your ability to implement the solution. All languages do this to some degree; Perl less than most. 13 In most languages, you swap two variables a and b with the following algorithm:
Declare temp to be of the same type as a and b temp = a; a = b; b = temp;But most languages are not Perl:
($b, $a) = ($a, $b);Iterating over an array usually requires an index variable and a count of how many things are currently stored in the array:
int i; for (i = 0; i < count_lines; i++) { strcat (line[i], suffix); }Whereas in Perl, you have the foreach construct borrowed from the shell:
foreach my $line (@lines) { $line .= $suffix }And if you feel put out by having to type foreach instead of just for , you're in luck, because they're synonyms for each other; so just type for if you want (Perl can tell which one you mean).
Because functions can return lists, you no longer need to build special structures just to return multivalued data. Because Perl does reference-counting garbage collection, you can return variables from the subroutine in which they are created and know that they won't be trampled on, yet their storage will be released later when they're no longer in use. And because Perl doesn't have strong typing of scalars, you can fill a hierarchical data structure with heterogeneous values without having to construct a union datatype and some kind of type descriptor.
Because built-in functions take lists of arguments where it makes sense to do that, you can pass them the results of other functions without having to construct an iterative loop:
unlink grep /~$/, readdir DIR;And the map function lets you form a new list from an old one with no unnecessary temporary variables:
open PASSWD, '/etc/passwd' or die "passwd: $!\n"; my @usernames = map /^([^:]+)/, <PASSWD>; close PASSWD;Because Perl's arrays grow and shrink automatically and there are simple operators for inserting, modifying, or deleting array elements, you don't need to build linked lists and worry if you've got the traversal termination conditions right. And because Perl has the hash data type, you can quickly locate a particular chunk of information by key or find out whether a member of a set exists.
3.7.2 Scope Out the ProblemOf course, sometimes temporary variables are unavoidable. Whenever you create one though, be sure and do it in the innermost scope possible (in other words, within the most deeply nested set of braces containing all references to the variable).
Create variables in the innermost scope possible.For example, let's say somewhere in my program I am traversing my Netscape history file and want to save the URLs visited in the last 10 days in @URLs :
use Netscape::History; my $history = new Netscape::History; my (@URLs, $url); while (defined($url = $history->next_url() )) { push @URLs, $url if time - $url->last_visit_time < 10 * 24 * 3600; }This looks quite reasonable on the face of it, but what if later on in our program we create a variable called $history or $url ? We'd get the message
"my" variable $url masks earlier declaration in same scopewhich would cause us to search backward in the code to find exactly which one it's referring to. Note the clause " in same scope " -- if in the meantime you created a variable $url at a different scope, well, that may be the one you find when searching backward with a text editor, but it won't be the right one. You may have to check your indentation level to see the scope level.
This process could be time-consuming. And really, the problem is in the earlier code, which created the variables $history or $url with far too wide a scope to begin with. We can (as of perl 5.004) put the my declaration of $url right where it is first used in the while statement and thereby limit its scope to the while block. As for $history , we can wrap a bare block around all the code to limit the scope of those variables:
use Netscape::History; my @URLs; { my $history = new Netscape::History; while (defined(my $url = $history->next_url() )) { push @URLs, $url if time - $url->last_visit_time < 10 * 24 * 3600; } }If you want to create a constant value to use in several places, use constant.pm to make sure it can't be overwritten:
$PI = 3.1415926535897932384; use constant PI => 3.1415926535897932384; my $volume = 4/3 * PI * $radius ** 3; $PI = 3.0; # The 'Indiana maneuver' works! PI = 3.0; # But this does notIn response to the last statement, Perl returns the error message, " Can't modify constant item in scalar assignment ."
constant.pm creates a subroutine of that name which returns the value you've assigned to it, so trying to overwrite it is like trying to assign a value to a subroutine call. Although the absurdity of that may sound like sufficient explanation for how use constant works, in fact, the latest version of perl allows you to assign a value to a subroutine call, provided the result of the subroutine is a place where you could store the value. For example, the subroutine could return a scalar variable. The term for this feature is lvaluable subroutine . But since the results of the subroutines created by use constant aren't lvalues, lvaluable subroutines won't cause problems for them.
![]() |
![]() |
![]() |
Nov 01, 2000 | www.ibm.com
Bugs are as inevitable as death and taxes. Nevertheless, the following material should help you avoid the pitfalls of bugs.
... ... ...First let's simply make sure the bug is repeatable. We'll set an action on line 8 to print $line where the error occurred, and run the program.
perl -d ./buggy.pl buggy.pl
use Data::Dumpe
a 8 print 'The line variable is now ', Dumper $line
The Data::Dumper module loads so that the autoaction can use a nice output format. The autoaction is set to do a print statement every time line 8 is reached. Now let's watch the show.
![]() |
![]() |
![]() |
Dec 08, 2017 | www.thegeekstuff.com
Perl Debugger Tutorial: 10 Easy Steps to Debug Perl Program by Balakrishnan Mariyappan on May 19, 2010
Earlier we discussed the basics of how to write and execute a perl program using Perl Hello World Example .
In this article, Let us review how to debug a perl program / script using Perl debugger , which is similar to the gdb tool for debugging C code .
To debug a perl program, invoke the perl debugger using "perl -d" as shown below.
# perl -d ./perl_debugger.plTo understand the perl debugger commands in detail, let us create the following sample perl program (perl_debugger.pl).
$ cat perl_debugger.pl #!/usr/bin/perl -w # Script to list out the filenames (in the pwd) that contains specific pattern. #Enabling slurp mode $/=undef; # Function : get_pattern # Description : to get the pattern to be matched in files. sub get_pattern { my $pattern; print "Enter search string: "; chomp ($pattern = <> ); return $pattern; } # Function : find_files # Description : to get list of filenames that contains the input pattern. sub find_files { my $pattern = shift; my (@files,@list,$file); # using glob, obtaining the filenames, @files = <./*>; # taking out the filenames that contains pattern. @list = grep { $file = $_; open $FH,"$file"; @lines = <$FH>; $count = grep { /$pattern/ } @lines; $file if($count); } @files; return @list; } # to obtain the pattern from STDIN $pattern = get_pattern(); # to find-out the list of filenames which has the input pattern. @list = find_files($pattern); print join "\n",@list;1. Enter Perl Debugger# perl -d ./perl_debugger.pl
it prompts,
2. View specific lines or subroutine statements using (l)
DB<1>DB<1> l 10
10: my $pattern;DB<2> l get_pattern
11 {
12: my $pattern;
13: print "Enter search string: ";
14: chomp ($pattern = );
15: return $pattern;
16 }3. Set the breakpoint on get_pattern function using (b) DB<3> b find_files
4. Set the breakpoint on specific line using (b)DB<4> b 44
5. View the breakpoints using (L)DB<5> L
6. step by step execution using (s and n)
./perl_debugger.pl:
22: my $pattern = shift;
break if (1)
44: print join "\n",@list;
break if (1)DB<5> s
main::(./perl_debugger.pl:39): $pattern = get_pattern();DB<5> s
main::get_pattern(./perl_debugger.pl:12):
12: my $pattern;Option s and n does step by step execution of each statements. Option s steps into the subroutine. Option n executes the subroutine in a single step (stepping over it).
The s option does stepping into the subroutine but while n option which would execute the subroutine(stepping over it).
7. Continue till next breakpoint (or line number, or subroutine) using (c)DB<5> c
8. Continue down to the specific line number using (c)
Enter search string: perl
main::find_files(./perl_debugger.pl:22):
22: my $pattern = shift;DB<5> c 36
9. Print the value in the specific variable using (p)
main::find_files(./perl_debugger.pl:36):
36: return @list;DB<6> p $pattern
perlDB<7> c
main::(./perl_debugger.pl:44): print join "\n",@list;
DB<7> c
./perl_debugger.pl
Debugged program terminated. Use q to quit or R to restart,
use o inhibit_exit to avoid stopping after program termination,
h q, h R or h o to get additional info.After the last continue operation, the output gets printed on the stdout as "./perl_debugger.pl" since it matches the pattern "perl".
10. Get debug commands from the file (source)Perl debugger can get the debug command from the file and execute it. For example, create the file called "debug_cmds" with the perl debug commands as,
c
p $pattern
qNote that R is used to restart the operation(no need quit and start debugger again).
DB<7> R
DB<7> source debug_cmds
>> c
Enter search string: perl
./perl_debugger.pl
Debugged program terminated. Use q to quit or R to restart,
use o inhibit_exit to avoid stopping after program termination,
h q, h R or h o to get additional info.
>> p $pattern
perl
>> qNote : If you are relatively new to perl, refer to our previous article: 20 perl programming tips for beginners .
Summary of perl debugger commandsFollowing options can be used once you enter the perl debugger.
- h or h h – for help page
- c – to continue down from current execution till the breakpoint otherwise till the subroutine name or line number,
- p – to show the values of variables,
- b – to place the breakpoints,
- L – to see the breakpoints set,
- d – to delete the breakpoints,
- s – to step into the next line execution.
- n – to step over the next line execution, so if next line is subroutine call, it would execute subroutine but not descend into it for inspection.
- source file – to take the debug commands from the file.
- l subname – to see the execution statements available in a subroutine.
- q – to quit from the debugger mode.
![]() |
![]() |
![]() |
Dec 03, 2017 | my.safaribooksonline.com
Not for the fainthearted, if you want to see how a regular expression runs when used in a match or substitution, use the core
re
pragma with itsdebug
option:% perl -Mstrict -Mwarnings
use re qw(debug);
$_ = "cats=purr, dog=bark";
my %sound = /(\w+)=(\w+)/g;
^D
Compiling REx `(\w+)=(\w+)'
size 15 first at 4
1: OPEN1(3)
3: PLUS(5)
4: ALNUM(0)
5: CLOSE1(7)
7: EXACT <=>(9)
9: OPEN2(11)
11: PLUS(13)
12: ALNUM(0)
13: CLOSE2(15)
15: END(0)
floating `=' at 1..2147483647 (checking floating) stclass `ALNUM' plus
minlen 3
Guessing start of match, REx `(\w+)=(\w+)' against `cats=purr,
dog=bark'...
Found floating substr `=' at offset 4...
Does not contradict STCLASS...
Guessed: match at offset 0
Matching REx `(\w+)=(\w+)' against `cats=purr, dog=bark'
Setting an EVAL scope, savestack=3
0 <> <cats=purr, d> | 1: OPEN1
0 <> <cats=purr, d> | 3: PLUS
ALNUM can match 4 times out of 32767...
Setting an EVAL scope, savestack=3
4 <cats> <=purr, d> | 5: CLOSE1
4 <cats> <=purr, d> | 7: EXACT <=>
5 <cats=> <purr, d> | 9: OPEN2
5 <cats=> <purr, d> | 11: PLUSSetting an EVAL scope, savestack=3
9 <=purr> <, dog=b> | 13: CLOSE2
9 <=purr> <, dog=b> | 15: END
Match successful!
Guessing start of match, REx `(\w+)=(\w+)' against `, dog=bark'...
Found floating substr `=' at offset 5...
By STCLASS: moving 0 --> 2
Guessed: match at offset 2
Matching REx `(\w+)=(\w+)' against `dog=bark'
Setting an EVAL scope, savestack=3
11 <urr, > <dog=bar> | 1: OPEN1
11 <urr, > <dog=bar> | 3: PLUS
ALNUM can match 3 times out of 32767...
Setting an EVAL scope, savestack=3
14 <rr, dog> <=bark> | 5: CLOSE1
14 <rr, dog> <=bark> | 7: EXACT <=>
15 <rr, dog=> <bark> | 9: OPEN2
15 <rr, dog=> <bark> | 11: PLUS
ALNUM can match 4 times out of 32767...
Setting an EVAL scope, savestack=3
19 <rr, dog=bark> <> | 13: CLOSE2
19 <rr, dog=bark> <> | 15: END
Match successful!
Freeing REx: `(\w+)=(\w+)'
debugcolor
option instead ofdebug
, you'll get some form of highlighting or coloring in the output that'll make it prettier, if not more understandable
![]() |
![]() |
![]() |
Nov 15, 2017 | perlmonks.com
GrandFather (Sage) on Jul 12, 2006 at 08:38 UTC
Re^2: Strategies for maintenance of horrible code?
Actually, just writing the tests is often a damn fine way of finding bugs. No exactly what OP is after at the moment, but something that is at the forefront of my mind because I'm in the middle of writing a set of tests (in Perl :) for some XML processing C++ code and turning up a pile of bugs as I go.
However it does suggest another test avenue: write test harnesses for modules so that you can exercise them in isolation and better understand how they work. If the test harness ends up part of a regression test system so much the better.
DWIM is Perl's answer to Gödel
![]() |
![]() |
![]() |
Jul 12, 2006 | perlmonks.com
converter has asked for the wisdom of the Perl Monks concerning the following question:
For the past several months I've been busy rewriting the horrible Perl code left behind by my predecessor. His approach to development was "Write some code. If the code runs without revealing any of the damage it's done, ship it. If not, write some more code."
This code is so bad that when co-workers ask me what I'm working on, I tell them "The Madman's Diary." Yes, it would have been cheaper and faster to throw this code away and start over, but I wasn't given that option.
My latest assignment is the repair of a tangled mess of a show-stopper that was discovered in a product that was supposed to ship today. After adding an open() override that logs the arguments to open() and some quality time with the watch(1) utility observing changes to the files containing the data that are causing the problem, I've narrowed the list of suspects down to a couple in-house scripts and a few (probably altered) webmin modules.
Now that I know where to look, I'd like to identify as quickly as possible which details can be safely ignored. I plan to use Devel::DProf to produce an execution graph for reference and Tie::Watch to watch variables, but I wonder if there are other tools that I should look at. A utility or module that would allow me to incrementally build a profile with persistent notes would be wonderful.
Debugging this code is a whole different game, and I'd really appreciate some input from other monks who've dealt with this type of problem.
eyepopslikeamosquito (Chancellor) on Jul 12, 2006 at 08:30 UTC
Re: Strategies for maintenance of horrible code?Yes, it would have been cheaper and faster to throw this code away and start overMaybe. For another point of view, see Joel Spolsky on not rewriting from scratch .I agree with adrianh . If a component is not broken, don't rewrite it. Rewrite a component when you find a number of bugs in it. But first write a regression test suite for the component. I've seen many folks over the years throw out old code, rewrite it ... and introduce a heap of new bugs in the process. If you come into a new company and introduce a swag of new bugs in previously working code, you will start to smell very badly.
See also:
- Swallowing an elephant in 10 easy steps
- Dealing with sloppy code
- Becoming familiar with a too-big codebase?
- Analyzing large Perl code base.
- What is the best way to add tests to existing code?
- Perl Medic: Transforming Legacy Code by Peter J. Scott
- Working Effectively with Legacy Code by Michael Feathers
GrandFather (Sage) on Jul 12, 2006 at 08:38 UTC
Re^2: Strategies for maintenance of horrible code?Actually, just writing the tests is often a damn fine way of finding bugs. No exactly what OP is after at the moment, but something that is at the forefront of my mind because I'm in the middle of writing a set of tests (in Perl :) for some XML processing C++ code and turning up a pile of bugs as I go.
However it does suggest another test avenue: write test harnesses for modules so that you can exercise them in isolation and better understand how they work. If the test harness ends up part of a regression test system so much the better.
DWIM is Perl's answer to Gödeltinita (Parson) on Jul 12, 2006 at 12:28 UTC
Maybe. For another point of view, see Joel Spolsky on not rewriting from scratch.uh oh. why does this remind me of perl 6? =)adrianh (Chancellor) on Jul 12, 2006 at 07:49 UTC
Debugging this code is a whole different game, and I'd really appreciate some input from other monks who've dealt with this type of problem.I'd recommend reading Perl Medic and Working Effectively with Legacy Code (the latter isn't Perl specific - but is chock full of useful advice).
I would not spend any time fixing the code if it's not breaking (assuming you're not being paid to review/fix the code). However evil it may be - if it's doing it's job leave it alone.
Instead - every time you need to fix a bug or add some new functionality just test/refactor the bits of the evil code that are touched by the changes. I've found incrementally adding tests and refactoring to be much more effective than any sort of "big bang" fixing things for the sake of them approach :-)
If you are being paid to do a review/fix then Perl::Critic might give you some useful places to look.
webfiend (Vicar) on Jul 14, 2006 at 21:24 UTC
Definitely agree about the approach of sorting things out with gradual refactoring and tests as the need arises. The problem with the "Big Bang" approach is that you have the potential for a very long stretch of time where there are two forks of the code: ugly shipping code that will need to be fixed and refactored as bugs are reported, and pretty nonfunctioning code that will need to incorporate those fixes as they are uncovered, resulting in a perpetual loop of "it's not quite ready yet."Ovid (Cardinal) on Jul 12, 2006 at 10:37 UTC
Check out Suggestions for working with poor code and some of the replies.
Cheers,
OvidNew address of my CGI Course .
GrandFather (Sage) on Jul 12, 2006 at 07:39 UTC
Re: Strategies for maintenance of horrible code?What tools are you using already and on what platform? For a large range of "detail" debugging there is nothing like as good as an IDE with a good integrated debugger. For a higher level view of where things are going Devel::TraceCalls may be handy, although it's output can be rather voluminous.
DWIM is Perl's answer to GödelMoron (Curate) on Jul 12, 2006 at 12:13 UTC
Some basic CYA I can see:
1) Ensure there is sufficient functional and technical design documentation against which the routines can be tested.
2) (updated) Make sure there is a sufficiently detailed project plan to include tasks for: systems analysis, functional and technical design, test planning, test script writing (e.g. using Expect ), developing, unit-, integrated and functional testing, rework and implementation, to include a GANTT chart of the work done so far and by who to what % of completion, to avoid getting the blame for not meeting poorly conceived targets over which you had no control.
In response to formal testing against the plan, I find it a useful aid to bug-fixing to monitor execution with perl -d, setting breakpoints and examining variables to hunt down which line of code causes each failure.
-MFree your mind
aufflick (Deacon) on Jul 13, 2006 at 00:17 UTC
You might find the comments to my recent question Generating documentation from Perl code (not just POD) useful.
The Doxygen perl extension creates docs that are great for seeing what classes re-implement what methods etc. Also the UML::Sequence sounds intriguing - it pupports to generate a sequence diagram by monitoring code execution.
![]() |
![]() |
![]() |
Dec 03, 2017 | perldoc.perl.org
Tie::File
NAME
- NAME
- SYNOPSIS
- DESCRIPTION
- Public Methods
- Tying to an already-opened filehandle
- Deferred Writing
- CONCURRENT ACCESS TO FILES
- CAVEATS
- SUBCLASSING
- WHAT ABOUT DB_File?
- AUTHOR
- LICENSE
- WARRANTY
- THANKS
- TODO
Tie::File - Access the lines of a disk file via a Perl array
SYNOPSISDESCRIPTION
- # This file documents Tie::File version 0.98
- use Tie::File
- tie @array 'Tie::File' filename or die ...
- $array 13 ] = 'blah' # line 13 of the file is now 'blah'
- print $array 42 # display line 42 of the file
- $n_recs = @array # how many records are in the file?
- $#array -= # chop two records off the end
- for @array
- s/PERL/Perl/g # Replace PERL with Perl everywhere in the file
- # These are just like regular push, pop, unshift, shift, and splice
- # Except that they modify the file in the way you would expect
- push @array new recs ...
- my $r1 = pop @array
- unshift @array new recs ...
- my $r2 = shift @array
- @old_recs = splice @array new recs ...
- untie @array # all finished
Tie::File
represents a regular text file as a Perl array. Each element in the array corresponds to a record in the file. The first line of the file is element 0 of the array; the second line is element 1, and so on.The file is not loaded into memory, so this will work even for gigantic files.
Changes to the array are reflected in the file immediately.
Lazy people and beginners may now stop reading the manual.
recsep
What is a 'record'? By default, the meaning is the same as for the
<...>
operator: It's a string terminated by$/
, which is probably"\n"
. (Minor exception: on DOS and Win32 systems, a 'record' is a string terminated by"\r\n"
.) You may change the definition of "record" by supplying therecsep
option in thetie
call:
- tie @array 'Tie::File' $file recsep => 'es'
This says that records are delimited by the string
es
. If the file contained the following data:
- Curse these pesky flies !\
then the
@array
would appear to have four elements:
- "Curse th"
- "e p"
- "ky fli"
- "!\n"
An undefined value is not permitted as a record separator. Perl's special "paragraph mode" semantics (à la
$/ = ""
) are not emulated.Records read from the tied array do not have the record separator string on the end; this is to allow
- $array 17 ] .= "extra"
to work as expected.
(See autochomp , below.) Records stored into the array will have the record separator string appended before they are written to the file, if they don't have one already. For example, if the record separator string is
"\n"
, then the following two lines do exactly the same thing:
- $array 17 ] = "Cherry pie"
- $array 17 ] = "Cherry pie\n"
The result is that the contents of line 17 of the file will be replaced with "Cherry pie"; a newline character will separate line 17 from line 18. This means that this code will do nothing:
- chomp $array 17
Because the
chomp
ed value will have the separator reattached when it is written back to the file. There is no way to create a file whose trailing record separator string is missing.Inserting records that contain the record separator string is not supported by this module. It will probably produce a reasonable result, but what this result will be may change in a future version. Use 'splice' to insert records or to replace one record with several.
autochomp
Normally, array elements have the record separator removed, so that if the file contains the text
- Gold
- Frankincense
- Myrrh
the tied array will appear to contain
"Gold" "Frankincense" "Myrrh"
. If you setautochomp
to a false value, the record separator will not be removed. If the file above was tied with
- tie @gifts "Tie::File" $gifts autochomp =>
then the array
@gifts
would appear to contain"Gold\n" "Frankincense\n" "Myrrh\n"
, or (on Win32 systems)"Gold\r\n" "Frankincense\r\n" "Myrrh\r\n"
.mode
Normally, the specified file will be opened for read and write access, and will be created if it does not exist. (That is, the flags
O_RDWR | O_CREAT
are supplied in theopen
call.) If you want to change this, you may supply alternative flags in themode
option. See Fcntl for a listing of available flags. For example:
- # open the file if it exists, but fail if it does not exist
- use Fcntl 'O_RDWR'
- tie @array 'Tie::File' $file mode => O_RDWR
- # create the file if it does not exist
- use Fcntl 'O_RDWR' 'O_CREAT'
- tie @array 'Tie::File' $file mode => O_RDWR | O_CREAT
- # open an existing file in read-only mode
- use Fcntl 'O_RDONLY'
- tie @array 'Tie::File' $file mode => O_RDONLY
Opening the data file in write-only or append mode is not supported.
memory
This is an upper limit on the amount of memory that
Tie::File
will consume at any time while managing the file. This is used for two things: managing the read cache and managing the deferred write buffer .Records read in from the file are cached, to avoid having to re-read them repeatedly. If you read the same record twice, the first time it will be stored in memory, and the second time it will be fetched from the read cache . The amount of data in the read cache will not exceed the value you specified for
memory
. IfTie::File
wants to cache a new record, but the read cache is full, it will make room by expiring the least-recently visited records from the read cache.The default memory limit is 2Mib. You can adjust the maximum read cache size by supplying the
memory
option. The argument is the desired cache size, in bytes.
- # I have a lot of memory, so use a large cache to speed up access
- tie @array 'Tie::File' $file memory => 20_000_000
Setting the memory limit to 0 will inhibit caching; records will be fetched from disk every time you examine them.
The
memory
value is not an absolute or exact limit on the memory used.Tie::File
objects contains some structures besides the read cache and the deferred write buffer, whose sizes are not charged againstmemory
.The cache itself consumes about 310 bytes per cached record, so if your file has many short records, you may want to decrease the cache memory limit, or else the cache overhead may exceed the size of the cached data.
dw_size
(This is an advanced feature. Skip this section on first reading.)
If you use deferred writing (See Deferred Writing , below) then data you write into the array will not be written directly to the file; instead, it will be saved in the deferred write buffer to be written out later. Data in the deferred write buffer is also charged against the memory limit you set with the
memory
option.You may set the
dw_size
option to limit the amount of data that can be saved in the deferred write buffer. This limit may not exceed the total memory limit. For example, if you setdw_size
to 1000 andmemory
to 2500, that means that no more than 1000 bytes of deferred writes will be saved up. The space available for the read cache will vary, but it will always be at least 1500 bytes (if the deferred write buffer is full) and it could grow as large as 2500 bytes (if the deferred write buffer is empty.)If you don't specify a
Option Formatdw_size
, it defaults to the entire memory limit.Public Methods
- mode
is a synonym formode
.- recsep
is a synonym forrecsep
.- memory
is a synonym formemory
. You get the idea.The
tie
call returns an object, say$o
. You may call
- $rec = $o->FETCH $n
- $o->STORE $n $rec
to fetch or store the record at line
$n
, respectively; similarly the other tied array methods. (See perltie for details.) You may also call the following methods on this object:flock
- $o->flock MODE
will lock the tied file.
MODE
has the same meaning as the second argument to the Perl built-inflock
function; for exampleLOCK_SH
orLOCK_EX | LOCK_NB
. (These constants are provided by theuse Fcntl ':flock'
declaration.)
MODE
is optional; the default isLOCK_EX
.
Tie::File
maintains an internal table of the byte offset of each record it has seen in the file.When you use
flock
to lock the file,Tie::File
assumes that the read cache is no longer trustworthy, because another process might have modified the file since the last time it was read. Therefore, a successful call toflock
discards the contents of the read cache and the internal record offset table.
Tie::File
promises that the following sequence of operations will be safe:In particular,
Tie::File
will not read or write the file during thetie
call. (Exception: Usingmode => O_TRUNC
will, of course, erase the file during thetie
call. If you want to do this safely, then open the file withoutO_TRUNC
, lock the file, and use@array = ()
.)The best way to unlock a file is to discard the object and untie the array. It is probably unsafe to unlock the file without also untying it, because if you do, changes may remain unwritten inside the object. That is why there is no shortcut for unlocking. If you really want to unlock the file prematurely, you know what to do; if you don't know what to do, then don't do it.
All the usual warnings about file locking apply here. In particular, note that file locking in Perl is advisory , which means that holding a lock will not prevent anyone else from reading, writing, or erasing the file; it only prevents them from getting another lock at the same time. Locks are analogous to green traffic lights: If you have a green light, that does not prevent the idiot coming the other way from plowing into you sideways; it merely guarantees to you that the idiot does not also have a green light at the same time.
autochomp
- my $old_value = $o->autochomp # disable autochomp option
- my $old_value = $o->autochomp # enable autochomp option
- my $ac = $o->autochomp () # recover current value
See autochomp , above.
defer
,flush
,discard
, andautodefer
See Deferred Writing , below.
offset
- $off = $o->offset $n
This method returns the byte offset of the start of the
Tying to an already-opened filehandle$n
th record in the file. If there is no such record, it returns an undefined value.If
$fh
is a filehandle, such as is returned byIO::File
or one of the otherIO
modules, you may use:
- tie @array 'Tie::File' $fh ...
Similarly if you opened that handle
FH
with regularopen
orsysopen
, you may use:
- tie @array 'Tie::File' \ *FH ...
Handles that were opened write-only won't work. Handles that were opened read-only will work as long as you don't try to modify the array. Handles must be attached to seekable sources of data---that means no pipes or sockets. If
Tie::File
can detect that you supplied a non-seekable handle, thetie
call will throw an exception. (On Unix systems, it can detect this.)Note that Tie::File will only close any filehandles that it opened internally. If you passed it a filehandle as above, you "own" the filehandle, and are responsible for closing it after you have untied the @array.
Deferred Writing(This is an advanced feature. Skip this section on first reading.)
Normally, modifying a
Tie::File
array writes to the underlying file immediately. Every assignment like$a ] = ...
rewrites as much of the file as is necessary; typically, everything from line 3 through the end will need to be rewritten. This is the simplest and most transparent behavior. Performance even for large files is reasonably good.However, under some circumstances, this behavior may be excessively slow. For example, suppose you have a million-record file, and you want to do:
- for @FILE
- $_ = "> $_"
The first time through the loop, you will rewrite the entire file, from line 0 through the end. The second time through the loop, you will rewrite the entire file from line 1 through the end. The third time through the loop, you will rewrite the entire file from line 2 to the end. And so on.
If the performance in such cases is unacceptable, you may defer the actual writing, and then have it done all at once. The following loop will perform much better for large files:
If
Tie::File
's memory limit is large enough, all the writing will done in memory. Then, when you call->flush
, the entire file will be rewritten in a single pass.(Actually, the preceding discussion is something of a fib. You don't need to enable deferred writing to get good performance for this common case, because
Tie::File
will do it for you automatically unless you specifically tell it not to. See autodeferring , below.)Calling
->flush
returns the array to immediate-write mode. If you wish to discard the deferred writes, you may call->discard
instead of->flush
. Note that in some cases, some of the data will have been written already, and it will be too late for->discard
to discard all the changes. Support for->discard
may be withdrawn in a future version ofTie::File
.Deferred writes are cached in memory up to the limit specified by the
dw_size
option (see above). If the deferred-write buffer is full and you try to write still more deferred data, the buffer will be flushed. All buffered data will be written immediately, the buffer will be emptied, and the now-empty space will be used for future deferred writes.If the deferred-write buffer isn't yet full, but the total size of the buffer and the read cache would exceed the
memory
limit, the oldest records will be expired from the read cache until the total size is under the limit.
push
,pop
,shift
,unshift
, andsplice
cannot be deferred. When you perform one of these operations, any deferred data is written to the file and the operation is performed immediately. This may change in a future version.If you resize the array with deferred writing enabled, the file will be resized immediately, but deferred records will not be written. This has a surprising consequence:
Autodeferring@a = ...
erases the file immediately, but the writing of the actual data is deferred. This might be a bug. If it is a bug, it will be fixed in a future version.
Tie::File
tries to guess when deferred writing might be helpful, and to turn it on and off automatically.
- for @a
- $_ = "> $_"
In this example, only the first two assignments will be done immediately; after this, all the changes to the file will be deferred up to the user-specified memory limit.
You should usually be able to ignore this and just use the module without thinking about deferring. However, special applications may require fine control over which writes are deferred, or may require that all writes be immediate. To disable the autodeferment feature, use
- tied @o ->autodefer
or
- tie @array 'Tie::File' $file autodefer =>
Similarly,
CONCURRENT ACCESS TO FILES->autodefer
re-enables autodeferment, and->autodefer ()
recovers the current value of the autodefer setting.Caching and deferred writing are inappropriate if you want the same file to be accessed simultaneously from more than one process. Other optimizations performed internally by this module are also incompatible with concurrent access. A future version of this module will support a
concurrent =>
option that enables safe concurrent access.Previous versions of this documentation suggested using
CAVEATSmemory =>
for safe concurrent access. This was mistaken. Tie::File will not support safe concurrent access before version 0.96.(That's Latin for 'warnings'.)
SUBCLASSING
- Reasonable effort was made to make this module efficient. Nevertheless, changing the size of a record in the middle of a large file will always be fairly slow, because everything after the new record must be moved.
- The behavior of tied arrays is not precisely the same as for regular arrays. For example:
undef
-ing aTie::File
array element just blanks out the corresponding record in the file. When you read it back again, you'll get the empty string, so the supposedly-undef
'ed value will be defined. Similarly, if you haveautochomp
disabled, then
- # This DOES print "How unusual!" if 'autochomp' is disabled
- undef $a 10
- print "How unusual!\n" if $a 10
Because when
autochomp
is disabled,$a 10
will read back as"\n"
(or whatever the record separator string is.)There are other minor differences, particularly regarding
exists
anddelete
, but in general, the correspondence is extremely close.- I have supposed that since this module is concerned with file I/O, almost all normal use of it will be heavily I/O bound. This means that the time to maintain complicated data structures inside the module will be dominated by the time to actually perform the I/O. When there was an opportunity to spend CPU time to avoid doing I/O, I usually tried to take it.
- You might be tempted to think that deferred writing is like transactions, with
flush
ascommit
anddiscard
asrollback
, but it isn't, so don't.- There is a large memory overhead for each record offset and for each cache entry: about 310 bytes per cached data record, and about 21 bytes per offset table entry.
The per-record overhead will limit the maximum number of records you can access per file. Note that accessing the length of the array via
$x = scalar @tied_file
accesses all records and stores their offsets. The same forforeach @tied_file
, even if you exit the loop early.This version promises absolutely nothing about the internals, which may change without notice. A future version of the module will have a well-defined and stable subclassing API.
WHAT ABOUTDB_File
?People sometimes point out that DB_File will do something similar, and ask why
Tie::File
module is necessary.There are a number of reasons that you might prefer
AUTHORTie::File
. A list is available athttp://perl.plover.com/TieFile/why-not-DB_File
.Mark Jason Dominus
To contact the author, send email to:
mjd perl tiefile @plover com
To receive an announcement whenever a new version of this module is released, send a blank email message to
mjd perl tiefile subscribe @plover com
.The most recent version of this module, including documentation and any news of importance, will be available at
LICENSE
- http://perl.plover.com/TieFile/
Tie::File
version 0.96 is copyright (C) 2003 Mark Jason Dominus.This library is free software; you may redistribute it and/or modify it under the same terms as Perl itself.
These terms are your choice of any of (1) the Perl Artistic Licence, or (2) version 2 of the GNU General Public License as published by the Free Software Foundation, or (3) any later version of the GNU General Public License.
This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this library program; it should be in the file
COPYING
. If not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USAFor licensing inquiries, contact the author at:
WARRANTY
- Mark Jason Dominus
- 255 S. Warnock St.
- Philadelphia, PA 19107
THANKS
Tie::File
version 0.98 comes with ABSOLUTELY NO WARRANTY. For details, see the license.
![]() |
![]() |
![]() |
Dec 01, 2017 | stackoverflow.com
AnonGeek ,Jun 20, 2012 at 20:37
I am trying to debug few regular expressions using:perl -Mre=debug file.plThe file.pl script has many regular expression. Some of them are repeated. Using above syntax, all the regex in file.pl are being debugged.
Is there a way to tell Perl to debug only a particular regex in a script?
I am familiar with YAPE::Regex module, but that is not what I require. So please don't suggest to use that.
Ehtesh Choudhury ,Jun 20, 2012 at 20:45
Why not just comment out the other regexes, or run just the particular regex on the command line, viaperl -e
? – Ehtesh Choudhury Jun 20 '12 at 20:45AnonGeek ,Jun 20, 2012 at 20:55
the script is very compilcated(12000 LOC). If I will comment out any of regex then the execution will fail..Also if I provide dummy values, then it will give unexpected results :( – AnonGeek Jun 20 '12 at 20:55Oleg V. Volkov ,Jun 20, 2012 at 20:41
As with many other pragmas, you can useno
to cancel previoususe
.use re 'debug'; $str=~/\d{3}/; no re 'debug'; $str=~/\d{3}/;Denis Ibaev ,Jun 20, 2012 at 20:48
As of 5.9.5 the directive use re 'debug' and its equivalents are lexically scoped, as the other directives are.
Use:
{ use re 'debug'; # Debugged regexp here. }AnonGeek ,Jun 20, 2012 at 21:10
Is this also supported in 5.8.8? I am putting it under a condition but it is enabling it globally for all regex. – AnonGeek Jun 20 '12 at 21:10Denis Ibaev ,Jun 21, 2012 at 5:24
No, since version 5.9.5. In 5.8.8 you need useno
statement. – Denis Ibaev Jun 21 '12 at 5:24
![]() |
![]() |
![]() |
Nov 30, 2017 | stackoverflow.com
I have this problem: I need to control the perl-debugger from an external script. By research I found out about various solutions, but I don't understand them. I failed to properly set up the RemotePort option (editing ".perldb"), which was the first I tried, and found no useful information on providing a filehandle from which the debugger would get its input (by somehow setting @cmdfhs) I found both options over here: http://search.cpan.org/~nwclark/perl-5.8.6/lib/perl5db.plIt would be nice if you could tell me how to provide the filehandle from which the debugger gets its input, or if you know a link where this is explained?
Casper ,Jun 28, 2015 at 21:53
Here's a simple example setting it up usingRemotePort
, which seemed easier to me:The trick to using
RemotePort
is that you have to have someone listening on the remote end BEFORE you launch the script to be debugged.As soon as you launch your script with
-d
Perl will attempt to connect toRemotePort
. So you have to make sure the initial connection succeeds by having someone listening there beforehand.Here I assume some Linux/Unix variant, which has the
netcat
utility installed. We usenetcat
to wait for incoming connections in this example, but you can use anything else you wish too which is able to create a service port and shuffle data between that and the current TTY:In terminal 1
# Use netcat to listen for incoming connections on port 9999 > nc -l -p 9999In terminal 2
# Start perl with -d and request a RemotePort connection > PERLDB_OPTS=RemotePort=127.0.0.1:9999 perl -d my_script.plAs soon as you do that in terminal 1 you will see something like this:
Loading DB routines from perl5db.pl version 1.39_10 Editor support available. Enter h or 'h h' for help, or 'man perldebug' for more help. main::(my_script.pl:4): DB<1>There you go..debug away.
Devel::Trepan is a gdb-like debugger. Although it has remote control, you can also run it at the outset with the option--command
which will "source" (in the gdb-sense) or run a series of debugger commands.To go into remote control, either start the debugger using the
--server
option or inside the debugger use the " server " command once inside the debugger.See Options for a list of options you can give at the outset.
![]() |
![]() |
![]() |
Nov 30, 2017 | stackoverflow.com
cbg ,Jul 7, 2014 at 11:36
my @char_array = split "", $s1;
@char_array
now contains all the characters of the$s1
string and it's possible to manipulate it, iterate over it or do whatever to it just like with any other array.You can yousplice
to insert elements at a given position of the array:echo -e 'hello\ndisk\ncaller' | perl -F'' -ane ' splice (@F,2,0," "); splice(@F,4,0," "); foreach(@F){print}' he l lo di s k ca l lerYou can use Data::Dumper for better visualization when working with arrays:
echo -n 'hello' | perl -MData::Dumper -F'' -ane ' splice (@F,2,0," "); splice(@F,4,0," ");print Dumper(\@F)' $VAR1 = [ 'h', 'e', ' ', 'l', ' ', 'l', 'o' ];
![]() |
![]() |
![]() |
Nov 30, 2017 | stackoverflow.com
Wakan Tanka ,Mar 20, 2015 at 13:17
I've come across following materials:
- Mastering Perl by brian d foy , chapter: Debugging Regular Expressions.
- Debugging regular expressions which mentions
re::debug
module for perlI've also try to use various another techniques:
but still did not get the point how to read their output. I've also found another modules used for debugging regular expressions here but I did not tried them yet, can you please explain how to read output of
use re 'debug'
or another command used for debugging regular expressions in perl?EDIT in reply to Borodin:
1st example:
perl -Mre=debug -e' "foobar"=~/(.)\1/' Compiling REx "(.)\1" Final program: 1: OPEN1 (3) 3: REG_ANY (4) 4: CLOSE1 (6) 6: REF1 (8) 8: END (0) minlen 1 Matching REx "(.)\1" against "foobar" 0 <> <foobar> | 1:OPEN1(3) 0 <> <foobar> | 3:REG_ANY(4) 1 <f> <oobar> | 4:CLOSE1(6) 1 <f> <oobar> | 6:REF1(8) failed... 1 <f> <oobar> | 1:OPEN1(3) 1 <f> <oobar> | 3:REG_ANY(4) 2 <fo> <obar> | 4:CLOSE1(6) 2 <fo> <obar> | 6:REF1(8) 3 <foo> <bar> | 8:END(0) Match successful! Freeing REx: "(.)\1"
- What does OPEN1, REG_ANY, CLOSE1 ... mean ?
- What numbers like 1 3 4 6 8 mean?
- What does number in braces OPEN1(3) mean?
- Which output should I look at, Compiling REx or Matching REx?
2nd example:
perl -Mre=debugcolor -e' "foobar"=~/(.*)\1/' Compiling REx "(.*)\1" Final program: 1: OPEN1 (3) 3: STAR (5) 4: REG_ANY (0) 5: CLOSE1 (7) 7: REF1 (9) 9: END (0) minlen 0 Matching REx "(.*)\1" against "foobar" 0 <foobar>| 1:OPEN1(3) 0 <foobar>| 3:STAR(5) REG_ANY can match 6 times out of 2147483647... 6 <foobar>| 5: CLOSE1(7) 6 <foobar>| 7: REF1(9) failed... 5 <foobar>| 5: CLOSE1(7) 5 <foobar>| 7: REF1(9) failed... 4 <foobar>| 5: CLOSE1(7) 4 <foobar>| 7: REF1(9) failed... 3 <foobar>| 5: CLOSE1(7) 3 <foobar>| 7: REF1(9) failed... 2 <foobar>| 5: CLOSE1(7) 2 <foobar>| 7: REF1(9) failed... 1 <foobar>| 5: CLOSE1(7) 1 <foobar>| 7: REF1(9) failed... 0 <foobar>| 5: CLOSE1(7) 0 <foobar>| 7: REF1(9) 0 <foobar>| 9: END(0) Match successful! Freeing REx: "(.*)\1"
- Why are numbers descending 6 5 4 3 ... in this example?
- What does failed keyword mean?
Borodin ,Mar 20, 2015 at 14:45
Asking how to use a debugger is very broad. Can you show us the pattern that you are trying to debug, and explain what you don't understand? – Borodin Mar 20 '15 at 14:45ThisSuitIsBlackNot ,Mar 20, 2015 at 15:38
When you runperl -Mre=debug
, you're using there
module; you can see the documentation by runningperldoc re
. The section on "debug mode" is a bit sparse, but ends with "See 'Debugging regular expressions' in perldebug for additional info."perldoc perldebug
is similarly short on details, but ends with "These matters are explored in some detail in 'Debugging regular expressions' in perldebguts." And now we have your answer . – ThisSuitIsBlackNot Mar 20 '15 at 15:38axblount ,Mar 20, 2015 at 20:25
Regular expressions define finite state machines 1 . The debugger is more or less showing you how the state machine is progressing as the string is consumed character by character."Compiling REx" is the listing of instructions for that regular expression. The number in parenthesis after each instruction is where to go once the step succeeds. In
/(.*)\1/
:1: OPEN1 (3) 3: STAR (5) 4: REG_ANY (0) 5: CLOSE1 (7)
STAR (5)
means computeSTAR
and once you succeed, go to instruction 5CLOSE1
."Matching REx" is the step-by-step execution of those instructions. The number on the left is the total number of characters that have been consumed so far. This number can go down if the matcher has to go backwards because something it tried didn't work.
To understand these instructions, it's important to understand how regular expressions "work." Finite state machines are usually visualized as a kind of flow chart. I have produced a crude one below for
/(.)\1/
. Because of the back reference to a capture group, I don't believe this regex is a strict finite state machine. The chart is useful none the less.Match +-------+ Anything +----------+ | Start +------------------+ State 1 | +---^---+ +--+---+---+ | | | | | |Matched same +-------------------------+ | character matched different | character +----+------+ | Success | +-----------+We start on
Start
. It's easy to advance to the first state, we just consume any one character (REG_ANY
). The only other thing that could happen is end of input. I haven't drawn that here. TheREG_ANY
instruction is wrapped in the capture group instructions.OPEN1
starts recording all matched characters into the first capture group.CLOSE1
stops recording characters to the first capture group.Once we consume a character, we sit on
State 1
and consume the next char. If it matches the previous char we move to success!REF1
is the instruction that attempts to match capture group #1. Otherwise, we failed and need to move back to theStart
to try again. Whenever the matcher says "failed..." it's telling you that something didn't work, so it's returning to an earlier state (that may or may not include 'unconsuming' characters).The example with
*
is more complicated.*
(which corresponds toSTAR
) tries to match the given pattern zero or more times, and it is greedy . That means it tries to match as many characters as it possibly can. Starting at the beginning of the string, it says "I can match up to 6 characters!" So, it matches all 6 characters ("foobar"
), closes the capture group, and tries to match"foobar"
again. That doesn't work! It tries again with 5, that doesn't work. And so on, until it tries to matching zero characters. That means the capture group is empty, matching the empty string always succeeds. So the match succeeds with\1 = ""
.I realize I've spent more time explaining regular expressions than I have Perl's regex debugger. But I think its output will become much more clear once you understand how regexes operate.
Here is a finite state machine simulator . You can enter a regex and see it executed. Unfortunately, it doesn't support back references.
1: I believe some of Perl's regular expression features push it beyond this definition but it's still useful to think about them this way.
> ,
The debug Iinformation contains description of the bytecode. Numbers denote the node indices in the op tree. Numbers in round brackets tell the engine to jump to a specific node upon match. The EXACT operator tells the regex engine to look for a literal string. REG_ANY means the . symbol. PLUS means the +. Code 0 is for the 'end' node. OPEN1 is a '(' symbol. CLOSE1 means ')'. STAR is a '*'. When the matcher reaches the end node, it returns a success code back to Perl, indicating that the entire regex has matched.See more details at http://perldoc.perl.org/perldebguts.html#Debugging-Regular-Expressions and a more conceptual http://perl.plover.com/Rx/paper/
![]() |
![]() |
![]() |
Nov 30, 2017 | stackoverflow.com
cbg ,Jul 7, 2014 at 11:36
my @char_array = split "", $s1;
@char_array
now contains all the characters of the$s1
string and it's possible to manipulate it, iterate over it or do whatever to it just like with any other array.You can yousplice
to insert elements at a given position of the array:echo -e 'hello\ndisk\ncaller' | perl -F'' -ane ' splice (@F,2,0," "); splice(@F,4,0," "); foreach(@F){print}' he l lo di s k ca l lerYou can use Data::Dumper for better visualization when working with arrays:
echo -n 'hello' | perl -MData::Dumper -F'' -ane ' splice (@F,2,0," "); splice(@F,4,0," ");print Dumper(\@F)' $VAR1 = [ 'h', 'e', ' ', 'l', ' ', 'l', 'o' ];
![]() |
![]() |
![]() |
Nov 30, 2017 | stackoverflow.com
Wakan Tanka ,Mar 20, 2015 at 13:17
I've come across following materials:
- Mastering Perl by brian d foy , chapter: Debugging Regular Expressions.
- Debugging regular expressions which mentions
re::debug
module for perlI've also try to use various another techniques:
but still did not get the point how to read their output. I've also found another modules used for debugging regular expressions here but I did not tried them yet, can you please explain how to read output of
use re 'debug'
or another command used for debugging regular expressions in perl?EDIT in reply to Borodin:
1st example:
perl -Mre=debug -e' "foobar"=~/(.)\1/' Compiling REx "(.)\1" Final program: 1: OPEN1 (3) 3: REG_ANY (4) 4: CLOSE1 (6) 6: REF1 (8) 8: END (0) minlen 1 Matching REx "(.)\1" against "foobar" 0 <> <foobar> | 1:OPEN1(3) 0 <> <foobar> | 3:REG_ANY(4) 1 <f> <oobar> | 4:CLOSE1(6) 1 <f> <oobar> | 6:REF1(8) failed... 1 <f> <oobar> | 1:OPEN1(3) 1 <f> <oobar> | 3:REG_ANY(4) 2 <fo> <obar> | 4:CLOSE1(6) 2 <fo> <obar> | 6:REF1(8) 3 <foo> <bar> | 8:END(0) Match successful! Freeing REx: "(.)\1"
- What does OPEN1, REG_ANY, CLOSE1 ... mean ?
- What numbers like 1 3 4 6 8 mean?
- What does number in braces OPEN1(3) mean?
- Which output should I look at, Compiling REx or Matching REx?
2nd example:
perl -Mre=debugcolor -e' "foobar"=~/(.*)\1/' Compiling REx "(.*)\1" Final program: 1: OPEN1 (3) 3: STAR (5) 4: REG_ANY (0) 5: CLOSE1 (7) 7: REF1 (9) 9: END (0) minlen 0 Matching REx "(.*)\1" against "foobar" 0 <foobar>| 1:OPEN1(3) 0 <foobar>| 3:STAR(5) REG_ANY can match 6 times out of 2147483647... 6 <foobar>| 5: CLOSE1(7) 6 <foobar>| 7: REF1(9) failed... 5 <foobar>| 5: CLOSE1(7) 5 <foobar>| 7: REF1(9) failed... 4 <foobar>| 5: CLOSE1(7) 4 <foobar>| 7: REF1(9) failed... 3 <foobar>| 5: CLOSE1(7) 3 <foobar>| 7: REF1(9) failed... 2 <foobar>| 5: CLOSE1(7) 2 <foobar>| 7: REF1(9) failed... 1 <foobar>| 5: CLOSE1(7) 1 <foobar>| 7: REF1(9) failed... 0 <foobar>| 5: CLOSE1(7) 0 <foobar>| 7: REF1(9) 0 <foobar>| 9: END(0) Match successful! Freeing REx: "(.*)\1"
- Why are numbers descending 6 5 4 3 ... in this example?
- What does failed keyword mean?
Borodin ,Mar 20, 2015 at 14:45
Asking how to use a debugger is very broad. Can you show us the pattern that you are trying to debug, and explain what you don't understand? – Borodin Mar 20 '15 at 14:45ThisSuitIsBlackNot ,Mar 20, 2015 at 15:38
When you runperl -Mre=debug
, you're using there
module; you can see the documentation by runningperldoc re
. The section on "debug mode" is a bit sparse, but ends with "See 'Debugging regular expressions' in perldebug for additional info."perldoc perldebug
is similarly short on details, but ends with "These matters are explored in some detail in 'Debugging regular expressions' in perldebguts." And now we have your answer . – ThisSuitIsBlackNot Mar 20 '15 at 15:38axblount ,Mar 20, 2015 at 20:25
Regular expressions define finite state machines 1 . The debugger is more or less showing you how the state machine is progressing as the string is consumed character by character."Compiling REx" is the listing of instructions for that regular expression. The number in parenthesis after each instruction is where to go once the step succeeds. In
/(.*)\1/
:1: OPEN1 (3) 3: STAR (5) 4: REG_ANY (0) 5: CLOSE1 (7)
STAR (5)
means computeSTAR
and once you succeed, go to instruction 5CLOSE1
."Matching REx" is the step-by-step execution of those instructions. The number on the left is the total number of characters that have been consumed so far. This number can go down if the matcher has to go backwards because something it tried didn't work.
To understand these instructions, it's important to understand how regular expressions "work." Finite state machines are usually visualized as a kind of flow chart. I have produced a crude one below for
/(.)\1/
. Because of the back reference to a capture group, I don't believe this regex is a strict finite state machine. The chart is useful none the less.Match +-------+ Anything +----------+ | Start +------------------+ State 1 | +---^---+ +--+---+---+ | | | | | |Matched same +-------------------------+ | character matched different | character +----+------+ | Success | +-----------+We start on
Start
. It's easy to advance to the first state, we just consume any one character (REG_ANY
). The only other thing that could happen is end of input. I haven't drawn that here. TheREG_ANY
instruction is wrapped in the capture group instructions.OPEN1
starts recording all matched characters into the first capture group.CLOSE1
stops recording characters to the first capture group.Once we consume a character, we sit on
State 1
and consume the next char. If it matches the previous char we move to success!REF1
is the instruction that attempts to match capture group #1. Otherwise, we failed and need to move back to theStart
to try again. Whenever the matcher says "failed..." it's telling you that something didn't work, so it's returning to an earlier state (that may or may not include 'unconsuming' characters).The example with
*
is more complicated.*
(which corresponds toSTAR
) tries to match the given pattern zero or more times, and it is greedy . That means it tries to match as many characters as it possibly can. Starting at the beginning of the string, it says "I can match up to 6 characters!" So, it matches all 6 characters ("foobar"
), closes the capture group, and tries to match"foobar"
again. That doesn't work! It tries again with 5, that doesn't work. And so on, until it tries to matching zero characters. That means the capture group is empty, matching the empty string always succeeds. So the match succeeds with\1 = ""
.I realize I've spent more time explaining regular expressions than I have Perl's regex debugger. But I think its output will become much more clear once you understand how regexes operate.
Here is a finite state machine simulator . You can enter a regex and see it executed. Unfortunately, it doesn't support back references.
1: I believe some of Perl's regular expression features push it beyond this definition but it's still useful to think about them this way.
> ,
The debug Iinformation contains description of the bytecode. Numbers denote the node indices in the op tree. Numbers in round brackets tell the engine to jump to a specific node upon match. The EXACT operator tells the regex engine to look for a literal string. REG_ANY means the . symbol. PLUS means the +. Code 0 is for the 'end' node. OPEN1 is a '(' symbol. CLOSE1 means ')'. STAR is a '*'. When the matcher reaches the end node, it returns a success code back to Perl, indicating that the entire regex has matched.See more details at http://perldoc.perl.org/perldebguts.html#Debugging-Regular-Expressions and a more conceptual http://perl.plover.com/Rx/paper/
![]() |
![]() |
![]() |
Nov 29, 2017 | stackoverflow.com
Alex ,Jun 21, 2009 at 12:45
How can I check that a variable has a specific value in Perl? Is there a command to stop a script's execution to look up some of it's variables?I wonder if I can use the Pythonic practice of inserting:
assert 0, (foo, bar)to debug scripts in a debuger-less way?
Telemachus ,Jun 21, 2009 at 12:58
A quick CPAN search suggests Carp::Assert .Sinan Ünür ,Jun 21, 2009 at 13:30
+1 for typing in more characters faster than I did. I am going to edit the URL to be version agnostic though. – Sinan Ünür Jun 21 '09 at 13:30Telemachus ,Jun 21, 2009 at 13:50
I was going to comment on your answer about the photo finish. As for the URL, I constantly forget that, so thanks. – Telemachus Jun 21 '09 at 13:50Sinan Ünür ,Jun 21, 2009 at 12:58
See Carp::Assert .zoul ,Jun 21, 2009 at 13:44
Smart::Comments are nice.RET ,Jun 22, 2009 at 3:28
Smart::Comments++ When used with the -ENV switch, it's a fantastic tool for this sort of thing. Much better than having to strip all the tests out before going to production, as someone else suggested.nik ,Jun 21, 2009 at 12:54
There is a script at PerlMonks that introduces a fast assert method.Speed is important since Perl is interpreted and any inline checks will impact performance (unlike simple C macros for example)
I am not sure if these things are going to be directly usable.
- there is Test::Harness in default installs. Here is a starter tutorial . The more recent module is TAP::Harness
- A slower version along the lines you talk is Sub::Assert
Ok! This is what i was looking for -- PDF Warning: Test-Tutorial.pdf . The
Test::Harness
is used for writing Perl module tests.Ape-inago ,Jun 21, 2009 at 13:51
$var_to_check =~ /sometest/ or die "bad variable!";I tend to throw things like this in my code, and later use a find and replace to get rid of them (in production code).
Also, ' eval ' can be used to run a section of code and capture errors and can be used to create exception handling functionality. If you are asserting that a value is not 0, perhaps you want to throw an exception and handle that case in a special way?
> ,
if ( $next_sunrise_time > 24*60*60 ) { warn( "assertion failed" ); } # Assert that the sun must rise in the next 24 hours.You can do this if you do not have access to Perl 5.9 which is required for Carp::Assert .
![]() |
![]() |
![]() |
Nov 29, 2017 | stackoverflow.com
Alex ,Jun 21, 2009 at 12:45
How can I check that a variable has a specific value in Perl? Is there a command to stop a script's execution to look up some of it's variables?I wonder if I can use the Pythonic practice of inserting:
assert 0, (foo, bar)to debug scripts in a debuger-less way?
Telemachus ,Jun 21, 2009 at 12:58
A quick CPAN search suggests Carp::Assert .Sinan Ünür ,Jun 21, 2009 at 13:30
+1 for typing in more characters faster than I did. I am going to edit the URL to be version agnostic though. – Sinan Ünür Jun 21 '09 at 13:30Telemachus ,Jun 21, 2009 at 13:50
I was going to comment on your answer about the photo finish. As for the URL, I constantly forget that, so thanks. – Telemachus Jun 21 '09 at 13:50Sinan Ünür ,Jun 21, 2009 at 12:58
See Carp::Assert .zoul ,Jun 21, 2009 at 13:44
Smart::Comments are nice.RET ,Jun 22, 2009 at 3:28
Smart::Comments++ When used with the -ENV switch, it's a fantastic tool for this sort of thing. Much better than having to strip all the tests out before going to production, as someone else suggested.nik ,Jun 21, 2009 at 12:54
There is a script at PerlMonks that introduces a fast assert method.Speed is important since Perl is interpreted and any inline checks will impact performance (unlike simple C macros for example)
I am not sure if these things are going to be directly usable.
- there is Test::Harness in default installs. Here is a starter tutorial . The more recent module is TAP::Harness
- A slower version along the lines you talk is Sub::Assert
Ok! This is what i was looking for -- PDF Warning: Test-Tutorial.pdf . The
Test::Harness
is used for writing Perl module tests.Ape-inago ,Jun 21, 2009 at 13:51
$var_to_check =~ /sometest/ or die "bad variable!";I tend to throw things like this in my code, and later use a find and replace to get rid of them (in production code).
Also, ' eval ' can be used to run a section of code and capture errors and can be used to create exception handling functionality. If you are asserting that a value is not 0, perhaps you want to throw an exception and handle that case in a special way?
> ,
if ( $next_sunrise_time > 24*60*60 ) { warn( "assertion failed" ); } # Assert that the sun must rise in the next 24 hours.You can do this if you do not have access to Perl 5.9 which is required for Carp::Assert .
![]() |
![]() |
![]() |
Aug 06, 2001 | perlmonks.com
So you find the Perl docs on modules a bit confusing? OK here is the world's simplest Perl module demonstrating all the salient features of Exporter and a script that uses this module. We also give a short rundown on @INC and finish with a note on using warnings and modules. Here is the module code. MyModule.pm package MyModule; use strict; use Exporter; use vars qw($VERSION @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS); $VERSION = 1.00; @ISA = qw(Exporter); @EXPORT = (); @EXPORT_OK = qw(func1 func2); %EXPORT_TAGS = ( DEFAULT => [qw(&func1)], Both => [qw(&func1 &func2)]); sub func1 { return reverse @_ } sub func2 { return map{ uc }@_ } 1; [download]First we get a namespace by declaring a package name. This helps ensure our module's functions and variables remain separate from any script that uses it.
Use strict is a very good idea for modules to restrict the use of global variables. See use strict warnings and diagnostics or die for more details.
We need to use the Exporter module to export our functions from the MyModule:: namespace into the main:: namespace to make them available to scripts that 'use' MyModule.
We pacify strict with the use vars declaration of some variables. We can use an 'our' declaration in 5.6+
We now set a $VERSION number and make Exporter part of MyModule using the @ISA. See perlboot for all the gory details on what @ISA is or just use it as shown.
@EXPORT contains a list of functions that we export by default, in this case nothing. Generally the less you export by default using @EXPORT the better. This avoids accidentally clashing with functions defined in the script using the module. If a script wants a function let it ask.
@EXPORT_OK contains a list of functions that we export on demand so we export &func1 &func2 only if specifically requested to. Use this in preference to just blindly exporting functions via @EXPORT. You can also export variables like $CONFIG provided they are globals not lexicals scoped with my (read declare them with our or use vars).
%EXPORT_TAGS. For convenience we define two sets of export tags. The ':DEFAULT' tag exports only &func1; the ':Both' tag exports both &func1 &func2. This hash stores labels pointing to array references. In this case the arrays are anonymous.
We need the 1; at the end because when a module loads Perl checks to see that the module returns a true value to ensure it loaded OK. You could put any true value at the end (see Code::Police ) but 1 is the convention.
MyScript.pl (A simple script to use MyModule) #!/usr/bin/perl -w use strict; # you may need to set @INC here (see below) my @list = qw (J u s t ~ A n o t h e r ~ P e r l ~ H a c k e r !); # case 1 # use MyModule; # print func1(@list),"\n"; # print func2(@list),"\n"; # case 2 # use MyModule qw(&func1); # print func1(@list),"\n"; # print MyModule::func2(@list),"\n"; # case 3 # use MyModule qw(:DEFAULT); # print func1(@list),"\n"; # print func2(@list),"\n"; # case 4 # use MyModule qw(:Both); # print func1(@list),"\n"; # print func2(@list),"\n"; [download]We use MyModule in MyScript.pl as shown. Uncomment the examples to see what happens. Just uncomment one at a time.
Case 1: Because our module exports nothing by default we get errors as &funct1 and &funct2 have not been exported thus do not exist in the main:: namespace of the script.
Case 2: This works OK. We ask our module to export the &func1 so we can use it. Although &func2 was not exported we reference it with its full package name so this works OK.
Case 3: The ':DEFAULT' tag *should* export &func1 so you might expect the error here to concern a missing &func2. In fact Perl complains about &func1. Hmm, what is going on here. The DEFAULT tag name is special and is automatically set in our modules %EXPORT_TAGS hash like this DEFAULT => \@EXPORT.
Case 4: We specified the export of both our functions with the ':Both' thus this works.
A note on @INCWhen you issue a use MyModule; directive perl searchs the @INC array for a module with the correct name. @INC usually contains:
/perl/lib /perl/site/lib .The . directory (dot dir) is the current working directory. CORE modules are installed under perl/lib whereas non-CORE modules install under perl/site/lib. You can add directories to the module search path in @INC like this:
BEGIN { push @INC, '/my/dir' } # or BEGIN { unshift @INC, '/my/dir' } # or use lib '/my/dir'; [download]We need to use a BEGIN block to shift values into @INC at compile time as this is when perl checks for modules. If you wait until the script is comiled it is too late and perl will throw an exception saying it can't find MyModule in @INC... The difference between pushing a value and unshifting a value into @INC is that perl searches the @INC array for the module starting with the first dir in that array. Thus is you have a MyModule in /perl/lib/ and another in /perl/site/lib/ and another in ./ the one in /perl/lib will be found first and thus the one used. The use lib pragma effectively does the same as the BEGIN { unshift @INC, $dir } block - see perlman:lib:lib for full specifics.
What use Foo::Bar meansuse Foo::Bar does not mean look for a module called "Foo::Bar.pm" in the @INC directories. It means search @INC for a *subdir* called "Foo" and a *module* called "Bar.pm".
Now once we have "use'd" a module its functions are available via the fully specified &PACKAGE::FUNCTION syntax. When we say &Foo::Bar::some_func we are refering to the *package name* not the (dir::)file name that we used in the use. This allows you to have many package names in one use'd file. In practice the names are usually the same.
use Warnings;You should test your module with warnings enabled as this will pick up many subtle (and not so subtle :-) errors. You can activate warnings using the -w flag in the script you use to test the module. If you add use warnings to the module then your module will require Perl 5.6+ as this was not available before then. If you put $^W++ at the top of the module then you will globally enable warnings - this may break *other modules* a script may be using in addition to your module so is rather antisocial. An expert coder here called tye tests with warnings but does not include them directly in his/her modules.
Hope this explains how it works.
cheers
tachyon
UpdateFixed a typo and added a few comments. Thanks to John M. Dlugosz . Rewrote and restyled tute for compatibility with versions of Perl < 5.6 thanks to crazyinsomniac . Also thanks to tye for reminding me that $^W++ is globally scoped and a bit antisocial for a module.
s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print
John M. Dlugosz (Monsignor) on Aug 06, 2001 at 04:30 UTC
Re: Simple Module TutorialVery nice, getting everything into a short page like that. But, I have a few comments:
Are you sure you want to make $VERSION a float, rather than a v-string? And if so, illustrate the three-digit convention (e.g. 5.005_001 for version 5.5.1).
I'm also shocked that your pm file doesn't use strict !
I would also suggest adding a comment to the 1; line, saying that this means "loaded OK".
-- John
tachyon (Chancellor) on Aug 06, 2001 at 05:42 UTC
Re: Re: Simple Module Tutorial
by tachyon (Chancellor) on Aug 06, 2001 at 05:42 UTCThanks John I've updated the text a bit in line with your suggestions. Forgot the strict in the module! Oops it is back in its rightful place right at the top. I just used the simple $VERSION numbering because this is a simple tute :-) Here is an excerpt from the Exporter manpage for those interested.
Module Version Checking The Exporter module will convert an attempt to import a number from a module into a call to $module_name->require_version($value). This can be used to validate that the version of the module being used is greater than or equal to the required version. The Exporter module supplies a default require_version method which checks the value of $VERSION in the exporting module. Since the default require_version method treats the $VERSION number as a simple numeric value it will regard version 1.10 as lower than 1.9. For this reason it is strongly recommended that you use numbers with at least two decimal places, e.g., 1.09. [download]cheers
tachyon
s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print
John M. Dlugosz (Monsignor) on Aug 06, 2001 at 08:18 UTC
Re: Re: Re: Simple Module Tutorial
by John M. Dlugosz (Monsignor) on Aug 06, 2001 at 08:18 UTC Yea, I just posted a tutorial on VERSION.For compatibility with mixing decimals and v-strings, the built-in UNIVERSAL::require_version uses three decimal digits per part.
If you have $MyModule::VERSION= 1.12; (a decimal number) and do a use MyModule 1.20.1 qw/bar/ , it will tell you that the module 1.120 and you asked for 1.020, so that's OK. You expected 1.20 to be greater than 1.12, not-OK.
-- John
tye (Sage) on Aug 06, 2001 at 22:09 UTC
(tye)Re: Simple Module Tutorial$W++ will only give you run-time warnings and will affect other packages. Personally, I don't turn on warnings in modules that I write but I do make a point of testing them with warnings turned on (by putting "#!/usr/bin/perl -w" at the top of my test scripts).
- tye (but my friends call me "Tye")tachyon (Chancellor) on Aug 06, 2001 at 22:36 UTC
Re: (tye)Re: Simple Module Tutorial
by tachyon (Chancellor) on Aug 06, 2001 at 22:36 UTCThanks I changed it from the lexically scoped 'use warnings;' so that this is applicable to versions < 5.6. but as it adds little value to the tutorial and has the unwanted side effects you point out I have just deleted it - saves a few lines of dubious value. I'll add a note on testing with warnings when I have a moment.
cheers
tachyon
s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print
johnnywang (Priest) on Aug 09, 2004 at 22:58 UTC
Re: Simple Module TutorialQuite some years have gone by since the inital post, just want to point out that "use vars" is now considered deprecated, instead, "our" is prefered. Another already mentioned point is not to use @EXPORT too much. So my basic module is like the following (I have a emacs function to output this):
package MyModule; use strict; use Exporter qw(import); our $VERSION = 1.00; our @ISA = qw(Exporter); our @EXPORT_OK = qw(func1 func2); our %EXPORT_TAGS = ( DEFAULT => [qw(func1)], Both => [qw(func1 func2)]); sub func1 { return reverse @_ } sub func2 { return map{ uc }@_ } 1; [download] For those emacs users, here's the simple script to generate the skeleton: (defun perl-new-module () "Generate a skeleton source for a traditional perl module." (interactive) (setq var (split-string (read-from-minibuffer "Enter module name (eg. Web::Test): "nil nil nil nil nil nil) " ")) (setq name (car var)) (insert (format "package %s;\n\n" name)) (insert "use strict;\n\n") (insert "use Exporter qw(import);\n") (insert "our @ISA = qw(Exporter);\n") (insert "our @EXPORT_OK = qw();\n") (insert "our %EXPORT_TAGS = ();\n") (insert "our $VERSION = 1.00; \n\n") (insert "\n\n\n\n\n\n") (insert "1;") (insert "\n") (previous-line 6) (end-of-line) ) [download]adrianh (Chancellor) on Aug 10, 2004 at 00:30 UTC
Re^2: Simple Module Tutorial
by adrianh (Chancellor) on Aug 10, 2004 at 00:30 UTCQuite some years have gone by since the inital post, just want to point out that "use vars" is now considered deprecated, instead, "our" is prefered.Some people prefer to use our rather than use vars (I'm one of them) - but the latter is not deprecated. Both do slightly different things and many people still prefer to use vars .
beretboy (Chaplain) on Aug 18, 2001 at 15:07 UTC
Re: Simple Module TutorialExellent tutorial ++! I have never understood the writing of modules till now
"Sanity is the playground of the unimaginative" -Unknown
Codmate (Novice) on Sep 06, 2001 at 20:18 UTC
Re: Re: Simple Module TutorialFantastic stuff - I just converted a full script into a module (and added some bits) and it worked 1st time - WITH NO DEBUGGING REQUIRED (and yes - I am using strict)!!! I was very scared of modules before but now feel like I could write a hundred. Thanks very much for this - invaluable tutorial for a newbie like me :))))))
by Codmate (Novice) on Sep 06, 2001 at 20:18 UTCJaap (Curate) on Jul 22, 2002 at 09:35 UTC
Re: Simple Module TutorialThis is a nice tutorial tachyon. Are you considering writing a more advanced tutorial on modules (combined with OO)?
Especially, what a GOOD module looks like. Should we use carp, dynaloader and what not?
gawatkins (Monsignor) on Apr 10, 2003 at 11:44 UTC
Re: Simple Module TutorialGreat Tutorial, It helped to clear up the muddy water created by my Perl Black Book .
Thanks again,
Greg W.
twotone (Beadle) on Oct 14, 2007 at 05:06 UTC
Re: Simple Module TutorialGreat summary of module basics!
Here's a little code I came up with to add my module location to @INC (in a cgi environment) by dynamically determining the document root for the script. It works on the remote apache server and when testing locally in windows. It might be of some interest:
BEGIN { # get doc root from %ENV # implicitly declare file root path if %ENV not fount my $doc_root = $ENV{DOCUMENT_ROOT} || 'C:/Users/User/Documents/website/sites/mysite'; # change \ to / $doc_root =~ s/\\/\//g; # add module folder location $doc_root .= "/cgi-bin/cms/"; # add module location to @INC push(@INC,$doc_root); } [download]bychan (Initiate) on Jan 28, 2008 at 08:45 UTC
Re^2: Simple Module TutorialThis tutorial is great. The only problem is, that I get the following result, if I comment out all the cases:
by bychan (Initiate) on Jan 28, 2008 at 08:45 UTC!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!
!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!
!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!
!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!Shouldn't there be some error messages or warnings?
chexmix (Hermit) on Aug 12, 2008 at 13:24 UTC
Re: Simple Module TutorialI like this post very much, but the following is opaque to me for some reason:
" Case 3: The ':DEFAULT' tag *should* export &func1 so you might expect the error here to concern a missing &func2. In fact Perl complains about &func1. Hmm, what is going on here. The DEFAULT tag name is special and is automatically set in our modules %EXPORT_TAGS hash like this DEFAULT => \@EXPORT."
I confess I still don't know what is going on here, and am wondering if someone can help me out.
The words "is automatically set in our modules %EXPORT_TAGS hash like this DEFAULT => \@EXPORT" seem inaccurate to me, since when I look up at the code for MyModule.pm, I see the line
%EXPORT_TAGS = ( DEFAULT => qw(&func1) ,
and not the line
%EXPORT_TAGS = ( DEFAULT => \@EXPORT,
Is the OP saying that the line as is in the MyModule.pm code is incorrect, because the "special" nature of DEFAULT overrides it with something else?
Thanks. I feel I am blanking on something obvious here, but just can't see it this rainy morning.
tye (Sage) on Aug 14, 2008 at 15:41 UTC
Re^2: Simple Module Tutorial (DEFAULT)
by tye (Sage) on Aug 14, 2008 at 15:41 UTCIs the OP saying that the line as is in the MyModule.pm code is incorrect, because the "special" nature of DEFAULT overrides it with something else?Yes. Exporter.pm wants :DEFAULT to match @EXPORT so the module is incorrect in trying to define its own meaning for :DEFAULT. Based on what you've quoted, it appears that Exporter.pm forces this issue, but the more important point is that you shouldn't set $EXPORT_TAGS{DEFAULT} yourself.
- tye
sg (Pilgrim) on Feb 05, 2011 at 23:13 UTC
Re: Simple Module TutorialThanks for the exposition; my inclination regarding a simple module is as follows:
MyModule.pm package MyModule; use strict; use warnings; use diagnostics; use Carp; our $VERSION = 1.08; sub see_me { my $foo = shift; print "\t\tDo you see this: $foo?\n"; } 1; __END__ last line of the module needs to be true; last line of the _file_ need not be true: 0; [download]The above module is exercised by the following script:
exercise_my_module.pl
#!/c/opt/perl/bin/perl use strict; use warnings; use diagnostics; use Carp; use MyModule 1.05; #use MyModule 1.10; # will fail MyModule::see_me( 8 ); __END__ [download]chanslor (Acolyte) on Feb 27, 2015 at 19:47 UTC
Re^2: Simple Module TutorialThank you for this post! It has gotten me past the first barrier of writing my own module. Thanks again! - chanslor
by chanslor (Acolyte) on Feb 27, 2015 at 19:47 UTCAnonymous Monk on Mar 09, 2010 at 12:21 UTC
Re: Simple Module TutorialExcellent representation of what seemed a tough nut to swallow. Thank you very much. Tanuj Bhargava
Anonymous Monk on Mar 07, 2011 at 15:12 UTC
Re: Simple Module TutorialThanks to the writer for your trouble. But everyone seems to get it except me. I have tried to duplicate your results.
I have:
- MyScript.pl under /storage/username/PERL_SCRIPTS/dev
- Test.pm under /storage/username/local/perl/site/lib/Test/Test.pm (I just replaced MyModule.pm with Test.pm).
The module code is exactly the same. In MyScript.pl I have added
use lib '/storage/username/local/perl/site/lib'; <p>and typed in the first two cases.</p> <code>perl MyScript.pl [download] gives: Undefined subroutine &main::func1 called at MyScript.pl line 10Line 10 is:
print func1(@list),"\n";after typing "use Test;"
What am I missing here? Also, is the BEGIN command supposed to be used in the Perl script? It gives syntax errors when I try to use it.
Thanks in advance,
Gideon
toolic (Bishop) on Mar 07, 2011 at 15:25 UTC
Re^2: Simple Module Tutorial"Test" is a poor choice of a module name because there is a Core module of the same name ( Test ) which is part of the standard Perl distribution. Furthermore, since you placed your .pm file under a directory named "Test", you would need to type use Test::Test; . I strongly recommend you change the name of your module to something more unique in order to avoid this naming collision.
by toolic (Bishop) on Mar 07, 2011 at 15:25 UTCAnonymous Monk on Mar 07, 2011 at 16:11 UTC
Re^3: Simple Module Tutorial
by Anonymous Monk on Mar 07, 2011 at 16:11 UTCDear toolic,
Thank you for your reply. I now have changed all instances of "Test" with "MyModule" and also changed the name of the module. MyModule.pm is now situated on
/storage/username/local/perl/site/liband I use
use lib '/storage/username/local/perl/site/lib'; with the second case (as per the example): # case 2 use MyModule; print func1(@list),"\n"; print MyModule::func2(@list),"\n"; [download]but I still get the same error: Undefined subroutine &main::func1 called at MyScript line 15.
Just to make sure I copied the module exactly from the example but to no avail. Interestingly, when I comment out the print func1 part, the line after that produces correct output. I hope that someone could point out to me where I am at fault.
Best regards,
Gideon
toolic (Bishop) on Mar 07, 2011 at 16:34 UTC
Re^4: Simple Module Tutorial
by toolic (Bishop) on Mar 07, 2011 at 16:34 UTCAnonymous Monk on Mar 07, 2011 at 16:58 UTC
Re^3: Simple Module Tutorial
by Anonymous Monk on Mar 07, 2011 at 16:58 UTCHi toolic
For some reason I can't reply to your latest post but thanks a mil! I did copy the module exactly but not the script. Somehow I mixed up case 1 and case 2. I expected case 1 not to work but case 2 but instead of coding
use MyModule qw(&func1);I simply used
use MyModule;Thanks a lot for pointing it out, it seems to work now. I have learned quite a bit.
Best regards,
Gideon
Anonymous Monk on Nov 18, 2014 at 00:13 UTC
Re: Simple Module TutorialThank you for this topic, it is very useful for a beginner. However i had a trouble with use of Module.
Anonymous Monk on Nov 18, 2014 at 01:06 UTC
Re^2: Simple Module TutorialYou did not post any code
by Anonymous Monk on Nov 18, 2014 at 01:06 UTCAnonymous Monk on Dec 16, 2014 at 23:03 UTC
Re^3: Simple Module Tutorial
by Anonymous Monk on Dec 16, 2014 at 23:03 UTC Hello, I used exactly the same code than quoted at Re: Simple Module Tutorial by johnnywang on Aug 09, 2004 at 22:58 UTC I don't get trouble with the code, but it is just that once i used the module i can't modify function in it and see the effects. It looks like library are not updated. How could we do that ?
- Comment on Simple Module Tutorial
- Select or Download Code
Replies are listed 'Best First'.
![]() |
![]() |
![]() |
Jun 03, 2016 | alvinalexander.com
Perl array FAQ: How can I test to see if a Perl array already contains a given value? (Also written as, How do I search an array with the Perl grep function?)
I use the Perl grep function to see if a Perl array contains a given entry. For instance, in this Perl code:
if ( grep { $_ eq $clientAddress} @ip_addresses ) { # the array already contains this ip address; skip it this time next; } else { # the array does not yet contain this ip address; add it push @ip_addresses, $clientAddress; }I'm testing to see if the Perl array "@ip_addresses" contains an entry given by the variable "$clientAddress".
Just use this Perl array search technique in an "if" clause, as shown, and then add whatever logic you want within your if and else statements. In this case, if the current IP address is not already in the array, I add it to the array in the "else" clause, but of course your logic will be unique.
An easier "Perl array contains" exampleIf it's easier to read without a variable in there, here's another example of this "Perl array contains" code:
if ( grep { $_ eq '192.168.1.100'} @ip_addresses )if you'd like more details, I didn't realize it, but I have another good example out here in my " Perl grep array tutorial ." (It's pretty bad when you can't find things on your own website.)
![]() |
![]() |
![]() |
Aug 06, 2001 | perlmonks.com
So you find the Perl docs on modules a bit confusing? OK here is the world's simplest Perl module demonstrating all the salient features of Exporter and a script that uses this module. We also give a short rundown on @INC and finish with a note on using warnings and modules. Here is the module code. MyModule.pm package MyModule; use strict; use Exporter; use vars qw($VERSION @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS); $VERSION = 1.00; @ISA = qw(Exporter); @EXPORT = (); @EXPORT_OK = qw(func1 func2); %EXPORT_TAGS = ( DEFAULT => [qw(&func1)], Both => [qw(&func1 &func2)]); sub func1 { return reverse @_ } sub func2 { return map{ uc }@_ } 1; [download]First we get a namespace by declaring a package name. This helps ensure our module's functions and variables remain separate from any script that uses it.
Use strict is a very good idea for modules to restrict the use of global variables. See use strict warnings and diagnostics or die for more details.
We need to use the Exporter module to export our functions from the MyModule:: namespace into the main:: namespace to make them available to scripts that 'use' MyModule.
We pacify strict with the use vars declaration of some variables. We can use an 'our' declaration in 5.6+
We now set a $VERSION number and make Exporter part of MyModule using the @ISA. See perlboot for all the gory details on what @ISA is or just use it as shown.
@EXPORT contains a list of functions that we export by default, in this case nothing. Generally the less you export by default using @EXPORT the better. This avoids accidentally clashing with functions defined in the script using the module. If a script wants a function let it ask.
@EXPORT_OK contains a list of functions that we export on demand so we export &func1 &func2 only if specifically requested to. Use this in preference to just blindly exporting functions via @EXPORT. You can also export variables like $CONFIG provided they are globals not lexicals scoped with my (read declare them with our or use vars).
%EXPORT_TAGS. For convenience we define two sets of export tags. The ':DEFAULT' tag exports only &func1; the ':Both' tag exports both &func1 &func2. This hash stores labels pointing to array references. In this case the arrays are anonymous.
We need the 1; at the end because when a module loads Perl checks to see that the module returns a true value to ensure it loaded OK. You could put any true value at the end (see Code::Police ) but 1 is the convention.
MyScript.pl (A simple script to use MyModule) #!/usr/bin/perl -w use strict; # you may need to set @INC here (see below) my @list = qw (J u s t ~ A n o t h e r ~ P e r l ~ H a c k e r !); # case 1 # use MyModule; # print func1(@list),"\n"; # print func2(@list),"\n"; # case 2 # use MyModule qw(&func1); # print func1(@list),"\n"; # print MyModule::func2(@list),"\n"; # case 3 # use MyModule qw(:DEFAULT); # print func1(@list),"\n"; # print func2(@list),"\n"; # case 4 # use MyModule qw(:Both); # print func1(@list),"\n"; # print func2(@list),"\n"; [download]We use MyModule in MyScript.pl as shown. Uncomment the examples to see what happens. Just uncomment one at a time.
Case 1: Because our module exports nothing by default we get errors as &funct1 and &funct2 have not been exported thus do not exist in the main:: namespace of the script.
Case 2: This works OK. We ask our module to export the &func1 so we can use it. Although &func2 was not exported we reference it with its full package name so this works OK.
Case 3: The ':DEFAULT' tag *should* export &func1 so you might expect the error here to concern a missing &func2. In fact Perl complains about &func1. Hmm, what is going on here. The DEFAULT tag name is special and is automatically set in our modules %EXPORT_TAGS hash like this DEFAULT => \@EXPORT.
Case 4: We specified the export of both our functions with the ':Both' thus this works.
A note on @INCWhen you issue a use MyModule; directive perl searchs the @INC array for a module with the correct name. @INC usually contains:
/perl/lib /perl/site/lib .The . directory (dot dir) is the current working directory. CORE modules are installed under perl/lib whereas non-CORE modules install under perl/site/lib. You can add directories to the module search path in @INC like this:
BEGIN { push @INC, '/my/dir' } # or BEGIN { unshift @INC, '/my/dir' } # or use lib '/my/dir'; [download]We need to use a BEGIN block to shift values into @INC at compile time as this is when perl checks for modules. If you wait until the script is comiled it is too late and perl will throw an exception saying it can't find MyModule in @INC... The difference between pushing a value and unshifting a value into @INC is that perl searches the @INC array for the module starting with the first dir in that array. Thus is you have a MyModule in /perl/lib/ and another in /perl/site/lib/ and another in ./ the one in /perl/lib will be found first and thus the one used. The use lib pragma effectively does the same as the BEGIN { unshift @INC, $dir } block - see perlman:lib:lib for full specifics.
What use Foo::Bar meansuse Foo::Bar does not mean look for a module called "Foo::Bar.pm" in the @INC directories. It means search @INC for a *subdir* called "Foo" and a *module* called "Bar.pm".
Now once we have "use'd" a module its functions are available via the fully specified &PACKAGE::FUNCTION syntax. When we say &Foo::Bar::some_func we are refering to the *package name* not the (dir::)file name that we used in the use. This allows you to have many package names in one use'd file. In practice the names are usually the same.
use Warnings;You should test your module with warnings enabled as this will pick up many subtle (and not so subtle :-) errors. You can activate warnings using the -w flag in the script you use to test the module. If you add use warnings to the module then your module will require Perl 5.6+ as this was not available before then. If you put $^W++ at the top of the module then you will globally enable warnings - this may break *other modules* a script may be using in addition to your module so is rather antisocial. An expert coder here called tye tests with warnings but does not include them directly in his/her modules.
Hope this explains how it works.
cheers
tachyon
UpdateFixed a typo and added a few comments. Thanks to John M. Dlugosz . Rewrote and restyled tute for compatibility with versions of Perl < 5.6 thanks to crazyinsomniac . Also thanks to tye for reminding me that $^W++ is globally scoped and a bit antisocial for a module.
s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print
John M. Dlugosz (Monsignor) on Aug 06, 2001 at 04:30 UTC
Re: Simple Module TutorialVery nice, getting everything into a short page like that. But, I have a few comments:
Are you sure you want to make $VERSION a float, rather than a v-string? And if so, illustrate the three-digit convention (e.g. 5.005_001 for version 5.5.1).
I'm also shocked that your pm file doesn't use strict !
I would also suggest adding a comment to the 1; line, saying that this means "loaded OK".
-- John
tachyon (Chancellor) on Aug 06, 2001 at 05:42 UTC
Re: Re: Simple Module Tutorial
by tachyon (Chancellor) on Aug 06, 2001 at 05:42 UTCThanks John I've updated the text a bit in line with your suggestions. Forgot the strict in the module! Oops it is back in its rightful place right at the top. I just used the simple $VERSION numbering because this is a simple tute :-) Here is an excerpt from the Exporter manpage for those interested.
Module Version Checking The Exporter module will convert an attempt to import a number from a module into a call to $module_name->require_version($value). This can be used to validate that the version of the module being used is greater than or equal to the required version. The Exporter module supplies a default require_version method which checks the value of $VERSION in the exporting module. Since the default require_version method treats the $VERSION number as a simple numeric value it will regard version 1.10 as lower than 1.9. For this reason it is strongly recommended that you use numbers with at least two decimal places, e.g., 1.09. [download]cheers
tachyon
s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print
John M. Dlugosz (Monsignor) on Aug 06, 2001 at 08:18 UTC
Re: Re: Re: Simple Module Tutorial
by John M. Dlugosz (Monsignor) on Aug 06, 2001 at 08:18 UTC Yea, I just posted a tutorial on VERSION.For compatibility with mixing decimals and v-strings, the built-in UNIVERSAL::require_version uses three decimal digits per part.
If you have $MyModule::VERSION= 1.12; (a decimal number) and do a use MyModule 1.20.1 qw/bar/ , it will tell you that the module 1.120 and you asked for 1.020, so that's OK. You expected 1.20 to be greater than 1.12, not-OK.
-- John
tye (Sage) on Aug 06, 2001 at 22:09 UTC
(tye)Re: Simple Module Tutorial$W++ will only give you run-time warnings and will affect other packages. Personally, I don't turn on warnings in modules that I write but I do make a point of testing them with warnings turned on (by putting "#!/usr/bin/perl -w" at the top of my test scripts).
- tye (but my friends call me "Tye")tachyon (Chancellor) on Aug 06, 2001 at 22:36 UTC
Re: (tye)Re: Simple Module Tutorial
by tachyon (Chancellor) on Aug 06, 2001 at 22:36 UTCThanks I changed it from the lexically scoped 'use warnings;' so that this is applicable to versions < 5.6. but as it adds little value to the tutorial and has the unwanted side effects you point out I have just deleted it - saves a few lines of dubious value. I'll add a note on testing with warnings when I have a moment.
cheers
tachyon
s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print
johnnywang (Priest) on Aug 09, 2004 at 22:58 UTC
Re: Simple Module TutorialQuite some years have gone by since the inital post, just want to point out that "use vars" is now considered deprecated, instead, "our" is prefered. Another already mentioned point is not to use @EXPORT too much. So my basic module is like the following (I have a emacs function to output this):
package MyModule; use strict; use Exporter qw(import); our $VERSION = 1.00; our @ISA = qw(Exporter); our @EXPORT_OK = qw(func1 func2); our %EXPORT_TAGS = ( DEFAULT => [qw(func1)], Both => [qw(func1 func2)]); sub func1 { return reverse @_ } sub func2 { return map{ uc }@_ } 1; [download] For those emacs users, here's the simple script to generate the skeleton: (defun perl-new-module () "Generate a skeleton source for a traditional perl module." (interactive) (setq var (split-string (read-from-minibuffer "Enter module name (eg. Web::Test): "nil nil nil nil nil nil) " ")) (setq name (car var)) (insert (format "package %s;\n\n" name)) (insert "use strict;\n\n") (insert "use Exporter qw(import);\n") (insert "our @ISA = qw(Exporter);\n") (insert "our @EXPORT_OK = qw();\n") (insert "our %EXPORT_TAGS = ();\n") (insert "our $VERSION = 1.00; \n\n") (insert "\n\n\n\n\n\n") (insert "1;") (insert "\n") (previous-line 6) (end-of-line) ) [download]adrianh (Chancellor) on Aug 10, 2004 at 00:30 UTC
Re^2: Simple Module Tutorial
by adrianh (Chancellor) on Aug 10, 2004 at 00:30 UTCQuite some years have gone by since the inital post, just want to point out that "use vars" is now considered deprecated, instead, "our" is prefered.Some people prefer to use our rather than use vars (I'm one of them) - but the latter is not deprecated. Both do slightly different things and many people still prefer to use vars .
beretboy (Chaplain) on Aug 18, 2001 at 15:07 UTC
Re: Simple Module TutorialExellent tutorial ++! I have never understood the writing of modules till now
"Sanity is the playground of the unimaginative" -Unknown
Codmate (Novice) on Sep 06, 2001 at 20:18 UTC
Re: Re: Simple Module TutorialFantastic stuff - I just converted a full script into a module (and added some bits) and it worked 1st time - WITH NO DEBUGGING REQUIRED (and yes - I am using strict)!!! I was very scared of modules before but now feel like I could write a hundred. Thanks very much for this - invaluable tutorial for a newbie like me :))))))
by Codmate (Novice) on Sep 06, 2001 at 20:18 UTCJaap (Curate) on Jul 22, 2002 at 09:35 UTC
Re: Simple Module TutorialThis is a nice tutorial tachyon. Are you considering writing a more advanced tutorial on modules (combined with OO)?
Especially, what a GOOD module looks like. Should we use carp, dynaloader and what not?
gawatkins (Monsignor) on Apr 10, 2003 at 11:44 UTC
Re: Simple Module TutorialGreat Tutorial, It helped to clear up the muddy water created by my Perl Black Book .
Thanks again,
Greg W.
twotone (Beadle) on Oct 14, 2007 at 05:06 UTC
Re: Simple Module TutorialGreat summary of module basics!
Here's a little code I came up with to add my module location to @INC (in a cgi environment) by dynamically determining the document root for the script. It works on the remote apache server and when testing locally in windows. It might be of some interest:
BEGIN { # get doc root from %ENV # implicitly declare file root path if %ENV not fount my $doc_root = $ENV{DOCUMENT_ROOT} || 'C:/Users/User/Documents/website/sites/mysite'; # change \ to / $doc_root =~ s/\\/\//g; # add module folder location $doc_root .= "/cgi-bin/cms/"; # add module location to @INC push(@INC,$doc_root); } [download]bychan (Initiate) on Jan 28, 2008 at 08:45 UTC
Re^2: Simple Module TutorialThis tutorial is great. The only problem is, that I get the following result, if I comment out all the cases:
by bychan (Initiate) on Jan 28, 2008 at 08:45 UTC!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!
!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!
!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!
!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!Shouldn't there be some error messages or warnings?
chexmix (Hermit) on Aug 12, 2008 at 13:24 UTC
Re: Simple Module TutorialI like this post very much, but the following is opaque to me for some reason:
" Case 3: The ':DEFAULT' tag *should* export &func1 so you might expect the error here to concern a missing &func2. In fact Perl complains about &func1. Hmm, what is going on here. The DEFAULT tag name is special and is automatically set in our modules %EXPORT_TAGS hash like this DEFAULT => \@EXPORT."
I confess I still don't know what is going on here, and am wondering if someone can help me out.
The words "is automatically set in our modules %EXPORT_TAGS hash like this DEFAULT => \@EXPORT" seem inaccurate to me, since when I look up at the code for MyModule.pm, I see the line
%EXPORT_TAGS = ( DEFAULT => qw(&func1) ,
and not the line
%EXPORT_TAGS = ( DEFAULT => \@EXPORT,
Is the OP saying that the line as is in the MyModule.pm code is incorrect, because the "special" nature of DEFAULT overrides it with something else?
Thanks. I feel I am blanking on something obvious here, but just can't see it this rainy morning.
tye (Sage) on Aug 14, 2008 at 15:41 UTC
Re^2: Simple Module Tutorial (DEFAULT)
by tye (Sage) on Aug 14, 2008 at 15:41 UTCIs the OP saying that the line as is in the MyModule.pm code is incorrect, because the "special" nature of DEFAULT overrides it with something else?Yes. Exporter.pm wants :DEFAULT to match @EXPORT so the module is incorrect in trying to define its own meaning for :DEFAULT. Based on what you've quoted, it appears that Exporter.pm forces this issue, but the more important point is that you shouldn't set $EXPORT_TAGS{DEFAULT} yourself.
- tye
sg (Pilgrim) on Feb 05, 2011 at 23:13 UTC
Re: Simple Module TutorialThanks for the exposition; my inclination regarding a simple module is as follows:
MyModule.pm package MyModule; use strict; use warnings; use diagnostics; use Carp; our $VERSION = 1.08; sub see_me { my $foo = shift; print "\t\tDo you see this: $foo?\n"; } 1; __END__ last line of the module needs to be true; last line of the _file_ need not be true: 0; [download]The above module is exercised by the following script:
exercise_my_module.pl
#!/c/opt/perl/bin/perl use strict; use warnings; use diagnostics; use Carp; use MyModule 1.05; #use MyModule 1.10; # will fail MyModule::see_me( 8 ); __END__ [download]chanslor (Acolyte) on Feb 27, 2015 at 19:47 UTC
Re^2: Simple Module TutorialThank you for this post! It has gotten me past the first barrier of writing my own module. Thanks again! - chanslor
by chanslor (Acolyte) on Feb 27, 2015 at 19:47 UTCAnonymous Monk on Mar 09, 2010 at 12:21 UTC
Re: Simple Module TutorialExcellent representation of what seemed a tough nut to swallow. Thank you very much. Tanuj Bhargava
Anonymous Monk on Mar 07, 2011 at 15:12 UTC
Re: Simple Module TutorialThanks to the writer for your trouble. But everyone seems to get it except me. I have tried to duplicate your results.
I have:
- MyScript.pl under /storage/username/PERL_SCRIPTS/dev
- Test.pm under /storage/username/local/perl/site/lib/Test/Test.pm (I just replaced MyModule.pm with Test.pm).
The module code is exactly the same. In MyScript.pl I have added
use lib '/storage/username/local/perl/site/lib'; <p>and typed in the first two cases.</p> <code>perl MyScript.pl [download] gives: Undefined subroutine &main::func1 called at MyScript.pl line 10Line 10 is:
print func1(@list),"\n";after typing "use Test;"
What am I missing here? Also, is the BEGIN command supposed to be used in the Perl script? It gives syntax errors when I try to use it.
Thanks in advance,
Gideon
toolic (Bishop) on Mar 07, 2011 at 15:25 UTC
Re^2: Simple Module Tutorial"Test" is a poor choice of a module name because there is a Core module of the same name ( Test ) which is part of the standard Perl distribution. Furthermore, since you placed your .pm file under a directory named "Test", you would need to type use Test::Test; . I strongly recommend you change the name of your module to something more unique in order to avoid this naming collision.
by toolic (Bishop) on Mar 07, 2011 at 15:25 UTCAnonymous Monk on Mar 07, 2011 at 16:11 UTC
Re^3: Simple Module Tutorial
by Anonymous Monk on Mar 07, 2011 at 16:11 UTCDear toolic,
Thank you for your reply. I now have changed all instances of "Test" with "MyModule" and also changed the name of the module. MyModule.pm is now situated on
/storage/username/local/perl/site/liband I use
use lib '/storage/username/local/perl/site/lib'; with the second case (as per the example): # case 2 use MyModule; print func1(@list),"\n"; print MyModule::func2(@list),"\n"; [download]but I still get the same error: Undefined subroutine &main::func1 called at MyScript line 15.
Just to make sure I copied the module exactly from the example but to no avail. Interestingly, when I comment out the print func1 part, the line after that produces correct output. I hope that someone could point out to me where I am at fault.
Best regards,
Gideon
toolic (Bishop) on Mar 07, 2011 at 16:34 UTC
Re^4: Simple Module Tutorial
by toolic (Bishop) on Mar 07, 2011 at 16:34 UTCAnonymous Monk on Mar 07, 2011 at 16:58 UTC
Re^3: Simple Module Tutorial
by Anonymous Monk on Mar 07, 2011 at 16:58 UTCHi toolic
For some reason I can't reply to your latest post but thanks a mil! I did copy the module exactly but not the script. Somehow I mixed up case 1 and case 2. I expected case 1 not to work but case 2 but instead of coding
use MyModule qw(&func1);I simply used
use MyModule;Thanks a lot for pointing it out, it seems to work now. I have learned quite a bit.
Best regards,
Gideon
Anonymous Monk on Nov 18, 2014 at 00:13 UTC
Re: Simple Module TutorialThank you for this topic, it is very useful for a beginner. However i had a trouble with use of Module.
Anonymous Monk on Nov 18, 2014 at 01:06 UTC
Re^2: Simple Module TutorialYou did not post any code
by Anonymous Monk on Nov 18, 2014 at 01:06 UTCAnonymous Monk on Dec 16, 2014 at 23:03 UTC
Re^3: Simple Module Tutorial
by Anonymous Monk on Dec 16, 2014 at 23:03 UTC Hello, I used exactly the same code than quoted at Re: Simple Module Tutorial by johnnywang on Aug 09, 2004 at 22:58 UTC I don't get trouble with the code, but it is just that once i used the module i can't modify function in it and see the effects. It looks like library are not updated. How could we do that ?
- Comment on Simple Module Tutorial
- Select or Download Code
Replies are listed 'Best First'.
![]() |
![]() |
![]() |
June 3, 2016 | alvinalexander.com
Perl array FAQ: How can I test to see if a Perl array already contains a given value? (Also written as, How do I search an array with the Perl grep function?)
I use the Perl grep function to see if a Perl array contains a given entry. For instance, in this Perl code:
if ( grep { $_ eq $clientAddress} @ip_addresses ) { # the array already contains this ip address; skip it this time next; } else { # the array does not yet contain this ip address; add it push @ip_addresses, $clientAddress; }I'm testing to see if the Perl array "@ip_addresses" contains an entry given by the variable "$clientAddress".
Just use this Perl array search technique in an "if" clause, as shown, and then add whatever logic you want within your if and else statements. In this case, if the current IP address is not already in the array, I add it to the array in the "else" clause, but of course your logic will be unique.
An easier "Perl array contains" exampleIf it's easier to read without a variable in there, here's another example of this "Perl array contains" code:
if ( grep { $_ eq '192.168.1.100'} @ip_addresses )if you'd like more details, I didn't realize it, but I have another good example out here in my " Perl grep array tutorial ." (It's pretty bad when you can't find things on your own website.)
![]() |
![]() |
![]() |
Nov 22, 2017 | stackoverflow.com
Speeddymon ,2 days ago
I've been reading up on dispatch tables and I get the general idea of how they work, but I'm having some trouble taking what I see online and applying the concept to some code I originally wrote as an ugly mess of if-elsif-else statements.I have options parsing configured by using
GetOpt::Long
, and in turn, those options set a value in the%OPTIONS
hash, depending on the option used.Taking the below code as an example... ( UPDATED WITH MORE DETAIL
use 5.008008; use strict; use warnings; use File::Basename qw(basename); use Getopt::Long qw(HelpMessage VersionMessage :config posix_default require_order no_ignore_case auto_version auto_help); my $EMPTY => q{}; sub usage { my $PROG = basename($0); print {*STDERR} $_ for @_; print {*STDERR} "Try $PROG --help for more information.\n"; exit(1); } sub process_args { my %OPTIONS; $OPTIONS{host} = $EMPTY; $OPTIONS{bash} = 0; $OPTIONS{nic} = 0; $OPTIONS{nicName} = $EMPTY; $OPTIONS{console} = 0; $OPTIONS{virtual} = 0; $OPTIONS{cmdb} = 0; $OPTIONS{policyid} = 0; $OPTIONS{showcompliant} = 0; $OPTIONS{backup} = 0; $OPTIONS{backuphistory} = 0; $OPTIONS{page} = $EMPTY; GetOptions ( 'host|h=s' => \$OPTIONS{host} , 'use-bash-script' => \$OPTIONS{bash} , 'remote-console|r!' => \$OPTIONS{console} , 'virtual-console|v!' => \$OPTIONS{virtual} , 'nic|n!' => \$OPTIONS{nic} , 'nic-name|m=s' => \$OPTIONS{nicName} , 'cmdb|d!' => \$OPTIONS{cmdb} , 'policy|p=i' => \$OPTIONS{policyid} , 'show-compliant|c!' => \$OPTIONS{showcompliant} , 'backup|b!' => \$OPTIONS{backup} , 'backup-history|s!' => \$OPTIONS{backuphistory} , 'page|g=s' => \$OPTIONS{page} , 'help' => sub { HelpMessage(-exitval => 0, -verbose ->1) }, 'version' => sub { VersionMessage() }, ) or usage; if ($OPTIONS{host} eq $EMPTY) { print {*STDERR} "ERROR: Must specify a host with -h flag\n"; HelpMessage; } sanity_check_options(\%OPTIONS); # Parse anything else on the command line and throw usage for (@ARGV) { warn "Unknown argument: $_\n"; HelpMessage; } return {%OPTIONS}; } sub sanity_check_options { my $OPTIONS = shift; if (($OPTIONS->{console}) and ($OPTIONS->{virtual})) { print "ERROR: Cannot use flags -r and -v together\n"; HelpMessage; } elsif (($OPTIONS->{console}) and ($OPTIONS->{cmdb})) { print "ERROR: Cannot use flags -r and -d together\n"; HelpMessage; } elsif (($OPTIONS->{console}) and ($OPTIONS->{backup})) { print "ERROR: Cannot use flags -r and -b together\n"; HelpMessage; } elsif (($OPTIONS->{console}) and ($OPTIONS->{nic})) { print "ERROR: Cannot use flags -r and -n together\n"; HelpMessage; } if (($OPTIONS->{virtual}) and ($OPTIONS->{backup})) { print "ERROR: Cannot use flags -v and -b together\n"; HelpMessage; } elsif (($OPTIONS->{virtual}) and ($OPTIONS->{cmdb})) { print "ERROR: Cannot use flags -v and -d together\n"; HelpMessage; } elsif (($OPTIONS->{virtual}) and ($OPTIONS->{nic})) { print "ERROR: Cannot use flags -v and -n together\n"; HelpMessage; } if (($OPTIONS->{backup}) and ($OPTIONS->{cmdb})) { print "ERROR: Cannot use flags -b and -d together\n"; HelpMessage; } elsif (($OPTIONS->{backup}) and ($OPTIONS->{nic})) { print "ERROR: Cannot use flags -b and -n together\n"; HelpMessage; } if (($OPTIONS->{nic}) and ($OPTIONS->{cmdb})) { print "ERROR: Cannot use flags -n and -d together\n"; HelpMessage; } if (($OPTIONS->{policyid} != 0) and not ($OPTIONS->{cmdb})) { print "ERROR: Cannot use flag -p without also specifying -d\n"; HelpMessage; } if (($OPTIONS->{showcompliant}) and not ($OPTIONS->{cmdb})) { print "ERROR: Cannot use flag -c without also specifying -d\n"; HelpMessage; } if (($OPTIONS->{backuphistory}) and not ($OPTIONS->{backup})) { print "ERROR: Cannot use flag -s without also specifying -b\n"; HelpMessage; } if (($OPTIONS->{nicName}) and not ($OPTIONS->{nic})) { print "ERROR: Cannot use flag -m without also specifying -n\n"; HelpMessage; } return %{$OPTIONS}; }I'd like to turn the above code into a dispatch table, but can't figure out how to do it.
Any help is appreciated.
Jim Garrison ,2 days ago
Are the sets of conflicting options always pairs? Can you have situations where optionsa
,b
, andc
cannot occur together but any two are OK? Before you can pick a representation you need to be sure your model can handle the logic you need in a general way. This is not an easy problem. – Jim Garrison 2 days agosimbabque ,yesterday
Don't use English, it's horribly slow and makes your code harder to read. – simbabque yesterdaySpeeddymon ,yesterday
Removed English module and changed$ARG
/@ARG
to$_
/@_
Added$EMPTY
as I forgot I had it defined globally. – Speeddymon yesterdaySpeeddymon ,yesterday
@JimGarrison -- you are correct. The if-elsif-else does not explicitly account for 3 options that conflict (though it does account for that implicitly) As an example, using-h
is required with all of the other options. But, using-h
,-r
,v
, all together is not allowed, while-h
,-r
, and-d
is allowed. – Speeddymon yesterdayikegami ,yesterday
Since the host must be provided, it should be an argument, not an option. – ikegami yesterdayzdim ,2 days ago
I am not sure how a dispatch table would help since you need to go through pair-wise combinations of specific possibilities, and thus cannot trigger a suitable action by one lookup.Here is another way to organize it
use List::MoreUtils 'firstval'; sub sanity_check_options { my ($OPTIONS, $opt_excl) = @_; # Check each of 'opt_excl' against all other for ConFLict my @excl = sort keys %$opt_excl; while (my $eo = shift @excl) { if (my $cfl = firstval { $OPTIONS->{$eo} and $OPTIONS->{$_} } @excl) { say "Can't use -$opt_excl->{$eo} and -$opt_excl->{$cfl} together"; HelpMessage(); last; } } # Go through specific checks on # policyid, showcompliant, backuphistory, and nicName ... return 1; # or some measure of whether there were errors } # Mutually exclusive options my %opt_excl = ( console => 'r', virtual => 'v', cmdb => 'c', backup => 'b', nic => 'n' ); sanity_check_options(\%OPTIONS, \%opt_excl);This checks all options listed in
%opt_excl
against each other for conflict, removing the segments ofelsif
involving the (five) options that are mutually exclusive. It uses List::MoreUtils::firstval . The few other specific invocations are best checked one by one.There is no use of returning
$OPTIONS
since it is passed as reference so any changes apply to the original structure (while it's not meant to be changed either). Perhaps you can keep track of whether there were errors and return that if it can be used in the caller, or just return1
.This addresses the long
elsif
chain as asked, and doesn't go into the rest of code. Here is one comment though: There is no need for{%OPTIONS}
, which copies the hash in order to create an anonymous one; just usereturn \%OPTIONS;
Comment on possible multiple conflicting options
This answer as it stands does not print all conflicting options that have been used if there are more than two, as raised by ikegami in comments; it does catch any conflicts so that the run is aborted.
The code is readily adjusted for this. Instead of the code in the
if
block either
- set a flag as a conflict is detected and break out of the loop, then print the list of those that must not be used with each other (
values %opt_excl
) or point at the following usage message- collect the conflicts as they are observed; print them after the loop
- or, see a different approach in ikegami's answer
However, one is expected to know of allowed invocations and any listing of conflicts is a courtesy to the forgetful user (or a debugging aid); a usage message is printed as well anyway.
Given the high number of conflicting options the usage message should contain a prominent note on this. Also consider that so many conflicting options may indicate a design flaw.
Finally, this code fully relies on the fact that this processing goes once per run and operates with a handful of options; thus it is not concerned with efficiency and freely uses ancillary data structures.
Speeddymon ,yesterday
Updated the question to clarify. – Speeddymon yesterdayzdim ,yesterday
@Speeddymon Thank you, updated. This brings together checks of those five options which can't go one with another. The remaining few I leave to be checked one by one; "encoding" one or two possibilities in some all-encompassing system would just increase complexity (and may end up less readable). – zdim yesterdayzdim ,yesterday
@Speeddymon Added the missing include,use List::MoreUtils 'firstval'
. Edited a little in the meanwhile, as well. – zdim yesterdaySpeeddymon ,yesterday
Thank you for the easy to follow example. I went with yours as it was the clearest and contained the least duplicate code. – Speeddymon yesterdayikegami ,23 hours ago
@Speeddymon, Apparently, it's not clear as you think since you didn't realize if doesn't work. It doesn't mention the error of using-r
and-c
together if-b
is also provided. And why is a hash being used at all? Wasteful and needlessly complex. – ikegami 23 hours agosimbabque ,yesterday
You can use a dispatch table if there are a lot of options. I would build that table programmatically. It might not be the best option here, but it works and the configuration is more readable than yourelsif
construct.use strict; use warnings; use Ref::Util::XS 'is_arrayref'; # or Ref::Util sub create_key { my $input = shift; # this would come from somewhere else, probably the Getopt config my @opts = qw( host bash nic nicName console virtual cmdb policyid showcompliant backup backuphistory page ); # this is to cover the configuration with easier syntax $input = { map { $_ => 1 } @{$input} } if is_arrayref($input); # options are always prefilled with false values return join q{}, map { $input->{$_} ? 1 : 0 } sort @opts; } my %forbidden_combinations = ( map { create_key( $_->[0] ) => $_->[1] } ( [ [qw( console virtual )] => q{Cannot use flags -r and -v together} ], [ [qw( console cmdb )] => q{Cannot use flags -r and -d together} ], [ [qw( console backup )] => q{Cannot use flags -r and -b together} ], [ [qw( console nic )] => q{Cannot use flags -r and -n together} ], ) ); p %forbidden_combinations; # from Data::PrinterThe output of the
p
function is the dispatch table.{ 00101 "Cannot use flags -r and -v together", 00110 "Cannot use flags -r and -n together", 01100 "Cannot use flags -r and -d together", 10100 "Cannot use flags -r and -b together" }As you can see, we've sorted all the options ascii-betically to use them as keys. That way, you could in theory build all kinds of combinations like exclusive options.
Let's take a look at the configuration itself.
my %forbidden_combinations = ( map { create_key( $_->[0] ) => $_->[1] } ( [ [qw( console virtual )] => q{Cannot use flags -r and -v together} ], # ... ) );We use a list of array references. Each entry is on one line and contains two pieces of information. Using the fat comma
=>
makes it easy to read. The first part, which is much like a key in a hash, is the combination. It's a list of fields that should not occur together. The second element in the array ref is the error message. I've removed all the recurring elements, like the newline, to make it easier to change how and where the error can be displayed.The
map
around this list of combination configuration runs the options through ourcreate_key
function, which translates it to a simple bitmap-style string. We assign all of it to a hash of that map and the error message.Inside
create_key
, we check if it was called with an array reference as its argument. If that's the case, the call was for building the table, and we convert it to a hash reference so we have a proper map to look stuff up in. We know that the%OPTIONS
always contains all the keys that exist, and that those are pre-filled with values that all evaluate to false . We can harness that convert the truthiness of those values to1
or0
, which then builds our key.We will see in a moment why that is useful.
Now how do we use this?
sub HelpMessage { exit; }; # as a placeholder # set up OPTIONS my %OPTIONS = ( host => q{}, bash => 0, nic => 0, nicName => q{}, console => 0, virtual => 0, cmdb => 0, policyid => 0, showcompliant => 0, backup => 0, backuphistory => 0, page => q{}, ); # read options with Getopt::Long ... $OPTIONS{console} = $OPTIONS{virtual} = 1; # ... and check for wrong invocations if ( exists $forbidden_combinations{ my $key = create_key($OPTIONS) } ) { warn "ERROR: $forbidden_combinations{$key}\n"; HelpMessage; }All we need to do now is get the
$OPTIONS
hash reference from Getopt::Long, and pass it through ourcreate_key
function to turn it into the map string. Then we can simply see if that keyexists
in our%forbidden_combinations
dispatch table and show the corresponding error message.
Advantages of this approach
If you want to add more parameters, all you need to do is include them in
@opts
. In a full implementation that would probably be auto-generated from the config for the Getopt call. The keys will change under the hood, but since that is abstracted away you don't have to care.Furthermore, this is easy to read. The
create_key
aside, the actual dispatch table syntax is quite concise and even has documentary character.Disadvantages of this approach
There is a lot of programmatic generation going on for just a single call. It's certainly not the most efficient way to do it.
To take this further, you can write functions that auto-generate entries for certain scenarios.
I suggest you take a look at the second chapter in Mark Jason Dominus' excellent book Higher-Order Perl , which is available for free as a PDF.
Speeddymon ,yesterday
Thank you for the detailed answer. I've updated the question to help clarify how the$OPTIONS
hash is setup. Can your example work within the bounds of what I have already, or should I rewrite the whole thing from scratch? – Speeddymon yesterdaysimbabque ,yesterday
@Speeddymon yeah, that should work. I see you've got%OPTIONS
, and it is always pre-set with values. That's going to be interesting. Let me try. – simbabque yesterdaySpeeddymon ,yesterday
Speaking of the HOP book... That was actually what I was using to try to learn and where I was having trouble in applying the concept to my code. :-) I couldn't find a PDF version before, so thank you for the link! – Speeddymon yesterdaysimbabque ,yesterday
@Speeddymon I've updated the answer and changed it to match your updated code. I suggest you read the diff first. What I don't like about it yet is that the possible keys are there twice, but that can be solved with some more trickery. I think that would blow up the answer even more, so I didn't do that. – simbabque yesterdayikegami ,23 hours ago
Doesn't detect the case when-r
,-v
and-b
are provided as an error. – ikegami 23 hours agoikegami ,2 days ago
You shouldn't be using elsif here because multiple condition could be true. And since multiple conditions could be true, a dispatch table can't be used. Your code can still be simplified greatly.my @errors; push @errors, "ERROR: Host must be provided\n" if !defined($OPTIONS{host}); my @conflicting = map { my ($opt, $flag) = @$_; $OPTIONS->{$opt} ? $flag : () } [ 'console', '-r' ], [ 'virtual', '-v' ], [ 'cmdb', '-d' ], [ 'backup', '-b' ], [ 'nic', '-n' ]; push @errors, "ERROR: Can only use one the following flags at a time: @conflicting\n" if @conflicting > 1; push @errors, "ERROR: Can't use flag -p without also specifying -d\n" if defined($OPTIONS->{policyid}) && !$OPTIONS->{cmdb}; push @errors, "ERROR: Can't use flag -c without also specifying -d\n" if $OPTIONS->{showcompliant} && !$OPTIONS->{cmdb}; push @errors, "ERROR: Can't use flag -s without also specifying -b\n" if $OPTIONS->{backuphistory} && !$OPTIONS->{backup}; push @errors, "ERROR: Can't use flag -m without also specifying -n\n" if defined($OPTIONS->{nicName}) && !$OPTIONS->{nic}; push @errors, "ERROR: Incorrect number of arguments\n" if @ARGV; usage(@errors) if @errors;Note that the above fixes numerous errors in your code.
Help vs Usage Error
--help
should provide the requested help to STDOUT, and shouldn't result in an error exit code.- Usage errors should be printed to STDERR, and should result in an error exit code.
Calling
HelpMessage
indifferently in both situations is therefore incorrect.Create the following sub named
usage
to use (without arguments) whenGetOptions
returns false, and with an error message when some other usage error occurs:use File::Basename qw( basename ); sub usage { my $prog = basename($0); print STDERR $_ for @_; print STDERR "Try '$prog --help' for more information.\n"; exit(1); }Keep using
HelpMessage
in response to--help
, but the defaults for the arguments are not appropriate for--help
. You should use the following:'help' => sub { HelpMessage( -exitval => 0, -verbose => 1 ) },Speeddymon ,yesterday
I wondered if it would be impossible because of multiple conditions being true, but based on other answers, it seems that it is possible to still build a table and compare... – Speeddymon yesterdayikegami ,yesterday
What are you talking about? No answer used a dispatch table. All the answers (including mine) used a (for
ormap
) loop that performs as many checks as there are conditions. The points of a dispatch table is to do a single check no matter how many conditions there are. Since all conditions can be true, you need to check all conditions, so a dispatch table is impossible by definition. (And that's without even mentioning that the value of a dispatch table should be a code reference or similar (something to dispatch to).) – ikegami yesterdayikegami ,yesterday
The difference between mine and the others is that mine avoids using an inefficient unordered hash and uses an efficient ordered list instead. (You could place the list in an array if you prefer.) – ikegami yesterdayikegami ,yesterday
Updated to match updated question. That fact that none of the other answers can be extended for your updated question proves my pointthat trying to put everything into one loop or table just makes things less flexible, longer and more complex. – ikegami yesterdaySpeeddymon ,yesterday
In response to the "help" tip --HelpMessage
is defined byGetOpt::Long
and reads from the PODs at the end of the file. – Speeddymon yesterday
![]() |
![]() |
![]() |
Nov 18, 2017 | www.tutorialspoint.com
Bitwise operator works on bits and perform bit by bit operation. Assume if $a = 60; and $b = 13; Now in binary format they will be as follows − $a = 0011 1100 $b = 0000 1101 ----------------- $a&$b = 0000 1100 $a|$b = 0011 1101 $a^$b = 0011 0001 ~$a = 1100 0011 There are following Bitwise operators supported by Perl language, assume if $a = 60; and $b = 13Example
S.No. Operator & Description 1 & Binary AND Operator copies a bit to the result if it exists in both operands.
Example − ($a & $b) will give 12 which is 0000 1100
2 | Binary OR Operator copies a bit if it exists in eather operand.
Example − ($a | $b) will give 61 which is 0011 1101
3 ^ Binary XOR Operator copies the bit if it is set in one operand but not both.
Example − ($a ^ $b) will give 49 which is 0011 0001
4 ~ Binary Ones Complement Operator is unary and has the efect of 'flipping' bits.
Example − (~$a ) will give -61 which is 1100 0011 in 2's complement form due to a signed binary number.
5 << Binary Left Shift Operator. The left operands value is moved left by the number of bits specified by the right operand.
Example − $a << 2 will give 240 which is 1111 0000
6 >> Binary Right Shift Operator. The left operands value is moved right by the number of bits specified by the right operand.
Example − $a >> 2 will give 15 which is 0000 1111
Try the following example to understand all the bitwise operators available in Perl. Copy and paste the following Perl program in test.pl file and execute this program.
#!/usr/local/bin/perl use integer; $a = 60; $b = 13; print "Value of \$a = $a and value of \$b = $b\n"; $c = $a & $b; print "Value of \$a & \$b = $c\n"; $c = $a | $b; print "Value of \$a | \$b = $c\n"; $c = $a ^ $b; print "Value of \$a ^ \$b = $c\n"; $c = ~$a; print "Value of ~\$a = $c\n"; $c = $a << 2; print "Value of \$a << 2 = $c\n"; $c = $a >> 2; print "Value of \$a >> 2 = $c\n";When the above code is executed, it produces the following result −
Value of $a = 60 and value of $b = 13 Value of $a & $b = 12 Value of $a | $b = 61 Value of $a ^ $b = 49 Value of ~$a = -61 Value of $a << 2 = 240 Value of $a >> 2 = 15
![]() |
![]() |
![]() |
Nov 22, 2017 | stackoverflow.com
Learn more up vote down vote favorite
Geo ,Jun 10, 2010 at 16:39
Let's say I have this list:my @list = qw(one two three four five);and I want to grab all the elements containing
o
. I'd have this:my @containing_o = grep { /o/ } @list;But what would I have to do to also receive an index, or to be able to access the index in
grep
's body?,
my @index_containing_o = grep { $list[$_] =~ /o/ } 0..$#list; # ==> (0,1,3) my %hash_of_containing_o = map { $list[$_]=~/o/?($list[$_]=>$_):() } 0..$#list # ==> ( 'one' => 0, 'two' => 1, 'four' => 3 )
![]() |
![]() |
![]() |
Nov 22, 2017 | alvinalexander.com
Perl grep array FAQ - How to search an array/list of strings By Alvin Alexander. Last updated: June 3 2016 Perl "grep array" FAQ: Can you demonstrate a Perl grep array example? (Related: Can you demonstrate how to search a Perl array?)
A very cool thing about Perl is that you can search lists (arrays) with the Perl grep function. This makes it very easy to find things in large lists -- without having to write your own Perl for/foreach loops.
A simple Perl grep array example (Perl array search)Here's a simple Perl array grep example. First I create a small string array (pizza toppings), and then search the Perl array for the string "pepper":
# create a perl list/array of strings @pizzas = qw(cheese pepperoni veggie sausage spinach garlic); # use the perl grep function to search the @pizzas list for the string "pepper" @results = grep /pepper/, @pizzas; # print the results print "@results\n";As you might guess from looking at the code, my
@results
Perl array prints the following output:pepperoniPerl grep array - case-insensitive searchingIf you're familiar with Perl regular expressions, you might also guess that it's very easy to make this Perl array search example case-insensitive using the standard
i
operator at the end of my search string.Here's what our Perl grep array example looks like with this change:
@results = grep /pepper/i, @pizzas;Perl grep array and regular expressions (regex)You can also use more complex Perl regular expressions (regex) in your array search. For instance, if for some reason you wanted to find all strings in your array that contain at least eight consecutive word characters, you could use this search pattern:
@results = grep /\w{8}/, @pizzas;That example results in the following output:
pepperoniPerl grep array - SummaryI hope this Perl grep array example (Perl array search example) has been helpful. For related Perl examples, see the Related block on this web page, or use the search form on this website. If you have any questions, or better yet, more Perl array search examples, feel free to use the Comments section below.
![]() |
![]() |
![]() |
Nov 22, 2017 | stackoverflow.com
Perl: Searching for item in an Array Ask Question up vote down vote favorite 1
Majic Johnson ,Apr 20, 2012 at 4:53
Given anarray @A
we want to check if theelement $B
is in it. One way is to say this:Foreach $element (@A){ if($element eq $B){ print "$B is in array A"; } }However when it gets to Perl, I am thinking always about the most elegant way. And this is what I am thinking: Is there a way to find out if array A contains B if we convert A to a variable string and use
index(@A,$B)=>0Is that possible?
cHao ,Apr 20, 2012 at 4:55
grep { $_ eq $B } @A
? – cHao Apr 20 '12 at 4:55daxim ,Apr 20, 2012 at 7:06
Related: stackoverflow.com/questions/7898499/ stackoverflow.com/questions/3086874/ – daxim Apr 20 '12 at 7:06Nikhil Jain ,Apr 20, 2012 at 5:49
There are many ways to find out whether the element is present in the array or not:
- Using foreach
foreach my $element (@a) { if($element eq $b) { # do something last; } }- Using Grep:
my $found = grep { $_ eq $b } @a;- Using List::Util module
use List::Util qw(first); my $found = first { $_ eq $b } @a;- Using Hash initialised by a Slice
my %check; @check{@a} = (); my $found = exists $check{$b};- Using Hash initialised by map
my %check = map { $_ => 1 } @a; my $found = $check{$b};pilcrow ,May 2, 2012 at 19:56
The List::Util::first() example is (potentially) subtly incorrect when searching for false values, since$found
will also evaluate false. (die unless $found
... oops!) List::MoreUtils::any does the right thing here. – pilcrow May 2 '12 at 19:56yazu ,Apr 20, 2012 at 4:56
use 5.10.1; $B ~~ @A and say '$B in @A';brian d foy ,Apr 20, 2012 at 13:07
You have to be very careful with this because this distributes the match over the elements. If @A has an array reference element that contains $B, this will still match even though $B isn't a top level element of @A. The smart match is fundamentally broken for this and many other reasons. – brian d foy Apr 20 '12 at 13:07obmib ,Apr 20, 2012 at 5:51
use List::AllUtils qw/ any /; print "\@A contains $B" if any { $B eq $_ } @A;bvr ,Apr 20, 2012 at 7:43
I would recommendfirst
in this case, as it does not have to traverse whole array. It can stop when item is found. – bvr Apr 20 '12 at 7:43brian d foy ,Apr 20, 2012 at 13:10
any can stop too because it needs only one element to be true. – brian d foy Apr 20 '12 at 13:10pilcrow ,May 3, 2012 at 1:38
Beware thatfirst
can also return a false value if it finds, e.g., "0", which would confound the example given in this answer.any
has the desired semantics. – pilcrow May 3 '12 at 1:38
![]() |
![]() |
![]() |
Nov 22, 2017 | stackoverflow.com
Speeddymon ,2 days ago
I've been reading up on dispatch tables and I get the general idea of how they work, but I'm having some trouble taking what I see online and applying the concept to some code I originally wrote as an ugly mess of if-elsif-else statements.I have options parsing configured by using
GetOpt::Long
, and in turn, those options set a value in the%OPTIONS
hash, depending on the option used.Taking the below code as an example... ( UPDATED WITH MORE DETAIL
use 5.008008; use strict; use warnings; use File::Basename qw(basename); use Getopt::Long qw(HelpMessage VersionMessage :config posix_default require_order no_ignore_case auto_version auto_help); my $EMPTY => q{}; sub usage { my $PROG = basename($0); print {*STDERR} $_ for @_; print {*STDERR} "Try $PROG --help for more information.\n"; exit(1); } sub process_args { my %OPTIONS; $OPTIONS{host} = $EMPTY; $OPTIONS{bash} = 0; $OPTIONS{nic} = 0; $OPTIONS{nicName} = $EMPTY; $OPTIONS{console} = 0; $OPTIONS{virtual} = 0; $OPTIONS{cmdb} = 0; $OPTIONS{policyid} = 0; $OPTIONS{showcompliant} = 0; $OPTIONS{backup} = 0; $OPTIONS{backuphistory} = 0; $OPTIONS{page} = $EMPTY; GetOptions ( 'host|h=s' => \$OPTIONS{host} , 'use-bash-script' => \$OPTIONS{bash} , 'remote-console|r!' => \$OPTIONS{console} , 'virtual-console|v!' => \$OPTIONS{virtual} , 'nic|n!' => \$OPTIONS{nic} , 'nic-name|m=s' => \$OPTIONS{nicName} , 'cmdb|d!' => \$OPTIONS{cmdb} , 'policy|p=i' => \$OPTIONS{policyid} , 'show-compliant|c!' => \$OPTIONS{showcompliant} , 'backup|b!' => \$OPTIONS{backup} , 'backup-history|s!' => \$OPTIONS{backuphistory} , 'page|g=s' => \$OPTIONS{page} , 'help' => sub { HelpMessage(-exitval => 0, -verbose ->1) }, 'version' => sub { VersionMessage() }, ) or usage; if ($OPTIONS{host} eq $EMPTY) { print {*STDERR} "ERROR: Must specify a host with -h flag\n"; HelpMessage; } sanity_check_options(\%OPTIONS); # Parse anything else on the command line and throw usage for (@ARGV) { warn "Unknown argument: $_\n"; HelpMessage; } return {%OPTIONS}; } sub sanity_check_options { my $OPTIONS = shift; if (($OPTIONS->{console}) and ($OPTIONS->{virtual})) { print "ERROR: Cannot use flags -r and -v together\n"; HelpMessage; } elsif (($OPTIONS->{console}) and ($OPTIONS->{cmdb})) { print "ERROR: Cannot use flags -r and -d together\n"; HelpMessage; } elsif (($OPTIONS->{console}) and ($OPTIONS->{backup})) { print "ERROR: Cannot use flags -r and -b together\n"; HelpMessage; } elsif (($OPTIONS->{console}) and ($OPTIONS->{nic})) { print "ERROR: Cannot use flags -r and -n together\n"; HelpMessage; } if (($OPTIONS->{virtual}) and ($OPTIONS->{backup})) { print "ERROR: Cannot use flags -v and -b together\n"; HelpMessage; } elsif (($OPTIONS->{virtual}) and ($OPTIONS->{cmdb})) { print "ERROR: Cannot use flags -v and -d together\n"; HelpMessage; } elsif (($OPTIONS->{virtual}) and ($OPTIONS->{nic})) { print "ERROR: Cannot use flags -v and -n together\n"; HelpMessage; } if (($OPTIONS->{backup}) and ($OPTIONS->{cmdb})) { print "ERROR: Cannot use flags -b and -d together\n"; HelpMessage; } elsif (($OPTIONS->{backup}) and ($OPTIONS->{nic})) { print "ERROR: Cannot use flags -b and -n together\n"; HelpMessage; } if (($OPTIONS->{nic}) and ($OPTIONS->{cmdb})) { print "ERROR: Cannot use flags -n and -d together\n"; HelpMessage; } if (($OPTIONS->{policyid} != 0) and not ($OPTIONS->{cmdb})) { print "ERROR: Cannot use flag -p without also specifying -d\n"; HelpMessage; } if (($OPTIONS->{showcompliant}) and not ($OPTIONS->{cmdb})) { print "ERROR: Cannot use flag -c without also specifying -d\n"; HelpMessage; } if (($OPTIONS->{backuphistory}) and not ($OPTIONS->{backup})) { print "ERROR: Cannot use flag -s without also specifying -b\n"; HelpMessage; } if (($OPTIONS->{nicName}) and not ($OPTIONS->{nic})) { print "ERROR: Cannot use flag -m without also specifying -n\n"; HelpMessage; } return %{$OPTIONS}; }I'd like to turn the above code into a dispatch table, but can't figure out how to do it.
Any help is appreciated.
Jim Garrison ,2 days ago
Are the sets of conflicting options always pairs? Can you have situations where optionsa
,b
, andc
cannot occur together but any two are OK? Before you can pick a representation you need to be sure your model can handle the logic you need in a general way. This is not an easy problem. – Jim Garrison 2 days agosimbabque ,yesterday
Don't use English, it's horribly slow and makes your code harder to read. – simbabque yesterdaySpeeddymon ,yesterday
Removed English module and changed$ARG
/@ARG
to$_
/@_
Added$EMPTY
as I forgot I had it defined globally. – Speeddymon yesterdaySpeeddymon ,yesterday
@JimGarrison -- you are correct. The if-elsif-else does not explicitly account for 3 options that conflict (though it does account for that implicitly) As an example, using-h
is required with all of the other options. But, using-h
,-r
,v
, all together is not allowed, while-h
,-r
, and-d
is allowed. – Speeddymon yesterdayikegami ,yesterday
Since the host must be provided, it should be an argument, not an option. – ikegami yesterdayzdim ,2 days ago
I am not sure how a dispatch table would help since you need to go through pair-wise combinations of specific possibilities, and thus cannot trigger a suitable action by one lookup.Here is another way to organize it
use List::MoreUtils 'firstval'; sub sanity_check_options { my ($OPTIONS, $opt_excl) = @_; # Check each of 'opt_excl' against all other for ConFLict my @excl = sort keys %$opt_excl; while (my $eo = shift @excl) { if (my $cfl = firstval { $OPTIONS->{$eo} and $OPTIONS->{$_} } @excl) { say "Can't use -$opt_excl->{$eo} and -$opt_excl->{$cfl} together"; HelpMessage(); last; } } # Go through specific checks on # policyid, showcompliant, backuphistory, and nicName ... return 1; # or some measure of whether there were errors } # Mutually exclusive options my %opt_excl = ( console => 'r', virtual => 'v', cmdb => 'c', backup => 'b', nic => 'n' ); sanity_check_options(\%OPTIONS, \%opt_excl);This checks all options listed in
%opt_excl
against each other for conflict, removing the segments ofelsif
involving the (five) options that are mutually exclusive. It uses List::MoreUtils::firstval . The few other specific invocations are best checked one by one.There is no use of returning
$OPTIONS
since it is passed as reference so any changes apply to the original structure (while it's not meant to be changed either). Perhaps you can keep track of whether there were errors and return that if it can be used in the caller, or just return1
.This addresses the long
elsif
chain as asked, and doesn't go into the rest of code. Here is one comment though: There is no need for{%OPTIONS}
, which copies the hash in order to create an anonymous one; just usereturn \%OPTIONS;
Comment on possible multiple conflicting options
This answer as it stands does not print all conflicting options that have been used if there are more than two, as raised by ikegami in comments; it does catch any conflicts so that the run is aborted.
The code is readily adjusted for this. Instead of the code in the
if
block either
- set a flag as a conflict is detected and break out of the loop, then print the list of those that must not be used with each other (
values %opt_excl
) or point at the following usage message- collect the conflicts as they are observed; print them after the loop
- or, see a different approach in ikegami's answer
However, one is expected to know of allowed invocations and any listing of conflicts is a courtesy to the forgetful user (or a debugging aid); a usage message is printed as well anyway.
Given the high number of conflicting options the usage message should contain a prominent note on this. Also consider that so many conflicting options may indicate a design flaw.
Finally, this code fully relies on the fact that this processing goes once per run and operates with a handful of options; thus it is not concerned with efficiency and freely uses ancillary data structures.
Speeddymon ,yesterday
Updated the question to clarify. – Speeddymon yesterdayzdim ,yesterday
@Speeddymon Thank you, updated. This brings together checks of those five options which can't go one with another. The remaining few I leave to be checked one by one; "encoding" one or two possibilities in some all-encompassing system would just increase complexity (and may end up less readable). – zdim yesterdayzdim ,yesterday
@Speeddymon Added the missing include,use List::MoreUtils 'firstval'
. Edited a little in the meanwhile, as well. – zdim yesterdaySpeeddymon ,yesterday
Thank you for the easy to follow example. I went with yours as it was the clearest and contained the least duplicate code. – Speeddymon yesterdayikegami ,23 hours ago
@Speeddymon, Apparently, it's not clear as you think since you didn't realize if doesn't work. It doesn't mention the error of using-r
and-c
together if-b
is also provided. And why is a hash being used at all? Wasteful and needlessly complex. – ikegami 23 hours agosimbabque ,yesterday
You can use a dispatch table if there are a lot of options. I would build that table programmatically. It might not be the best option here, but it works and the configuration is more readable than yourelsif
construct.use strict; use warnings; use Ref::Util::XS 'is_arrayref'; # or Ref::Util sub create_key { my $input = shift; # this would come from somewhere else, probably the Getopt config my @opts = qw( host bash nic nicName console virtual cmdb policyid showcompliant backup backuphistory page ); # this is to cover the configuration with easier syntax $input = { map { $_ => 1 } @{$input} } if is_arrayref($input); # options are always prefilled with false values return join q{}, map { $input->{$_} ? 1 : 0 } sort @opts; } my %forbidden_combinations = ( map { create_key( $_->[0] ) => $_->[1] } ( [ [qw( console virtual )] => q{Cannot use flags -r and -v together} ], [ [qw( console cmdb )] => q{Cannot use flags -r and -d together} ], [ [qw( console backup )] => q{Cannot use flags -r and -b together} ], [ [qw( console nic )] => q{Cannot use flags -r and -n together} ], ) ); p %forbidden_combinations; # from Data::PrinterThe output of the
p
function is the dispatch table.{ 00101 "Cannot use flags -r and -v together", 00110 "Cannot use flags -r and -n together", 01100 "Cannot use flags -r and -d together", 10100 "Cannot use flags -r and -b together" }As you can see, we've sorted all the options ascii-betically to use them as keys. That way, you could in theory build all kinds of combinations like exclusive options.
Let's take a look at the configuration itself.
my %forbidden_combinations = ( map { create_key( $_->[0] ) => $_->[1] } ( [ [qw( console virtual )] => q{Cannot use flags -r and -v together} ], # ... ) );We use a list of array references. Each entry is on one line and contains two pieces of information. Using the fat comma
=>
makes it easy to read. The first part, which is much like a key in a hash, is the combination. It's a list of fields that should not occur together. The second element in the array ref is the error message. I've removed all the recurring elements, like the newline, to make it easier to change how and where the error can be displayed.The
map
around this list of combination configuration runs the options through ourcreate_key
function, which translates it to a simple bitmap-style string. We assign all of it to a hash of that map and the error message.Inside
create_key
, we check if it was called with an array reference as its argument. If that's the case, the call was for building the table, and we convert it to a hash reference so we have a proper map to look stuff up in. We know that the%OPTIONS
always contains all the keys that exist, and that those are pre-filled with values that all evaluate to false . We can harness that convert the truthiness of those values to1
or0
, which then builds our key.We will see in a moment why that is useful.
Now how do we use this?
sub HelpMessage { exit; }; # as a placeholder # set up OPTIONS my %OPTIONS = ( host => q{}, bash => 0, nic => 0, nicName => q{}, console => 0, virtual => 0, cmdb => 0, policyid => 0, showcompliant => 0, backup => 0, backuphistory => 0, page => q{}, ); # read options with Getopt::Long ... $OPTIONS{console} = $OPTIONS{virtual} = 1; # ... and check for wrong invocations if ( exists $forbidden_combinations{ my $key = create_key($OPTIONS) } ) { warn "ERROR: $forbidden_combinations{$key}\n"; HelpMessage; }All we need to do now is get the
$OPTIONS
hash reference from Getopt::Long, and pass it through ourcreate_key
function to turn it into the map string. Then we can simply see if that keyexists
in our%forbidden_combinations
dispatch table and show the corresponding error message.
Advantages of this approach
If you want to add more parameters, all you need to do is include them in
@opts
. In a full implementation that would probably be auto-generated from the config for the Getopt call. The keys will change under the hood, but since that is abstracted away you don't have to care.Furthermore, this is easy to read. The
create_key
aside, the actual dispatch table syntax is quite concise and even has documentary character.Disadvantages of this approach
There is a lot of programmatic generation going on for just a single call. It's certainly not the most efficient way to do it.
To take this further, you can write functions that auto-generate entries for certain scenarios.
I suggest you take a look at the second chapter in Mark Jason Dominus' excellent book Higher-Order Perl , which is available for free as a PDF.
Speeddymon ,yesterday
Thank you for the detailed answer. I've updated the question to help clarify how the$OPTIONS
hash is setup. Can your example work within the bounds of what I have already, or should I rewrite the whole thing from scratch? – Speeddymon yesterdaysimbabque ,yesterday
@Speeddymon yeah, that should work. I see you've got%OPTIONS
, and it is always pre-set with values. That's going to be interesting. Let me try. – simbabque yesterdaySpeeddymon ,yesterday
Speaking of the HOP book... That was actually what I was using to try to learn and where I was having trouble in applying the concept to my code. :-) I couldn't find a PDF version before, so thank you for the link! – Speeddymon yesterdaysimbabque ,yesterday
@Speeddymon I've updated the answer and changed it to match your updated code. I suggest you read the diff first. What I don't like about it yet is that the possible keys are there twice, but that can be solved with some more trickery. I think that would blow up the answer even more, so I didn't do that. – simbabque yesterdayikegami ,23 hours ago
Doesn't detect the case when-r
,-v
and-b
are provided as an error. – ikegami 23 hours agoikegami ,2 days ago
You shouldn't be using elsif here because multiple condition could be true. And since multiple conditions could be true, a dispatch table can't be used. Your code can still be simplified greatly.my @errors; push @errors, "ERROR: Host must be provided\n" if !defined($OPTIONS{host}); my @conflicting = map { my ($opt, $flag) = @$_; $OPTIONS->{$opt} ? $flag : () } [ 'console', '-r' ], [ 'virtual', '-v' ], [ 'cmdb', '-d' ], [ 'backup', '-b' ], [ 'nic', '-n' ]; push @errors, "ERROR: Can only use one the following flags at a time: @conflicting\n" if @conflicting > 1; push @errors, "ERROR: Can't use flag -p without also specifying -d\n" if defined($OPTIONS->{policyid}) && !$OPTIONS->{cmdb}; push @errors, "ERROR: Can't use flag -c without also specifying -d\n" if $OPTIONS->{showcompliant} && !$OPTIONS->{cmdb}; push @errors, "ERROR: Can't use flag -s without also specifying -b\n" if $OPTIONS->{backuphistory} && !$OPTIONS->{backup}; push @errors, "ERROR: Can't use flag -m without also specifying -n\n" if defined($OPTIONS->{nicName}) && !$OPTIONS->{nic}; push @errors, "ERROR: Incorrect number of arguments\n" if @ARGV; usage(@errors) if @errors;Note that the above fixes numerous errors in your code.
Help vs Usage Error
--help
should provide the requested help to STDOUT, and shouldn't result in an error exit code.- Usage errors should be printed to STDERR, and should result in an error exit code.
Calling
HelpMessage
indifferently in both situations is therefore incorrect.Create the following sub named
usage
to use (without arguments) whenGetOptions
returns false, and with an error message when some other usage error occurs:use File::Basename qw( basename ); sub usage { my $prog = basename($0); print STDERR $_ for @_; print STDERR "Try '$prog --help' for more information.\n"; exit(1); }Keep using
HelpMessage
in response to--help
, but the defaults for the arguments are not appropriate for--help
. You should use the following:'help' => sub { HelpMessage( -exitval => 0, -verbose => 1 ) },Speeddymon ,yesterday
I wondered if it would be impossible because of multiple conditions being true, but based on other answers, it seems that it is possible to still build a table and compare... – Speeddymon yesterdayikegami ,yesterday
What are you talking about? No answer used a dispatch table. All the answers (including mine) used a (for
ormap
) loop that performs as many checks as there are conditions. The points of a dispatch table is to do a single check no matter how many conditions there are. Since all conditions can be true, you need to check all conditions, so a dispatch table is impossible by definition. (And that's without even mentioning that the value of a dispatch table should be a code reference or similar (something to dispatch to).) – ikegami yesterdayikegami ,yesterday
The difference between mine and the others is that mine avoids using an inefficient unordered hash and uses an efficient ordered list instead. (You could place the list in an array if you prefer.) – ikegami yesterdayikegami ,yesterday
Updated to match updated question. That fact that none of the other answers can be extended for your updated question proves my pointthat trying to put everything into one loop or table just makes things less flexible, longer and more complex. – ikegami yesterdaySpeeddymon ,yesterday
In response to the "help" tip --HelpMessage
is defined byGetOpt::Long
and reads from the PODs at the end of the file. – Speeddymon yesterday
![]() |
![]() |
![]() |
Nov 18, 2017 | www.tutorialspoint.com
Bitwise operator works on bits and perform bit by bit operation. Assume if $a = 60; and $b = 13; Now in binary format they will be as follows − $a = 0011 1100 $b = 0000 1101 ----------------- $a&$b = 0000 1100 $a|$b = 0011 1101 $a^$b = 0011 0001 ~$a = 1100 0011 There are following Bitwise operators supported by Perl language, assume if $a = 60; and $b = 13Example
S.No. Operator & Description 1 & Binary AND Operator copies a bit to the result if it exists in both operands.
Example − ($a & $b) will give 12 which is 0000 1100
2 | Binary OR Operator copies a bit if it exists in eather operand.
Example − ($a | $b) will give 61 which is 0011 1101
3 ^ Binary XOR Operator copies the bit if it is set in one operand but not both.
Example − ($a ^ $b) will give 49 which is 0011 0001
4 ~ Binary Ones Complement Operator is unary and has the efect of 'flipping' bits.
Example − (~$a ) will give -61 which is 1100 0011 in 2's complement form due to a signed binary number.
5 << Binary Left Shift Operator. The left operands value is moved left by the number of bits specified by the right operand.
Example − $a << 2 will give 240 which is 1111 0000
6 >> Binary Right Shift Operator. The left operands value is moved right by the number of bits specified by the right operand.
Example − $a >> 2 will give 15 which is 0000 1111
Try the following example to understand all the bitwise operators available in Perl. Copy and paste the following Perl program in test.pl file and execute this program.
#!/usr/local/bin/perl use integer; $a = 60; $b = 13; print "Value of \$a = $a and value of \$b = $b\n"; $c = $a & $b; print "Value of \$a & \$b = $c\n"; $c = $a | $b; print "Value of \$a | \$b = $c\n"; $c = $a ^ $b; print "Value of \$a ^ \$b = $c\n"; $c = ~$a; print "Value of ~\$a = $c\n"; $c = $a << 2; print "Value of \$a << 2 = $c\n"; $c = $a >> 2; print "Value of \$a >> 2 = $c\n";When the above code is executed, it produces the following result −
Value of $a = 60 and value of $b = 13 Value of $a & $b = 12 Value of $a | $b = 61 Value of $a ^ $b = 49 Value of ~$a = -61 Value of $a << 2 = 240 Value of $a >> 2 = 15
![]() |
![]() |
![]() |
Nov 22, 2017 | alvinalexander.com
Perl grep array FAQ - How to search an array/list of strings By Alvin Alexander. Last updated: June 3 2016 Perl "grep array" FAQ: Can you demonstrate a Perl grep array example? (Related: Can you demonstrate how to search a Perl array?)
A very cool thing about Perl is that you can search lists (arrays) with the Perl grep function. This makes it very easy to find things in large lists -- without having to write your own Perl for/foreach loops.
A simple Perl grep array example (Perl array search)Here's a simple Perl array grep example. First I create a small string array (pizza toppings), and then search the Perl array for the string "pepper":
# create a perl list/array of strings @pizzas = qw(cheese pepperoni veggie sausage spinach garlic); # use the perl grep function to search the @pizzas list for the string "pepper" @results = grep /pepper/, @pizzas; # print the results print "@results\n";As you might guess from looking at the code, my
@results
Perl array prints the following output:pepperoniPerl grep array - case-insensitive searchingIf you're familiar with Perl regular expressions, you might also guess that it's very easy to make this Perl array search example case-insensitive using the standard
i
operator at the end of my search string.Here's what our Perl grep array example looks like with this change:
@results = grep /pepper/i, @pizzas;Perl grep array and regular expressions (regex)You can also use more complex Perl regular expressions (regex) in your array search. For instance, if for some reason you wanted to find all strings in your array that contain at least eight consecutive word characters, you could use this search pattern:
@results = grep /\w{8}/, @pizzas;That example results in the following output:
pepperoniPerl grep array - SummaryI hope this Perl grep array example (Perl array search example) has been helpful. For related Perl examples, see the Related block on this web page, or use the search form on this website. If you have any questions, or better yet, more Perl array search examples, feel free to use the Comments section below.
![]() |
![]() |
![]() |
Nov 22, 2017 | stackoverflow.com
Perl: Searching for item in an Array Ask Question up vote down vote favorite 1
Majic Johnson ,Apr 20, 2012 at 4:53
Given anarray @A
we want to check if theelement $B
is in it. One way is to say this:Foreach $element (@A){ if($element eq $B){ print "$B is in array A"; } }However when it gets to Perl, I am thinking always about the most elegant way. And this is what I am thinking: Is there a way to find out if array A contains B if we convert A to a variable string and use
index(@A,$B)=>0Is that possible?
cHao ,Apr 20, 2012 at 4:55
grep { $_ eq $B } @A
? – cHao Apr 20 '12 at 4:55daxim ,Apr 20, 2012 at 7:06
Related: stackoverflow.com/questions/7898499/ stackoverflow.com/questions/3086874/ – daxim Apr 20 '12 at 7:06Nikhil Jain ,Apr 20, 2012 at 5:49
There are many ways to find out whether the element is present in the array or not:
- Using foreach
foreach my $element (@a) { if($element eq $b) { # do something last; } }- Using Grep:
my $found = grep { $_ eq $b } @a;- Using List::Util module
use List::Util qw(first); my $found = first { $_ eq $b } @a;- Using Hash initialised by a Slice
my %check; @check{@a} = (); my $found = exists $check{$b};- Using Hash initialised by map
my %check = map { $_ => 1 } @a; my $found = $check{$b};pilcrow ,May 2, 2012 at 19:56
The List::Util::first() example is (potentially) subtly incorrect when searching for false values, since$found
will also evaluate false. (die unless $found
... oops!) List::MoreUtils::any does the right thing here. – pilcrow May 2 '12 at 19:56yazu ,Apr 20, 2012 at 4:56
use 5.10.1; $B ~~ @A and say '$B in @A';brian d foy ,Apr 20, 2012 at 13:07
You have to be very careful with this because this distributes the match over the elements. If @A has an array reference element that contains $B, this will still match even though $B isn't a top level element of @A. The smart match is fundamentally broken for this and many other reasons. – brian d foy Apr 20 '12 at 13:07obmib ,Apr 20, 2012 at 5:51
use List::AllUtils qw/ any /; print "\@A contains $B" if any { $B eq $_ } @A;bvr ,Apr 20, 2012 at 7:43
I would recommendfirst
in this case, as it does not have to traverse whole array. It can stop when item is found. – bvr Apr 20 '12 at 7:43brian d foy ,Apr 20, 2012 at 13:10
any can stop too because it needs only one element to be true. – brian d foy Apr 20 '12 at 13:10pilcrow ,May 3, 2012 at 1:38
Beware thatfirst
can also return a false value if it finds, e.g., "0", which would confound the example given in this answer.any
has the desired semantics. – pilcrow May 3 '12 at 1:38
![]() |
![]() |
![]() |
Nov 22, 2017 | stackoverflow.com
Learn more up vote down vote favorite
Geo ,Jun 10, 2010 at 16:39
Let's say I have this list:my @list = qw(one two three four five);and I want to grab all the elements containing
o
. I'd have this:my @containing_o = grep { /o/ } @list;But what would I have to do to also receive an index, or to be able to access the index in
grep
's body?,
my @index_containing_o = grep { $list[$_] =~ /o/ } 0..$#list; # ==> (0,1,3) my %hash_of_containing_o = map { $list[$_]=~/o/?($list[$_]=>$_):() } 0..$#list # ==> ( 'one' => 0, 'two' => 1, 'four' => 3 )
![]() |
![]() |
![]() |
Nov 17, 2017 | perlmonks.com
Discipulus (Monsignor) on Nov 16, 2017 at 09:04 UTC
Re: perl modulesHello codestroman and welcome to the monastery and to the wonderful world of Perl!
First of all, please, add <c> code tags </c> around your code and output.
Then be sure to have read the standard documentation: perlmod and perlnewmod
Infact a basic Perl module define a package and use Exporter to export functions in the using Perl program.
In my homenode i've collected a lot of links on about module creation
L*
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.thanos1983 (Priest) on Nov 16, 2017 at 09:17 UTC
Re: perl modulesHello codestroman
Just to add a minor suggestion here, to the full cover reply of fellow monk Discipulus . It will assist you a lot also to read Simple Module Tutorial
Update: Direct answer to your question can be found here How to add a relative directory to @INC with multiple possible solutions. I would strongly recommend to go through all the articles that all monks proposed.
Hope this helps, BR.
Seeking for Perl wisdom...on the process of learning...not there...yet!hippo (Abbot) on Nov 16, 2017 at 09:21 UTC
Re: perl modules (Can't locate in @INC)PLEASE HELP!!This is a monastery - a place of quite contemplation. The louder you shout the less wisdom shall you receive.
The error message Can't locate dog.pm in @INC is pretty explicit. Either your module file is not called dog.pm in which case, change it or else your file dog.pm is not in any of the directories listed in @INC in which case either move it to one of those directories or else change @INC with use lib .
I also see, despite the lack of formatting in your post that your module doesn't use any namespace. You should probably address that. Perhaps a solid read through Simple Module Tutorial would be a good idea?
Anonymous Monk on Nov 16, 2017 at 09:07 UTC
Re: perl modulesuse an absolute pathname in use lib
Anonymous Monk on Nov 16, 2017 at 15:16 UTC
Re: perl modulesWelcome to the language ... and, to the Monastery. The "simple module tutorial" listed above is a very good place to start. Like all languages of its kind, Perl looks at runtime for external modules in a prescribed list of places, in a specified order. You can affect this in several ways, as the tutorials describe. Please read them carefully.
In the Perl(-5) language, this list is stored in a pre-defined array variable called @INC and it is populated from a variety of sources: a base-list that is compiled directly into the Perl interpreter, the PERL5LIB environment-variable, use lib statements, and even direct modification of the variable itself. Perl searches this list from beginning to end and processes (only) the first matching file that it finds.
(Note that, in Perl, the use statement is actually a pragma, or declaration to the compiler, and as such it has many "uses" and a somewhat complicated syntax.)
Corion (Pope) on Nov 16, 2017 at 15:23 UTC
Re^2: perl modules
by Corion (Pope) on Nov 16, 2017 at 15:23 UTC(Note that, in Perl, the use statement is actually a pragma, or declaration to the compiler, and as such it has many "uses" and a somewhat complicated syntax.)Please no.
The word "pragma" has a special meaning in Perl, and it is highly confusing to claim that a Perl "keyword" would be a "pragma". use certainly is a keyword and nothing else.
If you mean to say something different, please describe in more words what you want to say.
![]() |
![]() |
![]() |
Nov 18, 2017 | www.youtube.com
The command line debugger that comes with perl is very powerful.
Not only does it allow us to debug script but it can be used as a REPL - a Read Eval Print Loop to explore the capabilities of the language. There are a few basic examples in this screencast.http://perlmaven.com/using-the-built-...
To see all the Perl tutorials visit http://perlmaven.com/perl-tutorial
About Perl Programming and Perl programmers.
In this screencast:
perl -d e 1
p - print scalar
x - print data structure
b subname - set breakpoint
![]() |
![]() |
![]() |
Nov 17, 2017 | perlmonks.com
nikmit has asked for the wisdom of the Perl Monks concerning the following question:
Dear monks,
I came across this behaviour in perl which I find unintuitive, was wondering what the use case scenario for it is or whether I have done something wrong to bring it about...
I had a statement checking for the existence of data like so return 0 unless keys %{$hashref->{$key}} and I failed to realise that $key may not always exist.
I would have expected to see an error if $href->{$key} is undefined and therefore not a reference, but instead $key was just added to the hash.
Example:
#!/usr/bin/perl -w #perl-5.22.3 use strict; my $href = { cat => {milk => 1}, dog => {bone => 1} }; if (keys %{$href->{cow}}) { print "noop\n"; } else { if (exists $href->{cow}) { print "holy cow\n"; } else { print "no cow\n"; } } [download]This prints 'holy cow'
Discipulus (Monsignor) on Nov 17, 2017 at 09:32 UTC
Re: keys %{$hash->{$href}} adds $href to the hash if it doesnt exist?Hello nikmit ,
it's not a case of autovivification?
It is explained in perlref and for more informations see Explaining Autovivication and Autovivification in perl and https://perlmaven.com/autovivification
On CPAN there is a pragma to disable it if unwanted.
PS exists fixes your snippet: if (exists $href->{cow} and keys %{$href->{cow}}) { # no cow
L*
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.1nickt (Prior) on Nov 17, 2017 at 13:54 UTC
Re^2: keys %{$hash->{$href}} adds $href to the hash if it doesnt exist?
by 1nickt (Prior) on Nov 17, 2017 at 13:54 UTCCare must be used with exists as it will indeed autovivify intermediate hashes:
use strict; use warnings; use feature 'say'; use Data::Dumper; $Data::Dumper::Sortkeys = $Data::Dumper::Indent = 1; my $href = { cat => { milk => 1 }, dog => { bone => 1 }, }; say exists $href->{'cow'}->{'alfalfa'} ? 'cow' : 'no cow'; say Dumper $href __END__ [download] Output: no cow $VAR1 = { 'cat' => { 'milk' => 1 }, 'cow' => {}, # uh-oh 'dog' => { 'bone' => 1 } }; [download] So you would have to either use exists on all levels of the structure as haukex suggested : use strict; use warnings; use feature 'say'; use Data::Dumper; $Data::Dumper::Sortkeys = $Data::Dumper::Indent = 1; my $href = { cat => { milk => 1 }, dog => { bone => 1 }, }; say exists $href->{'cow'} && exists $href->{'cow'}->{'alfalfa'} ? 'cow' : 'no cow'; say Dumper $href __END__ [download] Output: no cow $VAR1 = { 'cat' => { 'milk' => 1 }, 'dog' => { 'bone' => 1 } }; [download] ... or use autovivification : use strict; use warnings; use feature 'say'; use Data::Dumper; $Data::Dumper::Sortkeys = $Data::Dumper::Indent = 1; my $href = { cat => { milk => 1 }, dog => { bone => 1 }, }; no autovivification; say exists $href->{'cow'}->{'alfalfa'} ? 'cow' : 'no cow'; say Dumper $href __END__ [download] Output: no cow $VAR1 = { 'cat' => { 'milk' => 1 }, 'dog' => { 'bone' => 1 } }; [download] Note that autovivification.pm has effect lexically: use strict; use warnings; use feature 'say'; use Data::Dumper; $Data::Dumper::Sortkeys = $Data::Dumper::Indent = 1; my $href = { cat => { milk => 1 }, dog => { bone => 1 }, }; { no autovivification; say exists $href->{'cow'}->{'alfalfa'} ? 'cow' : 'no cow'; say Dumper $href } say exists $href->{'cow'}->{'alfalfa'} ? 'cow' : 'still no cow'; say Dumper $href; __END__ [download] no cow $VAR1 = { 'cat' => { 'milk' => 1 }, 'dog' => { 'bone' => 1 } }; still no cow $VAR1 = { 'cat' => { 'milk' => 1 }, 'cow' => {}, # uh-oh 'dog' => { 'bone' => 1 } }; [download]
The way forward always starts with a minimal test.nikmit (Sexton) on Nov 17, 2017 at 10:15 UTC
Re^2: keys %{$hash->{$href}} adds $href to the hash if it doesnt exist?Thanks - no autovivification will become a permanent presence for me, next to use strict .
by nikmit (Sexton) on Nov 17, 2017 at 10:15 UTChaukex (Monsignor) on Nov 17, 2017 at 09:42 UTC
Re: keys %{$hash->{$href}} adds $href to the hash if it doesnt exist?This is "autovivification" and was just discussed the other day, see the replies in the thread Array dereference in foreach() , including the ones deeper down in the thread.
Use exists to check if a hash key exists. As described in its documentation, if you have multi-level data structures (hashes of hashes), you need to check every level. Update: Discipulus just updated to show an example.
Eily (Parson) on Nov 17, 2017 at 10:01 UTC
Re: keys %{$hash->{$href}} adds $href to the hash if it doesnt exist?FYI, while keys %hash returns the number of keys in scalar context, you can also use the hash itself, the value will be false if the hash is empty and true otherwise (actually 0 when empty, and information on the content otherwise). So if (exists $href->{cow} and %{ $href->{cow} }) . Unlike the keys version, scalar %{ $href->{cow} } will not create a new hash (autovivify) if the cow key doesn't exist, but die instead (at least if you forgot to check if the key exists, you'll get an error in the right place).
![]() |
![]() |
![]() |
Nov 17, 2017 | perlmonks.com
http://meteoalarm.eu ) is the official website from European national weather services that gives out warnings in extreme weather situations. It has been a while ago that i wrote a perl module for processing this information. ( Weather warnings from www.meteoalarm.eu ). The website is still on but has changed since. That made some changes necessary. I wrote the module only for informational purposes and it is not meant to use it for anything critical. Here is the code: #!/usr/bin/perl # # package Meteoalarm; use strict; use warnings; use Carp; use LWP; use HTML::Entities; use HTML::TreeBuilder; use utf8; binmode STDOUT, ":encoding(UTF-8)"; our $VERSION = "0.06"; sub new { my $class = shift; my $self = {}; my %passed_params = @_; $self->{'user_agent'} = _make_user_agent( $passed_params{'user_agent'} ); bless( $self, $class ); return $self; } sub countries { my $self = shift; my %passed_params = @_; my %type = ( 'all' => 0, 'wind' => 1, 'snow' => 2, 'ice' => 2, 'snow/ice' => 2, 'thunderstorm' => 3, 'fog' => 4, 'extreme high temperature' => 5, 'extreme low temperature' => 6, 'coastal event' => 7, 'forestfire' => 8, 'avalanches' => 9, 'rain' => 10, 'unnamed' => 11, 'flood' => 12, 'rainflood' => 13 ); if ( !$passed_params{type} ) { $passed_params{type} = 0; } elsif ( !$type{ $passed_params{type} } ) { $passed_params{type} = 0; } else { $passed_params{type} = $type{ $passed_params{type} }; } my %day = ( 'today' => 0, 'tomorrow' => 1 ); if ( !$passed_params{day} ) { $passed_params{day} = 0; } elsif ( !$day{ $passed_params{day} } ) { $passed_params{day} = 0; } else { $passed_params{day} = $day{ $passed_params{day} }; } my $url = _make_country_url( $passed_params{day}, $passed_params{type} ); my $content = _fetch_content( $url, $self->{'user_agent'} ); my $country_warnings = _parse_country_warnings($content); return $country_warnings; } sub regions { my ($self) = shift; my %passed_params = @_; my %day = ( 'today' => 0, 'tomorrow' => 1 ); if ( !$passed_params{day} ) { $passed_params{day} = 0; } elsif ( !$day{ $passed_params{day} } ) { $passed_params{day} = 0; } else { $passed_params{day} = $day{ $passed_params{day} }; } my %type = ( 'all' => 0, 'wind' => 1, 'snow' => 2, 'ice' => 2, 'snow/ice' => 2, 'thunderstorm' => 3, 'fog' => 4, 'extreme high temperature' => 5, 'extreme low temperature' => 6, 'coastal event' => 7, 'forestfire' => 8, 'avalanches' => 9, 'rain' => 10, 'unnamed' => 11, 'flood' => 12, 'rainflood' => 13 ); my %country_codes = ( 'AT' => 'AT-Austria', 'BA' => 'BA_Bosnia-Herzegovina', 'BE' => 'BE-Belgium', 'BG' => 'BG-Bulgaria', 'CH' => 'CH-Switzerland', 'CY' => 'CY-Cyprus', 'CZ' => 'CZ-Czechia', 'DE' => 'DE-Germany', 'DK' => 'DK-Denmark', 'EE' => 'EE-Estonia', 'ES' => 'ES-Spain', 'FI' => 'FI-Finland', 'FR' => 'FR-France', 'GR' => 'GR-Greece', 'HR' => 'HR-Croatia', 'HU' => 'HU-Hungary', 'IE' => 'IE-Ireland', 'IL' => 'IL-Israel', 'IS' => 'IS-Iceland', 'IT' => 'IT-Italy', 'LT' => 'LT-Lithuania', 'LU' => 'LU-Luxemburg', 'LV' => 'LV-Latvia', 'MD' => 'MD-Moldova', 'ME' => 'ME-Montenegro', 'MK' => 'MK-Former Yugoslav Republic of Macedonia', 'MT' => 'MT-Malta', 'NL' => 'NL-Netherlands', 'NO' => 'NO-Norway', 'PL' => 'PL-Poland', 'PT' => 'PT-Portugal', 'RO' => 'RO-Romania', 'RS' => 'RS-Serbia', 'SE' => 'SE-Sweden', 'SI' => 'SI-Slovenia', 'SK' => 'SK-Slovakia', 'UK' => 'UK-United-Kingdom' ); if ( !$passed_params{type} ) { $passed_params{type} = 0; } elsif ( !$type{ $passed_params{type} } ) { $passed_params{type} = 0; } else { $passed_params{type} = $type{ $passed_params{type} }; } croak "Invalid country_code: $passed_params{country_code}" unless $passed_params{country_code}; my $url = 'http://meteoalarm.eu/en_UK/' . $passed_params{day} . '/' . $passed_params{type} . '/' . $country_codes{ $passed_params{country_code} } . '.html'; my $content = _fetch_content( $url, $self->{'user_agent'} ); my $region_warnings = _parse_region_warnings($content); return $region_warnings; } sub details { my $self = shift; my %passed_params = @_; my %country_codes = ( 'AT' => 10, 'BA' => 10, 'BE' => 801, 'BG' => 28, 'CH' => 319, 'CY' => 1, 'CZ' => 14, 'DE' => 808, 'DK' => 8, 'EE' => 805, 'ES' => 831, 'FI' => 813, 'FR' => 94, 'GR' => 16, 'HR' => 806, 'HU' => 7, 'IE' => 804, 'IL' => 803, 'IS' => 11, 'IT' => 20, 'LT' => 801, 'LU' => 2, 'LV' => 804, 'MD' => 37, 'ME' => 3, 'MK' => 6, 'MT' => 1, 'NL' => 807, 'NO' => 814, 'PL' => 802, 'PT' => 26, 'RO' => 42, 'RS' => 11, 'SE' => 813, 'SI' => 801, 'SK' => 16, 'UK' => 16 ); my ( $region, $code ) = $passed_params{region_code} =~ /^([ABCDEFGHILMNPRSU][A-Z])(\d\d\ d)/; $code =~ s /^0//; croak "Invalid region_code: $passed_params{region_code}" unless ( $country_codes{$region} and ( $code <= $country_codes{$region} ) ); my $details; my %type = ( 'all' => 0, 'wind' => 1, 'snow' => 2, 'ice' => 2, 'snow/ice' => 2, 'thunderstorm' => 3, 'fog' => 4, 'extreme high temperature' => 5, 'extreme low temperature' => 6, 'coastal event' => 7, 'forestfire' => 8, 'avalanches' => 9, 'rain' => 10, 'unnamed' => 11, 'flood' => 12, 'rainflood' => 13 ); if ( !$passed_params{type} ) { $passed_params{type} = 0; } elsif ( !$type{ $passed_params{type} } ) { $passed_params{type} = 0; } else { $passed_params{type} = $type{ $passed_params{type} }; } my %day = ( 'today' => 0, 'tomorrow' => 1 ); if ( !$passed_params{day} ) { $passed_params{day} = 0; } elsif ( !$day{ $passed_params{day} } ) { $passed_params{day} = 0; } else { $passed_params{day} = $day{ $passed_params{day} }; } my $url = 'http://meteoalarm.eu/en_UK/' . $passed_params{day} . '/' . $passed_params{type} . '/' . $passed_params{region_code} . '.html'; my $content = _fetch_content( $url, $self->{'user_agent'} ); $details = _parse_details($content); return $details; } sub codes { my $self = shift; my @codes; my @countries_short; if (@_) { @countries_short = @_; } else { @countries_short = qw(AT BA BE BG CH CY CZ DE DK EE ES FI FR GR HR HU IE IL IS IT LT LU LV MD ME MK MT NL NO PL PT RO RS SE SI SK UK); } my %country_codes = ( 'AT' => 'AT-Austria', 'BA' => 'BA_Bosnia-Herzegovina', 'BE' => 'BE-Belgium', 'BG' => 'BG-Bulgaria', 'CH' => 'CH-Switzerland', 'CY' => 'CY-Cyprus', 'CZ' => 'CZ-Czechia', 'DE' => 'DE-Germany', 'DK' => 'DK-Denmark', 'EE' => 'EE-Estonia', 'ES' => 'ES-Spain', 'FI' => 'FI-Finland', 'FR' => 'FR-France', 'GR' => 'GR-Greece', 'HR' => 'HR-Croatia', 'HU' => 'HU-Hungary', 'IE' => 'IE-Ireland', 'IL' => 'IL-Israel', 'IS' => 'IS-Iceland', 'IT' => 'IT-Italy', 'LT' => 'LT-Lithuania', 'LU' => 'LU-Luxemburg', 'LV' => 'LV-Latvia', 'MD' => 'MD-Moldova', 'ME' => 'ME-Montenegro', 'MK' => 'MK-Former Yugoslav Republic of Macedonia', 'MT' => 'MT-Malta', 'NL' => 'NL-Netherlands', 'NO' => 'NO-Norway', 'PL' => 'PL-Poland', 'PT' => 'PT-Portugal', 'RO' => 'RO-Romania', 'RS' => 'RS-Serbia', 'SE' => 'SE-Sweden', 'SI' => 'SI-Slovenia', 'SK' => 'SK-Slovakia', 'UK' => 'UK-United-Kingdom' ); foreach my $country_short (@countries_short) { my $url = 'http://meteoalarm.eu/en_UK/' . '0' . '/' . '0' . '/' . $country_codes{$country_short} . '.html'; my $content = _fetch_content( $url, $self->{'user_agent'} ); push @codes, _parse_codes($content); } return @codes; } sub _make_country_url { my ( $day, $type ) = @_; my $url = 'http://meteoalarm.eu/en_UK/' . $day . '/' . $type . '/EU-Europe .html'; return $url; } sub _fetch_content { my ( $url, $user_agent ) = @_; my $ua = LWP::UserAgent->new; $ua->agent($user_agent); my $res = $ua->request( HTTP::Request->new( GET => $url ) ); croak " Can't fetch http://meteoalarm.eu: $res->status_line \n" unless ( $res->is_success ); return $res->decoded_content; } sub _parse_country_warnings { my $content = shift; my $p = HTML::TreeBuilder->new_from_content($content); my (%data); my @cells = $p->look_down( _tag => q{td}, class => qr/^col[12]$/ ) ; for my $cell (@cells) { my @src; my $div = $cell->look_down( _tag => q{div} ); my $id = $div->id; my $alt = $div->attr(q{alt}); $data{$id}{fullname} = $alt; my @weather_events = $div->look_down( _tag => 'span', class => qr{warn awt} ); $data{$id}{warnings} = _parse_weather_events( \@weather_events ); # # get tendency # my $tendency = $div->look_down( _tag => 'div', class => qr{tendenz awt nt l\d} ); if ( $tendency->{class} ) { $tendency->{class} =~ /tendenz awt nt l(\d)/; $data{$id}{tendency} = $1; } } return \%data; } sub _parse_region_warnings { my $content = shift; my $p = HTML::TreeBuilder->new_from_content($content); my (%data); my @cells = $p->look_down(_tag=>qr{div}, id=>qr{area}); for my $cell (@cells) { $cell->id =~ /area_([A-Z][A-Z]\d+)/; my $id = $1; my $fullname = $cell->look_down(_tag=>'span',id=>'cname')->as_text ; my $div = $cell->look_down( _tag => q{div} ); $data{$id}{fullname} = $fullname; my @weather_events = $div->look_down(_tag=> 'span', class=>qr{warnflag warn2}); $data{$id}{warnings} = _parse_weather_events( \@weather_events ); # # get tendency # my $tendency = $div->look_down( _tag => 'span', class => qr{tendenz awt\d l\d} ); if ( $tendency->{class} ) { $tendency->{class} =~ /tendenz awt\d l(\d)/; $data{$id}{tendency} = $1; } } return \%data; } sub _parse_weather_events { my $events = shift; my %weather_to_text = ( # lower case for consistency 1 => 'wind', 2 => 'snow/ice', 3 => 'thunderstorm', 4 => 'fog', 5 => 'extreme high temperature', 6 => 'extreme low temperature', 7 => 'coastal event', 8 => 'forestfire', 9 => 'avalanches', 10 => 'rain', 11 => 'unnamed', 12 => 'flood', 13 => 'rainflood' ); my %literal_warnings; for my $event (@$events) { #print $event->{class}, "\n"; $event->{class} =~ /warn\d* awt l(\d+) t(\d+)/; my $warn_level = $1; my $weather = $2; $literal_warnings{ $weather_to_text{$weather} } = $warn_level; } return \%literal_warnings; } sub _parse_details { my $content = shift; my (%data); my $p = HTML::TreeBuilder->new_from_content( decode_entities $cont ent); $data{fullname} = $p->look_down( _tag => q{h1} )->as_text; if ( $p->look_down( _tag => q{div}, class => q{warnbox awt nt l l1} ) ) { $data{warnings} = 'no warnings'; } else { my @warnboxes = $p->look_down( _tag => q{div}, class => qr/warnbox awt/ ); for my $warnbox (@warnboxes) { my ($as_txt); my @info_divs = $warnbox->look_down( _tag => q{div}, class => q{info} ); $as_txt = $info_divs[0]->as_text; my ( $from, $until ) = $as_txt =~ /valid from (.*) Until ( .*)$/; $as_txt = $info_divs[1]->as_text; my ( $warning, $level ) = $as_txt =~ /(.+?)\s+Awareness Level:\s+(.*)/; $warning =~ s/s$//; my $text = $warnbox->look_down( _tag => q{div}, class => q{text} )->as_text; $data{warnings}{ lc $warning } = { #lower case for constistency level => $level, from => $from, until => $until, text => $text, }; } } return \%data; } sub _parse_codes { my $content = shift; my $p = HTML::TreeBuilder->new_from_content($content); my (%data); #my @cells = $p->look_down( _tag => 'div', class => 'flags' ); my @cells = $p->look_down( _tag => qr{a} ); for my $cell (@cells) { if ( $cell->attr('xlink:href') ) { if ( $cell->attr('xlink:href') =~ /\/([A-Z][A-Z]\d+)-(.+?) .html/ ) { my $code = $1; my $fullname = $2; $data{$fullname} = $code; } } } return \%data; } sub _make_user_agent { my $ua = shift; $ua = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:55.0) Gecko/20100101 Fire fox/55.0' unless ($ua); return $ua; } sub _extract_details_fullname { my $content = shift; my $region; if ( $content =~ /<h1>Weather warnings: (.+?)<\/h1>/ ) { $region = $1; decode_entities($region); if ( $region =~ /.??<.*<\/a>/ ) { $region =~ s/.??<.*<\/a>//; } } else { carp "Can't get region name\n"; } return $region; } 1; __END__ =head1 NAME B<Meteoalarm> - OO Interface for meteoalarm.eu =head1 SYNOPSIS This Module gets weather warnings from meteoalarm.eu. For further reading of terms and conditions see http://meteoalarm.eu/t erms.php?lang=en_UK use Meteoalarm; my $meteo = Meteoalarm->new( 'user_agent' => 'Meteobot 0.001' ); my $countries = $meteo -> countries ('type' => 'all', 'day' => 'today' ); foreach my $country_code (sort keys %{$countries}){ print "Country: $countries->{$country_code}->{'fullname'}\n"; print "Tendency = $countries->{$country_code}->{tendency}\n" if ( $countries->{$country_code}->{'tendency'}); if (keys %{$countries->{$country_code}->{'warnings'}}){ foreach my $warning (keys %{$countries->{$country_code}->{'warning s'}}){ print "Event: $warning, severity: $countries->{$country_co de}->{'warnings'}->{$warning}\n"; } } else {print "No Warnings\n";} } my $regions = $meteo->regions( 'country_code' => 'PT', 'day' => 'today ', 'type' => 'all' ); foreach my $code ( sort keys %{$regions} ) { print "Region : $regions->{$code}->{'fullname'}: region_code = $co de\n" if ( keys %{ $regions->{$code}->{'warnings'} } ); print "Tendency = $regions->{$code}->{tendency}\n" if ( $regions-> {$code}->{'tendency'}); foreach my $type ( keys %{ $regions->{$code}->{'warnings'} } ) { print "$type Severity: $regions->{$code}->{'warnings'}->{$type}\n"; } } my $details = $meteo->details( 'region_code' => 'UK010', 'day' => 'tod ay'); my $name = $details->{'fullname'}; print "$name\n"; if ( $details->{warnings} eq 'no warnings' ) { print $details->{warnings}, "\n"; } else { foreach my $warning ( keys %{ $details->{'warnings'} } ) { print "$warning\n"; foreach my $detail ( keys %{ $details->{'warnings'}->{$warning } } ) { print "$detail: $details->{'warnings'}->{$warning}->{$deta il}\n"; } } } my $codes = $meteo->codes('FR'); my @codes = $meteo->codes(); foreach my $code (@codes) { foreach my $region ( sort keys %{$code} ) { print "Region name: $region, region code: $code->{$region}\n"; } } =head1 DESCRIPTION $meteo -> countries returns hashref of warnings for all countries. $meteo -> regions returns hashref of warnings for all regions in a spe cified country $meteo -> details returns hashref of detailled warnings for a specifie d region $meteo -> codes returns arrayref of hash of name and region code of a country =head1 METHODS =head1 new( ) creates a new meteoalarm object =head2 Optional Arguments: new( 'user_agent' => 'Meteobot 0.001'); changes the user agent string =head1 my $country = $meteo -> countries (); =head2 Optional Arguments: 'day' => 'today' || 'tomorrow' if day is not defined, default value is today 'type' => 'all' || 'wind' || 'snow' || 'ice' || 'snow/ice' || 'snow' || 'ice' || 'thunderstorm' || 'fog' || 'extreme high temperature' || 'extreme low temperature' || 'coastal event' || 'fores tfire' || 'avalanches' || 'rain' if type is not defined, default type is all =head1 $regions = $meteo -> regions ('country_code' => 'DE'); country_code is a 2 letter abbreviation =head2 Optional arguments: day=> 'today' || 'tomorrow' if day is not defined, default value is today =head1 $details = $meteo->details ('region_code' => 'ES005'); region_code consits of 2 letters for the country and 3 digits =head2 Optional arguments: day=> 'today' || 'tomorrow' if day is not defined, default value is today =head1 $code = $meteo -> codes (); Returns arrayref of hash for region names and codes for all countries =head2 Optional Arguments $code = $meteo -> codes ('PL'); Countrycode for a specific country =cut
1 direct reply -- Read more / Contribute by walto
on Sep 23, 2017 at 00:50
![]() |
![]() |
![]() |
Nov 17, 2017 | perlmonks.com
# This string has a mixture of ASCII, UTF-8, 2 byte wide, and 4 byte # wide characters my $crazy = "Hello\x{26c4}".encode("utf-8","\x{26f0}"). "\x{10102}\x{2fa1b}"; # Now the string only has ASCII and UTF-8 characters my $sane = safeString($crazy); # testString($crazy) returns 7 # testString($sane) returns 3 # length($sane) returns 19 # trueLength($sane) returns 9 my $snowman = safeSubstr($crazy,5,1); ######################################## # safeString($string) # return a safe version of the string sub safeString { my ($string) = @_; return "" unless defined($string); my $t = testString($string); return $string if $t <= 3; return encode("utf-8",$string) if $t <= 5; # The string has both UTF-8 and wide characters so it needs # tender-loving care my @s = unpack('C*',$string); my @r; for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { push @r, $s[$i]; $i++; } elsif ($s[$i] > 255) { # encode a wide character push @r,unpack("C*",encode("utf-8",chr($s[$i]))); $i++; } else { # copy all the utf-8 bytes $n = _charBytes($i,@s) - 1; map { push @r, $s[$i+$_] } 0..$n; $i += $n + 1; } } return pack("C*",@r); } ######################################## # safeSubstr($string,$pos,$n) # return a safe substring (treats utf-8 sequences as a single # character) sub safeSubstr { my ($string,$pos,$n) = @_; $s = safeString($string); my $p = 0; my $rPos = 0; my $rEnd = -1; my @s = unpack('C*',$s); for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { $i++; } elsif ($s[$i] > 255) { $i++; } else { $i += _charBytes($i,@s); } $p++; $rPos = $i if $p == $pos; $rEnd = $i-1 if $p == $pos + $n; } $rEnd = scalar(@s) - 1 if $rEnd < 0; return "" if $rPos > $rEnd; my @r; map { push @r, $s[$_] } $rPos..$rEnd; return pack("C*",@r); } ######################################## # testString($string) # returns information about the characters in the string # # The 1, 2, and 4 bits of the result are for ASCII, UTF-8, and # wide characters respectively. If multiple bits are set, # characters of each type appear in the string. If the result is: # <= 1 simple ASCII string # <= 3 simple UTF-8 string # >3 && <= 5 mixed ASCII & wide characters # >= 6 mixed UTF-8 & wide characters sub testString { my ($s) = @_; return undef unless defined($s); my $r = 0; my @s = unpack('C*',$s); for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { $r |= 1; $i++; } elsif ($s[$i] > 255) { $r |= 4; $i++; } else { $r |= 2; $i += _charBytes($i,@s); } } return $r; } ######################################## # trueLength($string) # returns the number of UTF-8 characters in a string sub trueLength { my ($s) = @_; return unless defined($s); my $len = 0; my @s = unpack('C*',$s); for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { $i++; } elsif ($s[$i] > 255) { $i++; } else { $i += _charBytes($i,@s); } $len++; } return $len; } ######################################## # String support routines sub _charBytes { my $n = shift(@_); my $len = scalar(@_); if ($_[$n] < 128) { return 1; } elsif ($_[$n] > 65535) { return 4; } elsif ($_[$n] > 255) { return 2; } elsif (($_[$n] & 0xFC) == 0xFC) { return min(6,$len); } elsif (($_[$n] & 0xF8) == 0xF8) { return min(5,$len); } elsif (($_[$n] & 0xF0) == 0xF0) { return min(4,$len); } elsif (($_[$n] & 0xE0) == 0xE0) { return min(3,$len); } elsif (($_[$n] & 0xC0) == 0xC0) { return min(2,$len); } else { return 1; } }
2 direct replies -- Read more / Contribute by tdlewis77
on Aug 25, 2017 at 13:07
![]() |
![]() |
![]() |
Nov 17, 2017 | perlmonks.com
1 direct reply -- Read more / Contribute by erichansen1836
on Oct 08, 2017 at 11:13TOPIC: FAST!! Random Access Indexed, Relational Flat File Databases, Indexed by external Perl SDBM databases of key/value pairs tied to program "in memory" hash tables, where the Key in the Key/Value Pair is one or more fields and/or partial fields concatenated together (separated by a delimiter such as a pipe "|") and contained within the Flat File records for you to arbitrarily seek to a single record or a sorted/related group of records within your database.
Since it has been over 2 years ago since I first posted about this TOPIC I discovered, I wanted to alert the Perl community to the original thread where you can find Perl source code now for examples of how to implement Joint Database Technology/Methodology. Inparticular the King James Bible Navigator software DEMO I posted which used FlatFile/SDBM for its database. I have made this a native Windows GUI application (TreeView/RichEdit COMBO interface) to demonstrate how to show your end-users a summary of the information of the data contained within a database, and allow them to drill down to a small amount of specific information (e.g. verses within a single book/chapter) for actual viewing (and retrieving from the database). The TreeView Double Click Event was originally written to random access the first verse within a chapter, then sequentially access the remaining verses within a chapter - performing a READ for each verse. I posted a separate modified TreeView Double Click Event for you to insert into the Application which reads an entire chapter in one (1) giant READ, breaking out the individual verses (into an array) using the UNPACK statement. -- Eric
Joint Database Technology: http://www.perlmonks.org/?node_id=1121222
![]() |
![]() |
![]() |
Nov 17, 2017 | stackoverflow.com
Ask Question up vote down vote favorite 1
brian d foy ,Nov 26, 2016 at 4:32
Running under moar (2016.10)Consider this code that constructs a set and tests for membership:
my $num_set = set( < 1 2 3 4 > ); say "set: ", $num_set.perl; say "4 is in set: ", 4 ∈ $num_set; say "IntStr 4 is in set: ", IntStr.new(4, "Four") ∈ $num_set; say "IntStr(4,...) is 4: ", IntStr.new(4, "Four") == 4; say "5 is in set: ", 5 ∈ $num_set;A straight
4
is not in the set, but the IntStr version is:set: set(IntStr.new(4, "4"),IntStr.new(1, "1"),IntStr.new(2, "2"),IntStr.new(3, "3")) 4 is in set: False IntStr 4 is in set: True IntStr(4,...) is 4: True 5 is in set: FalseI think most people aren't going to expect this, but the
∈
docs doesn't say anything about how this might work. I don't have this problem if I don't use the quote words (i.e.set( 1, 2, 3, 4)
).timotimo ,Nov 26, 2016 at 5:47
You took a wrong turn in the middle. The important part is whatnqp::existskey
is called with: thek.WHICH
. This method is there for value types, i.e. immutable types where the value - rather than identity - defines if two things are supposed to be the same thing (even if created twice). It returns a string representation of an object's value that is equal for two things that are supposed to be equal. For<1>.WHICH
you getIntStr|1
and for1.WHICH
you get justInt|1
.brian d foy ,Nov 26, 2016 at 6:18
Ah, okay. I can see a lot of pain for regular people trying to debug these things. – brian d foy Nov 26 '16 at 6:18smls ,Nov 26, 2016 at 14:46
As explained in the Set documentation, sets compare object identity, same as the===
operator:Within a Set, every element is guaranteed to be unique (in the sense that no two elements would compare positively with the === operator)
The identity of an object is defined by the .WHICH method, as timotimo elaborates in his answer.
brian d foy ,Nov 26, 2016 at 21:28
That's not really clear from that statement. That's talking about which elements are in the set. Beyond that, even if you choose to compare with ===, you have to know how other things are stored. This is the sort of info that should show up next to the Set operators. – brian d foy Nov 26 '16 at 21:28brian d foy ,Nov 26, 2016 at 23:09
Indeed, I think I've found a bug. The qw docs says this should be true:< a b 137 > eqv ( 'a', 'b', '137' )
, but in the same version of Rakudo Star I get false. It's different object types on each side. – brian d foy Nov 26 '16 at 23:09brian d foy ,Nov 26, 2016 at 23:16
Despite all this, your answer was the A-ha! moment that led me to look at the right thing. Thanks for all of your help. – brian d foy Nov 26 '16 at 23:16raiph ,Nov 27, 2016 at 4:50
Write your list of numbers using commasAs you mention in your answer, your code works if you write your numbers as a simple comma separated list rather than using the
<...>
construct.Here's why:
4 ∈ set 1, 2, 3, 4 # TrueA bare numeric literal in code like the
4
to the left of∈
constructs a single value with a numeric type. (In this case the type is Int, an integer.) If aset
constructor receives a list of similar literals on the right then everything works out fine.<1 2 3 4>
produces a list of "dual values"The various
<...>
"quote words" constructs turn the list of whitespace separated literal elements within the angle brackets into an output list of values.The foundational variant (
qw<...>
) outputs nothing but strings. Using it for your use case doesn't work:4 ∈ set qw<1 2 3 4> # FalseThe
4
on the left constructs a single numeric value, typeInt
. In the meantime theset
constructor receives a list of strings, typeStr
:('1','2','3','4')
. The∈
operator doesn't find anInt
in the set because all the values areStr
s so returnsFalse
.Moving along, the huffmanized
<...>
variant outputsStr
s unless an element is recognized as a number. If an element is recognized as a number then the output value is a "dual value". For example a1
becomes an IntStr .According to the doc "an IntStr can be used interchangeably where one might use a Str or an Int". But can it?
Your scenario is a case in point. While
1 ∈ set 1,2,3
and<1> ∈ set <1 2 3>
both work,1 ∈ set <1 2 3>
and<1> ∈ set 1, 2, 3
both returnFalse
.So it seems the
∈
operator isn't living up to the quoted doc's claim of dual value interchangeabilityThis may already be recognized as a bug in the
∈
set operation and/or other operations. Even if not, this sharp "dual value" edge of the<...>
list constructor may eventually be viewed as sufficiently painful that Perl 6 needs to change.brian d foy ,Nov 26, 2016 at 23:29
I think this is a bug, but not in the set stuff. The other answers were very helpful in sorting out what was important and what wasn't.I used the angle-brackets form of the quote words . The quote words form is supposed to be equivalent to the quoting version (that is, True under
eqv
). Here's the doc example:<a b c> eqv ('a', 'b', 'c')But, when I try this with a word that is all digits, this is broken:
$ perl6 > < a b 137 > eqv ( 'a', 'b', '137' ) FalseBut, the other forms work:
> qw/ a b 137 / eqv ( 'a', 'b', '137' ) True > Q:w/ a b 137 / eqv ( 'a', 'b', '137' ) TrueThe angle-bracket word quoting uses IntStr :
> my @n = < a b 137 > [a b 137] > @n.perl ["a", "b", IntStr.new(137, "137")]Without the word quoting, the digits word comes out as [Str]:
> ( 'a', 'b', '137' ).perl ("a", "b", "137") > ( 'a', 'b', '137' )[*-1].perl "137" > ( 'a', 'b', '137' )[*-1].WHAT (Str) > my @n = ( 'a', 'b', '137' ); [a b 137] > @n[*-1].WHAT (Str)You typically see these sorts of errors when there are two code paths to get to a final result instead of shared code that converges to one path very early. That's what I would look for if I wanted to track this down (but, I need to work on the book!)
This does highlight, though, that you have to be very careful about sets. Even if this bug was fixed, there are other, non-buggy ways that
eqv
can fail. I would have still failed because 4 as Int is not "4" as Str . I think this level of attention to data types in unperly in it's DWIMery. It's certainly something I'd have to explain very carefully in a classroom and still watch everyone mess up on it.For what it's worth, I think the results of
gist
tend to be misleading in their oversimplification, and sometimes the results ofperl
aren't rich enough (e.g. hidingStr
which forces me to.WHAT
). The more I use those, the less useful I find them.But, knowing that I messed up before I even started would have saved me from that code spelunking that ended up meaning nothing!
Christoph ,Nov 26, 2016 at 23:55
Could you clarify what you consider the bug to be? As far as I can tell, this is all by design: (a)<...>
goes through&val
, which returns allomorphs if possible (b) set membership is defined in terms of identity, which distinguishes between allomorphs and their corresponding value types; so I would not classify it as a bug, but 'broken' by design; or phrased another way, it's just the WAT that comes with this particular DWIM – Christoph Nov 26 '16 at 23:55Brad Gilbert ,Nov 26, 2016 at 23:59
This was intentionally added, and is part of the testsuite . ( I can't seem to find anywhere that tests for< >
being equivalent toq:w:v< >
and<< >>
/" "
being equivalent toqq:ww:v<< >>
) – Brad Gilbert Nov 26 '16 at 23:59brian d foy ,Nov 27, 2016 at 0:02
The docs say the two lists should be eqv, and they are not. If they are not meant to be equivalent, the docs need to change. Nothing in docs.perl6.org/language/quoting#Word_quoting:_qw mentions any of this stuff. – brian d foy Nov 27 '16 at 0:02Christoph ,Nov 27, 2016 at 0:17
The documentation seems to be just wrong here,<...>
does not correspond toqw(...)
, butqw:v(...)
. Cf S02 for the description of the adverb and this test that Brad was <del>looking for</del> already linked to – Christoph Nov 27 '16 at 0:17Christoph ,Nov 27, 2016 at 0:45
or perhaps not outright wrong, but rather 'just' misleading:<...>
is indeed a:w
form, and the given example code does compare equal according toeqv
– Christoph Nov 27 '16 at 0:45dwarring ,Nov 27, 2016 at 18:33
Just to add to the other answers and point out a consistancy here between sets and object hashes .An object hash is declared as
my %object-hash{Any}
. This effectively hashes on objects.WHICH
method, which is similar to how sets distinguish individual members.Substituting the set with an object hash:
my %obj-hash{Any}; %obj-hash< 1 2 3 4 > = Any; say "hash: ", %obj-hash.keys.perl; say "4 is in hash: ", %obj-hash{4}:exists; say "IntStr 4 is in hash: ", %obj-hash{ IntStr.new(4, "Four") }:exists; say "IntStr(4,...) is 4: ", IntStr.new(4, "Four") == 4; say "5 is in hash: ", %obj-hash{5}:exists;gives similar results to your original example:
hash: (IntStr.new(4, "4"), IntStr.new(1, "1"), IntStr.new(2, "2"), IntStr.new(3, "3")).Seq 4 is in hash: False IntStr 4 is in hash: True IntStr(4,...) is 4: True 5 is in hash: Falsebrian d foy ,Nov 29, 2016 at 21:52
Oh, this is going to suck hard while teaching a class. – brian d foy Nov 29 '16 at 21:52dwarring ,Nov 30, 2016 at 4:21
I agree its not great, as it is. – dwarring Nov 30 '16 at 4:21dwarring ,Nov 30, 2016 at 18:26
Have raised an RT rt.perl.org/Ticket/Display.html?id=130222 – dwarring Nov 30 '16 at 18:26
![]() |
![]() |
![]() |
Nov 17, 2017 | www.youtube.com
Published on Oct 17, 2015
For details visit: http://perlmaven.com/introducing-test... Category Science & Technology License Standard YouTube License
![]() |
![]() |
![]() |
Nov 17, 2017 | www.youtube.com
Published on Jun 21, 2017
In which I detail the Perl 6 elements that have most changed my Perl 5 coding, and share the Perl 5 techniques I have adopted.
I eat, sleep, live, and breathe Perl!
Consultant and Contract Programmer Frequent PerlMongers speaker Dedicated Shakespeare theater-goer Armchair Mathematician Author of Blue_Tiger, a tool for modernizing Perl.
36 years coding 22 years Perl 16 years Married 15 YAPCs 7 Hackathons 3 PerlWhirls Perl interests: Refactoring, Perl Idioms / Micropatterns, RosettaCode, and Perl 6.
![]() |
![]() |
![]() |
Nov 17, 2017 | stackoverflow.com
Ask Question up vote down vote favorite
Toren ,Jan 12, 2011 at 14:50
I have an attribute (32 bits-long), that each bit responsible to specific functionality. Perl script I'm writing should turn on 4th bit, but save previous definitions of other bits.I use in my program:
Sub BitOperationOnAttr { my $a=""; MyGetFunc( $a); $a |= 0x00000008; MySetFunc( $a); }** MyGetFunc/ MySetFunc my own functions that know read/fix value.
Questions:
- if usage of
$a |= 0x00000008;
is right ?- how extract hex value by Regular Expression from string I have : For example:
"Attribute: Somestring: value (8 long (0x8))"
Michael Carman ,Jan 12, 2011 at 16:13
Your questions are not related; they should be posted separately. That makes it easier for other people with similar questions to find them. – Michael Carman Jan 12 '11 at 16:13toolic ,Jan 12, 2011 at 16:47
Same question asked on PerlMonks: perlmonks.org/?node_id=881892 – toolic Jan 12 '11 at 16:47psmears ,Jan 12, 2011 at 15:00
- if usage of $a |= 0x00000008; is right ?
Yes, this is fine.
- how extract hex value by Regular Expression from string I have : For example:
"Attribute: Somestring: value (8 long (0x8))"
I'm assuming you have a string like the above, and want to use a regular expression to extract the "0x8". In that case, something like:
if ($string =~ m/0x([0-9a-fA-F]+)/) { $value = hex($1); } else { # string didn't match }should work.
Michael Carman ,Jan 12, 2011 at 16:32
Perl provides several ways for dealing with binary data:
- The bitwise operators
&
,|
, and~
.- The
pack
andunpack
functions.- The
vec
function.Your scenario sounds like a set of packed flags. The bitwise operators are a good fit for this:
my $mask = 1 << 3; # 0x0008 $value |= $mask; # set bit $value &= ~$mask; # clear bit if ($value & $mask) # check bit
vec
is designed for use with bit vectors. (Each element has the same size, which must be a power of two.) It could work here as well:vec($value, 3, 1) = 1; # set bit vec($value, 3, 1) = 0; # clear bit if (vec($value, 3, 1)) # check bit
pack
andunpack
are better suited for working with things like C structs or endianness.sdaau ,Jul 15, 2014 at 5:01
I upvoted, but there is something very important missing:vec
operates on a string!If we use a number; say:
$val=5;
printf("b%08b",$val);
(this gives
b00000101
) -- then one can see that the "check bit" syntax, say:
for($ix=7;$ix>=0;$ix--) {
print vec($val, $ix, 1);
};
print "\n";
will not work (it gives
00110101
, which is not the same number). The correct is to convert the number to ASCII char, i.e.
print vec(sprintf("%c", $val), $ix, 1);
.
![]() |
![]() |
![]() |
Nov 17, 2017 | stackoverflow.com
confused ,2 days ago
I have a string of epoch seconds "1510652305" which when i convert to normal time on unix command line using`date -d @1510652305`i get Tue Nov 14 15:08:25 IST 2017
But when i tried it in perl using something like this
use POSIX qw(strftime); use Time::Local; use Time::localtime; $kickoff_time=1510652305; $kickoff_time=ctime($kickoff_time);i get
Thu Jan 1 05:30:00 1970
How can i achieve the result i am getting in linux in perl?
Thanks!!
mwp ,2 days ago
Don't overthink it!my $kickoff_time = localtime 1510652305; say $kickoff_time; # Tue Nov 14 15:08:25 2017If you absolutely, positively need the timezone in there:
use POSIX qw{strftime}; my $kickoff_time = strftime '%a %b %e %H:%M:%S %Z %Y', localtime 1510652305; say $kickoff_time; # Tue Nov 14 15:08:25 IST 2017Note that this is locale-dependent.
confused ,yesterday
We have to use localtime to convert in time from epoch seconds and gmtime to convert in time from normal seconds i got it now....Thanks!! – confused yesterdaymob ,yesterday
Still confused. Bothlocaltime
andgmtime
expect the input to be epoch seconds. – mob yesterdaymwp ,4 hours ago
Exactly. To expound,localtime()
takes the epoch and returns a string (or date parts array) representing the time in your local timezone;gmtime()
takes the epoch and returns a string (or date parts array) representing the time in UTC. – mwp 4 hours ago,
I would recommend usingTime::Piece
for this job - it's core in perl.#!/usr/bin/env perl use strict; use warnings; use Time::Piece; my $t = localtime ( 1510652305 ); print $t;It'll print default format, or you can use formatted using
strftime
.
![]() |
![]() |
![]() |
Nov 17, 2017 | perlmonks.com
Meteoalarm.eu ( http://meteoalarm.eu ) is the official website from European national weather services that gives out warnings in extreme weather situations. It has been a while ago that i wrote a perl module for processing this information. ( Weather warnings from www.meteoalarm.eu ). The website is still on but has changed since. That made some changes necessary. I wrote the module only for informational purposes and it is not meant to use it for anything critical. Here is the code: #!/usr/bin/perl # # package Meteoalarm; use strict; use warnings; use Carp; use LWP; use HTML::Entities; use HTML::TreeBuilder; use utf8; binmode STDOUT, ":encoding(UTF-8)"; our $VERSION = "0.06"; sub new { my $class = shift; my $self = {}; my %passed_params = @_; $self->{'user_agent'} = _make_user_agent( $passed_params{'user_agent'} ); bless( $self, $class ); return $self; } sub countries { my $self = shift; my %passed_params = @_; my %type = ( 'all' => 0, 'wind' => 1, 'snow' => 2, 'ice' => 2, 'snow/ice' => 2, 'thunderstorm' => 3, 'fog' => 4, 'extreme high temperature' => 5, 'extreme low temperature' => 6, 'coastal event' => 7, 'forestfire' => 8, 'avalanches' => 9, 'rain' => 10, 'unnamed' => 11, 'flood' => 12, 'rainflood' => 13 ); if ( !$passed_params{type} ) { $passed_params{type} = 0; } elsif ( !$type{ $passed_params{type} } ) { $passed_params{type} = 0; } else { $passed_params{type} = $type{ $passed_params{type} }; } my %day = ( 'today' => 0, 'tomorrow' => 1 ); if ( !$passed_params{day} ) { $passed_params{day} = 0; } elsif ( !$day{ $passed_params{day} } ) { $passed_params{day} = 0; } else { $passed_params{day} = $day{ $passed_params{day} }; } my $url = _make_country_url( $passed_params{day}, $passed_params{type} ); my $content = _fetch_content( $url, $self->{'user_agent'} ); my $country_warnings = _parse_country_warnings($content); return $country_warnings; } sub regions { my ($self) = shift; my %passed_params = @_; my %day = ( 'today' => 0, 'tomorrow' => 1 ); if ( !$passed_params{day} ) { $passed_params{day} = 0; } elsif ( !$day{ $passed_params{day} } ) { $passed_params{day} = 0; } else { $passed_params{day} = $day{ $passed_params{day} }; } my %type = ( 'all' => 0, 'wind' => 1, 'snow' => 2, 'ice' => 2, 'snow/ice' => 2, 'thunderstorm' => 3, 'fog' => 4, 'extreme high temperature' => 5, 'extreme low temperature' => 6, 'coastal event' => 7, 'forestfire' => 8, 'avalanches' => 9, 'rain' => 10, 'unnamed' => 11, 'flood' => 12, 'rainflood' => 13 ); my %country_codes = ( 'AT' => 'AT-Austria', 'BA' => 'BA_Bosnia-Herzegovina', 'BE' => 'BE-Belgium', 'BG' => 'BG-Bulgaria', 'CH' => 'CH-Switzerland', 'CY' => 'CY-Cyprus', 'CZ' => 'CZ-Czechia', 'DE' => 'DE-Germany', 'DK' => 'DK-Denmark', 'EE' => 'EE-Estonia', 'ES' => 'ES-Spain', 'FI' => 'FI-Finland', 'FR' => 'FR-France', 'GR' => 'GR-Greece', 'HR' => 'HR-Croatia', 'HU' => 'HU-Hungary', 'IE' => 'IE-Ireland', 'IL' => 'IL-Israel', 'IS' => 'IS-Iceland', 'IT' => 'IT-Italy', 'LT' => 'LT-Lithuania', 'LU' => 'LU-Luxemburg', 'LV' => 'LV-Latvia', 'MD' => 'MD-Moldova', 'ME' => 'ME-Montenegro', 'MK' => 'MK-Former Yugoslav Republic of Macedonia', 'MT' => 'MT-Malta', 'NL' => 'NL-Netherlands', 'NO' => 'NO-Norway', 'PL' => 'PL-Poland', 'PT' => 'PT-Portugal', 'RO' => 'RO-Romania', 'RS' => 'RS-Serbia', 'SE' => 'SE-Sweden', 'SI' => 'SI-Slovenia', 'SK' => 'SK-Slovakia', 'UK' => 'UK-United-Kingdom' ); if ( !$passed_params{type} ) { $passed_params{type} = 0; } elsif ( !$type{ $passed_params{type} } ) { $passed_params{type} = 0; } else { $passed_params{type} = $type{ $passed_params{type} }; } croak "Invalid country_code: $passed_params{country_code}" unless $passed_params{country_code}; my $url = 'http://meteoalarm.eu/en_UK/' . $passed_params{day} . '/' . $passed_params{type} . '/' . $country_codes{ $passed_params{country_code} } . '.html'; my $content = _fetch_content( $url, $self->{'user_agent'} ); my $region_warnings = _parse_region_warnings($content); return $region_warnings; } sub details { my $self = shift; my %passed_params = @_; my %country_codes = ( 'AT' => 10, 'BA' => 10, 'BE' => 801, 'BG' => 28, 'CH' => 319, 'CY' => 1, 'CZ' => 14, 'DE' => 808, 'DK' => 8, 'EE' => 805, 'ES' => 831, 'FI' => 813, 'FR' => 94, 'GR' => 16, 'HR' => 806, 'HU' => 7, 'IE' => 804, 'IL' => 803, 'IS' => 11, 'IT' => 20, 'LT' => 801, 'LU' => 2, 'LV' => 804, 'MD' => 37, 'ME' => 3, 'MK' => 6, 'MT' => 1, 'NL' => 807, 'NO' => 814, 'PL' => 802, 'PT' => 26, 'RO' => 42, 'RS' => 11, 'SE' => 813, 'SI' => 801, 'SK' => 16, 'UK' => 16 ); my ( $region, $code ) = $passed_params{region_code} =~ /^([ABCDEFGHILMNPRSU][A-Z])(\d\d\ d)/; $code =~ s /^0//; croak "Invalid region_code: $passed_params{region_code}" unless ( $country_codes{$region} and ( $code <= $country_codes{$region} ) ); my $details; my %type = ( 'all' => 0, 'wind' => 1, 'snow' => 2, 'ice' => 2, 'snow/ice' => 2, 'thunderstorm' => 3, 'fog' => 4, 'extreme high temperature' => 5, 'extreme low temperature' => 6, 'coastal event' => 7, 'forestfire' => 8, 'avalanches' => 9, 'rain' => 10, 'unnamed' => 11, 'flood' => 12, 'rainflood' => 13 ); if ( !$passed_params{type} ) { $passed_params{type} = 0; } elsif ( !$type{ $passed_params{type} } ) { $passed_params{type} = 0; } else { $passed_params{type} = $type{ $passed_params{type} }; } my %day = ( 'today' => 0, 'tomorrow' => 1 ); if ( !$passed_params{day} ) { $passed_params{day} = 0; } elsif ( !$day{ $passed_params{day} } ) { $passed_params{day} = 0; } else { $passed_params{day} = $day{ $passed_params{day} }; } my $url = 'http://meteoalarm.eu/en_UK/' . $passed_params{day} . '/' . $passed_params{type} . '/' . $passed_params{region_code} . '.html'; my $content = _fetch_content( $url, $self->{'user_agent'} ); $details = _parse_details($content); return $details; } sub codes { my $self = shift; my @codes; my @countries_short; if (@_) { @countries_short = @_; } else { @countries_short = qw(AT BA BE BG CH CY CZ DE DK EE ES FI FR GR HR HU IE IL IS IT LT LU LV MD ME MK MT NL NO PL PT RO RS SE SI SK UK); } my %country_codes = ( 'AT' => 'AT-Austria', 'BA' => 'BA_Bosnia-Herzegovina', 'BE' => 'BE-Belgium', 'BG' => 'BG-Bulgaria', 'CH' => 'CH-Switzerland', 'CY' => 'CY-Cyprus', 'CZ' => 'CZ-Czechia', 'DE' => 'DE-Germany', 'DK' => 'DK-Denmark', 'EE' => 'EE-Estonia', 'ES' => 'ES-Spain', 'FI' => 'FI-Finland', 'FR' => 'FR-France', 'GR' => 'GR-Greece', 'HR' => 'HR-Croatia', 'HU' => 'HU-Hungary', 'IE' => 'IE-Ireland', 'IL' => 'IL-Israel', 'IS' => 'IS-Iceland', 'IT' => 'IT-Italy', 'LT' => 'LT-Lithuania', 'LU' => 'LU-Luxemburg', 'LV' => 'LV-Latvia', 'MD' => 'MD-Moldova', 'ME' => 'ME-Montenegro', 'MK' => 'MK-Former Yugoslav Republic of Macedonia', 'MT' => 'MT-Malta', 'NL' => 'NL-Netherlands', 'NO' => 'NO-Norway', 'PL' => 'PL-Poland', 'PT' => 'PT-Portugal', 'RO' => 'RO-Romania', 'RS' => 'RS-Serbia', 'SE' => 'SE-Sweden', 'SI' => 'SI-Slovenia', 'SK' => 'SK-Slovakia', 'UK' => 'UK-United-Kingdom' ); foreach my $country_short (@countries_short) { my $url = 'http://meteoalarm.eu/en_UK/' . '0' . '/' . '0' . '/' . $country_codes{$country_short} . '.html'; my $content = _fetch_content( $url, $self->{'user_agent'} ); push @codes, _parse_codes($content); } return @codes; } sub _make_country_url { my ( $day, $type ) = @_; my $url = 'http://meteoalarm.eu/en_UK/' . $day . '/' . $type . '/EU-Europe .html'; return $url; } sub _fetch_content { my ( $url, $user_agent ) = @_; my $ua = LWP::UserAgent->new; $ua->agent($user_agent); my $res = $ua->request( HTTP::Request->new( GET => $url ) ); croak " Can't fetch http://meteoalarm.eu: $res->status_line \n" unless ( $res->is_success ); return $res->decoded_content; } sub _parse_country_warnings { my $content = shift; my $p = HTML::TreeBuilder->new_from_content($content); my (%data); my @cells = $p->look_down( _tag => q{td}, class => qr/^col[12]$/ ) ; for my $cell (@cells) { my @src; my $div = $cell->look_down( _tag => q{div} ); my $id = $div->id; my $alt = $div->attr(q{alt}); $data{$id}{fullname} = $alt; my @weather_events = $div->look_down( _tag => 'span', class => qr{warn awt} ); $data{$id}{warnings} = _parse_weather_events( \@weather_events ); # # get tendency # my $tendency = $div->look_down( _tag => 'div', class => qr{tendenz awt nt l\d} ); if ( $tendency->{class} ) { $tendency->{class} =~ /tendenz awt nt l(\d)/; $data{$id}{tendency} = $1; } } return \%data; } sub _parse_region_warnings { my $content = shift; my $p = HTML::TreeBuilder->new_from_content($content); my (%data); my @cells = $p->look_down(_tag=>qr{div}, id=>qr{area}); for my $cell (@cells) { $cell->id =~ /area_([A-Z][A-Z]\d+)/; my $id = $1; my $fullname = $cell->look_down(_tag=>'span',id=>'cname')->as_text ; my $div = $cell->look_down( _tag => q{div} ); $data{$id}{fullname} = $fullname; my @weather_events = $div->look_down(_tag=> 'span', class=>qr{warnflag warn2}); $data{$id}{warnings} = _parse_weather_events( \@weather_events ); # # get tendency # my $tendency = $div->look_down( _tag => 'span', class => qr{tendenz awt\d l\d} ); if ( $tendency->{class} ) { $tendency->{class} =~ /tendenz awt\d l(\d)/; $data{$id}{tendency} = $1; } } return \%data; } sub _parse_weather_events { my $events = shift; my %weather_to_text = ( # lower case for consistency 1 => 'wind', 2 => 'snow/ice', 3 => 'thunderstorm', 4 => 'fog', 5 => 'extreme high temperature', 6 => 'extreme low temperature', 7 => 'coastal event', 8 => 'forestfire', 9 => 'avalanches', 10 => 'rain', 11 => 'unnamed', 12 => 'flood', 13 => 'rainflood' ); my %literal_warnings; for my $event (@$events) { #print $event->{class}, "\n"; $event->{class} =~ /warn\d* awt l(\d+) t(\d+)/; my $warn_level = $1; my $weather = $2; $literal_warnings{ $weather_to_text{$weather} } = $warn_level; } return \%literal_warnings; } sub _parse_details { my $content = shift; my (%data); my $p = HTML::TreeBuilder->new_from_content( decode_entities $cont ent); $data{fullname} = $p->look_down( _tag => q{h1} )->as_text; if ( $p->look_down( _tag => q{div}, class => q{warnbox awt nt l l1} ) ) { $data{warnings} = 'no warnings'; } else { my @warnboxes = $p->look_down( _tag => q{div}, class => qr/warnbox awt/ ); for my $warnbox (@warnboxes) { my ($as_txt); my @info_divs = $warnbox->look_down( _tag => q{div}, class => q{info} ); $as_txt = $info_divs[0]->as_text; my ( $from, $until ) = $as_txt =~ /valid from (.*) Until ( .*)$/; $as_txt = $info_divs[1]->as_text; my ( $warning, $level ) = $as_txt =~ /(.+?)\s+Awareness Level:\s+(.*)/; $warning =~ s/s$//; my $text = $warnbox->look_down( _tag => q{div}, class => q{text} )->as_text; $data{warnings}{ lc $warning } = { #lower case for constistency level => $level, from => $from, until => $until, text => $text, }; } } return \%data; } sub _parse_codes { my $content = shift; my $p = HTML::TreeBuilder->new_from_content($content); my (%data); #my @cells = $p->look_down( _tag => 'div', class => 'flags' ); my @cells = $p->look_down( _tag => qr{a} ); for my $cell (@cells) { if ( $cell->attr('xlink:href') ) { if ( $cell->attr('xlink:href') =~ /\/([A-Z][A-Z]\d+)-(.+?) .html/ ) { my $code = $1; my $fullname = $2; $data{$fullname} = $code; } } } return \%data; } sub _make_user_agent { my $ua = shift; $ua = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:55.0) Gecko/20100101 Fire fox/55.0' unless ($ua); return $ua; } sub _extract_details_fullname { my $content = shift; my $region; if ( $content =~ /<h1>Weather warnings: (.+?)<\/h1>/ ) { $region = $1; decode_entities($region); if ( $region =~ /.??<.*<\/a>/ ) { $region =~ s/.??<.*<\/a>//; } } else { carp "Can't get region name\n"; } return $region; } 1; __END__ =head1 NAME B<Meteoalarm> - OO Interface for meteoalarm.eu =head1 SYNOPSIS This Module gets weather warnings from meteoalarm.eu. For further reading of terms and conditions see http://meteoalarm.eu/t erms.php?lang=en_UK use Meteoalarm; my $meteo = Meteoalarm->new( 'user_agent' => 'Meteobot 0.001' ); my $countries = $meteo -> countries ('type' => 'all', 'day' => 'today' ); foreach my $country_code (sort keys %{$countries}){ print "Country: $countries->{$country_code}->{'fullname'}\n"; print "Tendency = $countries->{$country_code}->{tendency}\n" if ( $countries->{$country_code}->{'tendency'}); if (keys %{$countries->{$country_code}->{'warnings'}}){ foreach my $warning (keys %{$countries->{$country_code}->{'warning s'}}){ print "Event: $warning, severity: $countries->{$country_co de}->{'warnings'}->{$warning}\n"; } } else {print "No Warnings\n";} } my $regions = $meteo->regions( 'country_code' => 'PT', 'day' => 'today ', 'type' => 'all' ); foreach my $code ( sort keys %{$regions} ) { print "Region : $regions->{$code}->{'fullname'}: region_code = $co de\n" if ( keys %{ $regions->{$code}->{'warnings'} } ); print "Tendency = $regions->{$code}->{tendency}\n" if ( $regions-> {$code}->{'tendency'}); foreach my $type ( keys %{ $regions->{$code}->{'warnings'} } ) { print "$type Severity: $regions->{$code}->{'warnings'}->{$type}\n"; } } my $details = $meteo->details( 'region_code' => 'UK010', 'day' => 'tod ay'); my $name = $details->{'fullname'}; print "$name\n"; if ( $details->{warnings} eq 'no warnings' ) { print $details->{warnings}, "\n"; } else { foreach my $warning ( keys %{ $details->{'warnings'} } ) { print "$warning\n"; foreach my $detail ( keys %{ $details->{'warnings'}->{$warning } } ) { print "$detail: $details->{'warnings'}->{$warning}->{$deta il}\n"; } } } my $codes = $meteo->codes('FR'); my @codes = $meteo->codes(); foreach my $code (@codes) { foreach my $region ( sort keys %{$code} ) { print "Region name: $region, region code: $code->{$region}\n"; } } =head1 DESCRIPTION $meteo -> countries returns hashref of warnings for all countries. $meteo -> regions returns hashref of warnings for all regions in a spe cified country $meteo -> details returns hashref of detailled warnings for a specifie d region $meteo -> codes returns arrayref of hash of name and region code of a country =head1 METHODS =head1 new( ) creates a new meteoalarm object =head2 Optional Arguments: new( 'user_agent' => 'Meteobot 0.001'); changes the user agent string =head1 my $country = $meteo -> countries (); =head2 Optional Arguments: 'day' => 'today' || 'tomorrow' if day is not defined, default value is today 'type' => 'all' || 'wind' || 'snow' || 'ice' || 'snow/ice' || 'snow' || 'ice' || 'thunderstorm' || 'fog' || 'extreme high temperature' || 'extreme low temperature' || 'coastal event' || 'fores tfire' || 'avalanches' || 'rain' if type is not defined, default type is all =head1 $regions = $meteo -> regions ('country_code' => 'DE'); country_code is a 2 letter abbreviation =head2 Optional arguments: day=> 'today' || 'tomorrow' if day is not defined, default value is today =head1 $details = $meteo->details ('region_code' => 'ES005'); region_code consits of 2 letters for the country and 3 digits =head2 Optional arguments: day=> 'today' || 'tomorrow' if day is not defined, default value is today =head1 $code = $meteo -> codes (); Returns arrayref of hash for region names and codes for all countries =head2 Optional Arguments $code = $meteo -> codes ('PL'); Countrycode for a specific country =cut
1 direct reply -- Read more / Contribute by walto
on Sep 23, 2017 at 00:50
![]() |
![]() |
![]() |
Nov 17, 2017 | perlmonks.com
Dealing with data that comes from webpages can be really complicated. There is likely to be a combination of ASCII, UTF-8, and wide characters in the data returned and you cannot depend on the website to tell you what type of content is being returned. The routines safeString, safeSubstr, testString, and trueLength can be used to easily manipulate these strings. Pass any string to safeString and you will never get a wide character warning from print. Use safeSubstr to extract complete UTF-8 characters sequences from a string. Use testString to tell you what's really in the string. Use trueLength to find out how many characters wide the output will be. # This string has a mixture of ASCII, UTF-8, 2 byte wide, and 4 byte # wide characters my $crazy = "Hello\x{26c4}".encode("utf-8","\x{26f0}"). "\x{10102}\x{2fa1b}"; # Now the string only has ASCII and UTF-8 characters my $sane = safeString($crazy); # testString($crazy) returns 7 # testString($sane) returns 3 # length($sane) returns 19 # trueLength($sane) returns 9 my $snowman = safeSubstr($crazy,5,1); ######################################## # safeString($string) # return a safe version of the string sub safeString { my ($string) = @_; return "" unless defined($string); my $t = testString($string); return $string if $t <= 3; return encode("utf-8",$string) if $t <= 5; # The string has both UTF-8 and wide characters so it needs # tender-loving care my @s = unpack('C*',$string); my @r; for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { push @r, $s[$i]; $i++; } elsif ($s[$i] > 255) { # encode a wide character push @r,unpack("C*",encode("utf-8",chr($s[$i]))); $i++; } else { # copy all the utf-8 bytes $n = _charBytes($i,@s) - 1; map { push @r, $s[$i+$_] } 0..$n; $i += $n + 1; } } return pack("C*",@r); } ######################################## # safeSubstr($string,$pos,$n) # return a safe substring (treats utf-8 sequences as a single # character) sub safeSubstr { my ($string,$pos,$n) = @_; $s = safeString($string); my $p = 0; my $rPos = 0; my $rEnd = -1; my @s = unpack('C*',$s); for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { $i++; } elsif ($s[$i] > 255) { $i++; } else { $i += _charBytes($i,@s); } $p++; $rPos = $i if $p == $pos; $rEnd = $i-1 if $p == $pos + $n; } $rEnd = scalar(@s) - 1 if $rEnd < 0; return "" if $rPos > $rEnd; my @r; map { push @r, $s[$_] } $rPos..$rEnd; return pack("C*",@r); } ######################################## # testString($string) # returns information about the characters in the string # # The 1, 2, and 4 bits of the result are for ASCII, UTF-8, and # wide characters respectively. If multiple bits are set, # characters of each type appear in the string. If the result is: # <= 1 simple ASCII string # <= 3 simple UTF-8 string # >3 && <= 5 mixed ASCII & wide characters # >= 6 mixed UTF-8 & wide characters sub testString { my ($s) = @_; return undef unless defined($s); my $r = 0; my @s = unpack('C*',$s); for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { $r |= 1; $i++; } elsif ($s[$i] > 255) { $r |= 4; $i++; } else { $r |= 2; $i += _charBytes($i,@s); } } return $r; } ######################################## # trueLength($string) # returns the number of UTF-8 characters in a string sub trueLength { my ($s) = @_; return unless defined($s); my $len = 0; my @s = unpack('C*',$s); for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { $i++; } elsif ($s[$i] > 255) { $i++; } else { $i += _charBytes($i,@s); } $len++; } return $len; } ######################################## # String support routines sub _charBytes { my $n = shift(@_); my $len = scalar(@_); if ($_[$n] < 128) { return 1; } elsif ($_[$n] > 65535) { return 4; } elsif ($_[$n] > 255) { return 2; } elsif (($_[$n] & 0xFC) == 0xFC) { return min(6,$len); } elsif (($_[$n] & 0xF8) == 0xF8) { return min(5,$len); } elsif (($_[$n] & 0xF0) == 0xF0) { return min(4,$len); } elsif (($_[$n] & 0xE0) == 0xE0) { return min(3,$len); } elsif (($_[$n] & 0xC0) == 0xC0) { return min(2,$len); } else { return 1; } }
2 direct replies -- Read more / Contribute by tdlewis77
on Aug 25, 2017 at 13:07
![]() |
![]() |
![]() |
Nov 17, 2017 | perlmonks.com
1 direct reply -- Read more / Contribute by erichansen1836
on Oct 08, 2017 at 11:13TOPIC: FAST!! Random Access Indexed, Relational Flat File Databases, Indexed by external Perl SDBM databases of key/value pairs tied to program "in memory" hash tables, where the Key in the Key/Value Pair is one or more fields and/or partial fields concatenated together (separated by a delimiter such as a pipe "|") and contained within the Flat File records for you to arbitrarily seek to a single record or a sorted/related group of records within your database.
Since it has been over 2 years ago since I first posted about this TOPIC I discovered, I wanted to alert the Perl community to the original thread where you can find Perl source code now for examples of how to implement Joint Database Technology/Methodology. Inparticular the King James Bible Navigator software DEMO I posted which used FlatFile/SDBM for its database. I have made this a native Windows GUI application (TreeView/RichEdit COMBO interface) to demonstrate how to show your end-users a summary of the information of the data contained within a database, and allow them to drill down to a small amount of specific information (e.g. verses within a single book/chapter) for actual viewing (and retrieving from the database). The TreeView Double Click Event was originally written to random access the first verse within a chapter, then sequentially access the remaining verses within a chapter - performing a READ for each verse. I posted a separate modified TreeView Double Click Event for you to insert into the Application which reads an entire chapter in one (1) giant READ, breaking out the individual verses (into an array) using the UNPACK statement. -- Eric
Joint Database Technology: http://www.perlmonks.org/?node_id=1121222
![]() |
![]() |
![]() |
Nov 17, 2017 | stackoverflow.com
Ask Question up vote down vote favorite 1
brian d foy ,Nov 26, 2016 at 4:32
Running under moar (2016.10)Consider this code that constructs a set and tests for membership:
my $num_set = set( < 1 2 3 4 > ); say "set: ", $num_set.perl; say "4 is in set: ", 4 ∈ $num_set; say "IntStr 4 is in set: ", IntStr.new(4, "Four") ∈ $num_set; say "IntStr(4,...) is 4: ", IntStr.new(4, "Four") == 4; say "5 is in set: ", 5 ∈ $num_set;A straight
4
is not in the set, but the IntStr version is:set: set(IntStr.new(4, "4"),IntStr.new(1, "1"),IntStr.new(2, "2"),IntStr.new(3, "3")) 4 is in set: False IntStr 4 is in set: True IntStr(4,...) is 4: True 5 is in set: FalseI think most people aren't going to expect this, but the
∈
docs doesn't say anything about how this might work. I don't have this problem if I don't use the quote words (i.e.set( 1, 2, 3, 4)
).timotimo ,Nov 26, 2016 at 5:47
You took a wrong turn in the middle. The important part is whatnqp::existskey
is called with: thek.WHICH
. This method is there for value types, i.e. immutable types where the value - rather than identity - defines if two things are supposed to be the same thing (even if created twice). It returns a string representation of an object's value that is equal for two things that are supposed to be equal. For<1>.WHICH
you getIntStr|1
and for1.WHICH
you get justInt|1
.brian d foy ,Nov 26, 2016 at 6:18
Ah, okay. I can see a lot of pain for regular people trying to debug these things. – brian d foy Nov 26 '16 at 6:18smls ,Nov 26, 2016 at 14:46
As explained in the Set documentation, sets compare object identity, same as the===
operator:Within a Set, every element is guaranteed to be unique (in the sense that no two elements would compare positively with the === operator)
The identity of an object is defined by the .WHICH method, as timotimo elaborates in his answer.
brian d foy ,Nov 26, 2016 at 21:28
That's not really clear from that statement. That's talking about which elements are in the set. Beyond that, even if you choose to compare with ===, you have to know how other things are stored. This is the sort of info that should show up next to the Set operators. – brian d foy Nov 26 '16 at 21:28brian d foy ,Nov 26, 2016 at 23:09
Indeed, I think I've found a bug. The qw docs says this should be true:< a b 137 > eqv ( 'a', 'b', '137' )
, but in the same version of Rakudo Star I get false. It's different object types on each side. – brian d foy Nov 26 '16 at 23:09brian d foy ,Nov 26, 2016 at 23:16
Despite all this, your answer was the A-ha! moment that led me to look at the right thing. Thanks for all of your help. – brian d foy Nov 26 '16 at 23:16raiph ,Nov 27, 2016 at 4:50
Write your list of numbers using commasAs you mention in your answer, your code works if you write your numbers as a simple comma separated list rather than using the
<...>
construct.Here's why:
4 ∈ set 1, 2, 3, 4 # TrueA bare numeric literal in code like the
4
to the left of∈
constructs a single value with a numeric type. (In this case the type is Int, an integer.) If aset
constructor receives a list of similar literals on the right then everything works out fine.<1 2 3 4>
produces a list of "dual values"The various
<...>
"quote words" constructs turn the list of whitespace separated literal elements within the angle brackets into an output list of values.The foundational variant (
qw<...>
) outputs nothing but strings. Using it for your use case doesn't work:4 ∈ set qw<1 2 3 4> # FalseThe
4
on the left constructs a single numeric value, typeInt
. In the meantime theset
constructor receives a list of strings, typeStr
:('1','2','3','4')
. The∈
operator doesn't find anInt
in the set because all the values areStr
s so returnsFalse
.Moving along, the huffmanized
<...>
variant outputsStr
s unless an element is recognized as a number. If an element is recognized as a number then the output value is a "dual value". For example a1
becomes an IntStr .According to the doc "an IntStr can be used interchangeably where one might use a Str or an Int". But can it?
Your scenario is a case in point. While
1 ∈ set 1,2,3
and<1> ∈ set <1 2 3>
both work,1 ∈ set <1 2 3>
and<1> ∈ set 1, 2, 3
both returnFalse
.So it seems the
∈
operator isn't living up to the quoted doc's claim of dual value interchangeabilityThis may already be recognized as a bug in the
∈
set operation and/or other operations. Even if not, this sharp "dual value" edge of the<...>
list constructor may eventually be viewed as sufficiently painful that Perl 6 needs to change.brian d foy ,Nov 26, 2016 at 23:29
I think this is a bug, but not in the set stuff. The other answers were very helpful in sorting out what was important and what wasn't.I used the angle-brackets form of the quote words . The quote words form is supposed to be equivalent to the quoting version (that is, True under
eqv
). Here's the doc example:<a b c> eqv ('a', 'b', 'c')But, when I try this with a word that is all digits, this is broken:
$ perl6 > < a b 137 > eqv ( 'a', 'b', '137' ) FalseBut, the other forms work:
> qw/ a b 137 / eqv ( 'a', 'b', '137' ) True > Q:w/ a b 137 / eqv ( 'a', 'b', '137' ) TrueThe angle-bracket word quoting uses IntStr :
> my @n = < a b 137 > [a b 137] > @n.perl ["a", "b", IntStr.new(137, "137")]Without the word quoting, the digits word comes out as [Str]:
> ( 'a', 'b', '137' ).perl ("a", "b", "137") > ( 'a', 'b', '137' )[*-1].perl "137" > ( 'a', 'b', '137' )[*-1].WHAT (Str) > my @n = ( 'a', 'b', '137' ); [a b 137] > @n[*-1].WHAT (Str)You typically see these sorts of errors when there are two code paths to get to a final result instead of shared code that converges to one path very early. That's what I would look for if I wanted to track this down (but, I need to work on the book!)
This does highlight, though, that you have to be very careful about sets. Even if this bug was fixed, there are other, non-buggy ways that
eqv
can fail. I would have still failed because 4 as Int is not "4" as Str . I think this level of attention to data types in unperly in it's DWIMery. It's certainly something I'd have to explain very carefully in a classroom and still watch everyone mess up on it.For what it's worth, I think the results of
gist
tend to be misleading in their oversimplification, and sometimes the results ofperl
aren't rich enough (e.g. hidingStr
which forces me to.WHAT
). The more I use those, the less useful I find them.But, knowing that I messed up before I even started would have saved me from that code spelunking that ended up meaning nothing!
Christoph ,Nov 26, 2016 at 23:55
Could you clarify what you consider the bug to be? As far as I can tell, this is all by design: (a)<...>
goes through&val
, which returns allomorphs if possible (b) set membership is defined in terms of identity, which distinguishes between allomorphs and their corresponding value types; so I would not classify it as a bug, but 'broken' by design; or phrased another way, it's just the WAT that comes with this particular DWIM – Christoph Nov 26 '16 at 23:55Brad Gilbert ,Nov 26, 2016 at 23:59
This was intentionally added, and is part of the testsuite . ( I can't seem to find anywhere that tests for< >
being equivalent toq:w:v< >
and<< >>
/" "
being equivalent toqq:ww:v<< >>
) – Brad Gilbert Nov 26 '16 at 23:59brian d foy ,Nov 27, 2016 at 0:02
The docs say the two lists should be eqv, and they are not. If they are not meant to be equivalent, the docs need to change. Nothing in docs.perl6.org/language/quoting#Word_quoting:_qw mentions any of this stuff. – brian d foy Nov 27 '16 at 0:02Christoph ,Nov 27, 2016 at 0:17
The documentation seems to be just wrong here,<...>
does not correspond toqw(...)
, butqw:v(...)
. Cf S02 for the description of the adverb and this test that Brad was <del>looking for</del> already linked to – Christoph Nov 27 '16 at 0:17Christoph ,Nov 27, 2016 at 0:45
or perhaps not outright wrong, but rather 'just' misleading:<...>
is indeed a:w
form, and the given example code does compare equal according toeqv
– Christoph Nov 27 '16 at 0:45dwarring ,Nov 27, 2016 at 18:33
Just to add to the other answers and point out a consistancy here between sets and object hashes .An object hash is declared as
my %object-hash{Any}
. This effectively hashes on objects.WHICH
method, which is similar to how sets distinguish individual members.Substituting the set with an object hash:
my %obj-hash{Any}; %obj-hash< 1 2 3 4 > = Any; say "hash: ", %obj-hash.keys.perl; say "4 is in hash: ", %obj-hash{4}:exists; say "IntStr 4 is in hash: ", %obj-hash{ IntStr.new(4, "Four") }:exists; say "IntStr(4,...) is 4: ", IntStr.new(4, "Four") == 4; say "5 is in hash: ", %obj-hash{5}:exists;gives similar results to your original example:
hash: (IntStr.new(4, "4"), IntStr.new(1, "1"), IntStr.new(2, "2"), IntStr.new(3, "3")).Seq 4 is in hash: False IntStr 4 is in hash: True IntStr(4,...) is 4: True 5 is in hash: Falsebrian d foy ,Nov 29, 2016 at 21:52
Oh, this is going to suck hard while teaching a class. – brian d foy Nov 29 '16 at 21:52dwarring ,Nov 30, 2016 at 4:21
I agree its not great, as it is. – dwarring Nov 30 '16 at 4:21dwarring ,Nov 30, 2016 at 18:26
Have raised an RT rt.perl.org/Ticket/Display.html?id=130222 – dwarring Nov 30 '16 at 18:26
![]() |
![]() |
![]() |
Nov 17, 2017 | www.youtube.com
Published on Oct 17, 2015
For details visit: http://perlmaven.com/introducing-test... Category Science & Technology License Standard YouTube License
![]() |
![]() |
![]() |
Nov 17, 2017 | www.youtube.com
Published on Jun 21, 2017
In which I detail the Perl 6 elements that have most changed my Perl 5 coding, and share the Perl 5 techniques I have adopted.
I eat, sleep, live, and breathe Perl!
Consultant and Contract Programmer Frequent PerlMongers speaker Dedicated Shakespeare theater-goer Armchair Mathematician Author of Blue_Tiger, a tool for modernizing Perl.
36 years coding 22 years Perl 16 years Married 15 YAPCs 7 Hackathons 3 PerlWhirls Perl interests: Refactoring, Perl Idioms / Micropatterns, RosettaCode, and Perl 6.
![]() |
![]() |
![]() |
Nov 17, 2017 | stackoverflow.com
Ask Question up vote down vote favorite
Toren ,Jan 12, 2011 at 14:50
I have an attribute (32 bits-long), that each bit responsible to specific functionality. Perl script I'm writing should turn on 4th bit, but save previous definitions of other bits.I use in my program:
Sub BitOperationOnAttr { my $a=""; MyGetFunc( $a); $a |= 0x00000008; MySetFunc( $a); }** MyGetFunc/ MySetFunc my own functions that know read/fix value.
Questions:
- if usage of
$a |= 0x00000008;
is right ?- how extract hex value by Regular Expression from string I have : For example:
"Attribute: Somestring: value (8 long (0x8))"
Michael Carman ,Jan 12, 2011 at 16:13
Your questions are not related; they should be posted separately. That makes it easier for other people with similar questions to find them. – Michael Carman Jan 12 '11 at 16:13toolic ,Jan 12, 2011 at 16:47
Same question asked on PerlMonks: perlmonks.org/?node_id=881892 – toolic Jan 12 '11 at 16:47psmears ,Jan 12, 2011 at 15:00
- if usage of $a |= 0x00000008; is right ?
Yes, this is fine.
- how extract hex value by Regular Expression from string I have : For example:
"Attribute: Somestring: value (8 long (0x8))"
I'm assuming you have a string like the above, and want to use a regular expression to extract the "0x8". In that case, something like:
if ($string =~ m/0x([0-9a-fA-F]+)/) { $value = hex($1); } else { # string didn't match }should work.
Michael Carman ,Jan 12, 2011 at 16:32
Perl provides several ways for dealing with binary data:
- The bitwise operators
&
,|
, and~
.- The
pack
andunpack
functions.- The
vec
function.Your scenario sounds like a set of packed flags. The bitwise operators are a good fit for this:
my $mask = 1 << 3; # 0x0008 $value |= $mask; # set bit $value &= ~$mask; # clear bit if ($value & $mask) # check bit
vec
is designed for use with bit vectors. (Each element has the same size, which must be a power of two.) It could work here as well:vec($value, 3, 1) = 1; # set bit vec($value, 3, 1) = 0; # clear bit if (vec($value, 3, 1)) # check bit
pack
andunpack
are better suited for working with things like C structs or endianness.sdaau ,Jul 15, 2014 at 5:01
I upvoted, but there is something very important missing:vec
operates on a string!If we use a number; say:
$val=5;
printf("b%08b",$val);
(this gives
b00000101
) -- then one can see that the "check bit" syntax, say:
for($ix=7;$ix>=0;$ix--) {
print vec($val, $ix, 1);
};
print "\n";
will not work (it gives
00110101
, which is not the same number). The correct is to convert the number to ASCII char, i.e.
print vec(sprintf("%c", $val), $ix, 1);
.
![]() |
![]() |
![]() |
Nov 17, 2017 | stackoverflow.com
confused ,2 days ago
I have a string of epoch seconds "1510652305" which when i convert to normal time on unix command line using`date -d @1510652305`i get Tue Nov 14 15:08:25 IST 2017
But when i tried it in perl using something like this
use POSIX qw(strftime); use Time::Local; use Time::localtime; $kickoff_time=1510652305; $kickoff_time=ctime($kickoff_time);i get
Thu Jan 1 05:30:00 1970
How can i achieve the result i am getting in linux in perl?
Thanks!!
mwp ,2 days ago
Don't overthink it!my $kickoff_time = localtime 1510652305; say $kickoff_time; # Tue Nov 14 15:08:25 2017If you absolutely, positively need the timezone in there:
use POSIX qw{strftime}; my $kickoff_time = strftime '%a %b %e %H:%M:%S %Z %Y', localtime 1510652305; say $kickoff_time; # Tue Nov 14 15:08:25 IST 2017Note that this is locale-dependent.
confused ,yesterday
We have to use localtime to convert in time from epoch seconds and gmtime to convert in time from normal seconds i got it now....Thanks!! – confused yesterdaymob ,yesterday
Still confused. Bothlocaltime
andgmtime
expect the input to be epoch seconds. – mob yesterdaymwp ,4 hours ago
Exactly. To expound,localtime()
takes the epoch and returns a string (or date parts array) representing the time in your local timezone;gmtime()
takes the epoch and returns a string (or date parts array) representing the time in UTC. – mwp 4 hours ago,
I would recommend usingTime::Piece
for this job - it's core in perl.#!/usr/bin/env perl use strict; use warnings; use Time::Piece; my $t = localtime ( 1510652305 ); print $t;It'll print default format, or you can use formatted using
strftime
.
![]() |
![]() |
![]() |
Nov 17, 2017 | perlmonks.com
Discipulus (Monsignor) on Nov 16, 2017 at 09:04 UTC
Re: perl modulesHello codestroman and welcome to the monastery and to the wonderful world of Perl!
First of all, please, add <c> code tags </c> around your code and output.
Then be sure to have read the standard documentation: perlmod and perlnewmod
Infact a basic perl module define a package and use Exporter to export functions in the using perl program.
In my homenode i've collected a lot of links on about module creation
L*
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.thanos1983 (Priest) on Nov 16, 2017 at 09:17 UTC
Re: perl modulesHello codestroman
Just to add a minor suggestion here, to the full cover reply of fellow monk Discipulus . It will assist you a lot also to read Simple Module Tutorial
Update: Direct answer to your question can be found here How to add a relative directory to @INC with multiple possible solutions. I would strongly recommend to go through all the articles that all monks proposed.
Hope this helps, BR.
Seeking for Perl wisdom...on the process of learning...not there...yet!hippo (Abbot) on Nov 16, 2017 at 09:21 UTC
Re: perl modules (Can't locate in @INC)PLEASE HELP!!This is a monastery - a place of quite contemplation. The louder you shout the less wisdom shall you receive.
The error message Can't locate dog.pm in @INC is pretty explicit. Either your module file is not called dog.pm in which case, change it or else your file dog.pm is not in any of the directories listed in @INC in which case either move it to one of those directories or else change @INC with use lib .
I also see, despite the lack of formatting in your post that your module doesn't use any namespace. You should probably address that. Perhaps a solid read through Simple Module Tutorial would be a good idea?
Anonymous Monk on Nov 16, 2017 at 09:07 UTC
Re: perl modulesuse an absolute pathname in use lib
Anonymous Monk on Nov 16, 2017 at 15:16 UTC
Re: perl modulesWelcome to the language ... and, to the Monastery. The "simple module tutorial" listed above is a very good place to start. Like all languages of its kind, Perl looks at runtime for external modules in a prescribed list of places, in a specified order. You can affect this in several ways, as the tutorials describe. Please read them carefully.
In the Perl(-5) language, this list is stored in a pre-defined array variable called @INC and it is populated from a variety of sources: a base-list that is compiled directly into the Perl interpreter, the PERL5LIB environment-variable, use lib statements, and even direct modification of the variable itself. Perl searches this list from beginning to end and processes (only) the first matching file that it finds.
(Note that, in Perl, the use statement is actually a pragma, or declaration to the compiler, and as such it has many "uses" and a somewhat complicated syntax.)
Corion (Pope) on Nov 16, 2017 at 15:23 UTC
Re^2: perl modules
by Corion (Pope) on Nov 16, 2017 at 15:23 UTC(Note that, in Perl, the use statement is actually a pragma, or declaration to the compiler, and as such it has many "uses" and a somewhat complicated syntax.)Please no.
The word "pragma" has a special meaning in Perl, and it is highly confusing to claim that a Perl "keyword" would be a "pragma". use certainly is a keyword and nothing else.
If you mean to say something different, please describe in more words what you want to say.
![]() |
![]() |
![]() |
Nov 17, 2017 | www.youtube.com
Perl comes with a very powerful built-in command line debugger. In this screencast you can see basics how to use it.
For blog entries and for more screencasts see http://perlmaven.com/
About Perl Programming and Perl programmers.
For the blog entry of this screencast visit
http://perlmaven.com/using-the-built-...Debugger commands used:
q - quit,
h - help,
p - print,
s - step in,
n - step over,
r - step out,
T - stack trace
l - listing codeThe Padre project can be found here: http://padre.perlide.org/
The book mentioned was Pro Perl Debugging: http://www.apress.com/9781590594544
If you are interested an on-site Perl training contact me http://szabgab.com/contact.html
![]() |
![]() |
![]() |
Nov 17, 2017 | www.amazon.com
Regex Modifiers
Several modifiers change the behavior of the regular expression operators. These modifiers appear at the end of the match, substitution, and qr// operators. For example, here's how to enable case-insensitive matching:
my $pet = 'ELLie' ; like $pet, qr /Ellie/, 'Nice puppy!' ; like $pet, qr /Ellie/i, 'shift key br0ken' ; The first like() will fail because the strings contain different letters. The second like() will pass, because the /i modifier causes the regex to ignore case distinctions. and are effectively equivalent in the second regex due to the modifier.
You may also embed regex modifiers within a pattern:
my $find_a_cat = qr /(?<feline>(?i)cat)/; The (?i) syntax enables case-insensitive matching only for its enclosing group -- in this case, the named capture. You may use multiple modifiers with this form. Disable specific modifiers by preceding them with the minus character ( ):
my $find_a_rational = qr /(?<number>(?-i)Rat)/;
... ... ...The /e modifier lets you write arbitrary code on the right side of a substitution operation. If the match succeeds, the regex engine will use the return value of that code as the substitution value. The earlier global substitution example could be simpler with code like the following:
# appease the Mitchell estate $sequel =~ {Scarlett( O 'Hara)?} { ' Mauve ' . defined $1 ? ' Midway ' : '' }ge; Each additional occurrence of the /e modifier will cause another evaluation of the result of the expression, though only Perl golfers use anything beyond /ee
![]() |
![]() |
![]() |
Nov 16, 2017 | stackoverflow.com
Andrew Newby, Nov 10 at 11:33
I am trying to use Net::FTP ( http://search.cpan.org/~shay/libnet-3.10/lib/Net/FTP.pm ) to upload a file to a remote server. I have:use Net::FTP; my $ftp = Net::FTP->new("example.com", Debug => 1) or die "Cannot connect to example.com: $@"; $ftp->login("username",'xxxx') or die "Cannot login ", $ftp->message; $ftp->pasv(); $ftp->binary(); $ftp->cwd("/web/example.com/public_html/cgi-bin/links/admin/IMPORT") or die "Cannot change working directory ", $ftp->message; print "Currently in: " . $ftp->pwd(). "\n"; $ftp->put("/home/chambres/web/example.com/public_html/cgi-bin/links/admin/org.csv") or die "Cannot upload ", $ftp->message; $ftp->quit;However, when I run it I get:
Net::FTP>>> Net::FTP(3.05) Net::FTP>>> Exporter(5.72) Net::FTP>>> Net::Cmd(3.05) Net::FTP>>> IO::Socket::SSL(2.024) Net::FTP>>> IO::Socket::IP(0.37) Net::FTP>>> IO::Socket(1.38) Net::FTP>>> IO::Handle(1.35) Net::FTP=GLOB(0x182e348)<<< 220 (vsFTPd 3.0.3) Net::FTP=GLOB(0x182e348)>>> USER username Net::FTP=GLOB(0x182e348)<<< 331 Please specify the password. Net::FTP=GLOB(0x182e348)>>> PASS .... Net::FTP=GLOB(0x182e348)<<< 230 Login successful. Net::FTP=GLOB(0x182e348)>>> EPSV Net::FTP=GLOB(0x182e348)<<< 229 Entering Extended Passive Mode (|||12065|) Net::FTP=GLOB(0x182e348)>>> TYPE I Net::FTP=GLOB(0x182e348)<<< 200 Switching to Binary mode. Net::FTP=GLOB(0x182e348)>>> CWD /web/example.com/public_html/cgi-bin/links/admin/IMPORT Net::FTP=GLOB(0x182e348)<<< 250 Directory successfully changed. Net::FTP=GLOB(0x182e348)>>> PWD Net::FTP=GLOB(0x182e348)<<< 257 "/web/example.com/public_html/cgi-bin/links/admin/IMPORT" is the current directory Currently in: /web/example.com/public_html/cgi-bin/links/admin/IMPORT Net::FTP=GLOB(0x182e348)>>> PORT 139,162,208,252,155,199 Net::FTP=GLOB(0x182e348)<<< 200 PORT command successful. Consider using PASV. Net::FTP=GLOB(0x182e348)>>> FEAT Net::FTP=GLOB(0x182e348)<<< 211-Features: Net::FTP=GLOB(0x182e348)<<< EPRT Net::FTP=GLOB(0x182e348)<<< EPSV Net::FTP=GLOB(0x182e348)<<< MDTM Net::FTP=GLOB(0x182e348)<<< PASV Net::FTP=GLOB(0x182e348)<<< REST STREAM Net::FTP=GLOB(0x182e348)<<< SIZE Net::FTP=GLOB(0x182e348)<<< TVFS Net::FTP=GLOB(0x182e348)<<< 211 End Net::FTP=GLOB(0x182e348)>>> HELP ALLO Net::FTP=GLOB(0x182e348)<<< 214-The following commands are recognized. Net::FTP=GLOB(0x182e348)<<< ABOR ACCT ALLO APPE CDUP CWD DELE EPRT EPSV FEAT HELP LIST MDTM MKD Net::FTP=GLOB(0x182e348)<<< MODE NLST NOOP OPTS PASS PASV PORT PWD QUIT REIN REST RETR RMD RNFR Net::FTP=GLOB(0x182e348)<<< RNTO SITE SIZE SMNT STAT STOR STOU STRU SYST TYPE USER XCUP XCWD XMKD Net::FTP=GLOB(0x182e348)<<< XPWD XRMD Net::FTP=GLOB(0x182e348)<<< 214 Help OK. Net::FTP=GLOB(0x182e348)>>> ALLO 37954326 Net::FTP=GLOB(0x182e348)<<< 202 ALLO command ignored. Net::FTP=GLOB(0x182e348)>>> STOR org.csv Net::FTP=GLOB(0x182e348)<<< 425 Failed to establish connection. <h1>Software error:</h1> <pre>Cannot upload Failed to establish connection. </pre> <p> For help, please send mail to this site's webmaster, giving this error message and the time and date of the error. </p> [Fri Nov 10 10:57:33 2017] export-csv-other-sites.cgi: Cannot upload Failed to establish connection.It seems to work up until the
put()
command. Any ideas as to what is going on?Gerhard Barnard, Nov 10 at 11:36
huh?$ftp->put("/home/chambres/web/example.com/public_html/cgi-
That seems incomplete. – Gerhard Barnard Nov 10 at 11:36Gerhard Barnard, Nov 10 at 11:38
Secondly, it is not connecting. It tells you that twice Net::FTP=GLOB(0x182e348)<<< 425 Failed to establish connection. and then again <pre>Cannot upload Failed to establish connection – Gerhard Barnard Nov 10 at 11:38Andrew Newby, Nov 10 at 11:41
@GerhardBarnard - I know that :) The weird part, is that it says it IS connected:Currently in: /web/example.com/public_html/cgi-bin/links/admin/IMPORT
. – Andrew Newby Nov 10 at 11:41Andrew Newby, Nov 10 at 11:42
"That seems incomplete." - what seems incomplete? – Andrew Newby Nov 10 at 11:42Gerhard Barnard, Nov 10 at 11:43
I suspect it is not keeping the connection open. can you also fix the code? it seems incomplete.$ftp->put("/home/chambres/web/example.com/public_html/cgi-
– Gerhard Barnard Nov 10 at 11:43Net::FTP=GLOB(0x182e348)>>> PORT 139,162,208,252,155,199 Net::FTP=GLOB(0x182e348)<<< 200 PORT command successful. Consider using PASV.FTP uses a control connection for the command and data connections for each data transfer. With the PORT command your local system is instructing the server to connect to the given IP address (139.162.208.252) and port (39879=155*256+199). Connecting from outside to some arbitrary port on your system will not work if you are behind a firewall or some NAT or if there is a firewall configured on your system. In these cases it might work to use the passive mode where the client opens a connection to the server and not the server a connection to the client.
Net::FTP=GLOB(0x182e348)>>> STOR org.csv Net::FTP=GLOB(0x182e348)<<< 425 Failed to establish connection.It looks like the server could not connect to your system in order to create a connection to transfer the data. Probably a firewall or NAT involved. Try passive mode.
It looks like that you tried to use passive mode already:
$ftp->pasv(); ... Net::FTP=GLOB(0x182e348)>>> EPSV Net::FTP=GLOB(0x182e348)<<< 229 Entering Extended Passive Mode (|||12065|)Only you did it the wrong way. The command above just sends the PASV/EPSV command to the server but does not change which mode gets used for the next data transfer. To cite from the documentation :
If for some reason you want to have complete control over the data connection, this includes generating it and calling the response method when required, then the user can use these methods to do so.
However calling these methods only affects the use of the methods above that can return a data connection. They have no effect on methods get, put, put_unique and those that do not require data connections.To instead enable passive mode in connection with
put
,get
etc usepassive
notpasv
:$ftp->passive(1);
![]() |
![]() |
![]() |
Nov 16, 2017 | perlmonks.com
likbez
// is an abbreviation for m// (be careful of context). But // is can be replaced by (almost?) any delimiter, by using m or s or tr.You make a very good point. Now I started to understand why they put description of tr, which is actually a function into this strange place
http://perldoc.perl.org/perlop.html#Quote-Like-OperatorsStrings with arbitrary delimiters after tr, m, s, etc are a special, additional type of literals. Each with its own rules. And those rules are different from rules that exist for single quoted strings, or double quoted strings or regex (three most popular types of literals in Perl).For example, the treatment of backslash in "tr literal" is different from single quoted strings:
"A single-quoted, literal string. A backslash represents a backslash unless followed by the delimiter or another backslash, in which case the delimiter or backslash is interpolated."This means that in Perl there is a dozen or so of different types of literals, each with its own idiosyncratic rules. Which create confusion even for long type Perl users as they tend to forget detail of constructs they use rarely and extrapolate them from more often used constructs.
For example, in my case, I was burned by the fact that "m literals" allows interpolation of variables, but "tr literals" do not. And even created a test case to study this behavior :-)
In other words, the nature of those "context-dependent-literals" (on the level of lexical scanner they are all literals) is completely defined not by delimiters they are using (which are arbitrary), but by the operator used before it. If there none, m is assumed.
This "design decision" (in retrospect this is a design decision, although in reality it was "absence of design decition" situation ;-) adds unnecessary complexity to the language and several new (and completely unnecessary) types of bugs.
This "design decision" is also poorly documented and for typical "possible blunders" (for tr that would be usage of "[","$","@" without preceding backslash) there is no warnings.
This trick of putting tr description into http://perldoc.perl.org/perlop.html that I mentioned before now can be viewed as an attempt to hide this additional complexity. It might be beneficial to revise the docs along the lines I proposed.
In reality in Perl q, qq, qr, m, s, tr are functions each of which accepts (and interpret) a specific, unique type of "context-dependent-literal" as the argument. That's the reality of this, pretty unique, situation with the language, as I see it.
Quote-Like-Operators shows 2 interesting examples with tr: tr[aeiouy][yuoiea] or tr(+\-*/)/ABCD/. [download]The second variant look like a perversion for me. I never thought that this is possible. I thought that the "arbitrary delimiter" is "catched" after the operator and after that they should be uniform within the operator ;-).And the first is not without problems either: if you "extrapolate" your skills with regex into tr you can write instead of tr[aeiouy][yuoiea] obviously incorrect< code>tr/ aeiouy /] yuoiea / that will work fine as long as strings are of equal length.
![]() |
![]() |
![]() |
Nov 11, 2017 | stackoverflow.com
Rotch Miller, Nov 11 at 6:48
I have following query in Perl regarding the accessing of file handlers.Consider the following code snippet which describes the exact scenario.
Main.pl#!/usr/bin/perl -w use warnings; use strict; use strict 'refs'; use File::Basename; use Fcntl ':flock'; use feature qw/say switch/; use File::Spec::Functions; use File::Find; require( "/home/rxa3kor/Mastering_Perl/sample.pm" ); our $LOGFILE = "sample"; open( LOGFILE, ">$LOGFILE" ) or die "__ERROR: can't open file\n'", $LOGFILE, "'!\n"; flock( LOGFILE, LOCK_EX ); print LOGFILE ( "Tool Start\n" ); &sample::func(); flock( LOGFILE, LOCK_UN ); close( LOGFILE );sample.pm#!/usr/bin/perl -w package sample; sub func() { print $main::LOGFILE ( "Printing in subroutine\n" ); }when I execute the above said code snippet I am getting the following error.
print() on unopened filehandle Mastering at /home/rxa3kor/Mastering_Perl/sample.pm line 6.
Th error is because the filehandle
LOGFILE
is not visible undersample.pm
module.How this concept can be implemented?
I want to open a file in
Main.pl
and I need this file handle to be accessible in different Perl modules.Dave Cross ,Nov 11 at 6:54
I don't think this is the code you are using. This code doesn't compile. You are missing a semicolon at the end of theuse File::Find
line. And once I fix that, I get another problem as you are not loadingsample.pm
in your main program. Please don't waste our time by posting sample code where we have to fix simple errors like that. – Dave Cross Nov 11 at 6:54Dave Cross ,Nov 11 at 6:55
Two more errors.sample.pm
does not return a true value. And the filename is different between this sample code and the error message that you quote. – Dave Cross Nov 11 at 6:55Rotch Miller ,Nov 11 at 7:01
Basically i wanted to know whether we can open a file under main.pl and i need this file handle to be accessible in different Perl modules. – Rotch Miller Nov 11 at 7:01DavidO ,Nov 11 at 7:11
This is unrelated to the problem that you are asking about, but what do you think will happen when you open your logfile in'>'
mode, and then discover you're unable to obtain an exclusive lock because someone else has it locked? – DavidO Nov 11 at 7:11DavidO ,Nov 11 at 7:13
Hint: Clobber-output mode will clobber the output file before you've obtained a lock. This means if someone else already had the file opened with a lock, you just clobbered them. – DavidO Nov 11 at 7:13Dave Cross ,Nov 11 at 7:19
The reason why you're seeing this error is that$main::LOGFILE
refers to the scalar variable$LOGFILE
which contains the filename,sample
. The filehandle,LOGFILE
, is a completely different variable. And here we see the dangers of having two variables of different types (scalar vs filehandle) with the same name.Bareword filehandles (the ones in capital letters with no sigil attached, the type you are using) are slightly strange variables. They don't need a sigil, so you shouldn't use one. So the simplest fix is to just remove the
$
.sub func() { print main::LOGFILE ("Printing in subroutine\n"); }But using global variables like this is a terrible idea. It will quickly lead to your code turning into an unmaintainable mess.
Far better to use a lexical filehandle and to pass that into your subroutine.
our $LOGFILE="sample"; open( my $log_fh, ">$LOGFILE" ) or die "__ERROR: can't open file\n'",$LOGFILE,"'!\n"; flock( $log_fh, LOCK_EX ); print $log_fh ("Tool Start\n"); &sample::func($log_fh); flock( $log_fh, LOCK_UN ); close( $log_fh );And in
sample.pm
:sub func { my ($fh) = @_; print $fh ("Printing in subroutine\n"); }Note that as I'm now passing a parameter to
func()
. I've removed the prototype saying that it takes no parameters (although the fact that you were calling it with&
turns off parameter checking!)A few other points.
- You don't need both
-w
anduse warnings
. Remove the-w
.- You don't need both
use strict
anduse strict 'refs'
. Remove the latter.- Modules with all lower-case names are reserved for special Perl features called pragmas . Don't name your modules like that.
- There's no need for
$LOGFILE
to be a package variable (defined withour
). Just make it a lexical (defined withmy
).- There is no reason to call subroutines with
&
(and, in fact, it has a couple of downsides that will confused you).- Don't define subroutines with prototypes unless you know what they are for.
- No need for a shebang line in modules.
- Use
strict
andwarnings
in modules.I'd write your code like this:
# main.pl use warnings; use strict; use File::Basename; # Not used. Remove? use Fcntl ':flock'; # Not user. Remove? use feature qw/say switch/; use File::Spec::Functions; # Not user. Remove? use File::Find; # Not user. Remove? use Sample; my $LOGFILE = 'sample'; # Lexical filehandle. Three-arg version of open() open( my $log_fh, '>', $LOGFILE ) or die "__ERROR: can't open file\n'$LOGFILE'!\n"; flock( $log_fh, LOCK_EX ); print $log_fh ("Tool Start\n"); sample::func($log_fh); flock( $log_fh, LOCK_UN ); close( $log_fh );And...
package Sample; use strict; use warnings; sub func { my ($fh) = @_; print $fh ("Printing in subroutine\n"); } 1;Rotch Miller ,Nov 11 at 7:29
Is there any method where we can avoid passing the file handler to a subroutine ? I need to directly access the file handler in the perl module which is present in main,pl. Reason for this requirement is because i may have different Perl modules and different subroutines inside each modules, every time i need to pass the file handlers to each of these subroutines in Perl module. Another difficulty will be always subroutine need not be called from main.pl file, subroutine defined in a *.pm file may call other subroutine which is defined in another *.pm module. – Rotch Miller Nov 11 at 7:29Dave Cross ,Nov 11 at 7:32
@RotchMiller: My answer already tells you how to do that. But I think it's a very bad idea. – Dave Cross Nov 11 at 7:32Rotch Miller ,Nov 11 at 7:54
Main underlying problem is the way how the file handler's can be made visible in the subroutine of different Perl modules. Like how we have to export a scalar variables from one *.pm module to any perl modules using the EXPORTER, similar concept for file handlers would be good. – Rotch Miller Nov 11 at 7:54Dave Cross ,Nov 11 at 7:59
@RotchMiller: Exporter works fine for filehandles. Obviously not if they're lexical variables. But for package variables and bareword filehandles, there's no problem. – Dave Cross Nov 11 at 7:59Dave Cross ,Nov 11 at 8:03
@RotchMiller The traditional way to make a variable visible within subroutines in many different modules is to pass it in as a parameter. But if you want to ignore seventy years of good software engineering practice - feel free :-) – Dave Cross Nov 11 at 8:03> ,
You've got an extremely detailed analysis from Dave Cross .Here I'd like to offer a way to cleanly provide a log file for all modules to write to.
Introduce a module that performs the writes to a log file in a sub; load it by all modules that need that. In that sub open the log file to append, using state filehandle which thus stays open across the calls. Then the modules write by invoking this sub, and this can be initiated by a call from
main
.The logger module
package LogAll; use warnings; use strict; use feature qw(say state); use Carp qw(croak); use Exporter qw(import); our @EXPORT_OK = qw(write_log); sub write_log { state $fh = do { # initialize; stays open across calls my $log = 'LOG_FILE.txt'; open my $afh, '>>', $log or croak "Can't open $log: $!"; $afh; }; say $fh $_ for @_; } 1;Two other modules, that need to log, are virtually the same for this example; here is one
package Mod1; use warnings; use strict; use Exporter qw(import); use LogAll qw(write_log); our @EXPORT_OK = qw(f1); sub f1 { write_log(__PACKAGE__ . ": @_"); } 1;The main
use warnings; use strict; use LogAll qw(write_log); use Mod1 qw(f1); use Mod2 qw(f2); write_log('START'); f1("hi from " . __PACKAGE__); f2("another " . __PACKAGE__);A run results in the file
LOG_FILE.txt
START Mod1: hi from main Mod2: another mainI print
START
for a demo but the file need not be opened frommain
.Please develop the printer module further as suitable. For example, and a way for the file name to be passed optionally so that
main
can name the log (by varying type and number of arguments), and add a way to close the log controllably,
![]() |
![]() |
![]() |
Nov 16, 2017 | perlmonks.com
davido (Archbishop) on Nov 16, 2017 at 05:46 UTC
Re: Generating a range of Unicode charactersCheck out perlop Auto-increment and Auto-decrement for an explanation.
The thing to consider here is that the .. range operator leverages the semantics provided by ++ (auto-increment). The documentation for auto-increment says this:
The auto-increment operator has a little extra builtin magic to it. If you increment a variable that is numeric, or that has ever been used in a numeric context, you get a normal increment. If, however, the variable has been used in only string contexts since it was set, and has a value that is not the empty string and matches the pattern /^ a-zA-Z * 0-9 *\z/ , the increment is done as a string, preserving each character within its range, with carry:
print ++($foo = "99"); # prints "100" print ++($foo = "a0"); # prints "a1" print ++($foo = "Az"); # prints "Ba" print ++($foo = "zz"); # prints "aaa" [download]The components of the range you are trying to construct do not meet the criteria for Perl's built-in autoincrement behavior.
However, if you're using Perl 5.26 or newer, and enable unicode_strings you can use the following, as documented in perlop Range Operators .
use charnames "greek"; my @greek_small = map { chr } (ord("\N{alpha}") .. ord("\N{omega}")); [download]Or forgo the \N{charname} lookups and just use the actual ordinal values:
my @chars = map {chr} $ord_first .. $ord_last; [download]
Dave
Your Mother (Chancellor) on Nov 16, 2017 at 06:13 UTC
Re: Generating a range of Unicode charactersIs this what you're after?
perl -CSD -le 'print chr for 0xDF .. 0x0101' [download]Update: I hadn't read all the way down davido 's post. He is making the same suggestion already at the end.
![]() |
![]() |
![]() |
Nov 12, 2017 | stackoverflow.com
pleriche, Nov 12 at 9:52
I have an HTML file containing a 2-column table which I want to parse in order to extract pairs of strings representing the columns. The page layout of the HTML (white space, new lines) is arbitrary, hence I can't parse the file line by line.I recall that you can parse such a thing by slurping the whole file into a string and operating on the entire string, which I'm finding a bit more challenging. I'm trying things like the following:
#!/usr/bin/perl open(FILE, "Glossary") || die "Couldn't open file\n"; @lines = <FILE>; close(FILE); $data = join(' ', @lines); while ($data =~ /<tr>.*(<td>.*<\/td>).*(<td>.*<\/td>).*<\/tr>/g) { print $1, ":", $2, "\n"; }which gives a
null
output. Here's a section of the input file:<table class="wikitable"> <tr> <td><b>Term</b> </td> <td><b>Meaning</b> </td></tr> <tr> <td><span id="0-Day">0-Day</span> </td> <td> <p>See <a href="#Zero_Day">Zero Day</a>. </p> </td>Can someone help me out?
Borodin ,Nov 12 at 21:20
UseHTML::TableExtract
– Borodin Nov 12 at 21:20zdim ,Nov 12 at 21:46
To correct my early comment (removed), while I recommend HTML::TreeBuilder for general parsing of HTML (and there are others), here you indeed wantHTML::TableExtract
. And you do not want to use regex – zdim Nov 12 at 21:46Dave Cross ,2 days ago
You can't parse HTML with a regex – Dave Cross 2 days agoMiguel Prz ,Nov 12 at 10:03
There is a HTML::TableExtract module in CPAN, which simplifies the problem you are trying to solve:use strict; use warnings; use HTML::TableExtract qw(tree); my $te = HTML::TableExtract->new( headers => qw(Term Meaning) ); my $html_file = "Glossary"; $te->parse_file($html_file); my $table = $te->first_table_found; # ...pleriche ,yesterday
Thank you and I'm sure TableExtract is the better way of doing it, but the object of my question was to improve my understanding of how to use regular expressions since they're so central to Perl. Adding gs to the regexpr as someone suggested (since deleted) was the leg-up I needed. – pleriche yesterdayMiguel Prz ,yesterday
I see your point, and it's really important build a solid knowlegment on regexpr. But, like other people have said, it's not a goot idea apply regexpr to parsing html documents – Miguel Prz yesterday,
You already have answers explaining why you shouldn't parse HTML with regexes. And you really shouldn't. But you've asked for an explanation of why your code doesn't work. So here goes...You have two problems in your code. One stops it working and the other stops it working as you expect.
Firstly, you are using
.
in your regex to match any character. But.
doesn't match any character. It matches any character except a newline. And you have newlines in your string. You fix that by adding the/s
option to your match operator (so it has/gs
instead of/s
).With that fix in place, you get a result from your code. Using your test data, I see:
<td><b>Term</b> </td>:<td><b>Meaning</b> </td>Which is correct. But looking at your test data, I wondered why I wasn't getting two results - because of the
/g
. I soon realised it was because your test data is missing the closing</td>
. When I added that, I got this result:<td><span id="0-Day">0-Day</span> </td>:<td> <p>See <a href="#Zero_Day">Zero Day</a>. </p> </td>Ok. It's now finding the second result. But what has happened to the first one? That's the second error in your code.
You have
.*
a few times in your regex. That means "zero or more of any character". But it's the "or more" that is a problem here. By default, Perl regex qualifiers (*
or+
) are greedy. That means they will use up as much of the string as possible. And the first.*
in your regex is eating up a lot of your string. All of it up to the second<tr>
in fact.The solution to that is to make the
.*
non-greedy. And you do that by adding?
to the end. So you can replace all of the.*
with.*?
. Having done that, I get this output:<td><b>Term</b> </td>:<td><b>Meaning</b> </td> <td><span id="0-Day">0-Day</span> </td>:<td> <p>See <a href="#Zero_Day">Zero Day</a>. </p> </td>Which seems correct to me.
So, to summarise:
- By default,
.
doesn't match newlines. To do that, you need/s
.- Beware of greedy qualifiers.
![]() |
![]() |
![]() |
Nov 11, 2017 | stackoverflow.com
newbie ,Nov 11 at 0:27
I have a read-only perl file with a huge hash defined in it. Is there anyway for me to read this perl file and dump out the hash contents?this is basic structure of the hash within the file.
%hash_name = { -files => [ '<some_path>', ], -dirs => [ '<some_path>', '<some_path>', '<some_path>', '<some_path>', '<some_path>', ], };Davy M ,Nov 11 at 0:30
can you not cat the file and redirect it into a one that does have write permissions?cat perl_file_name > new_perl_file_name
– Davy M Nov 11 at 0:30newbie ,Nov 11 at 0:34
yes I did consider that but will go with that approach only if there is no other way to dump the hash without creating a new file. – newbie Nov 11 at 0:34zdim ,Nov 11 at 2:59
@newbie Thank you, and to repeat the question: Does this file have other Perl code or just this hash? Also, is the hash undeclared (just%hash_name
), as you show it, or is it "lexical," so withmy
such as:my %hash_name
? – zdim Nov 11 at 2:59zdim ,Nov 11 at 3:12
@newbie What you show is invalid in Perl: the%
in%hash_name
indicates that the variable is a hash , but{ .. }
form a hash reference , which is a scalar variable (not a hash). So it should be either%hash_name = ( .. )
or it's$hashref_name = { .. }
– zdim Nov 11 at 3:12Schwern ,Nov 11 at 6:59
Note this is an insecure way to store data. The data file must be evaluated as perl code. Any arbitrary code could be in the file. In addition, the data file can only be read by Perl programs. Instead, use JSON or similar data format. JSON::MaybeXS can convert between JSON and Perl. – Schwern Nov 11 at 6:59zdim ,Nov 11 at 3:40
Ideally you'd copy the file so that you can edit it, then turn it into a module so to use it nicely.But if for some reason this isn't feasible here are your options.
If that hash is the only thing in the file , "load" it using do † and assign to a hash
use warnings; use strict; my $file = './read_this.pl'; # the file has *only* that one hash my %hash = do $file;This form of
do
executes the file (runs it as a script), returning the last expression that is evaluated. With only the hash in the file that last expression is the hash definition, precisely what you need.If the hash is undeclared , so a global variable (or declared with
our
), then declare asour
a hash with the same name in your program and again load the file withdo
our %hash_name; # same name as in the file do $file; # file has "%hash" or "our %hash" (not "my %hash")Here we "pick up" the hash that is evaluated as
do
runs the file by virtues of ourIf the hash is "lexical" , declared as
my %hash
(as it should be!) ... well, this is bad. Then you need to parse the text of the file so to extract lines with the hash. This is in general very hard to do, as it amounts to parsing Perl. (A hash can be built usingmap
, returned from a sub as a reference or a flat list ...) Once that is done youeval
the variable which contains the text defining that hash.However, if you know how the hash is built, as you imply, with no
()
anywhere insideuse warnings; use strict; my $file = './read_this.pl'; my $content = do { # "slurp" the file -- read it into a variable local $/; open my $fh, '<', $file or die "Can't open $file: $!"; <$fh>; }; my ($hash_text) = $content =~ /\%hash_name\s*=\s*(\(.*?\)/s; my %hash = eval $hash_text;This simple shot leaves out a lot, assuming squarely that the hash is as shown. Also note that this form of eval carries real and serious security risks.
† Files are also loaded using require . Apart from it doing a lot more than
do
, the important thing here is that even if it runs multiple timesrequire
still loads that file only once . This matters for modules in the first place, which shouldn't be loaded multiple times, and use indeed usesrequire
.On the other hand,
do
does it every time, what makes it suitable for loading files to be used as data, which presumably should be read every time. This is the recommended method. Note thatrequire
itself usesdo
to actually load the file.Thanks to Schwern for a comment.
Schwern ,Nov 11 at 4:31
do
will always load the file.require
will only load it once. Since you want to get data from the file, it's recommended to usedo
. Else the second or third time anything in that process loads the file they'll end up with1
. – Schwern Nov 11 at 4:31zdim ,Nov 11 at 4:41
@Schwern Right, thank you for the comment. I wanted to avoid excessive explanation thus I simply usedo
. (I still mentionrequire
since it is feasible that the data is loaded once.) But it is good to state this, thank you -- I am adding the comment. – zdim Nov 11 at 4:41Schwern ,Nov 11 at 4:57
It's bad practice to userequire
because a future person maintaining the code may alsorequire
the same file elsewhere (not even in the same code file, it's per process) and not realize it has already been required. I'd suggest instead explaining whydo
is the right thing to do here instead ofrequire
, it's a necessary complexity. – Schwern Nov 11 at 4:57zdim ,Nov 11 at 5:40
@Schwern A good point, thank you. Adjusted the post. – zdim Nov 11 at 5:40zdim ,Nov 13 at 2:15
@DavyM Thank you for kind words. You are right, and there are so many such questions that it even seems more common (than otherwise) when people start out. All that we can do is to keep pointing it out, and write it always in full code examples. – zdim Nov 13 at 2:15
![]() |
![]() |
![]() |
Nov 13, 2017 | stackoverflow.com
brian d foy ,Nov 13 at 2:34
Thepack
function puts one or more things together in a single string. It represents things as octets (bytes) in a way that it can unpack reliably in some other program. That program might be far away (like, the distance to Mars far away). It doesn't matter if it starts as something human readable or not. That's not the point.Consider some task where you have a numeric ID that's up to about 65,000 and a string that might be up to six characters.
print pack 'S A6', 137, $ARGV[0];It's easier to see what this is doing if you run it through a hex dumper as you run it:
$ perl pack.pl Snoopy | hexdump -C 00000000 89 00 53 6e 6f 6f 70 79 |..Snoopy|The first column counts the position in the output so ignore that. Then the first two octets represent the
S
(short, 'word', whatever, but two octets) format. I gave it the number 137 and it stored that as 0x8900. Then it stored 'Snoopy' in the next six octets.Now try it with a shorter name:
$ perl test.pl Linus | hexdump -C 00000000 89 00 4c 69 6e 75 73 20 |..Linus |Now there's a space character at the end (0x20). The packed data still has six octets. Try it with a longer name:
$ perl test.pl 'Peppermint Patty' | hexdump -C 00000000 89 00 50 65 70 70 65 72 |..Pepper|Now it truncates the string to fit the six available spaces.
Consider the case where you immediately send this through a socket or some other way of communicating with something else. The thing on the other side knows it's going to get eight octets. It also knows that the first two will be the short and the next six will be the name. Suppose the other side stored that it
$tidy_little_package
. It gets the separate values by unpacking them:my( $id, $name ) = unpack 'S A6', $tidy_little_package;That's the idea. You can represent many values of different types in a binary format that's completely reversible. You send that packed string wherever it needs to be used.
I have many more examples of
pack
in Learning Perl and Programming Perl .
![]() |
![]() |
![]() |
Nov 16, 2017 | stackoverflow.com
Grep Two Dimensional Array Ask Question up vote down vote favorite
Taranasaur ,yesterday
Since this is not a question directly covered here, thought best I ask and answer it.I had an issue where I wanted to add a node name to a list only if the same node doesn't already exist. The array was built using:
push (@fin_nodes, [$node, $hindex, $e->{$hip}->{FREQ}]);So given when given array (@fin_nodes) that looks like:
$VAR1 = [ 'first-node', '4', 3 ]; $VAR2 = [ 'second-node', '1', 3 ]; $VAR3 = [ 'another-node', '1', 5 ]; $VAR4 = [ 'some-node', '0', 5 ];To do a grep on this the following works:
my @match = grep { grep { $_ =~ $node } @$_ } @fin_nodes;So given a $node "second-node" the above statement will return @match as:
$VAR1 = [ 'second-node', '1', 3 ];Sobrique ,yesterday
Why not use a hash instead? – Sobrique yesterdayysth ,yesterday
when dumping an array, do Data::Dumper::Dumper(\@array), not ...(@array). if passed a list, Dumper dumps each element individually, which is not what you want here – ysth yesterday,
I would say "don't" and instead:my %fin_nodes; $fin_nodes{$node} = [$hindex, $e->{$hip}->{FREQ}]);And then you can simply
if ($fin_nodes{$node}) {
Failing that though - you don't need to grep every element, as your node name is always first.
So:
my @matches = grep { $_ -> [0] eq $node } @fin_nodes;
eq
is probably a better choice than=~
here, because the latter will substring match. (And worse, can potentially do some quite unexpected things if you've metacharacters in there, since you're not quoting or escaping them)E.g. in your example - if you look for a node called
"node"
you'll get multiple hits.Note - if you're only looking for one match, you can do something like:
my ( $first_match ) = grep { $_ -> [0] eq $node } @fin_nodes;This will just get you the first result, and the rest will be discarded. (Which isn't too efficient, because
grep
will continue to iterate the whole list).Taranasaur ,yesterday
Your last statement was on point, I only needed one match. Then before pushing a node onto fin_nodes this was enough: "if (!$first_match)" – Taranasaur yesterdayBorodin ,yesterday
@Taranasaur: I think you missed the point of Sobrique's answer. A hash is by far the better choice for this, and you can simply write$fin_nodes{$node} //= [ $hindex, $e->{$hip}{FREQ} ]
and avoid the need for any explicit test altogether. – Borodin yesterdayTaranasaur ,yesterday
@Borodin, no I do get Sobrique's point. The fin_nodes array is being used for a simple list function that another method is already using quite happily in my program. I will at some point go back and create a hash as there might be more attributes I'll need to include in that array/hash – Taranasaur yesterdayysth ,yesterday
"because the latter will substring match" assuming no regex metacharacters; if there are any, it will be even worse – ysth yesterdaySobrique ,yesterday
Good point @ysth I will add that. – Sobrique yesterday
![]() |
![]() |
![]() |
Nov 16, 2017 | perlmonks.com
davido (Archbishop) on Nov 16, 2017 at 05:46 UTC
Re: Generating a range of Unicode charactersCheck out perlop Auto-increment and Auto-decrement for an explanation.
The thing to consider here is that the .. range operator leverages the semantics provided by ++ (auto-increment). The documentation for auto-increment says this:
The auto-increment operator has a little extra builtin magic to it. If you increment a variable that is numeric, or that has ever been used in a numeric context, you get a normal increment. If, however, the variable has been used in only string contexts since it was set, and has a value that is not the empty string and matches the pattern /^ a-zA-Z * 0-9 *\z/ , the increment is done as a string, preserving each character within its range, with carry:
print ++($foo = "99"); # prints "100" print ++($foo = "a0"); # prints "a1" print ++($foo = "Az"); # prints "Ba" print ++($foo = "zz"); # prints "aaa" [download]The components of the range you are trying to construct do not meet the criteria for Perl's built-in autoincrement behavior.
However, if you're using Perl 5.26 or newer, and enable unicode_strings you can use the following, as documented in perlop Range Operators .
use charnames "greek"; my @greek_small = map { chr } (ord("\N{alpha}") .. ord("\N{omega}")); [download]Or forgo the \N{charname} lookups and just use the actual ordinal values:
my @chars = map {chr} $ord_first .. $ord_last; [download]
Dave
Your Mother (Chancellor) on Nov 16, 2017 at 06:13 UTC
Re: Generating a range of Unicode charactersIs this what you're after?
perl -CSD -le 'print chr for 0xDF .. 0x0101' [download]Update: I hadn't read all the way down davido 's post. He is making the same suggestion already at the end.
![]() |
![]() |
![]() |
Nov 12, 2017 | stackoverflow.com
pleriche, Nov 12 at 9:52
I have an HTML file containing a 2-column table which I want to parse in order to extract pairs of strings representing the columns. The page layout of the HTML (white space, new lines) is arbitrary, hence I can't parse the file line by line.I recall that you can parse such a thing by slurping the whole file into a string and operating on the entire string, which I'm finding a bit more challenging. I'm trying things like the following:
#!/usr/bin/perl open(FILE, "Glossary") || die "Couldn't open file\n"; @lines = <FILE>; close(FILE); $data = join(' ', @lines); while ($data =~ /<tr>.*(<td>.*<\/td>).*(<td>.*<\/td>).*<\/tr>/g) { print $1, ":", $2, "\n"; }which gives a
null
output. Here's a section of the input file:<table class="wikitable"> <tr> <td><b>Term</b> </td> <td><b>Meaning</b> </td></tr> <tr> <td><span id="0-Day">0-Day</span> </td> <td> <p>See <a href="#Zero_Day">Zero Day</a>. </p> </td>Can someone help me out?
Borodin ,Nov 12 at 21:20
UseHTML::TableExtract
– Borodin Nov 12 at 21:20zdim ,Nov 12 at 21:46
To correct my early comment (removed), while I recommend HTML::TreeBuilder for general parsing of HTML (and there are others), here you indeed wantHTML::TableExtract
. And you do not want to use regex – zdim Nov 12 at 21:46Dave Cross ,2 days ago
You can't parse HTML with a regex – Dave Cross 2 days agoMiguel Prz ,Nov 12 at 10:03
There is a HTML::TableExtract module in CPAN, which simplifies the problem you are trying to solve:use strict; use warnings; use HTML::TableExtract qw(tree); my $te = HTML::TableExtract->new( headers => qw(Term Meaning) ); my $html_file = "Glossary"; $te->parse_file($html_file); my $table = $te->first_table_found; # ...pleriche ,yesterday
Thank you and I'm sure TableExtract is the better way of doing it, but the object of my question was to improve my understanding of how to use regular expressions since they're so central to Perl. Adding gs to the regexpr as someone suggested (since deleted) was the leg-up I needed. – pleriche yesterdayMiguel Prz ,yesterday
I see your point, and it's really important build a solid knowlegment on regexpr. But, like other people have said, it's not a goot idea apply regexpr to parsing html documents – Miguel Prz yesterday,
You already have answers explaining why you shouldn't parse HTML with regexes. And you really shouldn't. But you've asked for an explanation of why your code doesn't work. So here goes...You have two problems in your code. One stops it working and the other stops it working as you expect.
Firstly, you are using
.
in your regex to match any character. But.
doesn't match any character. It matches any character except a newline. And you have newlines in your string. You fix that by adding the/s
option to your match operator (so it has/gs
instead of/s
).With that fix in place, you get a result from your code. Using your test data, I see:
<td><b>Term</b> </td>:<td><b>Meaning</b> </td>Which is correct. But looking at your test data, I wondered why I wasn't getting two results - because of the
/g
. I soon realised it was because your test data is missing the closing</td>
. When I added that, I got this result:<td><span id="0-Day">0-Day</span> </td>:<td> <p>See <a href="#Zero_Day">Zero Day</a>. </p> </td>Ok. It's now finding the second result. But what has happened to the first one? That's the second error in your code.
You have
.*
a few times in your regex. That means "zero or more of any character". But it's the "or more" that is a problem here. By default, Perl regex qualifiers (*
or+
) are greedy. That means they will use up as much of the string as possible. And the first.*
in your regex is eating up a lot of your string. All of it up to the second<tr>
in fact.The solution to that is to make the
.*
non-greedy. And you do that by adding?
to the end. So you can replace all of the.*
with.*?
. Having done that, I get this output:<td><b>Term</b> </td>:<td><b>Meaning</b> </td> <td><span id="0-Day">0-Day</span> </td>:<td> <p>See <a href="#Zero_Day">Zero Day</a>. </p> </td>Which seems correct to me.
So, to summarise:
- By default,
.
doesn't match newlines. To do that, you need/s
.- Beware of greedy qualifiers.
![]() |
![]() |
![]() |
Nov 11, 2017 | stackoverflow.com
newbie ,Nov 11 at 0:27
I have a read-only perl file with a huge hash defined in it. Is there anyway for me to read this perl file and dump out the hash contents?this is basic structure of the hash within the file.
%hash_name = { -files => [ '<some_path>', ], -dirs => [ '<some_path>', '<some_path>', '<some_path>', '<some_path>', '<some_path>', ], };Davy M ,Nov 11 at 0:30
can you not cat the file and redirect it into a one that does have write permissions?cat perl_file_name > new_perl_file_name
– Davy M Nov 11 at 0:30newbie ,Nov 11 at 0:34
yes I did consider that but will go with that approach only if there is no other way to dump the hash without creating a new file. – newbie Nov 11 at 0:34zdim ,Nov 11 at 2:59
@newbie Thank you, and to repeat the question: Does this file have other Perl code or just this hash? Also, is the hash undeclared (just%hash_name
), as you show it, or is it "lexical," so withmy
such as:my %hash_name
? – zdim Nov 11 at 2:59zdim ,Nov 11 at 3:12
@newbie What you show is invalid in Perl: the%
in%hash_name
indicates that the variable is a hash , but{ .. }
form a hash reference , which is a scalar variable (not a hash). So it should be either%hash_name = ( .. )
or it's$hashref_name = { .. }
– zdim Nov 11 at 3:12Schwern ,Nov 11 at 6:59
Note this is an insecure way to store data. The data file must be evaluated as perl code. Any arbitrary code could be in the file. In addition, the data file can only be read by Perl programs. Instead, use JSON or similar data format. JSON::MaybeXS can convert between JSON and Perl. – Schwern Nov 11 at 6:59zdim ,Nov 11 at 3:40
Ideally you'd copy the file so that you can edit it, then turn it into a module so to use it nicely.But if for some reason this isn't feasible here are your options.
If that hash is the only thing in the file , "load" it using do † and assign to a hash
use warnings; use strict; my $file = './read_this.pl'; # the file has *only* that one hash my %hash = do $file;This form of
do
executes the file (runs it as a script), returning the last expression that is evaluated. With only the hash in the file that last expression is the hash definition, precisely what you need.If the hash is undeclared , so a global variable (or declared with
our
), then declare asour
a hash with the same name in your program and again load the file withdo
our %hash_name; # same name as in the file do $file; # file has "%hash" or "our %hash" (not "my %hash")Here we "pick up" the hash that is evaluated as
do
runs the file by virtues of ourIf the hash is "lexical" , declared as
my %hash
(as it should be!) ... well, this is bad. Then you need to parse the text of the file so to extract lines with the hash. This is in general very hard to do, as it amounts to parsing Perl. (A hash can be built usingmap
, returned from a sub as a reference or a flat list ...) Once that is done youeval
the variable which contains the text defining that hash.However, if you know how the hash is built, as you imply, with no
()
anywhere insideuse warnings; use strict; my $file = './read_this.pl'; my $content = do { # "slurp" the file -- read it into a variable local $/; open my $fh, '<', $file or die "Can't open $file: $!"; <$fh>; }; my ($hash_text) = $content =~ /\%hash_name\s*=\s*(\(.*?\)/s; my %hash = eval $hash_text;This simple shot leaves out a lot, assuming squarely that the hash is as shown. Also note that this form of eval carries real and serious security risks.
† Files are also loaded using require . Apart from it doing a lot more than
do
, the important thing here is that even if it runs multiple timesrequire
still loads that file only once . This matters for modules in the first place, which shouldn't be loaded multiple times, and use indeed usesrequire
.On the other hand,
do
does it every time, what makes it suitable for loading files to be used as data, which presumably should be read every time. This is the recommended method. Note thatrequire
itself usesdo
to actually load the file.Thanks to Schwern for a comment.
Schwern ,Nov 11 at 4:31
do
will always load the file.require
will only load it once. Since you want to get data from the file, it's recommended to usedo
. Else the second or third time anything in that process loads the file they'll end up with1
. – Schwern Nov 11 at 4:31zdim ,Nov 11 at 4:41
@Schwern Right, thank you for the comment. I wanted to avoid excessive explanation thus I simply usedo
. (I still mentionrequire
since it is feasible that the data is loaded once.) But it is good to state this, thank you -- I am adding the comment. – zdim Nov 11 at 4:41Schwern ,Nov 11 at 4:57
It's bad practice to userequire
because a future person maintaining the code may alsorequire
the same file elsewhere (not even in the same code file, it's per process) and not realize it has already been required. I'd suggest instead explaining whydo
is the right thing to do here instead ofrequire
, it's a necessary complexity. – Schwern Nov 11 at 4:57zdim ,Nov 11 at 5:40
@Schwern A good point, thank you. Adjusted the post. – zdim Nov 11 at 5:40zdim ,Nov 13 at 2:15
@DavyM Thank you for kind words. You are right, and there are so many such questions that it even seems more common (than otherwise) when people start out. All that we can do is to keep pointing it out, and write it always in full code examples. – zdim Nov 13 at 2:15
![]() |
![]() |
![]() |
Nov 13, 2017 | stackoverflow.com
brian d foy ,Nov 13 at 2:34
Thepack
function puts one or more things together in a single string. It represents things as octets (bytes) in a way that it can unpack reliably in some other program. That program might be far away (like, the distance to Mars far away). It doesn't matter if it starts as something human readable or not. That's not the point.Consider some task where you have a numeric ID that's up to about 65,000 and a string that might be up to six characters.
print pack 'S A6', 137, $ARGV[0];It's easier to see what this is doing if you run it through a hex dumper as you run it:
$ perl pack.pl Snoopy | hexdump -C 00000000 89 00 53 6e 6f 6f 70 79 |..Snoopy|The first column counts the position in the output so ignore that. Then the first two octets represent the
S
(short, 'word', whatever, but two octets) format. I gave it the number 137 and it stored that as 0x8900. Then it stored 'Snoopy' in the next six octets.Now try it with a shorter name:
$ perl test.pl Linus | hexdump -C 00000000 89 00 4c 69 6e 75 73 20 |..Linus |Now there's a space character at the end (0x20). The packed data still has six octets. Try it with a longer name:
$ perl test.pl 'Peppermint Patty' | hexdump -C 00000000 89 00 50 65 70 70 65 72 |..Pepper|Now it truncates the string to fit the six available spaces.
Consider the case where you immediately send this through a socket or some other way of communicating with something else. The thing on the other side knows it's going to get eight octets. It also knows that the first two will be the short and the next six will be the name. Suppose the other side stored that it
$tidy_little_package
. It gets the separate values by unpacking them:my( $id, $name ) = unpack 'S A6', $tidy_little_package;That's the idea. You can represent many values of different types in a binary format that's completely reversible. You send that packed string wherever it needs to be used.
I have many more examples of
pack
in Learning Perl and Programming Perl .
![]() |
![]() |
![]() |
Nov 16, 2017 | stackoverflow.com
Grep Two Dimensional Array Ask Question up vote down vote favorite
Taranasaur ,yesterday
Since this is not a question directly covered here, thought best I ask and answer it.I had an issue where I wanted to add a node name to a list only if the same node doesn't already exist. The array was built using:
push (@fin_nodes, [$node, $hindex, $e->{$hip}->{FREQ}]);So given when given array (@fin_nodes) that looks like:
$VAR1 = [ 'first-node', '4', 3 ]; $VAR2 = [ 'second-node', '1', 3 ]; $VAR3 = [ 'another-node', '1', 5 ]; $VAR4 = [ 'some-node', '0', 5 ];To do a grep on this the following works:
my @match = grep { grep { $_ =~ $node } @$_ } @fin_nodes;So given a $node "second-node" the above statement will return @match as:
$VAR1 = [ 'second-node', '1', 3 ];Sobrique ,yesterday
Why not use a hash instead? – Sobrique yesterdayysth ,yesterday
when dumping an array, do Data::Dumper::Dumper(\@array), not ...(@array). if passed a list, Dumper dumps each element individually, which is not what you want here – ysth yesterday,
I would say "don't" and instead:my %fin_nodes; $fin_nodes{$node} = [$hindex, $e->{$hip}->{FREQ}]);And then you can simply
if ($fin_nodes{$node}) {
Failing that though - you don't need to grep every element, as your node name is always first.
So:
my @matches = grep { $_ -> [0] eq $node } @fin_nodes;
eq
is probably a better choice than=~
here, because the latter will substring match. (And worse, can potentially do some quite unexpected things if you've metacharacters in there, since you're not quoting or escaping them)E.g. in your example - if you look for a node called
"node"
you'll get multiple hits.Note - if you're only looking for one match, you can do something like:
my ( $first_match ) = grep { $_ -> [0] eq $node } @fin_nodes;This will just get you the first result, and the rest will be discarded. (Which isn't too efficient, because
grep
will continue to iterate the whole list).Taranasaur ,yesterday
Your last statement was on point, I only needed one match. Then before pushing a node onto fin_nodes this was enough: "if (!$first_match)" – Taranasaur yesterdayBorodin ,yesterday
@Taranasaur: I think you missed the point of Sobrique's answer. A hash is by far the better choice for this, and you can simply write$fin_nodes{$node} //= [ $hindex, $e->{$hip}{FREQ} ]
and avoid the need for any explicit test altogether. – Borodin yesterdayTaranasaur ,yesterday
@Borodin, no I do get Sobrique's point. The fin_nodes array is being used for a simple list function that another method is already using quite happily in my program. I will at some point go back and create a hash as there might be more attributes I'll need to include in that array/hash – Taranasaur yesterdayysth ,yesterday
"because the latter will substring match" assuming no regex metacharacters; if there are any, it will be even worse – ysth yesterdaySobrique ,yesterday
Good point @ysth I will add that. – Sobrique yesterday
![]() |
![]() |
![]() |
Nov 16, 2017 | perlmonks.com
on Feb 09, 2015 at 13:21 UTC ( # 1116049 = perlquestion : print w/replies , xml ) Need Help?? kzwix has asked for the wisdom of the Perl Monks concerning the following question:
![]() |
![]() |
![]() |
Nov 16, 2017 | stackoverflow.com
,
There are no arrays in your code. And there are no method calls in your code.Your hash is defined incorrectly. You cannot embed hashes inside other hashes. You need to use hash references. Like this:
my %data = ( 'a' => { x => 'Hello', y => 'World' }, 'b' => { x => 'Foo', y => 'Bar' } );Note, I'm using
{ ... }
to define your inner hashes, not( ... )
.That still gives us an error though.
Type of arg 1 to main::p must be hash (not hash element) at passhash line 20, near "})"
If that's unclear, we can always try adding
use diagnostics
to get more details of the error:(F) This function requires the argument in that position to be of a certain type. Arrays must be @NAME or @{EXPR}. Hashes must be %NAME or %{EXPR}. No implicit dereferencing is allowed--use the {EXPR} forms as an explicit dereference. See perlref.
Parameter type definitions come from prototypes. Your prototype is
\%
. People often think that means a hash reference. It doesn't. It means, "give me a real hash in this position and I'll take a reference to it and pass that reference to the subroutine".(See, this is why people say that prototypes shouldn't be used in Perl - they often don't do what you think they do.)
You're not passing a hash. You're passing a hash reference. You can fix it by dereferencing the hash in the subroutine call.
p(%{$data{a}});But that's a really silly idea. Take a hash reference and turn it into a hash, so that Perl can take its reference to pass it into a subroutine.
What you really want to do is to change the prototype to just
$
so the subroutine accepts a hash reference. You can then check that you have a hash reference usingref
.But that's still overkill. People advise against using Perl prototypes for very good reasons. Just remove it
> ,
Your definition of the structure is wrong. Inner hashes need to use{}
, not()
.my %data = ( a => { x => 'Hello', y => 'World' }, b => { x => 'Foo', y => 'Bar' } );Also, to get a single hash element, use
$data{'a'}
(or even$data{a}
), not%data{'a'}
.Moreover, see Why are Perl 5's function prototypes bad? on why not to use prototypes. After correcting the syntax as above, the code works even without the prototype. If you really need the prototype, use
%
, not\%
. But you clearly don't know exactly what purpose prototypes serve, so don't use them.
![]() |
![]() |
![]() |
Nov 16, 2017 | stackoverflow.com
newguy, 2 days ago
I have a function in perl that returns a list. It is my understanding that when foo() is assigned to list a copy is made:sub foo() { return `ping 127.0.0.1` } my @list = foo();That
@list
then needs to be transferred to another list like@oldlist = @list;
and another copy is made. So I was thinking can I just make a reference from the returned list likemy $listref = \foo();
and then I can assign that reference, but that doesn't work.The function I'm working with runs a command that returns a pretty big list (the ping command is just for example purposes) and I have call it often so I want to minimize the copies if possible. what is a good way to deal with that?
zdim ,2 days ago
Make an anonymous array reference of the list that is returnedmy $listref = [ foo() ];But, can you not return an arrayref to start with? That is better in general, too.
What you attempted "takes a reference of a list" ... what one cannot do in the literal sense; lists are "elusive" things , while a reference can be taken
By using the backslash operator on a variable, subroutine, or value.
and a "list" isn't either (with a subroutine we need syntax
\&sub_name
)However, with the
\
operator a reference is taken, either to all elements of the list if in list contextmy @ref_of_LIST = \( 1,2,3 ); #--> @ref_of_LIST: (\1, \2, \3)or to a scalar if in scalar context, which is what happens in your attempt. Since your sub returns a list of values, they are evaluated by the comma operator and discarded, one by one, until the last one. The reference is then taken of that scalar
my $ref_of_LIST = \( 1,2,3 ); #--> $ref_of_LIST: \3As it happens, all this applies without parens as well, with
\foo()
.newguy ,2 days ago
I don't know how to return an array ref from a command that returns a list. Would it be acceptable to do it asreturn [`ping 1.2.3.4`];
– newguy 2 days agozdim ,2 days ago
@newguy Yes, that would be a fine way to do it. Another is to store the command's return in an array variable (say,@ary
) -- if you need it elsewhere in the sub -- and thenreturn \@ary;
– zdim 2 days agonewguy ,2 days ago
Ok thanks. Wouldn't the @ary way create a copy though – newguy 2 days agozdim ,2 days ago
@newguy For one, those elements must be stored somewhere, either anonymously by[ .. ]
or associated with a named variable by@ary = ..
. I don't know whether yet an extra copy is made in order to construct an array, but I'd expect that it isn't When youreturn \@ary
no new copies are made. I would expect that they are about the same. – zdim 2 days agozdim ,2 days ago
@newguy I added an explanation of what happens with\foo()
– zdim 2 days ago
![]() |
![]() |
![]() |
Nov 16, 2017 | stackoverflow.com
The match operator in scalar context evaluates to a boolean that indicates whether the match succeeded or not.
my $success = $user =~ /(\d+)/;The match operator in list context returns the captured strings (or
1
if there are no captures) on success and an empty list on error.my ($num) = $user =~ /(\d+)/;You used the former, but you want the latter. That gives you the following (after a few other small fixes):
sub next_level { my ($user) = @_; my ($num) = $user =~ /(\d+)\z/; $user =~ s/\d+\z//g; $user .= ++$num; return $user; }But that approach is complicated and inefficient. Simpler solution:
sub next_level { my ($user) = @_; $user =~ s/(\d+)\z/ $1 + 1 /e; return $user; }
![]() |
![]() |
![]() |
Nov 16, 2017 | stackoverflow.com
sampath, yesterday
I am trying to remove the old files in a dir if the count is more than 3 over SSHKindly suggest how to resolve the issue.
Please refer the code snippet
#!/usr/bin/perl use strict; use warnings; my $HOME="/opt/app/latest"; my $LIBS="${HOME}/libs"; my $LIBS_BACKUP_DIR="${HOME}/libs_backups"; my $a; my $b; my $c; my $d; my $command =qq(sudo /bin/su - jenkins -c "ssh username\@server 'my $a=ls ${LIBS_BACKUP_DIR} | wc -l;my $b=`$a`;if ($b > 3); { print " Found More than 3 back up files , removing older files..";my $c=ls -tr ${LIBS_BACKUP_DIR} | head -1;my $d=`$c`;print "Old file name $d";}else { print "No of back up files are less then 3 .";} '"); print "$command\n"; system($command);output:
sudo /bin/su - jenkins -c "ssh username@server 'my ; =ls /opt/app/latest/libs_backups | wc -l;my ; =``;if ( > 3); { print " Found More than 3 back up files , removing older files..";my ; =ls -tr /opt/app/latest/libs_backups | head -1;my ; =``;print "Old file name ";}else { print "No of back up files are less then 3 .";} '" Found: -c: line 0: unexpected EOF while looking for matching `'' Found: -c: line 1: syntax error: unexpected end of file
janh ,yesterday
Are you trying to execute parts of your local perl script in an ssh session on a remote server? That will not work. – janh yesterdaysimbabque ,yesterday
Look into Object::Remote. Here is a good talk by the author from the German Perl Workshop 2014. It will essentially let you write Perl code locally, and execute it completely on a remote machine. It doesn't even matter what Perl version you have there. – simbabque yesterdaysimbabque ,yesterday
You should also not use$a
and$b
. They are reserved global variables forsort
. – simbabque yesterdayChris Turner ,yesterday
Why are you sudoing when your command is running on an entirely different server? – Chris Turner yesterdayshawnhcorey ,yesterday
Never putsudo
orsu
in a script. This is security breach. Instead run the script assudo
orsu
. – shawnhcorey yesterdayIf you have three levels of escaping, you're bound to get it wrong if you do it manually. Use String::ShellQuote'sshell_quote
instead.Furthermore, avoid generating code. You're bound to get it wrong! Pass the necessary information using arguments, the environment or some other channel of communication instead.
There were numerous errors in the interior Perl script on top of the fact that you tried to execute a Perl script without actually invoking
perl
!#!/usr/bin/perl use strict; use warnings; use String::ShellQuote qw( shell_quote ); my $HOME = "/opt/app/latest"; my $LIBS = "$HOME/libs"; my $LIBS_BACKUP_DIR = "$HOME/libs_backups"; my $perl_script = <<'__EOI__'; use strict; use warnings; use String::ShellQuote qw( shell_quote ); my ($LIBS_BACKUP_DIR) = @ARGV; my $cmd = shell_quote("ls", "-tr", "--", $LIBS_BACKUP_DIR); chomp( my @files = `$cmd` ); if (@files > 3) { print "Found more than 3 back up files. Removing older files...\n"; print "$_\n" for @files; } else { print "Found three or fewer backup files.\n"; } __EOI__ my $remote_cmd = shell_quote("perl", "-e", $perl_script, "--", $LIBS_BACKUP_DIR); my $ssh_cmd = shell_quote("ssh", 'username@server', "--", $remote_cmd); my $local_cmd = shell_quote("sudo", "su", "-c", $ssh_ccmd); system($local_cmd);
![]() |
![]() |
![]() |
Nov 15, 2017 | perlmonks.com
![]() |
![]() |
![]() |
Nov 15, 2017 | perlmonks.com
Suggestions for working with poor code by Ovid (Cardinal)
on May 10, 2001 at 01:34 UTC ( # 79261 = perlmeditation : print w/replies , xml ) Need Help??
![]() |
![]() |
![]() |
Nov 15, 2017 | perlmonks.com
Re^2: Swallowing an elephant in 10 easy steps
by ELISHEVA (Prior) on Aug 13, 2009 at 18:27 UTCThe time drivers are the overall quality of the design, ease of access to code and database schemas, and the size of the system: the number of database tables, the complexity of the type/class system(s), the amount of code, and the number of features in whatever subsystem you explore in step 10. Rather than an average, I'll take the most recent example, Perl Monks.
The Perl Monks website has 83 data tables, two main type hierarchies (nodetypes and perl classes), a core engine of about 12K and about 600 additional code units spread throughout the database. Documentation is scattered and mostly out of date.
The initial architecture seems solid but its features have been used inconsistently over time. Accessing the schema and code samples is slow because there is no tarball to download - it has to be done through the web interface or manually cut and pasted into files off line. The database/class assessment (1-4) took about 16 hours. Steps 5-7 took about 30 hours. Steps 8-10 took about 24 hours. All told that is 70 hours, including writing up documentation and formatting it with HTML.
However, I always like to leave myself some breathing space. If I were contracting to learn a system that size, I'd want 90 hours and an opportunity to reassess time schedules after the initial code walk through was complete. If a system is very poorly designed this process takes somewhat longer.
A crucial element in controlling time is controlling the amount of detail needed to gain understanding. It is easy to lose sight of the forest for the trees. That is why I advise stopping and moving onto the next phase once your categories give a place to most design elements and the categories work together to tell story. That is also why I recommend backtracking as needed. Sometimes we make mistakes about which details really matter and which can be temporarily blackboxed. Knowing I can backtrack lets me err on the side of black boxing.
The other element affecting time is, of course, the skill of the analyst or developer. I have the advantage that I have worked both at the coding and the architecture level of software. I doubt I could work that fast if I didn't know how to read code fluently and trace the flow of data through code. Having been exposed to many different system designs over the years also helps - architectural strategies leave telltale footprints and experience helps me pick up on those quickly.
However one can also learn these skills by doing. The more you practice scanning, categorizing and tracing through code and data the better you get at it. It will take longer, but the steps are designed to build on themselves and are, in a way, self-teaching. That is why you can't just do the 10 steps in parallel as jdporter jokingly suggests below.
However some theoretical context and a naturally open mind definitely helps: if you think that database tables should always have a one-to-one relationship with classes you will be very very confused by a system where that isn't true. If I had to delegate this work to someone else I probably would work up a set of reading materials on different design strategies that have been used in the past 30 years. Alternatively or in addition, I might pair an analyst with a programmer so that they could learn from each other (with neither having priority!)
Best, beth
Update: expanded description of the PerlMonks system so that it addresses all of the time drivers mentioned in the first paragaph.
Update: fixed miscalculation of time
![]() |
![]() |
![]() |
Nov 15, 2017 | perlmonks.com
dave0 (Friar) on Apr 15, 2005 at 15:32 UTC
Re: Analyzing large Perl code base.Having recently done this on a fairly large codebase that grew organically (no design, no refactoring) over the course of four years, I feel your pain.
Writing a testsuite, on any level, is nearly essential for this. If you're rewriting an existing module, you'll need to ensure it's compatible with the old one, and the only sane way to do that is to test. If the old code is monolithic, it might be difficult to test individual units, but don't let that stop you from testing at a higher level.
B::Xref helped me make sense of the interactions in the old codebase. I didn't bother with any visualization tools or graph-creation, though. I just took the output of perl -MO=Xref filename for each file, removed some of the cruft with a text editor, ran it through mpage -4 to print, and spent a day with coffee and pencil, figuring out how things worked.
Pretty much the same tactic was used on the actual code. Print it out, annotate it away from the computer, and then sit down with the notes to implement the refactoring. If your codebase is huge (mine was about 4-5k lines in several .pl and .pm files, and was still manageable) you might not want to do this, though.
![]() |
![]() |
![]() |
Nov 15, 2017 | perlmonks.com
Re: Strategies for maintenance of horrible code?
by aufflick (Deacon) on Jul 13, 2006 at 00:17 UTCYou might find the comments to my recent question Generating documentation from Perl code (not just POD) useful.The Doxygen perl extension creates docs that are great for seeing what classes re-implement what methods etc. Also the UML::Sequence sounds intriguing - it pupports to generate a sequence diagram by monitoring code execution.
![]() |
![]() |
![]() |
Nov 15, 2017 | perlmonks.com
Generating documentation from Perl code (not just POD) by aufflick (Deacon)
on Jul 11, 2006 at 05:15 UTC ( # 560312 = perlquestion : print w/replies , xml ) Need Help?? aufflick has asked for the wisdom of the Perl Monks concerning the following question:
![]() |
![]() |
![]() |
Nov 15, 2017 | perlmonks.com
knobunc (Pilgrim) on May 10, 2001 at 18:19 UTC
Re: Suggestions for working with poor codeVery cool node.
With regard to the To Do list, I scatter them throughout my code if there is a place I need to do further work. However, I have a make rule for todo that searches for all of the lines with TODO in them and prints them out. So a usage of a TODO:
if ($whatever) { # TODO - Finish code to take over the world }[download]Becomes:
To Do List
Dir/file.pl 132: Finish code to take over the world [download]When run through the following (ugly, suboptimal, but working) code in Tools/todo.sh :
#/bin/sh echo 'To Do List' find . -type f | xargs grep -n TODO | perl -ne \ '($file, $line, $rest) += split /:/, $_, 3; $file =~ s|^./||; $rest =~ s|.*?TODO.*?[-\s:]+||; $rest =~ s|"[.;,]\s*$||; $rest =~ s|\\n||g; print "$file $line: \u$rest\n"' | sort | uniq | grep -v '.#' | grep -v Makefile | grep -v CVS[download]Which I call from my Makefile:
todo: Tools/todo.sh [download]Kinda ugly, but it lets me put the TODO statements where I actually need to do the work.
So I can proof out a block of code by writing narrative comments with TODO at the start of the line (behind comment characters of course).
Then fill in the code later and not worry about missing a piece. Also since the TODOs are where the stuff needs to be filled in, I have lots of context around the issue and don't need to write as much as I would if they were at the top of the file. Plus anyone without something to do in the group can just type make todo and add some code. Finally, it is easier to add a TODO right where you need it, than bop up to the top of the file and then have to find where you were back in the code.
-ben
![]() |
![]() |
![]() |
Nov 15, 2017 | my.safaribooksonline.com
Debugging is just an extreme case of dynamic analysis. Third-party code can be extremely convoluted (so can your own code, of course, but you don't usually think of it that way because you're familiar with it; you knew it when it was just a subroutine); sometimes you just can't tell how part of the code fits in, or whether it's called at all. The code is laid out in some arrangement that makes no sense; if only you could see where the program would actually go when it was run.
Well, you can, using Perl's built-in debugger. Even though you're not actually trying to find a bug, the code-tracing ability of the debugger is perfect for the job.
This isn't the place for a full treatment of the debugger (you can see more detail in [ SCOTT01 ]), but fortunately you don't need a full treatment; a subset of the commands is enough for what you need to do. (Using the debugger is like getting in a fight; it's usually over very quickly without using many of the fancy moves you trained for.)
-d
command-line flag; either edit the program to add-d
to the shebang line, or run the program by invoking Perl explicitly:% perl -d program argument argument...
Make sure that the perl in your path is the same one in the shebang line of program or you'll go crazy if there are differences between the two perls.
Basic Debugger Commands
- h h h Brief verbose help (verbose brief help prior to 5.8.0)
- b subroutine Set breakpoint at first executable statement of subroutine
- b line Set breakpoint for line line
- b place condition Set breakpoint for place (either line or subroutine) but trigger it only when the Perl expression condition is true
- с Continue until end of program or breakpoint
- с line Continue until line line, end of program, or earlier breakpoint
- x expression Examine the value of a variable or expression
- n Execute current statement, skipping over any subroutines called from it
- s Execute next Perl statement, going into a subroutine called from the current statement if necessary
- 1 List source code from current line
- r Execute statements until return from current subroutine, end of program, or earlier breakpoint
- T Display stack trace
- q Quit
Armed with these commands, we can go code spelunking. Suppose you are debugging a program containing the following code fragment:
77 for my $url (@url_queue)
78 {
79 my $res = $ua->request($url);
80 summarize($res->content);
81 }and you know that whenever the program gets to the URL
http://www.perlmedic.com/fnord.html
something strange happens in thesummarize()
subroutine. You'd like to check the HTTP::Response object to see if there were any redirects you didn't know about. You start the program under the debugger and type:DB<1> b 80 $url =~ /fnord/
DB<2>The program will run until it has fetched the URL you're interested in, at which point you can examine the response object -- here's an example of what it might look like:
Perl 5.8.0 and later will give you a stack trace anyway if you run a program under the debugger and some code triggers a warning. But suppose you are either running under an earlier perl, or you'd really like to have a debugger prompt at the point the warning was about to happen.
You can combine two advanced features of Perl to do this: pseudo-signal handlers, and programmatic debugger control .
A signal handler is a subroutine you can tell Perl to execute whenever your program receives a signal. For instance, when the user interrupts your program by pressing Control-C, that works by sending an INT signal to your program, which interprets it by default as an instruction to stop executing.
There are two pseudo-signals, called
__WARN__
and__DIE__
. They aren't real signals, but Perl "generates" them whenever it's told to issue a warning or to die, respectively. You can supply code to be run in those events by inserting a subroutine reference in the%SIG
hash (see perlvar ) as follows:$SIG{__WARN__} = sub { print "Ouch, I'm bad" };
(Try it on some code that generates a warning.)
The next piece of the solution is that the debugger can be controlled from within your program; the variable
$single
in the special package DB determines what Perl does at each statement: 0 means keep going, and 1 or 2 mean give a user prompt. 1 So setting$DB::single
to 1 in a pseudo-signal handler will give us a debugger prompt at just the point we wanted.1 . The difference between the two values is that a 1 causes the debugger to act as though the last
n
ors
command the user typed wass
, whereas a 2 is equivalent to ann
. When you type an empty command in the debugger (just hit Return), it repeats whatever the lastn
ors
command was.Putting the pieces together, you can start running the program under the debugger and give the commands:
DB<1> $SIG{__WARN__} = sub { warn @_; $DB::single = 1 }
DB<2>Now the program will breakpoint where it was about to issue a warning, and you can issue a
T
command to see a stack trace, examine data, or do anything else you want. 2 The warning is still printed first.2 . Under some circumstances, the breakpoint might not occur at the actual place of warning: The current routine might return if the statement triggering the warning is the last one being executed in that routine.
Unfortunately, no
__DIE__
pseudo-signal handler will return control to the debugger (evidently death is considered too pressing an engagement to be interrupted). However, you can get a stack trace by calling theconfess()
function in the Carp module:DB<1> use Carp
DB<2> $SIG{__DIE__} = sub { confess (@_) }The output will look something like this:
DB<3>
Insufficient privilege to launch preemptive strike at wargames line
109.
main::__ANON__[(eval 17)[/usr/lib/perl5/5.6.1/
perl5db.pl:1521]:2]('Insufficient privilege to launch preemptive
strike at wargames line 109.^J') called at wargames line 121
main::preemptive('Strike=HASH(0x82069d4)') called at wargames
line 109
main::make_strike('ICBM=HASH(0x820692c)') called at wargames
line 74
main::icbm('Silo_ND') called at wargames line 32
main::wmd('ICBM') called at wargames line 22
main::strike() called at wargames line 11
main::menu() called at wargames line 5
Debugged program terminated. Use q to quit or R to restart,
use O inhibit_exit to avoid stopping after program termination,
h q, h R or h O to get additional info.I've often found it amusing that the debugger refers to the program at this point as "debugged."
![]() |
![]() |
![]() |
Nov 15, 2017 | perlmonks.com
Nov 16, 2017 at 02:50 UTC ( # 1203542 = perlquestion : print w/replies , xml ) Need Help??likbez has asked for the wisdom of the Perl Monks concerning the following question:
Looks like in tr function a scalar variable is accepted as the first argument, but is not compiled properly into set of characters
use strict; use warnings; my $str1 = 'abcde'; my $str2 = 'eda'; my $diff1 = 0; eval "\$diff1=\$str1=~tr/$str2//"; print "diff1: $diff1\n"; $ perl foo.pl diff1: 3This produces in perl 5, version 26:
Test 1: strait set diff1=0, diff2=3
Test 2: complement set diff1=5, diff2=2Obviously only the second result in both tests is correct. Looks like only explicitly given first set is correctly compiled. Is this a feature or a bug ?
Athanasius (Chancellor) on Nov 16, 2017 at 03:08 UTC
Re: Strange behaviour of tr function in case the set1 is supplied by a variableHello likbez ,
The transliteration operator tr/SEARCHLIST/REPLACEMENTLIST/ does not interpolate its SEARCHLIST , so in your first example the search list is simply the literal characters , , , , . See Quote and Quote like Operators .
Hope that helps,
Athanasius < contra mundum Iustus alius egestas vitae, eros Piratica,roboticus (Chancellor) on Nov 16, 2017 at 03:08 UTC
Re: Strange behaviour of tr function in case the set1 is supplied by a variablelikbez :
Feature, per the tr docs
Characters may be literals or any of the escape sequences accepted in double-quoted strings. But there is no interpolation, so "$" and "@" are treated as literals.
A hyphen at the beginning or end, or preceded by a backslash is considered a literal. Escape sequence details are in the table near the beginning of this section.
So if you want to use a string to specify the values in a tr statement, you'll probably have to do it via a string eval:
$ cat foo.pl use strict; use warnings;
my $str1 = 'abcde';
my $str2 = 'eda';
my $diff1 = 0;
eval "\$diff1=\$str1=~tr/$str2//";
print "diff1: $diff1\n";
perl foo.pl diff1: 3... roboticus
When your only tool is a hammer, all problems look like your thumb.
Anonymous Monk on Nov 16, 2017 at 03:09 UTC
Re: Strange behaviour of tr function in case the set1 is supplied by a variableLooks like in tr function a scalar variable is accepted as the fist argument, but is not compiled properly into set of characters
:)
you're guessing how tr /// works, you're guessing it works like s/// or m///, but you can't guess , it doesn't work like that, it doesn't interpolate variables, read perldoc -f tr for the details
likbez !!! on Nov 16, 2017 at 04:41 UTC
Re^2: Strange behaviour of tr function in case the set1 is supplied by a variableyou're guessing how tr/// works, you're guessing it works like s/// or m///, but you can't guess , it doesn't work like that, it doesn't interpolate variables, read perldoc -f tr for the detailsHouston, we have a problem ;-)First of all that limits tr area of applicability.
The second, it's not that I am guessing, I just (wrongly) extrapolated regex behavior on tr, as people more often use regex then tr. Funny, but searching my old code and comments in it is clear that I remembered (probably discovered the hard way, not by reading the documentation ;-) this nuance several years ago. Not now. Completely forgotten. Erased from memory. And that tells you something about Perl complexity (actually tr is not that frequently used by most programmers, especially for counting characters).
And that's a real situation, that we face with Perl in other areas too (and not only with Perl): Perl exceeds typical human memory capacity to hold the information about the language. That's why we need "crutches" like strict.
You simply can't remember all the nuances of more then a dozen of string-related built-in functions, can you? You probably can (and should) for index/rindex and substr , but that's about it.
So here are two problems here:
1. Are / / strings uniformly interpreted in the language, or there is a "gotcha" because they are differently interpreted by tr (essentially as a single quoted strings) and regex (as double quoted strings) ?
2. If so, what is the quality of warnings about this gotcha? There is no warning issued, if you use strict and warnings. BTW, it looks like $ can be escaped:
main::(-e:1): 0
DB<5> $_='\$bba\$'
DB<6> tr/\$/?/
DB<7> print $_
\?bba\?
[download]Right now there is zero warnings issued with use strict and use warnings enabled. Looks like this idea of using =~ for tr was not so good, after all. Regular syntax like tr(set1, set2) would be much better. But it's to late to change and now we need warnings to be implemented.
likbez !!! on Nov 16, 2017 at 03:10 UTC
Re: Strange behaviour of tr function in case the set1 is supplied by a variableWith eval statement works correctly. So it looks like $ is treated by tr as a regular symbol and no warnings are issued.
$statement='$diff1=$str1'."=~tr/$str2//;";
eval($statement);
print "With eval: diff1=$diff1\n";
[download]that will produce:
With eval: diff1=3ww (Archbishop) on Nov 16, 2017 at 03:16 UTC
Re: Strange behaviour of tr function in case the set1 is supplied by a variableSame results in AS 5.24 under Win7x64.
Suspected problem might have arisen from lack of strict, warnings. Wrong, same results BUT using both remains a generally good idea.
Also wondered if compiling (with qr/.../ ) might change the outcome. Wrong again, albeit with variant (erroneous) output.
Correct me if I'm wrong, guessing that "strait" is a typo or personal shortening of "straight."
Update: Now that I've seen earlier replies... ouch, pounding forehead into brick wall!