One-liners 102

More one-line Perl scripts

Teodor Zlatanov
Published on March 12, 2003 Share this page

Facebook Twitter Linked In Google+ E-mail this page Comments 0

This article, as regular readers may have guessed, is the sequel to " One-liners 101 ," which appeared in a previous installment of "Cultured Perl". The earlier article is an absolute requirement for understanding the material here, so please take a look at it before you continue.

The goal of this article, as with its predecessor, is to show legible and reusable code, not necessarily the shortest or most efficient version of a program. With that in mind, let's get to the code!

Tom Christiansen's list

Tom Christiansen posted a list of one-liners on Usenet years ago, and that list is still interesting and useful for any Perl programmer. We will look at the more complex one-liners from the list; the full list is available in the file tomc.txt (see Related topics to download this file). The list overlaps slightly with the " One-liners 101 " article, and I will try to point out those intersections.

Awk is commonly used for basic tasks such as breaking up text into fields; Perl excels at text manipulation by design. Thus, we come to our first one-liner, intended to add two columns in the text input to the script.

Listing 1. Like awk?
1 2 3 4 # add first and penultimate columns # NOTE the equivalent awk script: # awk '{i = NF - 1; print $1 + $i}' perl -lane 'print $F[0] + $F[-2]'

So what does it do? The magic is in the switches. The -n and -a switches make the script a wrapper around input that splits the input on whitespace into the @F array; the -e switch adds an extra statement into the wrapper. The code of interest actually produced is:

Listing 2: The full Monty
1 2 3 4 5 while (<>) { @F = split(' '); print $F[0] + $F[-2]; # offset -2 means "2nd to last element of the array" }

Another common task is to print the contents of a file between two markers or between two line numbers.

Listing 3: Printing a range of lines
1 2 3 4 5 6 7 8 9 10 11 # 1. just lines 15 to 17 perl -ne 'print if 15 .. 17' # 2. just lines NOT between line 10 and 20 perl -ne 'print unless 10 .. 20' # 3. lines between START and END perl -ne 'print if /^START$/ .. /^END$/' # 4. lines NOT between START and END perl -ne 'print unless /^START$/ .. /^END$/'

A problem with the first one-liner in Listing 3 is that it will go through the whole file, even if the necessary range has already been covered. The third one-liner does not have that problem, because it will print all the lines between the START and END markers. If there are eight sets of START/END markers, the third one-liner will print the lines inside all eight sets.

Preventing the inefficiency of the first one-liner is easy: just use the $. variable, which tells you the current line. Start printing if $. is over 15 and exit if $. is greater than 17.

Listing 4: Printing a numeric range of lines more efficiently
1 2 # just lines 15 to 17, efficiently perl -ne 'print if $. >= 15; exit if $. >= 17;'

Enough printing, let's do some editing. Needless to say, if you are experimenting with one-liners, especially ones intended to modify data, you should keep backups. You wouldn't be the first programmer to think a minor modification couldn't possibly make a difference to a one-liner program; just don't make that assumption while editing the Sendmail configuration or your mailbox.

Listing 5: In-place editing
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 # 1. in-place edit of *.c files changing all foo to bar perl -p -i.bak -e 's/\bfoo\b/bar/g' *.c # 2. delete first 10 lines perl -i.old -ne 'print unless 1 .. 10' foo.txt # 3. change all the isolated oldvar occurrences to newvar perl -i.old -pe 's{\boldvar\b}{newvar}g' *.[chy] # 4. increment all numbers found in these files perl -i.tiny -pe 's/(\d+)/ 1 + $1 /ge' file1 file2 .... # 5. delete all but lines between START and END perl -i.old -ne 'print unless /^START$/ .. /^END$/' foo.txt # 6. binary edit (careful!) perl -i.bak -pe 's/Mozilla/Slopoke/g' /usr/local/bin/netscape

Why does 1 .. 10 specify line numbers 1 through 10? Read the "perldoc perlop" manual page. Basically, the .. operator iterates through a range. Thus, the script does not count 10 lines , it counts 10 iterations of the loop generated by the -n switch (see "perldoc perlrun" and Listing 2 for an example of that loop).

The magic of the -i switch is that it replaces each file in @ARGV with the version produced by the script's output on that file. Thus, the -i switch makes Perl into an editing text filter. Do not forget to use the backup option to the -i switch. Following the i with an extension will make a backup of the edited file using that extension.

Note how the -p and -n switch are used. The -n switch is used when you want explicitly to print out data. The -p switch implicitly inserts a print $_ statement in the loop produced by the -n switch. Thus, the -p switch is better for full processing of a file, while the -n switch is better for selective file processing, where only specific data needs to be printed.

Examples of in-place editing can also be found in the " One-liners 101 " article.

Reversing the contents of a file is not a common task, but the following one-liners show than the -n and -p switches are not always the best choice when processing an entire file.

Listing 6: Reversal of files' fortunes
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 # 1. command-line that reverses the whole input by lines # (printing each line in reverse order) perl -e 'print reverse <>' file1 file2 file3 .... # 2. command-line that shows each line with its characters backwards perl -nle 'print scalar reverse $_' file1 file2 file3 .... # 3. find palindromes in the /usr/dict/words dictionary file perl -lne '$_ = lc $_; print if $_ eq reverse' /usr/dict/words # 4. command-line that reverses all the bytes in a file perl -0777e 'print scalar reverse <>' f1 f2 f3 ... # 5. command-line that reverses each paragraph in the file but prints # them in order perl -00 -e 'print reverse <>' file1 file2 file3 ....

The -0 (zero) flag is very useful if you want to read a full paragraph or a full file into a single string. (It also works with any character number, so you can use a special character as a marker.) Be careful when reading a full file in one command ( -0777 ), because a large file will use up all your memory. If you need to read the contents of a file backwards (for instance, to analyze a log in reverse order), use the CPAN module File::ReadBackwards. Also see " One-liners 101 ," which shows an example of log analysis with File::ReadBackwards.

Note the similarity between the first and second scripts in Listing 6. The first one, however, is completely different from the second one. The difference lies in using <> in scalar context (as -n does in the second script) or list context (as the first script does).

The third script, the palindrome detector, did not originally have the $_ = lc $_; segment. I added that to catch those palindromes like "Bob" that are not the same backwards.

My addition can be written as $_ = lc; as well, but explicitly stating the subject of the lc() function makes the one-liner more legible, in my opinion.

Paul Joslin's list

Paul Joslin was kind enough to send me some of his one-liners for this article.

Listing 7: Rewrite with a random number
1 2 # replace string XYZ with a random number less than 611 in these files perl -i.bak -pe "s/XYZ/int rand(611)/e" f1 f2 f3

This is a filter that replaces XYZ with a random number less than 611 (that number is arbitrarily chosen). Remember the rand() function returns a random number between 0 and its argument.

Note that XYZ will be replaced by a different random number every time, because the substitution evaluates "int rand(611)" every time.

Listing 8: Revealing the files' base nature
1 2 3 4 5 6 7 8 9 10 11 # 1. Run basename on contents of file perl -pe "s@.*/@@gio" INDEX # 2. Run dirname on contents of file perl -pe 's@^(.*/)[^/]+@$1\n@' INDEX # 3. Run basename on contents of file perl -MFile::Basename -ne 'print basename $_' INDEX # 4. Run dirname on contents of file perl -MFile::Basename -ne 'print dirname $_' INDEX

One-liners 1 and 2 came from Paul, while 3 and 4 were my rewrites of them with the File::Basename module. Their purpose is simple, but any system administrator will find these one-liners useful.

Listing 9: Moving or renaming, it's all the same in UNIX
1 2 3 4 5 6 # 1. write command to mv dirs XYZ_asd to Asd # (you may have to preface each '!' with a '\' depending on your shell) ls | perl -pe 's!([^_]+)_(.)(.*)!mv $1_$2$3 \u$2\E$3!gio' # 2. Write a shell script to move input from xyz to Xyz ls | perl -ne 'chop; printf "mv $_ %s\n", ucfirst $_;'

For regular users or system administrators, renaming files based on a pattern is a very common task. The scripts above will do two kinds of job: either remove the file name portion up to the _ character, or change each filename so that its first letter is uppercased according to the Perl ucfirst() function.

There is a UNIX utility called "mmv" by Vladimir Lanin that may also be of interest. It allows you to rename files based on simple patterns, and it's surprisingly powerful. See the Related topics section for a link to this utility.

Some of mine

The following is not a one-liner, but it's a pretty useful script that started as a one-liner. It is similar to Listing 7 in that it replaces a fixed string, but the trick is that the replacement itself for the fixed string becomes the fixed string the next time.

The idea came from a newsgroup posting a long time ago, but I haven't been able to find original version. The script is useful in case you need to replace one IP address with another in all your system files -- for instance, if your default router has changed. The script includes $0 (in UNIX, usually the name of the script) in the list of files to rewrite.

As a one-liner it ultimately proved too complex, and the messages regarding what is about to be executed are necessary when system files are going to be modified.

Listing 10: Replace one IP address with another one
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 #!/usr/bin/perl -w use Regexp::Common qw/net/; # provides the regular expressions for IP matching my $replacement = shift @ARGV; # get the new IP address die "You must provide $0 with a replacement string for the IP 111.111.111.111" unless $replacement; # we require that $replacement be JUST a valid IP address die "Invalid IP address provided: [$replacement]" unless $replacement =~ m/^$RE{net}{IPv4}$/; # replace the string in each file foreach my $file ($0, qw[/etc/hosts /etc/defaultrouter /etc/ethers], @ARGV) { # note that we know $replacement is a valid IP address, so this is # not a dangerous invocation my $command = "perl -p -i.bak -e 's/111.111.111.111/$replacement/g' $file"; print "Executing [$command]\n"; system($command); }

Note the use of the Regexp::Common module, an indispensable resource for any Perl programmer today. Without Regexp::Common, you will be wasting a lot of time trying to match a number or other common patterns manually, and you're likely to get it wrong.

Conclusion

Thanks to Paul Joslin for sending me his list of one-liners. And in the spirit of conciseness that one-liners inspire, I'll refer you to " One-liners 101 " for some closing thoughts on one-line Perl scripts.
Articles by Teodor Zlatanov

Git gets demystified and Subversion control (Aug 27, 2009)

Build simple photo-sharing with Amazon cloud and Perl (Apr 06, 2009)

developerWorks: Use IMAP with Perl, Part 2 (May 26, 2005)

developerWorks: Complex Layered Configurations with AppConfig (Apr 11, 2005)

developerWorks: Perl 6 Grammars and Regular Expressions (Nov 09, 2004)

developerWorks: Genetic Algorithms Simulate a Multi-Celled Organism (Oct 28, 2004)

developerWorks: Cultured Perl: Managing Linux Configuration Files (Jun 15, 2004)

developerWorks: Cultured Perl: Fun with MP3 and Perl, Part 2 (Feb 09, 2004)

developerWorks: Cultured Perl: Fun with MP3 and Perl, Part 1 (Dec 16, 2003)

developerWorks: Inversion Lists with Perl (Oct 27, 2003)

developerWorks: Cultured Perl: One-Liners 102 (Mar 21, 2003)

developerWorks: Developing cfperl, From the Beginning (Jan 22, 2003)

IBM developerWorks: Using the xinetd program for system administration (Nov 28, 2001)

IBM developerWorks: Reading and writing Excel files with Perl (Sep 30, 2001)

IBM developerWorks: Automating UNIX system administration with Perl (Jul 22, 2001)

IBM developerWorks: A programmer's Linux-oriented setup - Optimizing your machine for your needs (Mar 25, 2001)

IBM developerWorks: Cultured Perl: Debugging Perl with ease (Nov 23, 2000)

IBM developerWorks: Cultured Perl: Review of Programming Perl, Third Edition (Sep 17, 2000)

IBM developerWorks: Cultured Perl: Writing Perl programs that speak English Using Parse::RecDescent (Aug 05, 2000)

IBM developerWorks: Perl: Small observations about the big picture (Jul 02, 2000)

IBM developerWorks: Parsing with Perl modules (Apr 30, 2000)