Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

Introduction to Perl 5.10 for Unix System Administrators

(Perl 5.10 without excessive complexity)

by Dr Nikolai Bezroukov

Contents : Foreword : Ch01 : Ch02 : Ch03 : Ch04 : Ch05 : Ch06 : Ch07 : Ch08 :


Prev | Up | Contents | Down | Next

4.3. HERE string literals
and DATA filehandle

Version 0.90


HERE Literals

Suppose you have several lines (say a dozen) that you want to to put into some variable (or just print), without any changes but you want to interpolate the variables that you find. This can be accomplished with so called HERE documents. HERE documents in Perl is just another notation for a quoted string literal. The type of constant delimiting the HERE document determines the type of HERE literal. In example below this is a double quoted string so interpolation of variables is allowed:

$lang="Perl";
print <<"EOF";
Dear $lang Language Designer,
   It's not the first language that I learn 
   and I got used not expect much from the compiler/interpreter.
   But in this case I was really upset when I discovered 
   how difficult is to catch a syntax error in the $lang
   and how fuzzy the $lang diagnostic is. 

Your truly, 
Joe User
EOF

Please note that EOF is an special closing tag that tells Perl to look for a complete line stating with this tag as the end of the string literal that usually encompass several lines .

Attention: There should be no space at the start of line that contains closing tag

This is a pretty old and very useful mechanism for incorporating an input streams of data into scripts. It was probably first used in OS/360 job control language (JCL) and later was adopted by Unix shells. Again this is an old mechanism and as with many such things the position of the end tag is crucial. Generally the following rules are applicable:

  1. There must be no space between = and <<.
  2. The statement that contains =<< string literal operator is regular Perl statement that (probably) should have semicolon at the end.
  3. You can't (easily) have any space in front of the end tag. If you want to indent the text in the here document, you can do this:
#2345678901234567890 
($VAR =<<"EOF") =~ s/^\s+//gm; 
   your text 
   goes here 
EOF

But the EOF mark will still needs to be stared from the very first column. If you really want it to be indented, you'll have to include spaces in the end tag too, but this ugly and error-prone"

($quote = <<'   EOF') =~ s/^\s+//gm;
   ... ... ....
   EOF
# ^____  note spaces
 
$quote =~ s/\s*--/\n--/;

The classic Perl idiom that uses this construct is producing a help screen for a utility in case you discover that either the number of parameters, or some parameter is wrong. The main advantage is that you can now use special characters within the print statement without fear of confusing the print function about where the string ends.

die <<"EOF";
   Usage: $my_util -a[lr]  file 
      -a -- all
      -l -- long format
      -r -- recursive
EOF

Since you are saying "EOL", interpolation happens and $my_util will be replaces with its value. Also please note that we did not put end marker in quotes at the endline. This would be a syntax error.

The third type (backticked endtag) is probably the most interesting, as it permits a simple and convenient way to generate and then execute a shell script in Perl. For example:

my $fd = new FileHandle("/Scripts/test.sh");
my @my_script = <$fd>; close($fd);
$line =<<`SCRIPT`;
   @my_script
SCRIPT

Here is first read script /Scripts/test.sh from a file into array @script and then execute it using <<`SCRIPT` construct. But script can be generated on the fly and written into Perl variable or array.

The result of the script execution will piped into the variable $line.

DATA filehandle
 __DATA__ and __END__ tokens

The DATA filehandle provides an elegant and simple mechanism to include a small datafile directly into the script.  This is a method very similar to the method used in JCL on System 360 computers in old days of punched cards. 

Often simple scripts use just one small datafile. In this case you can attach this datafile to the code, and thus process that datafile 'in place'. That permits coping just one file for the utility.

ATTENTION: You not not need to open __DATA__ file. It is already operaed for you and you can use this filehandle immediately

For example, the example above that generates a simple help screen for a utility can be rewritten using __DATA__ filehandle:

$my_util="supercopy";
while ($line = <DATA>) { print $line; }
__DATA__
 Usage $my_util -a[lr]  file
   -a -- all
   -l -- long format
   -r -- recursive

Warning:

A special token __END__ can also be used with the DATA special filehandle. The only difference between the two is that __END__ can be used in a file only once.

The __DATA__ token can be used more than once, which might be useful in cases when a your database consist of several sections or is a single file that contains several packages. In this case each package can have its own 'DATA' handler. That can be usrful, if you have data that you want to bundle with your program and treat as though it were in a file, but you don't want it to be in a different file.

Use the __DATA__ or __END__ tokens after your program code to mark the start of a data block, which can be read inside your program or module from the DATA filehandle.

Use __DATA__ within a module:

while (<DATA>) {
    # process the line
}
__DATA__
# your data goes here

Similarly, use __END__ within the main program file:

while (<main::DATA>) {
   # process the line
}
__END__
# your data goes here

Both __DATA__ and __END__ indicate the logical end of a module or script before the physical end of file is reached. Text after __DATA__ or __END__ can be read through the per-package DATA filehandle. For example, take the hypothetical module Primes. Text after __DATA__ in primes.pm can be read from the Primes::DATA filehandle.

__END__ behaves as a synonym for __DATA__ in the main package. Text after __END__ tokens in modules is inaccessible.

Those facilities allow writing of self-contained programs that include small dataset in the same file as the script itself. Without those facilities you need to keep data kept in separate files. Often this is used to include documentation to the program.  Sometimes it's configuration data or test data that the program was developed with, left lying about in case it ever needs to be recreated.

Another trick is to use DATA to find out the current program's or module's size or last modification date. On most systems, the $0 variable will contain the full pathname to your running script. On systems where $0 is not correct, you could try the DATA filehandle instead. This can be used to pull in the size, modification date, etc. Put a special token __DATA__ at the end of the file (and maybe a warning not to delete it), and the DATA filehandle will be to the script itself.

use POSIX qw(strftime);

$raw_time = (stat(DATA))[9];
$size     = -s DATA;
$kilosize = int($size / 1024) . 'k';

print "<P>Script size is $kilosize\n";
print strftime("<P>Last script update: %c (%Z)\n", localtime($raw_time));

__DATA__
DO NOT REMOVE THE PRECEDING LINE.
Everything else in this file will be ignored.

Macrovariables __FILE__ and __LINE__

The special tokens __FILE__ and __LINE__ are often used in conjunction with the input files including special filehandles like DATA. It's important to understand that they are not a variables but macros that are replaced (interpolated) with their values during compilation.

They give information that is similar to the  caller() function, but there is a difference: the caller() built-in function reports the lines that are in the stack above the place where the statement is executing in, not the actual place itself. Below is an test that illustrated this fact:

print __FILE__, " ", __LINE__,
"@{[caller()]}\n";

ATTENTION: Since they are not scalar variables, you cannot put them in double quotes.

A Sample Script

A typical example of HERE document usage is the generation of HTML in CGI scripts: Perl lets you run on over multiple lines within a single print statement; only the closing quote and semi-colon tell Perl the statement is finished. For example using a regular double quoted literal the simple HTML page can be generated in the following way:

$language="Perl"
print "
<title>A Hello $language Page</title>
<h1>A Hello $language Page</h1>
<hr>
<p>Hello $language.
<p>It's not the first language that I learn
and I got used not expect much from the compiler.
<p>But I was pretty upset when I discovered
how difficult is to catch a syntax error in $language
and how fuzzy $language diagnostic is.
<p>Your truly
<br>Joe User
";

HERE documents provide more convenient way to do the same thing. They interpolate variables within it like a regular double quoted string:

   
print <<"EOF";
<title>A Hello $language Page</title>
<h1>A Hello $language Page</h1>
<hr>
<p>Hello $language.
<p>It's not the first language that I learn
and I got used not expect much from the compiler.
<p>But I was pretty upset when I discovered
how difficult is to catch a syntax error in $language
and how fuzzy $language diagnostic is.
<p>Your truly
<br>Joe User
EOF

__DATA__ is even more convenient but if interpolation of variables is required, it should be done via eval function:

... ... ...
   $file_delim=$\; $\=undef;
   $line = eval(<DATA>);
   $\=$file_delim;
   ... ... ...
__DATA__
<title>A Hello $language Page</title>
<h1>A Hello $language Page</h1>
<hr>
<p>Hello $language.
<p>It's not the first language that I learn
and I got used not expect much from the compiler.
<p>But I was pretty upset when I discovered
how difficult is to catch a syntax error in $language
and how fuzzy $language diagnostic is.
<p>Your truly
<br>Joe User

Recommended Links

Google matched content

Softpanorama Recommended

Top articles

Sites

Prev | Up | Contents | Down | Next



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: August, 24, 2020