|
Softpanorama |
May the source be with you, but remember the KISS principle ;-)
Softpanorama Search
|
|
|||||||
Often Perl claims to be efficient in the scarce resource of the programmer's time. It isn't often that people tune scripts for optimum performance. Are there a few tips you can give to new Perl programmers on how squeak out a little better runtime performance?
LR:
'Script' ne
'program' again. When dealing with upward-scalable data sets, performance
becomes important. I proposed a tutorial to Perl Conference 4.0
this year on this subject, but unfortunately it wasn't accepted.
An Interview with HP's Larry Rosler
Many people know your extensive work with Perl's regular expressions. What is the most common misunderstanding new programmers have about this pattern-matching language?
IZ: I do not remember. For me, the beginner stage was so long ago, and I try to avoid questions on c.l.p.misc which many posters have enough expertise to answer. Let me guess.
Perl's regular expressions are modeled
(eventually) after command-line parameters to grep and other similar
utilities. In the command-line world, everything is a string. Bingo: Perl
regular expressions look like strings. (Let us forget for a moment that
operators qq()
etc. were introduced to make strings look like regular expressions ... .)
We have a language with binary operators
(for example, `|',
`{4}', or `'
- this was concatenation), unary operators (`[]',
`[^]', `(?!)',
'+' - both
postfix and aroundfix), grouping (`(?:)'),
keywords (`\w',
`^'),
ternary ('{3,7}'),
naming (`()')
etc. All of this is packed into a string. No wonder that even inherently
unreadable languages like Tcl or Lisp start looking like Dr. Seuss compared
to regular expressions.
Additionally, newcomers do not
understand that one needs to break a regular expression into tokens (not
mentioning how to do it!), all these rules about what is special when
backslashed, what is special when not backslashed and so on. To add insult to
injury, m in
m// is
optional, but s
in s/// is
not, //x
would require you to go into "gory details," some switches in
//ioxmsg apply to regular
expressions, some to the operator in which the regular expression appears,
print /foo/, 'bar'
is applied to $_,
but split /foo/, 'bar'
is not etc., etc., etc.
//x
was introduced as a clever hack around the problem of "packing a language
into a string," but it went only a small part of the way to make things more
maintainable. Languages like SNOBOL introduced COBOL-style patterns, which
swing into the opposite end of the scale: things become less readable due to
the sheer size of patterns for any nontrivial task.
Regular expressions are extremely powerful tools, they are the functional-programming oasis inside a procedural language, and, when well understood, they map wonderfully into the problem domain they address. Making them into eye candy is not impossible, but requires a lot of work (and probably significant changes in the current mindsets).
Interview with Dr. Ilya Zakharevich
[Nov 14, 2000]
Infinite lists in Perl ... data out of this
data structure, and it might never run ... the stream computes
the data just as they're needed ... is more like a linked list, which means that it ...
www.plover.com/~mjd/perl/Stream/article - 27k -
Cached -
Similar pages
[Nov 11, 2000] The OutRider Computing Journal: Creating a Log Class in Perl
"One recurrent theme in my job as a database administrator/assistant systems administrator/systems analyst is the need to keep track of what happened on the systems while I wasn't watching. What did the cron job do last night. What did all those spooler daemons do while I was at lunch? In other words logging. It bothered me that there was a lack of simple tools for doing such a simple, redundant job. So, I set out to do build some myself. My systems programming tool of choice is Perl, so, that is language I chose for the project. This journey took me out of my normal routine of straight-line Perl programming and dumped me in the land of Modules and Object Oriented Perl. I'm glad to say it didn't overwhelm me and in fact I found it rather easy to write."
"My first order of business was to take my old standby logging routines and objectify them. I had several concise routines that I would either import into the main package through a use statement or just simply copy/paste depending on my mood and what I was doing. They consisted of four routines: start_logging, stop_logging, restart_logging and log."
"This was quick and dirty code that, while it did the job, was not very simple to use. For example, if I needed to redirect the output of a sub-process to the log file I would have to say: stop_logging(), then run that process and redirect its output to the log then restart_logging() again. It was Rather clumsy and difficult to document. So, I set about to rewrite the routines in an object oriented manner. I followed the 'Three little rules' as formulated by Larry Wall in the objperl(1) man page and restated by Damian Conway in his book "Object Oriented Perl"..."
[Nov 1, 2000] www.perl.com - Critique of the Perl 6 RFC Process
[Oct. 14, 2000] The Linux Gurus: Automating FTP: Part Two
A few weeks ago I wrote an article on how to automate FTP via the .netrc file. I received lots of excellent feedback concerning the article, most of which to tell me other ways to automate FTP. I had aimed that article at the non-programming audience. I wanted to demonstrate a very simple way to automate your FTP logins and to automatically perform certain FTP tasks. Because the volume of mail was so high I decided to write a second part to the article to detail two other ways to automate FTP. For very simple tasks the .netrc file is probably the easiest way to go. However, for more complex tasks, or tasks needing greater error checking and flexibility you will probably want to use one of the methods outlined below. This article is not meant to be a definitive reference on any of the material covered, its primary goal is to familiarize you with the methods and give you further references to learn more. Remember, doing is the best way to learn.
Using Perls NET::FTP
I have to admit, I am a PERL bigot, I love PERL and do much of my programming in PERL. Now, notice I said bigot and not god, just because I like it does not I mean I am good at it, I would say I write the worlds worst PERL code. If your wanting to do FTP with PERL then the Net::FTP module should be your first choice. Net::FTP is part of PERLs libnet package and is probably already installed on your system. In the really off chance that it is not you can head over to CPAN and get it along with easy to understand installation instructions. Net:FTP allows us to use familiar FTP commands via PERLs object oriented syntax. In order to use Net::FTP we simply have to place a use Net::FTP statement at the start of our program and make a Net::FTP object. Let us say we wanted to upload the file dailyreport.txt to someserver.com into the directory /reports everyday under the username of someuser with a password of foo. This is how we would do it with Net::FTP:
#!/usr/bin/perl use warnings; use Net::FTP; $ftp = Net::FTP->new("ftp.someserver.com") or die "Could not connect: $@\n"; $ftp->login("someuser","foo"); $ftp->cwd("/reports"); $ftp->type("ascii"); $ftp->put("reports.txt"); $ftp->close;Could it get any easier than this? The general syntax is ftpobject->ftpcommand(parameters). You will notice that all of the statements that do the actual work are done through standard FTP commands, and that is the beauty of Net::FTP, there is nothing new to learn. If you know rudimentary PERL and FTP you can use Net::FTP. Here is a short table with some of the Net::FTP methods.
Method Use login($username,$password) Will log you into the ftp server. Use anonymous without a password to login anonymously cwd($dir) Changes the current directory on the server cdup() Changes the current directory on the server to one directory level above the current directory pwd() Returns the current working directory ls() Gets the current directory contents get($remotename, $localname) Retrieves file $remotename and stores it in $localname put($localname, $remotename) Stores file $localname to the ftp server as $remotename delete($filename) Removes the file $filename from the server rmdir($dirname) Removes the directory $dirname from the server type($type) Switch to either binary or ascii transfer mode ascii() Switch to ascii transfer mode binary() Switch to binary transfer mode quit() Terminate the FTP connection Pretty simple stuff. This is not all there is to Net::FTP, you can find more information by reading the Net::FTP man page - man Net::FTP
Ncftp
Another simple choice we have is to use ncftp which once again is probably already installed on your system. Ncftp gives us a simple command line based interface to ftp. Note that ncftp can also be used as a shell just as regular ftp can. The two commands we are mainly concerned with are ncftpget and ncftpput. As you can probably guess these two commands are used to send and retrieve files from a specified ftp server. We will look at ncftpget first.
Ncftpget allows us to retrieve files from a remote server with a single command line. Try it out, get the Sim City 3000 demo from Loki with this command:
ncftpget ftp://ftp.lokigames.com/pub/demos/sc3u/sc3u-demo-x86.run
If you run the above command you should connect to the ftp server and begin downloading the Sim City 3000 demo and get a status line telling you how much you have downloaded and an estimated time left. Be forewarned though, this demo is quite large (170+MB) so you may not want to execute this unless you have good bandwidth (or a lot of time!). This is all well and good, but what if we want to log in as a specific user. Well, that is easy, with command line switches. For example the command:
ncftpget -u someuser -p yourpassword ftp.someserver.net . '/pub/README.txt'
Would login to someserver.net with a username of someuser and a password of yourpassword and retrieve the file /pub/README.txt from that server. You can get a complete list of command line options by typing ncftpget by itself on the command line (and then pressing enter of course). You can also read the ncftpget man page for even more detailed information.
Well, after having briefly covered ncftpget we can move on to ncftpput which (big surprise) works exactly the same way. So, to put a file called README.txt to someserver.net with out username and password we could use:
ncftpput -a -u someuser -p yourpassword ftp.someserver.net '/pub' 'README.txt'
Notice the -a switch to tell ftp that this is an ascii file. Also note that the remote directory (/pub) comes first, followed by the local file to upload. The interface into ncftpput is almost identical and the command line switches are pretty much the same. Once again, type ncftpput by itself to get a list of switches and you can consult the ncftpput man page with man ncftpput.
The End
I hope this helps someone out. As always in Linux, there are many other ways to perform this same task. You can check out Scurry, a neat package written by Doug Muth, to quote the README file that comes with it: This is a Perl script designed to automate FTP transfers, and transfers using Secure Copy (scp) which comes with ssh versions 1 and 2.. It looks to be a pretty slick package although I have not had time to fully try it out. You may also want to check out Expect which is a tool to automate nearly an task. Again, to quote the Expect website: Expect is a tool for automating interactive applications such as telnet, ftp, passwd, fsck, rlogin, tip, etc. Expect really makes this stuff trivial. Expect is also useful for testing these same applications. And by adding Tk, you can also wrap interactive applications in X11 GUIs.
[Oct. 14, 2000]Apache Today - E-Commerce Solutions Template-Driven Pages, Part 2
Perl is ideally suited to reading in entire files, doing a bit of processing, and then print it out again. I have for years now used the same basic function of reproducing templates in Perl. The function looks like this:
sub parse_template
{
my ($template,%subs) = @_;
open(TEMPLATE,$template)
or print "I tried to load $template<br>\n";
{
local $/;
$_ = <TEMPLATE>;
}
close(TEMPLATE);
foreach $sub (sort keys %subs)
{
$_ =~ s/\%\%$sub\%\%/$subs{$sub}/g;
}
return $_;
}
There are a few points to note about this function before we look at how best to use it. First and foremost, you'll notice that we load the entire template file into memory. This is because we want to process the file in it's entirety. The second point is that we don't actually print the template from within the function, instead, we return the translated text to the caller. This is just in case we want to use the template for something other than an active HTML page generated by a CGI script. We could use the same function to introduce templates into a static HTML file, whilst still allowing us to reproduce and parse the template in the process.
The third point is just a small nicety. If the file that's been selected doesn't exist, we print a little message to say that there's been an error. We could equally return nothing, but I prefer to be able to spot the problem. In production systems, I've actually used an SSI type error message, and also taken the time to mail an error message to the webmaster to highlight a possible problem.
Now for the important part. The second half of the function actually processes the template so that we can embed elements into the templates that can be replaced on the fly. We replace strings of the form %%string%% by using a hash which we supply to the function. The key of the hash is the string, and the value is the replacement text. For example, take the simple template:
<title>%%title%%</title>
Using the function above we can print out the template using:
print parse_template('template','title' => 'This is the title text');
This will produce the desired:
<title>This is the title text</title>
You can create as many templates as you like, and have as many different replacement strings as you like. It'll also replace the same string a number of times, useful if you want the page title, and the title displayed within the page to be the same.
There is of course a little problem with this, in that in order for this to work, you need to have a different set of templates that support the %%text%% construct. So, the final trick is to change the way in which we search for the matching string that we want to replace. Instead of using %%text%%, you use a standard SSI construct, using a comment to encapsulate the text to be replaced. For example, we could have a template with:
<font size=+2><b><!--#include perltext=title --></b></font>
Now if you use the template as an SSI include in another document, the 'replacement' text will be ignored, because the SSI system will treat it as a comment. But when parsed by an updated version of our function, the 'title' gets replaced with the desired text.
All you have to do is modify the function to replace the quoted string. I've included the full version of that function below:
sub parse_template
{
my ($template,%subs) = @_;
open(TEMPLATE,"$template")
or print "I tried to load $template<br>\n";
{
local $/;
$_ = <TEMPLATE>;
}
close(TEMPLATE);
foreach $sub (sort keys %subs)
{
$search = quotemeta '<!--#include perltext=' . $sub . ' -->';
$_ =~ s/$search/$subs{$sub}/g;
}
return $_;
}
I've used quotemeta here to make sure that the whole string is suitable for use as a search string in the regular expression.
[Oct. 07, 2000] www.perl.com - Ilya Regularly Expresses by Joe Johnston. Interview with Dr. Ilya Zakharevich a long-time contributor to the Perl5 Porters mailing list and for being a regular expression wizard extraordinaire (regex enthusiasts are IMHO very suspect ;-)
Many people know your extensive work with Perl's regular expressions. What is the most common misunderstanding new programmers have about this pattern-matching language?
IZ: I do not remember. For me, the beginner stage was so long ago, and I try to avoid questions on c.l.p.misc which many posters have enough expertise to answer. Let me guess.
Perl's regular expressions are modeled
(eventually) after command-line parameters to grep and other similar
utilities. In the command-line world, everything is a string. Bingo: Perl
regular expressions look like strings. (Let us forget for a moment that
operators qq()
etc. were introduced to make strings look like regular expressions ... .)
We have a language with binary operators
(for example, `|',
`{4}', or `'
- this was concatenation), unary operators (`[]',
`[^]', `(?!)',
'+' - both
postfix and aroundfix), grouping (`(?:)'),
keywords (`\w',
`^'),
ternary ('{3,7}'),
naming (`()')
etc. All of this is packed into a string. No wonder that even inherently
unreadable languages like Tcl or Lisp start looking like Dr. Seuss compared
to regular expressions.
Additionally, newcomers do not
understand that one needs to break a regular expression into tokens (not
mentioning how to do it!), all these rules about what is special when
backslashed, what is special when not backslashed and so on. To add insult to
injury, m in
m// is
optional, but s
in s/// is
not, //x
would require you to go into "gory details," some switches in
//ioxmsg apply to regular
expressions, some to the operator in which the regular expression appears,
print /foo/, 'bar'
is applied to $_,
but split /foo/, 'bar'
is not etc., etc., etc.
//x
was introduced as a clever hack around the problem of "packing a language
into a string," but it went only a small part of the way to make things more
maintainable. Languages like SNOBOL introduced COBOL-style patterns, which
swing into the opposite end of the scale: things become less readable due to
the sheer size of patterns for any nontrivial task.
Regular expressions are extremely powerful tools, they are the functional-programming oasis inside a procedural language, and, when well understood, they map wonderfully into the problem domain they address. Making them into eye candy is not impossible, but requires a lot of work (and probably significant changes in the current mindsets).
For those of us who use the Beast Emacs, you have provided the outstanding cperl-mode syntax highlighter and indenting system. What was so bad about the traditional perl-mode that made you want an alternative?
IZ:
Again, I do not remember the details. But I did not invent the alternative, I
just adopted an existing branch. Here's my attempt to reconstruct how it did
happen (but it may be a false memory): At the time I grabbed
cperl-mode.el v1.3 (by
Bob Olson) from gnu.emacs.source,
perl-mode.el was handling about 30% of
constructs, while cperl-mode.el
was handling 60%. Additionally, electric constructs were decreasing the
irritation factor a lot. This was what I started with. Bob named and coded
cperl-mode.el
similarly to the difference between c-mode.el
and cc-mode.el.
Being locked into Emacs, being used to (extremely high) standards of good DOS programmers editors, and having a very low irritation threshold for bookkeeping-related repetitive tasks got me some minimal experience with Emacs Lisp (I needed several years to make my Emacs config tolerable). So when facing a problem with the existing cperl-mode.el, I would try to fix it instead of working around it.
While not time-efficient, this was bringing this warm fuzzy feeling of improving the universe instead of just increasing its entropy. So it went and went, with additional fuel supplied by annoyed/pleased/patchy users around the world.
What first attracted you to Perl?
IZ:
Oh, this is easy to answer: command.com.
If all you have are scissors, everything starts looking like a nail. So you
learn to deal with everything using your scissors. I remember my impression
when I printed out the documentation for 4DOS/4OS2: Wow, these guys thought
of everything! I may replace half of the tiny utilities I need with this one
program!
Then I saw a "go" script for running LaTeX/BibTeX until successful completion. It required an additional program, perl.exe, which was not exactly tiny (around 200K), but obviously demonstrated quite enough bang for a K. The manpage for this program had a few kilolines, was very well written, so was easy to grasp. Browsing "Programming Perl" did not hurt, either. (It took a lot of time to understand that the title is a false advertisement). Using this program for intelligent format conversion between bibliographical databases and BibTeX proved to be a success, including a chain of regular expressions like:
elsif (/^\s*No\.\s+([-\d\/]+(\s*\([-\/\d]+\))?)\s+(pp\.\s*)?([-\d]+)(\s*pp\.?)?
(\s*\((\d{4})\))\s*$/i) {
elsif (/^\s*(pp\.\s*)?(([ivxlcd]+\+)?([-\d]+)|((\w)\d+-+\6\d+))(\s*pp\.?)?\s*$/) {
Then there was the year I was trying to make a math-editing widget based on a beefed up TK's text widget. With all the work for "typesetting" components of formula delegated by the widget to TCL callbacks, TCL turned out to be not an answer.
Could you describe in more detail what additional text-handling primitives you would like to see included with Perl? What string munging operations are absent that really ought to be included in Perl's core?
The problem: Perl's text-handling abilities do not scale well. This has two faces, both invisible as far as you confine yourselves to simple tasks only. The first face is not that Perl lacks some "operations;" it is not that some "words" are missing, whole "word classes" are not present. Imagine expressive power of a language without adjectives.
In Perl text-handling equals string-handling. But there is more in a text than the sequence of characters. You see a text of a program - you can see boundaries of blocks, etc.; you see an English text, you can see word boundaries and sentence boundaries, etc. With the exception of the word boundaries, all these "distinctive features" become very hard to recognize by a "local inspection of a sequence of characters near an offset" - unless you agree to use a heuristic which works only time to time. But a lot of problems require recognition of the relative position of a substring w.r.t. these "distinctive features".
Remember those "abstract algorithms" books and lessons? You can solve the problems "straightforwardly," or you can do it "smartly." Typically, "straightforward" algorithms are easy to code, but they do not scale well. Smart algorithms start by an appropriate preprocessing step. You organize your data first. The particular ways to do this may be quite different: you sort the data, or keep an "index" of some kind "into your data," you hash things appropriately, your balance some trees, and so on. The algorithms use the initial data together with such an "index."
Perl provides a few primitives to work with strings, which are quite enough to code any "straightforward" algorithm. What about "smart" ones? You need preprocessing. Typically, digging out the info is easy with Perl, but how would you store what you dug? The information should be kept "off band," for example, in an array or hash of offsets into the string.
Now modify the string a little bit, say,
perform some s()()
substitutions, or cut-and-paste with substr().
What happens with your "off band" information? It went out of sync. You need
to update your annotating structures. Do not even think about doing
s()()g, since you
do not have enough info about the changes after the fact. You need to do your
s()()
one-by-one - but while s()()g
is quite optimized, a series of s()()
is not - and you get stuck again into the land of badly scaling algorithms.
(Strictly speaking, for this particular
example s()()eg
could save you - as well as code-embedded-into-a-regular-expression, but this
was only a simple illustration of why off-band data is not appropriate for
many algorithms. Please be lenient with this example!)
Even if no modification is done, using off-band data is very awkward: how to check what are the attributes of the character at offset 2001 when there are many different attributes, each marking a large subset of the string?
That was the problem, and the solution
supported by many text-processing systems is to have "in-band annotations",
which is recognized by the editing primitives, and easily queryable. Perl
allows exactly one item of in-band data for strings: pos(), which is
respected by regular expressions. But it is not preserved by string-editing
operations, or even by $s1 = $s2!
"In-band" data comes in several "kinds". A particular "kind" describes:
[[LABEL DELIM0] KEYWORD [DELIM1 VAR1 SEP VAR2 ... DELIM2]
[DELIM4 EXPR DELIM4] [DELIM5 BODY DELIM6]]
- with some parts possibly missing, so the internal structure is a tree).
Different answers lead to a zoo of
intuitively different kinds of markup, each kind useful for some categories
of problems. You can mark "gaps between" characters, or you can mark
characters themselves. The markup may "name" a position ("the first
__END__ in a Perl
program"), or cover a subset of the string ("show in red", "is a link to
this URL", or "inside comment"). Since the kind of the markup defines
what happens when the string is modified, the system can support
self-consistency of the markup "automatically" (in exceptionally complicated
cases one may need to register a callback or two).
The second face of problem is not with
the expressive power of Perl, but with the implementation. Perl has a very
rigid rule: a string must be stored in a consecutive sequence of bytes.
Remove a character in the middle of the string, and all the chars after it
(or before it) should be moved. As I said,
s()()g has some optimizations which allow
doing such movements "in one pass", but what if your problem cannot be
reduced to one pass of s()()g?
Then each of the tiny modification you do one-at-a-time may require a huge
relocation - or maybe even copying of the whole string. This is why a lot of
algorithms for text manipulation require a "split buffer"
implementation, when several chunks of the string may be stored
(transparently!) at unrelated addresses.
Such "split-buffer" strings may look
incredibly hard to implement, as in "all the innards of Perl should be
changed", but it is not. Just store "split strings" similarly to
tie()d data. The
FETCH
(actually, the low-level MAGIC-read method) would "glue" all the chunks into
one - and would remove the MAGIC - before the actual read is performed; and
now no part of Perl requires any change. Now four or five primitives
for text-handling may be changed to recognize the associated
tie()d structures - and
act without gluing chunks together. We may even do it in arbitrarily small
steps, one opcode at a time.
Another important performance
improvement needed for many algorithms would be the copy-on-write, when
several variables may refer to the same buffer in memory, or different parts
of the same buffer - with suitable semantic what to do when one of these
variables is modified. (In fact the core of this is already implemented in
one of my patches!) Together with other benefits, this would solve the
performance problems of $&
and friends, as well as would make m/foo/; $&
= 'bar'; equivalent to
s/foo/bar/. Having
copy-on-write substrings may be slightly more patch-intensive than
copy-on-write strings, though. The complication: currently the
buffers are required to be 0-terminated (so that they may be used with the
system APIs). It is hard to make 'b' as in
substr('abc',1,1) refer to the same buffer
(containing "abc\0") as 'abc'. The solution may be to remove this
requirement, and have two low-level string access API, SvPV() and SvPVz(), so
that SvPVz() may perform the actual copying (as in copy-on-write) and the
appending of \0
- but only when needed!
**** http://www.oreilly.com/catalog/perlsysadm/chapter/ch09.html Chapter 9 of "Perl for System Administration" online. Chapter 9 is an in-depth look at one of the more common system administrator's duties: sifting through log files.
The chapter covers everything from basic syslog, text only log files to Microsoft NT, binary log files and how to interpret them using Perl. David N. Blank-Edelman does more than just explain how to grok the files, he addresses several other problems, such as log file rotation and stateful vs stateless data. There is also a very detailed section on log file analysis. He covers several different algorithms for analyzing the logs and turning them into useful data. Also, he addresses the use of databases in the logfile analysis process.
Low cost alternative Perl conference for the East Coast(Pittsburg)
web.oreilly.com -- News -- Why Learn CGI
www.perl.com - Report on the Perl 6 Announcement
Bit::Vector 5.8 Steffen Beyer <sb at engelschall.com> - July 28th 2000, 08:51 EST
Bit::Vector is a (stand-alone) C library and an object-oriented Perl module (with overloaded operators) which allows you to handle bit vectors, sets (of integers), "big integer arithmetic", and boolean matrices (all of arbitrary size) very efficiently.
Changes: This release works with with Perl 5.6.0. It adds a method and an overloaded operator for exponentiation, and Copy() now accepts vectors of any size and truncates or fills up (according to sign) as necessary
**** Perl for System Administration Chapter 9 Log Files
It's been said that if you work on any program long enough, that program will eventually be able to send electronic mail. It doesn't matter what the original purpose of the program was (if you can still remember)--if you develop it long enough, some day that program will send its first piece of email.
From the vantage point of a systems or network administrator, this means there are lots and lots of programs out there generating mail daily. Mail filters like procmail can help us with this deluge by sorting through the mail stream. But sometimes it is more effective to write sophisticated programs to actually read the mail for us. For example, we might write a program to analyze unsolicted commercial email (spam) or one that keeps long-term statistics based on daily diagnostic email from a server.
Isil's Home Welcome to Isil's home !
[Jul 23, 2000] Roth Consulting Perl Package Repository -- a very interesting collection of Win32 packages
[Jul 9, 2000] IBM developerWorks: Dare to script tree-based XML with Perl
"Parsing an XML document into tree structures makes it possible to operate on the tree structure of the data. Find out how to use the functions for accessing and manipulating the document tree, and follow a sample stock-trading application that uses Perl, DOM, XML, and a database to evaluate trading rules."
[Jul 3, 2000] Linux Magazine: An Object Lesson in Perl
"Objects provide encapsulation (to control access to data), abstract data types (to let the data more closely model the real world), and inheritance (to reuse operations that are similar but have some variation)."
[Jul 2, 2000] IBM DeveloperWorks Linux Features Perl: Small observations about the big picture Simple techniques to increase reliability and maintainability in Perl code by Teodor Zlatanov
Getting the job done in Perl is easy. The language was designed to make simple tasks easy, and hard tasks possible. But the built-in simplicity of the language can become a trap. Programmers are by nature averse to documenting or designing the architecture of their programs. The excitement of writing pure code lies in the direct connection to the machine, telling it exactly what to do. Teodor Zlatanov presents techniques to improve the reliability and maintainability of Perl programs through increasing clarity of the code. His tips are intended for the beginner or intermediate Perl programmer, with a stronger emphasis on establishing good standards rather than on changing particular coding styles.
IBM developerWorks: Parsing with Perl modules (Apr 30, 2000)
Web Techniques: Programming
With Perl - Web Access Logs with DBI (Apr 16, 2000)
WDVL.com: The Perl You Need
to Know Special: Introduction to mod_perl (Apr 11, 2000)
dotcomma: MySQL Database
Interaction with Perl (Jan 29, 2000)
LinuxFocus: Perl part I
(Sep 19, 1999)
DBGUI - Sybase
database interface -- The current (DBI/DBD) version is 2.1.8. (Released Apr
01 2000)
The current (SybPerl) version 1.6.8. (Released December 14 1999). From program
description:
DBGUI is a complete X graphical database interface that can -
- perform any SQL command
- save the SQL results to a file
- perform incremental or standard searches on the SQL results
- keep a configurable history of _all_ SQL commands and parameters
- run on any UNIX or UNIX-like machine (Linux)
- sort (normal, numerical and reverse) on any column of the SQL results
- print the SQL results to a printer
- quick command line clear and restore for easy command line generation/pasting
- "clone" the results to a new display window for comparisons etc.
- utilize the DBI/DBD (or SybPerl) libraries or isql/sqsh binaries for the queries
- maintain four complete configuration "snapshots" for easy retrieval
- reload the last set of parameters on startup
- interactively enable and disable three different command lines for execution
- display the column data type in each column header
- display the column width in each column header
- solicit and quickly popup a list of the system datatypes. (SybPerl Version)
- indicate a busy/idle condition with a colored indicator (red/green)
- display the date/time of the last command execution in title bar
- load a specified checkpoint file on startup for pre-defined menu histories
- display the checkpoint file path in the title bar
- color code header info in multiple result-set data.
- quickly sum up any numerical data column.
- more probably....
[June 06, 2000] www.perl.com - ANSI Standard Perl An Interview with HP's Larry Rosler (Larry Rosler was both editor of the draft standard and chairman of the Language Subcommittee for X3J11. He helped put 'ANSI' in front of C.)
I was taught by several "hardware" guys who got into programming somewhat reluctantly. Do you feel that novice programmers lose an important perspective without knowing what goes on under the hood?LR: I taught the first course on C at Bell Labs, using a draft of K&R, which helped vet the exercises. The students were hardware engineers who were being induced to learn programming. They found C (which is 'portable assembly language') much to their liking. Essentials such as pointers are very clear if you have a machine model in mind.
Perl is at a higher level of abstraction, so the machine model isn't as necessary at first. But when you get to complex data structures, which require references (which are like pointers in C, but much safer), a grounding in addressing becomes useful.
In an ideal world, a student would first learn an abstract assembly language such as MIX (see Knuth, Vol. 1), do some useful exercises, then take on a higher-level language with the machine model in the back of the head.
When did you run into Perl? What did you think of the language the first time you saw it?
LR: I am a relative latecomer to Perl. I was experimenting with CGI programming using shell scripts (!) because they were better for rapid prototyping than C. Soon I discovered that Perl offered advantages similar to the shell, but was much more expressive (particularly in the manipulation of data structures) and much faster to execute.
Because of my familiarity with Unix commands such as 'sed', which made heavy use of regular expressions, and because of my C experience, I was quite comfortable with Perl syntax. The hardest adjustment was to learn to write code with as few Perl operations as possible, because of the costs of dispatching each instruction. The Benchmark module became the most important tool I used to learn how to write efficient (and hence, sometimes elegant) Perl. I also learned a lot from the newsgroup comp.lang.perl.misc, to which I eventually became able to contribute.
Often Perl claims to be efficient in the scarce resource of the programmer's time. It isn't often that people tune scripts for optimum performance. Are there a few tips you can give to new Perl programmers on how squeak out a little better runtime performance?
LR: 'Script'
ne'program' again. When dealing with upward-scalable data sets, performance becomes important.I proposed a tutorial to Perl Conference 4.0 this year on this subject, but unfortunately it wasn't accepted.
- 0. Don't optimize prematurely!
- 1. Don't use an external command where Perl can do a task internally.
- 2. Refine the data structures and algorithms. References: The Practice of Programming (Kernighan and Pike), Programming Pearls (Bentley), Algorithms with Perl (Orwant et al).
- 3. After these are optimal, identify the remaining hot spots. (Judicious use of the time() and times() functions, or the Benchmark module.)
- 4. Try to improve the hot spots by using perl functions such as map() and grep() instead of explicit loops; making regexes more explicit to minimize backtracking; caching intermediate results to avoid unnecessary recalculation, ...
There's certainly no ANSI Perl. Does Perl need the same kind of official standardization that C got?
LR: I believe that it does, in order to increase its acceptability. Many organizations either cannot or will not endorse the use of unstandardized languages in their business-critical activities.
The current situation with Perl is better than it was with the other two languages I mentioned. Perl has one official open source for its implementation, whereas the others had multiple proprietary implementations, leading to different semantics for many language features. But this single 'official' Perl semantics has never been adequately characterized independent of the implementation, so is subject to arbitrary change.
Building on quicksand is acceptable for 'scripts' of limited longevity and applicability. It is not acceptable for 'programs' of significant commercial value. I think the lack of a firm, stable, well-defined foundation is the major inhibitor for the continuing commercial evolution of Perl. Of the major contributors to Perl, Ilya Zakharevitch is most outspoken in his view that Perl is not (yet?) a 'programming' language!
I'm curious how one would standardize Perl when the language changes so quickly and committees move so slowly. Consider than three years ago, Perl was not threaded. Now, threads are standard, but their interface may change in the near future. Mr. Zakharevich continues to pull new regex constructs from head of Zeus. Even more striking, Perl supports unicode. Is there some way to stage the standardization so that it isn't painfully out-of-date? Would standardization necessarily slow down perl development?
LR: Sometimes standardization speeds up development, by forcing evaluation and convergence on a specified way of doing things. Sometimes features are characterized and implemented during standardization (wide-character types for C, for example; the Standard Template Library and many other features for C++).
One way to view it is re Samuel Johnson's famous mot: "When a man knows he is to be hanged in a fortnight, it concentrates his mind wonderfully."
Back to Java for a moment. Because they are both "web technologies", Perl and Java are often seen as competitors. There have been attempts, like Larry Wall's JPL, to provide better integration between these two beasties. What sort of utility do you see in such a marriage?
LR: Hard for me to say. Java forces OO programming from the beginning, and I have never needed to write an object in any program. This may be for me a conceptual hurdle that I cannot (and need not) overcome.
C++ provides sweetened C syntax and semantics (particularly toward bits and bytes), and OO if you want it. This can all be done with the efficiency of C.
Perl provides higher-level syntax and semantics (particularly toward strings), and OO if you want it. I know how to write efficient Perl when I have to.
Java, in my opinion, fills a much-needed gap between those two approaches.
:-)Speaking of competitors, Python is making a big splash this year. Although it seems to satisfy many of the same itches Perl does, its proponents point to its cleaner syntax and more tradition OO implementation as making it a "better Perl". What are your thoughts on Python?
LR: Whatever improvements Python may offer are not sufficient to give it a critical mass of programmers and programming support relative to Perl. There is an order-of-magnitude difference in this metric (programmers, modules, books, ...), and I don't think significant inroads are being made. And, as I said, to me OO is a big yawner.
Have you had the chance to read Mr. Conway's _Object Oriented Perl_? I found I learned more about general Perl from it than OO techniques (which by no means is to say book is inadequate in the latter department).
LR: Yes, I have gotten to about the middle of it. It is a fine book. But it still didn't convince me about the necessity of objects. All I see is the performance-damaging complexity of the interfaces.
A Fresh Look at Efficient Perl Sorting
One Minute Perl Book Reviews -- -- pretty entertaining test ;-). If number of elements in array test fail that the book is really highly suspect. I am not sure about the value of others (actually most of them are probably absent in any introductory book). Note that the original text was not even spell-checked ;-). For the actual marks of a number of book including Medinets' book see the paper.
Seems like everyone is writing a Perl book. The most disturbing part is that they're being written by people who have nothing to do with Perl. How to decide what's crap and what's not?
Worry no more! After many mirth-filled hours of flipping through many an awful Perl book, I have come up with a simple one-minute litmus test to determine if the book you're holding is worth the tree its printed on.
The Perl Book Litmus Test
Remember, the point of this test is to find bad books and there can only be negative results with this test. A book which passes all the tests put forth here CAN STILL SUCK.
Flip to the index. Look up the following tidbits and answer the questions.
- localtime
- Does it state that it returns the number of years since 1900? Does it mention that when used in scalar context it returns a nicely formated date?
- srand
- Check how it uses srand(). Does it warn you to call it only once in a given program? (If srand is never mentioned, that's okay)
- Number of elements in an array
- Does it say that an array will return its number of elements in scalar context, or does it use
$num = $#array + 1;- flock
- Does it discuss and use flock instead of lockfiles? (ie. setting some .lock file instead of using flock()) Its okay if file locking is never discussed at all.
- Portable Constants
- When performaing flocking, socket operations or sysopens does it define it use the constants defined by Perl, or do they define their own unportable constants? (Its okay if the book never has to use these constants at all)
Linux Today XML.com Processing XML with Perl
Linux Today Datamation Perl professionals - IT Salary Tracker
"For the third installation in our four-week series on open source professionals (see Apache server professionals and Linux professionals), we take a look at Perl jobs. Though it doesn't command the highest salaries, knowledge of the Perl scripting language--the mortar of countless Web sites--is by far the most sought after open source skill. Our sister site, dice.com, an Internet-based job board for IT professionals, listed 3,365 Perl positions in January 2000. That's more than three times the number of listings for Apache, Linux, and Sendmail pros added together."
"Demand for Perl has done nothing but expand, and the outlook for future job growth is very good. Dice.com stats show a 200% increase in Perl jobs nationwide since September 1999. Perl pros in New York are cashing in more than other cities, garnering an average annual salary of $84,000 and a contract wage of $80 an hour. However, Silicon Valley has by far the greatest number of opportunities with 1,115 job postings, almost a third of the Perl listings nationwide. And if you're wondering about which job titles are especially hot, Web developers and Webmasters are America's most wanted Perl pros, with 793 availabilities. Software engineers and applications programmers are sizzling, too."
[March 26, 2000] Linux Gazette GIMP-Perl GIMP Scripting for the Rest of Us
[Feb 26, 2000] Linux Today Perl.com Beginning Perl Ten Perl Myths
[Feb 20, 2000] pftp pFtp is a ftp client written in perl. It uses the Perl/Tk and Libnet libraries, both available from the CPAN FTP site. Download: pFtp 0.05
[Feb 20, 2000] The Perl Archive File Downloading Page 1
[Feb 20, 2000] Script Profile - Perl Utilities File management
A source code distribution is available.
[http://www.eprotect.com/stas/TULARC/works/faq_manager/index.html]
[Feb 09, 2000] Some assorted links:
[Jan 29, 2000] dotcomma: MySQL Database Interaction with Perl
"Perl uses many different methods to interact with databases, and below, I am outlining how to use just some of the methods."
[Jan 28, 2000] DDJ EXAMINING PERLDAP by Troy Neeriemer
Netscape's PerLDAP is an important tool for both programmers and administrators because it provides a mechanism for accessing directory information from Perl. Troy presents a high-level overview of PerLDAP, along with details of how you can use it. Additional resources include perldap.txt (listings) and perldap.zip (source code).
[Jan 28, 2000] DDJ FULL-TEXT SEARCHING IN PERL by Tim Kientzle
Full-text search engines are popular these days, and not just on web sites. Tim shows
how you can build a fast full-text search capability using Perl's built-in database
support. Additional resources include perlsrch.txt (listings) and perlsrch.zip (source code).
[Jan. 5, 2000] Linux Magazine October 1999 FEATURES Uncultured Perl
Copyright © 1996-2009 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Site uses AdSense so you need to be aware of Google privacy policy. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
Disclaimer:
Last modified: August 15, 2009