|
Softpanorama
(slightly skeptical)
Open Source Software Educational Society |
May the
source be with you,
but remember the KISS principle ;-)
|
Perl for Unix System Administrators
This is a very limited effort to help Unix sysadmins to learn of
Perl. It is based on my FDU lectures to CS students. We discuss mainly "simple Perl"
and the site tries to counter "excessive complexity" drive that is dominant in many
Perl-related sites and publications. It can be also used for preparing to the
Certified Internet Web Professional Exam. See also
Introduction to Perl for Unix system administrators.
IMHO the main advantage of using powerful
complex language like Perl is the ability to write simple programs. Perhaps
the world has gone overboard on this object-oriented thing.
You do not need many tricks used in lower level languages as Perl itself provides
you high level primitives for the task. This page is linked to several sub-pages.
The most important among them are:
All language have quirks, and all inflict a lot of pain before one can adapt
to them. Once learned the quirks become incorporated into your understanding of
the language. But there is no royal way to mastering the language. The more different
is one's background is, more one needs to suffer. Generally any user of a
new programming language needs to suffer a lot ;-)
When mastering a new language first you face a level of "cognitive overload"
until the quirks of the language become easily handled by your unconscious mind.
At that point, all of the sudden the quirky interaction becomes a "standard" way
of performing the task. For example regular expression syntax seems to be
a weird base for serious programs, fraught with pitfalls, a big semantic mess as
a result of outgrowing its primary purpose. On the other hand, in skilled hands
its a very powerful tool.
One early sign of adaptation to Perl idiosyncrasies is when you start to put
$ on all scalar variables automatically. The next step is to overcome notational
difficulties of using different two operations ("=="
and eq) for comparison -- the source of many
subtle errors for novices much like accidental use of assignment statement instead
of comparison in if statement C (like in --
if (a=1)... ). Before than happens
please be vary of using complex constructs -- diagnostic in Perl is really bad.
I would call them complexity junkies. A classic example of this "killing readers
with obscurity" approach is an article
Understanding
and Using Iterators. Actually Perl has weak support of iterators as it lacks
co-routine support. But you can never learn that from the paper where trivial example
were presented using obscure overcomplicated code that reminds me
The International Obfuscated
C Code Contest. Please don't follow them. Try to write simple transparent
code.
There is also OO-enthusiasts flavor of complexity junkies with Damian Conway
as the most prominent representative. Their advice should not be taken at face value.
Please remember about KISS principle and try to write simple Perl scripts without
overly complex regular expressions or fancy idioms.
Some Perl gurus pathological preoccupation with idioms is not healthy. Although
definitly gifted authors
Randal L. Schwartz
and Tom Christiansen are a little bit too much preoccupied with this fancy art.
Fancy idioms are bad for novices and can contain subtle limitations or side effects
that can byte even seasoned Perl programmers.
Generally the problems mentioned above are more fundamental than the trivial
"abstraction is the enemy of convenience". It is more like that badly chosen
notational abstraction at one level can lead to an inhibition of innovative notational
abstraction on others.
All-in-all Perl is a great language. But even sun has dark spots...
Dr. Nikolai Bezroukov
Notes:
- Those pages are written by people for whom English is not a
native language. Some amount of grammar and spelling errors
should be expected.
- This is a Spartan WHYFF (We Help You For Free) site. It
cannot replace the best teachers and
the
best books.
- The site contain some obsolete pages as it develops like a
living tree... Some links on older pages
are broken. Please
try to use Google, Open directory, etc. to find a replacement link
(see
HOWTO search the WEB for details).
We would appreciate if you can
mail us a correct link.
|
|
|
Perl 20th year anniversry
|
There are some tools that look like you will never replace them. One of those
(for me) is grep. It does what it does very well (remarks about the shortcomings
of regexen in general aside). It works reasonably well with Unicode/UTF-8 (a great
opportunity to Fail Miserably for any tool, viz. a2ps).
Yet, the other day I read about
ack, which claims to be "better than grep,
a search tool for programmers". Woo. Better than grep? In what way?
The ack homepage lists the top ten reasons
why one should use it instead of grep. Actually, it's thirteen reasons but then
some are dupes. So I'd say "about ten reasons". Let's look at them in order.
- It's blazingly fast because it only searches the stuff you want searched.
Wait, how does it know what I want? A
DWIM-Interface at last? Not
quite. First off, ack is faster than grep for simple
searches. Here's an example:
$ time ack 1Jsztn-000647-SL exim_main.log >/dev/null
real 0m3.463s
user 0m3.280s
sys 0m0.180s
$ time grep -F 1Jsztn-000647-SL exim_main.log >/dev/null
real 0m14.957s
user 0m14.770s
sys 0m0.160s
Two notes: first, yes, the file was in the page cache before I ran ack; second,
I even made it easy for grep by telling it explicitly I was looking
for a fixed string (not that it helped much, the same command without -F
was faster by about 0.1s). Oh and for completeness, the exim logfile I searched
has about two million lines and is 250M. I've run those tests ten times for
each, the times shown above are typical.
So yes, for simple searches, ack is faster than grep. Let's try
with a more complicated pattern, then. This time, let's use the pattern
(klausman|gentoo) on the same file. Note that we have to use
-E
for grep to use extended regexen, which ack in turn does not
need, since it (almost) always uses them. Here, grep takes its sweet
time: 3:56, nearly four minutes. In contrast, ack accomplished the
same task in 49 seconds (all times averaged over ten runs, then rounded to integer
seconds).
As for the "being clever" side of speed, see below, points 5 and 6
- ack is pure Perl, so it runs on Windows just fine.
This isn't relevant to me, since I don't use windows for anything where I
might need grep. That said, it might be a killer feature for others.
- The standalone version uses no non-standard modules, so you can put it in
your ~/bin without fear.
Ok, this is not so much of a feature than a hard criterion. If I needed extra
modules for the whole thing to run, that'd be a deal breaker. I already have
tons of libraries, I don't need more undergrowth around my dependency tree.
- Searches recursively through directories by default, while ignoring .svn,
CVS and other VCS directories.
This is a feature, yet one that wouldn't pry me away from grep:
-r
is there (though it distinctly feels like an afterthought). Since ack
ignores a certain set of files and directories, its recursive capabilities where
there from the start, making it feel more seamless.
- ack ignores most of the crap you don't want to search
To be precise:
- VCS directories
- blib, the Perl build directory
- backup files like foo~ and #foo#
- binary files, core dumps, etc.
Most of the time, I don't want to search those (and have to exclude them
with grep -v from find results). Of course, this ignore-mode
can be switched off with ack (-u). All that said, it sure
makes command lines shorter (and easier to read and construct). Also, this is
the first spot where ack's Perl-centricism shows. I don't mind, even though
I prefer that other language with P.
- Ignoring .svn directories means that ack is faster than grep for searching
through trees.
Dupe. See Point 5
- Lets you specify file types to search, as in --perl or --nohtml.
While at first glance, this may seem limited,
ack comes with a plethora
of definitions (45 if I counted correctly), so it's not as perl-centric as it
may seem from the example. This feature saves command-line space (if there's
such a thing), since it avoids wild find-constructs. The docs mention that
--perl also checks the shebang line of files that don't have a suffix,
but make no mention of the other "shipped" file type recognizers doing so.
- File-filtering capabilities usable without searching with ack -f. This lets
you create lists of files of a given type.
This mostly is a consequence of the feature above. Even if it weren't there,
you could simply search for "."
- Color highlighting of search results.
While I've looked upon color in shells as kinda childish for a while, I wouldn't
want to miss syntax highlighting in vim, colors for ls (if they're not as sucky
as the defaults we had for years) or match highlighting for grep. It's really
neat to see that yes, the pattern you grepped for indeed matches what you think
it does. Especially during evolutionary construction of command lines and shell
scripts.
- Uses real Perl regular expressions, not a GNU subset
Again, this doesn't bother me much. I use
egrep/grep -E
all the time, anyway. And I'm no Perl programmer, so I don't get withdrawal
symptoms every time I use another regex engine.
- Allows you to specify output using Perl's special variables
This
sounds neat, yet I don't really have a use case for it. Also,
my perl-fu is weak, so I probably won't use it anyway. Still, might be a killer
feature for you.
The docs have an example:
ack '(Mr|Mr?s)\. (Smith|Jones)' --output='$&'
- Many command-line switches are the same as in GNU grep:
Specifically mentioned are
-w, -c and -l. It's
always nice if you don't have to look up all the flags every time.
- Command name is 25% fewer characters to type! Save days of free-time! Heck,
it's 50% shorter compared to grep -r
Okay, now we have proof that not only the
ack webmaster can't count,
he's also making up reasons for fun. Works for me.
Bottom line: yes, ack is an exciting new tool which partly replaces
grep. That said, a drop-in replacement it ain't. While the standalone version of
ack needs nothing but a perl interpreter and its standard modules, for embedded
systems that may not work out (vs. the binary with no deps beside a libc). This
might also be an issue if you need grep early on during boot and /usr (where
your perl resides) isn't mounted yet. Also, default behaviour is divergent enough
that it might yield nasty surprises if you just drop in ack instead of grep. Still,
I recommend giving ack a try if you ever use grep on the command
line. If you're a coder who often needs to search through working copies/checkouts,
even more so.
Update
I've written
a followup on this, including some tips for day-to-day usage (and an explanation
of grep's sucky performance).
Comments
René "Necoro" Neumann writes (in German, translation by me):
Stumbled across your blog entry about "ack" today. I tried it and found it
to be cool :). So I created two ebuilds for it:
Just wanted to let you know (there is no comment function on your blog).
Youtube has educational potential
YouTube
I work in the NYTimes.com feeds team. We handle retrieving, parsing and
transforming incoming feeds from whatever strange proprietary format our
partners choose to give us into something that our CMS can digest. As you
can imagine, we deal with a huge amount of text processing. To handle all
of these transformations as efficiently as possible we rely heavily on the
magic of Perl. Recently, as feeds become more and more important, we have
begun to feel pains caused by past impromptu segments of inefficient code
written to meet quick, episodic deadlines. A situation that we are especially
prone to as a fast moving news organization.
I am a relatively new employee
here at NYTimes.com and one of my responsibilities is to create tools to
help ensure the integrity and scalability of our code. To this end, I would
like to introduce you to The New York Times Perl Profiler, or
Devel::NYTProf.
The purpose of this tool is to allow developers to easily profile Perl code
line-by-line with minimal computational overhead and highly visual output.
With only one additional command, developers can generate robust color-coded
HTML reports that include some useful statistics about their Perl program.
Here is the typical usage:
perl -d:NYTProf myslowcode.pl
nytprofhtml
See? Its easy! nytprofhtml
is an implementation of the included reporting interface (Devel::NYTProf::Reader).
If you don’t want HTML reports, you can implement your own format with relative
ease. If you create something cool, be sure to let me know via CPAN patch
request or open@nytimes.com. Detailed
instructions can be found in the documentation and source code on CPAN.
You can see sample screen shots of the html report’s
index pageand a
single module report.
Similar tools exist to profile Perl code.
Devel::DProf is
the ubiquitous profiler, but it only collects information about subroutine
calls. Because of this limitation, its not all that helpful in finding that
elusive broken regex in a 75-line subroutine of regex transforms.
Devel::FastProf
is another per-line profiler, however I found its output difficult to coerce
into HTML. It also doesn’t support non-Linux systems (we need at least Solaris
and Ubuntu/Linux support).
Devel::NYTProf is available
as a distribution on the CPAN.
You may install by typing “install Devel::NYTProf” in the ‘cpan’ command-line
application, or manually
by downloading
the tarball from CPAN.
We were able to reduce the long runtime on one particular application
by 20% (about a minute) after the very first test run of our profiler. We
hope that you will find our tool as useful as we have. Of course, any comments
and suggestions are welcome!
| www.bbc.co.uk/blogs/radiolabs
Like most organisations the BBC has its own technical ecosystem; the BBC's
is pretty much restricted to Perl and static files. This means that the vast
majority of the BBC's website is statically published - in other words HTML
is created internally and FTP'ed to the web servers. There are then a range
of Perl scripts that are used to provide additional functionality and interactivity.
While there are some advantages to this ecosystem there are also some obvious
disadvantages. And a couple of implication, including an effective hard limit
on the number of files you can save in a single directory (many older, but still
commonly used, filesystems just scan through every file in a directory to find
a particular filename so performance rapidly degrades with thousands, or tens
of thousands, of files in one directory), the inherent complexity of keeping
the links between pages up to date and valid and, the sheer number of static
files that would need to be generate to deliver the sort of aggregation pages
we wanted to create when we
launched /programmes; let alone our plans for /music and personalisation.
What we wanted was a dynamic publishing solution - in other words the ability
to render webpages on the fly, when a user requests them. Now obviously there
are already a number of existing frameworks out there that provide the sort
of functionality that we needed, however none that provided the functionality
and that could be run on the BBC servers. So we (the Audio and Music bit of
Future Media and Technology - but more specifically Paul,
Duncan, Michael and Jamie) embarked on building a
Model-view-controller (MVC) framework in Perl.
For applications that run internally we use
Ruby on Rail. Because we enjoy using it, its fast to develop with, straight
forward to use and because we use it (i.e. to reduce knowledge transfer and
training requirements) we decided to follow the same design patterns and coding
conventions used in Rails when we built our MVC framework. Yes that's right
we've built Perl on Rails.
This isn't quite as insane as it might appear. Remember that we have some
rather specific non-functional requirements. We need to use Perl, there are
restrictions on which libraries can and can't be installed on the live environment
and we needed a framework that could handle significant load. What we've built
ticks all those boxes. Our benchmarking figures point to significantly better
performance than Ruby on Rails (at least for the applications we are building),
it can live in the BBC technical ecosystem and it provides a familiar API to
our web development and software engineering teams with a nice clean separation
of duties with rendering completely separated from models and controllers.
Using this framework we have launched
/programmes. And because the pages are generated dynamically we can aggregate
and slice and dice the content in
interesting
ways. And nor do we have to sub divide our pages into arbitrary directories
on the web server - the BBC broadcasts about 1,400 programmes a day which means
if we created a single static file for each episode we would start to run into
performance problems within a couple of weeks.
Now since we've gone to the effort of building this framework and because
it can be used to deliver great, modern web products we want to use it elsewhere.
As I've written about
elsewhere we are working on building an enhanced
music site built around a
MusicBrainz spine. But that's just my department - what about the rest of
the BBC?
In general the BBC's Software Engineering community is pretty good at sharing
code. If one team has something that might be useful elsewhere then there's
no problem in installing it and using it elsewhere. What we're not so good at
is coordinating our effort so that we can all contribute to the same code base
- in short we don't really have an open source mentality between teams - we're
more
cathedral and less bazaar even if we freely let each other into our cathedrals.
With the Perl on Rails framework I was keen to adopted a much more open source
model - and actively encouraged other teams around the BBC to contribute code
- and that's pretty much what we've done. In the few weeks since the
programmes beta launch
JSON and
YAML views have been written - due to go live next month. Integration with
the BBC's centrally managed controlled vocabulary - to provide accurate term
extraction and therefore programme aggregation by subject, person or place -
is well underway and should be with us in the new year. And finally the
iPlayer team are building the next
generation iPlayer browse using the framework. All this activity is great news.
With multiple teams contributing code (rather than forking it) everyone benefits
from faster development cycles, less bureaucracy and enhanced functionality.
Comments
======
- At 12:37 AM on 01 Dec 2007,
Anonymous Perl Lover wrote:
Any reason U didn't use Catalyst, Maypole, Combust, CGI::Application,
CGI::Prototype, or any of the dozens of other perl MVC frameworks?
Catalyst was around long before Ruby on Rails (possibly before the Ruby
language for that matter), but never made the kind of headlines RoR gets.
The Ruby community seems to be much better at mobilizing.
Actually, I think it's the Perl community's TMTOWTDI lifestyle. In Ruby,
for small things *maybe* U use Camping, but you'll probably use Rails and
for everything else you'll definitely use Rails. There are some others,
but only the developers of them use them. In Perl, literally everyone writes
their own.
Inferior languages like Java and C# rose up real quick and stayed there--keep
getting bigger even--because they limit their users' choices. Perl stayed
in the background and is now dying because it believes in offering as many
choices as possible. That's why Perl 666 is going to be more limiting. As
U can tell from my subtle gibe that the next version of Perl is evil, I
prefer choices. But developers like me are a dying breed.
Developers now-a-days need cookie cutter, copy&paste code. When's the
Perl on Rails book going to be released? Probably around the time the Catalyst
one is. Or the CGI::Application one.
Bleh. I wrote way too much. U can't even put this up now, it's too long.
I didn't realize I was so annoyed by the one jillion perl MVC web frameworks
and how they're just one tiny example of why perl is dead.
- At 01:20 AM
on 01 Dec 2007, Anon wrote:
> The Ruby community seems to be much better at
> mobilizing.
I really think that first video demo of RoR using Textmate is what had
a large effect. Before that, I don't remember seeing hardly any videos of
development happening right in front of your eyes.
You watched the video thinking, "wow! it's so fast and easy! I'm gonna
get in on that!". When, in reality, any good programmer using any good environment
can make a software look good like that (if they practice a bit beforehand).
As an aside, anyone know of a video demo podcast for Catalyst?
- At 07:14 AM
on 01 Dec 2007,
Dave Cross wrote:
Others have already commented that you seem to be reinventing the wheel
here. No-one seems to have mentioned the Perl MVC framework which (in my
opinion) is most like Rails - it's called Jifty (http://jifty.org/).
But there already parts of the BBC who are using Catalyst with great
success. Why didn't you ask around a bit before embarking on what was presumably
not a trivial task?
- At
09:52 AM on 01 Dec 2007,
Raips wrote:
How about BBC doing same as New York Times did?
http://www.linux.com/feature/120359
http://open.nytimes.com/
Complain
about this post
- Perl 5 by Example
Online Perl Book. 22 chapters with appendixes.
- Beginning Perl
Very complete (and completely free) Perl Beginners book, both HTML and downloadable
(PDF).
- Practical mod_perl This free
perl book is available in html or pdf versions, so you can view the perl book
online or download this free book.
- Extreme Perl Extreme
Perl is a book about Extreme Programming using the programming language Perl.
This free Perl ebook is available in HTML, PDF, or A4 PDF.
- Learning Perl the Hard Way
Learning Perl the Hard Way is a free book available under the GNU Free Documentation
License. This free perl ebook can be downloaded in pdf or gzipped postscript
format.
- Web Client Programming
with Perl Free Online Perl Book
- The Perl Reference Guide
The guide contains a concise description of all Perl 5 statements, functions,
variables and lots of other useful information.
- Perl Reference
Guide & Perl Pocket Reference (PDF Link) Short Perl reference book
in pdf form.
- CGI Programming on the World
Wide Web This is an out of print book from 1996 that is available
from Oreilly.
- Beginning
Perl for Bioinformatics (Sample Chapter) GenBank (Genetic Sequence
Data Bank) is a rapidly growing international repository of known genetic sequences
from a variety of organisms. Its use is central to modern biology and to bioinformatics.
- CGI Programming
with Perl, 2nd Edition (Sample Chapter) Security.
- Advanced
Perl Programming (Sample Chapter) Chapter 1: Data References and Anonymous
Storage
-
Programming Web Services with Perl (Sample Chapter) One chapter on
Soap.
- Oreilly Sample Chapters
Quite a few sample chapters from perl books are indexed here (some have already
been linked to individually).
Wendy is Perl framework for Web sites and services development. It works
with mod_perl 2 and PostgreSQL. Built with security and performance in mind,
Wendy supports DB servers clustering, separate read- and write- DB back-ends,
data cache with memcached, templates cache, etc.
Release focus: Initial freshmeat announcement
My favorite (so far) programming language has been
born 20 years ago. It’s been loved and hated. It’s been
praised and damned. It’s been complimented and criticized. But all
that doesn’t matter. What matters is that it has been helping people all
over the world to solve problems. Tricky, boring, annoying problems.
It provided enough power to build enterprise grade applications, while still
being easy and flexible enough to be the super-glue of many systems.
I’m sure Perl will still be with us in another 20 years. I wish it
to be as useful in that time, as it is now.
Thanks, respect, and best wishes to everyone who created and supported Perl,
its community and tools all these years. Happy birthday!
pixconv.pl is a Perl script to rename (yyyymmdd_nnn.ext), (auto-)rotate, resize,
scale, grayscale, watermark, borderize, and optimize digital images.
Release focus: Major feature enhancements
Changes:
-b/-B border and -C border color options were added along with a -m match images
orientation (landscape or portrait) option. EXIF manipulation was fixed. A -R
resize option was added for correctly resizing portrait images. Handling of
images with whitespace in their filename was fixed
Author:
Iain Lea
[contact developer]
Three free Perl e-books
"Learning Perl the Hard Way"
|
http://www.greenteapress.com/perl/ |
|
Free eBook: "Learning Perl the Hard Way"
by Allen B. Downey, is designed for programmers who do not know Perl. Open source
book available under the GNU Free Documentation License. Users can distribute,
copy and modify the content. |
| |
|
"Extreme Perl"
|
|
http://www.extremeperl.org/bk/home
|
|
Free eBook: "Extreme Perl" by Robert
Nagler. Covers extreme programming (an approach to software development
that emphasizes business results and involves rapid iteration, code
writing and continuous testing), release planning, iteration planning,
pair programming, tracking, acceptance testing, coding style, logistics,
test-driven design, continuous design, unit testing, refactoring and
SMOP. |
| |
|
|
"Beginning Perl"
|
|
http://learn.perl.org/library/beginning_perl/
|
|
Free eBook: "Beginning Perl" by
Simon Cozens. Fourteen chapter book covers simple values, lists and
hashes, loops and decisions, regular expressions, files and data, references,
subroutines, running and debugging in Perl, modules, object-oriented
Perl, CGI, databases and more.
|
|
Indexer of file tree written in Perl. Looks like limited to HTML files but can
probably be extended to other types
About: Kazi is a simple content management system.
It takes a directory tree populated with HTML files,
and builds a menu of it. It can be extended with modules and
customized with templates.
Host Grapher is a very simple collection of Perl scripts that provide graphical
display of CPU, memory, process, disk, and network information for a system.
There are clients for Windows, Linux, FreeBSD, SunOS, AIX and Tru64. No socket
will be opened on the client, nor will SNMP be used for obtaining the data.
Perltidy is a Perl script indenter and beautifier. By default it approximately
follows the suggestions in perlstyle(1), but the style can be adjusted with
command line parameters. Perltidy can also write
syntax-colored HTML output.
Release focus: Minor feature enhancements
XHTML Family Tree Generator is a CGI Perl script together with some Perl
modules that will create views of a family tree. Data can be stored in a database
or in a data file. The data file is either a simple text (CSV), an Excel, or
GEDCOM file listing the family members, parents, and other details. It is possible
to show a tree of ancestors and descendants for any person, showing any number
of generations. Other facilities are provided for showing email directories,
birthday reminders, facehall, and more. It has a simple configuration, makes
heavy use of CGI (and other CPAN modules), generates valid XHTML, and has support
for Unicode and multiple languages.
Release focus: N/A
Changes:
Romanian language support has been added, and the code has been cleaned up.
Sman is "The Searcher for Man Pages", an enhanced version of "apropos" and
"man -k". Sman adds several key abilities over its predecessors, including stemming
and support for complex boolean text searches such as "(linux and kernel) or
(mach and microkernel)". It shows results in a ranked order, optionally with
a summary of the manpage with the searched text highlighted. Searches may be
applied to the manpage section, title, body, or filename. The complete contents
of the man page are indexed. A prebuilt index is used to perform fast searches.
PodBrowser is a documentation browser for Perl. It can be used to view the
documentation for Perl's builtin functions, its "perldoc" pages, pragmatic modules,
and the default and user-installed modules. It supports bookmarks, printing,
and integration with the CPAN search site.
With Config::General you can read and write config files and access the parsed
contents from a hash structure. The format of config files supported by Config::General
is inspired by the Apache config format (and is 100% compatible with Apache
configs). It also supports some enhancements such as here-documents, C-style
comments, and multiline options.
Release focus: Major bugfixes
Changes:
The variable interpolation code has been rewritten. This fixes two bugs. More
checks were added for invalid structures. More tests for variable interpolation
were added to "make test".
rshall
Runs commands on multiple remote hosts simultaneously. (Perl)
View the README
Download version 11.0 -
gzipped tarball, 9 KB
Last update: November 2005
autosync
Copies files to remote hosts based on a configuration file. (Perl)
View the README
Download version 1.4 -
gzipped tarball, 5 KB
Last update: April 2007
"In February, ActiveState
released a free version of its flagship Komodo IDE called Komodo Edit, and that
release was a prelude to going open source. Open Komodo is only a subset of Edit,
though. "
September 6, 2007
Komodo Spawns New Open Source IDE Project
By
Sean Michael Kerner
Development tools vendor ActiveState is opening up parts of its Komodo IDE
(define)
in a new effort called Open Komodo.
Komodo is a Mozilla Framework-based application that uses Mozilla's XUL (XML-based
User Interface Language), which is Mozilla's language for creating its user
interface.
The Open Komodo effort will take code from ActiveState's freely available,
but not open source, Komodo Edit product and use it as a base for the new open
source IDE. The aim is to create a community and a project that will help Web
developers to more easily create modern Web-based applications.
"This is our first entry into managing an open source project," Shane Caraveo,
Komodo Dev Lead, told Internetnews.com. "We want to start with a tight
focus on what we want to accomplish and that focus is supporting the Open Web
with a development environment."
Caraveo explained that back in February, ActiveState released a free version
of its flagship Komodo IDE called Komodo Edit, and that release was a prelude
to going open source. Open Komodo is only a subset of Edit, though.
"We're focusing first strictly on Web development," Caraveo said. "So some
of the language support for backend dynamic languages will not be available
as open source. They will still be available for free in Edit and possibly as
extensions to Open Komodo."
The idea behind creating a fully open source IDE for Web development has
been percolating for over a year at ActiveState, according to Caraveo. He said
there are also a lot of people in the Mozilla community that have been discussing
the creation of an IDE.
"I feel there is no need for them to start from nothing, which is a large
investment," Caraveo said. "Since we were a couple months from having everything
done, I felt it was a good time to announce, so we can start to talk with people
in the community about Komodo from a standpoint that they are willing to work
with."
A build of the Open Komodo code base that actually works is expected by late
October or early November. That build according to Caraveo will look and work
much like Komodo Edit does now.
"We want to be sure that people have something they can play with and actually
use immediately, even if it is not the product we want in the end," Caraveo
said.
The longer-term project is something called Komodo Snapdragon. The intention
of Snapdragon is to provide a top-quality IDE for Web development that focuses
on open technologies, such as AJAX, HTML/XML, JavaScript and more.
"We want to provide tight integration into other Firefox-based development
tools as well," Caraveo explained. "This would target Web 2.0 applications,
and next-generation Rich Internet Applications."
With many IDEs already out in a crowded marketplace for development tools,
Open Komodo's use of Mozilla's XUL (pronounced "zule") may well be its key differentiators.
"A XUL-based application uses all the same technologies that you would use
to develop an advanced Web site today," Caraveo said.
"This includes XML, CSS and JavaScript. This type of platform allows people
who can develop Web site to develop applications. So, I would say that this
is an IDE that Web developers can easily modify, hack, build, extend, without
having to learn new languages and technologies."
Being open and accessible are critical to the success of Open Komodo; in
fact Caraveo noted that the No. 1 success factor is community involvement.
"If Snapdragon is only an ActiveState project, then it has not succeeded
in the way we want it to."
See also Manning Minimal Perl
Only on author site
Jun 26 1996 (Dan Connolly)Sorry for the length, but I felt
inspired tonight...
In article <TMB.96Jun17182
...@best.best.com>
t
...@best.com
(.) writes:
>
> In article <Pine.SUN.3.93.960617173341.9643A-100...@blackhole.dimensional.com>
Kirk Haines <oshcn...@dimensional.com>
writes:
>
> > Well it's probably just my stupidity
> > (and that of everyone else who works here) but I've
got about 50 Perl
> > scripts that do god knows what, and the people who wrote
them left,
> > and we are experiencing excruciating pain.
>
> And that is not a situation in the least bit related to
Perl. That is the
> fault of whoever wrote those scripts [...]
> Of course it is _related_ to Perl. Yes, you can write better
or worse
> Perl code.
> In fact, one way management can bring about good coding styles without
> examining each and every line of code is by choosing tools and > languages
that enforce some aspects of good coding styles. Perl isn't > one
of those languages.
Short form: (1) there's a tension between early detection of faults and rapid
prototyping, and perl and python are at very different points on the spectrum.
(2) it's more the community around a language than the language itself that
influences code quality. (3) For my purposes, perl will continue to be a work-horse
tool, but I'll be using Java more for things that I would have used python or
Modula-3 for, and I hope the industry uses Java for things that it has been
using C++ for.
Long form:
(1) Traditional perl programming is a black art, but a darned useful craft
as well. The semantics are very powerful, and the syntactic features combine
in amazingly powerful ways. But you definitely have enough rope to hang yourself;
not enough to hang the machine or crash all the time, like the way you can corrupt
the runtime in C by writing past the end of an array or calling free() twice.
But like C, you can introduce subtle logic bugs by using = where you meant ==.
And failing to check return values results in a program that nearly always reports
successful completion, whether it really succeeded or not.
I like studying and learning programming languages, and I found it more difficult
to build the necessary intuitions to read and write traditional perl programs
than to build intuitions for any language I leaned previously, and nearly every
language I learned since.
I learned perl "from a master" -- Tom Christiansen was in the next office,
and he painstakingly (if not patiently :-) answered my many frustrated questions.
Previous to learning perl, I had learned a dozen or so languages without much
difficulty (here in roughly the order I learned them):
Extended BASIC (Radio Shack Color
Computer)
learned from a book, disassembled the interpreter
6809 assembler
learned from a book with a friend, and from
disassembling LOTS of stuff
Basic09
learned from the manuals, with help from BBS folks
Logo
learned in a store one afternoon, reading a book
Pascal
learned one summer from a college professor
C
read a book one weekend COBOL
learned at a summer job shell
misc. hacking in school LISP
read some books, hacked on TI lisp machine in a class
prolog
programming languages course
Modula-2
programming languages course
Ada
programming languages course
Learning assembly after basic was tough: "Where are the variables? Geez..
rebooting the machine all the time is a pain. I wish this thing had automatic
string handling." And I'm not sure I ever grokked Ada's rendezvous stuff completely.
And I learned COBOL in a strictly monkey-see-monkey-do manner. It was months
before I found a manual.
None of those were particularly unexpected difficulties. But after an intial
taste of perl, it looked really easy and powerful, and I was frustrated when
the first few real programs I tried to write had bugs that I just could not
figure out at all.
Really learning to use regexps was well worth the effort, but things like
"surprise! <FILE> works completely different in an array context!"
was an experience I don't care to repeat. I can't remember the exact program
that drove me batty, but it was related to:
$x = <FILE>; # reads one line
@x = <FILE>; # reads whole file, split into lines
but the idiom I used that created an array context wasn't as transparent
as @x -- it was something like grep() or chop(). Ah yes, I think it was chop().
Who would have guessed that
chop(<XXX>);
would read thew whole file?
Perl is full of short-hand idioms that are so useful that knowledgeable perl
programmer's would feel awkward writing them out long-hand, and yet they can
throw newbies for a loop. For example, the work-horse idiom:
while(<>){
...; }
is short for:
while( ($_ = <STDIN>) gt ''){
...; }
roughly speaking; that is, ignoring the tremendously useful feature of <>
which processes files mentioned on the command line (aka @ARGV) ala traditional
unix filters, which would take me about 10 or 20 lines to write out longhand,
and about an hour to get just right. Ah! and I forgot to mention that <XXX>
is an idiom for reading one line from a file... and lines are delimited by the
magic $/, and ...
The point is that even as of several years ago, perl is a highly-evolved,
highly idiomatic language and tool, based on zillions of person-years of use
in unix system administration. The vast majority of text-processing/system management
tasks that folks might want to hack up a script to tackle can be developed quickly,
expressed succincly, and run efficiently in perl.
The first crack usually looks like:
while(<>){
if(/X-Diagnostic: (.*)/){
print "diagnostic: $1\n";
} }
and it usually works great the first time you try it. Then you add a few
wrinkles, and before you know it, the task you set out to do is solved.
Taking that piece of code that solves a particular problem, and software-engineering
it usually takes about 10x longer than it took to develop in the first place
(as these tasks are often personal and transient, it's rarely worth the trouble
anyway).
The author of the hack is generally in a position to restrict the inputs
to reasonable stuff (eliminating the need to deal with corner cases) and check
the output by hand (eliminating the need to document and report errors in typical
engineering fashion).
This is very much in contrast with other languages, where the cost of solving
the immediate problem may be significantly higher, but the result is much more
likely to have good software engineering characteristics, such that it's useful
to other folks or other projects with little added effort.
For example, Olin Shivers described his experience writing ML programs: they
are a royal pain to get through the type checker, but once they compile, they
are often bug-free.
Python isn't that far along the quick-and-dirty vs. slow-and-clean spectrum,
but it's in that direction.
Contrast the work-horse example above with a loose translation to python:
import sys
while 1:
line = sys.stdin.readline()
if not line: break
...
incorporating the @ARGV parts of <> would expand it to something like:
import sys
for f in sys.argv[1:]:
in = open(f)
while 1:
line = sys.stdin.readline()
if not line: break
...
Python doesn't have special syntax for this sort of thing. So the python
code is more verbose and less idiomatic -- easier to grok for the newbie, but
harder to "pattern match," or recognize as a common idiom for the seasoned programmer.
For an example of the stylistic slants of the two languages, consider error/exception
conditions. As a rule, in perl, errors are reported as particular return values,
whereas in python, they signal exceptions. So in the error case, a perl code
fragment will run merrily along, while a python code fragment will trap out.
In many text-processing tasks, running merrily along is just what you want.
But when you hand that code to your friend, and he presents it with some input
that you never considered, python is a lot more likely to let your friend know
that the program needs to be enhanced to handle the new situation.
I've seen exception idioms in perl, but they involve die and EVAL. The runtime
libraries don't die on errors, as a rule, and EVAL is a pretty hairy way to
do something as mundane as error handling.
Next, consider naming and scope. By default, perl variables are global, so
you almost never have to declare them. Local variables have dynamic scope by
default (ala early lisp systems) and traditional statically scoped variables
are a perl5 innovation.
On the flip side, python variables are local by default, so you almost never
have to worry about the variable clobbering problem. (python has some
semantic gotchas of its own here for the folks who have intuitions about traditional
static scoping)
So far, I have discussed mostly the intrinsic aspects of a language that
vary along the quick-and-dirty vs. slow-and-clean spectrum.
But the point of this article is that:
(2) The comunity around a language -- i.e. the conventional wisdom, history,
documentation, and available source code -- has a lot more influence of the
quality of code developed in a given language than the intrinsic aspects of
the language itself.
For example, it's perfectly possible to write clean, well structured programs
in Fortran. But the bulk of traditional fortran has no comments or indentation,
and lots of GOTOs, global variables, and aliased variables. The mindset behind
fortran was that hand-optimization was superior to machine-optimization -- a
mindset left over from assembler, and popularized by bad compilers.
COBOL has some really bad features (e.g. lack of local variables) that make
writing good programs hard, but don't come close to explaining the astoundinly
uninspired programming techniques I've seen employed in some business/database
apps I've seen. Stuff like writing 12 paragraphs (subroutines, or functions
to the modern world) -- one for each month of the year, with 12 sets of variables
jan-X, jan-Y, feb-X, feb-Y, etc., rather than using loops and arrays, which
DO exist in COBOL.
Perl, as a language, is evolving faster than the perl development community.
Perl5 in strict mode a reasonable modern object-oriented programming language.
But there are ZILLIONs of perl programmers, and from what I can see, about 2%
of them bought into the new facilities. The rest of them are still happily getting
their jobs done writing perl4 code -- myself included.
Perl was useful and widely deployed before the OOP "paradigm-shift" hit the
industry. And a community with that much momentum doesn't turn on a dime.
In contrast, python started from scratch after some earlier languages, and
had the benefit of looking back at REXX, icon, and perl, as well as C++ and
-- most importantly -- Modula-3. So documentation encouraged some pretty modern
concepts like objects and modules while the python development community was
still young.
As a result, consider the namespace of functions in the two systems: the
languages have roughly equivalent support: python has modules, and perl has
packages. But you might not know that from looking at most of the code you see
on the net: traditional perl folks rarely use the $package`var stuff, while
python folks use it routinely. The perl5 movement is quickly changing this,
but until recently, perl programmers use the vast majority of perl's facilities
without ever considering packages, while python programmers run into the concept
of modules in the early tutorials.
For me, the bottom line is that I do a lot of quick-and-dirty stuff, and
I'm comfortable with perl4's idioms, so I use it a lot. I have dabbled in perl5,
but I'm not yet comfortable with it's OOP idioms.
I prefer the feel and syntax of python, but the "strictness" often gets in
the way, and I end up switching to perl in order to finish the task before leaving
for the day.
When I want to write "correct" programs, neither is good enough. I want lots
more help from the machine, like static typechecking. And sad to say, when I
want to write code that other folks will use, I choose C.
As much as the industry adopted C++, I find it frightening. It requires all
the priestly knowledge and incantations of perl with none of the rapid-prototyping
benefits, gives no more safety guarantees than C, and has never been specified
to my satisfaction.
Modual-3 was more fun to learn than I had had in years. The precision, simplicity,
and discipline employed in the design of the language and libraries is refreshing
and results in a system with amazing complexity management characteristics.
I have high hopes for Java. I will miss a few of Modula-3's really novel
features. The way interfaces, generics, exceptions, partial revelations, structural
typing + brands come together is fantastic. But Java has threads, exceptions,
and garbage collection, combined with more hype than C++ ever had.
I'm afraid that the portion of the space of problems for which I might have
looked to python and Modula-3 has been covered -- by perl for quick-and-dirty
tasks, and by Java for more engineered stuff. And both perl and Java seem more
economical than python and Modula-3.
Dan --
Daniel W. Connolly "We believe in
the interconnectedness of all things"
Research Scientist, MIT/W3C PGP: EDF8 A8E4 F3BB 0F3C
FD1B 7BE0 716C FF21
<conno...@w3.org>
http://www.w3.org/pub/WWW/People/Connolly/
DocPerl provides a Web-based interface to Perl's Plain Old Documentation (POD).
It is a graphical, easy-to-use interface to POD, automatically listing all installed
modules on the local host, and any other nominated directories containing Perl
files. DocPerl can also display a summary of the APIs defined by files and the
code of those files. It can search the POD documentation for module names and
for functions defined in modules.
Release focus: Minor bugfixes
Changes:
This release includes fixes for many minor bugs, including the removal of a
configuration option that should not have been removed, and many JavaScript
issues. The code has been tidied up.
Perl Dev Kit 7.0 released...
The Perl Dev Kit (PDK) provides essential tools for building self-contained,
easily deployable executables for Windows, Mac OS X, Linux, Solaris, AIX, and
HP-UX. The comprehensive feature set includes a graphical debugger and code
coverage and hotspot analyzer, as well as tools for building sophisticated Perl-based
filters and easily converting useful VBScript code to Perl.
Release focus: Major feature enhancements
Changes:
A coverage and hotspot analyzer tool, PerlCov, was added for better code performance
and reliability. PerlApp was improved with more sophisticated module wrapping
to improve executable performance. By popular demand, PDK support has been extended
to Mac OS X. New native 64-bit support was dded for Windows (x64), Linux (x64),
and Solaris (Sparc). New Solaris and AIX GUIs were added.
Author:
Activator
[contact developer]
On this page, I will post aides and tools that Perl provides which allow you
to more efficently debug your Perl code. I will post updates as we cover material
necessary for understanding the tools mentioned.
CGI::Dump
Dump is one of the functions exported in CGI.pm's :standard
set. It's functionality is similar to that of Data::Dumper.
Rather than pretty-printing a complex data structure, however, this module
pretty-prints all of the parameters passed to your CGI script. That is to
say that when called, it generates an HTML list of each parameter's name
and value, so that you can see exactly what parameters were passed to your
script. Don't forget that you must print the return value of this function
- it doesn't do any printing on its own.use CGI qw/:standard/;
print Dump;
Benchmark
- As you know by now, one of Perl's mottos is "There's More Than One Way
To Do It" (TMTOWTDI ©). This is usually a Good Thing, but can occasionally
lead to confusion. One of the most common forms of confusion that Perl's
verstaility causes is wondering which of multiple ways one should use to
get the job done most quickly.
Analyzing two or more chunks of code to see how they compare time-wise
is known as "Benchmarking". Perl provides a standard module that will Benchmark
your code for you. It is named, unsurprisingly, Benchmark.
Benchmark provides several helpful subroutines, but the most
common is called cmpthese(). This subroutine takes two arguments:
The number of iterations to run each method, and a hashref containing the
code blocks (subroutines) you want to compare, keyed by a label for each
block. It will run each subroutine the number of times specified, and then
print out statistics telling you how they compare.
For example,
my solution
to ICA5 contained
three different ways of creating a two dimensional array. Which one of these
ways is "best"? Let's have Benchmark tell us:
#!/usr/bin/perl
use strict;
use warnings;
use Benchmark 'cmpthese';
sub explicit {
my @two_d = ([ ('x') x 10 ],
[ ('x') x 10 ],
[ ('x') x 10 ],
[ ('x') x 10 ],
[ ('x') x 10 ]);
}
sub new_per_loop {
my @two_d;
for (0..4){
my @inner = ('x') x 10;
push @two_d, \@inner;
}
}
sub anon_ref_per_loop {
my @two_d;
for (0..4){
push @two_d, [ ('x') x 10 ];
}
}
sub nested {
my @two_d;
for my $i (0..4){
for my $j (0..9){
$two_d[$i][$j] = 'x';
}
}
}
cmpthese (10_000, {
'Explicit' => \&explicit,
'New Array Per Loop' => \&new_per_loop,
'Anon. Ref Per Loop' => \&anon_ref_per_loop,
'Nested Loops' => \&nested,
}
);
The above code will print out the following statistics (numbers may be slightly
off, of course):
Benchmark: timing 10000 iterations of Anon. Ref Per Loop, Explicit, Nested Loops, New Array Per Loop...
Anon. Ref Per Loop: 2 wallclock secs ( 1.53 usr + 0.00 sys = 1.53 CPU) @ 6535.95/s (n=10000)
Explicit: 1 wallclock secs ( 1.24 usr + 0.00 sys = 1.24 CPU) @ 8064.52/s (n=10000)
Nested Loops: 4 wallclock secs ( 4.01 usr + 0.00 sys = 4.01 CPU) @ 2493.77/s (n=10000)
New Array Per Loop: 2 wallclock secs ( 1.76 usr + 0.00 sys = 1.76 CPU) @ 5681.82/s (n=10000)
Rate Nested Loops New Array Per Loop Anon. Ref Per Loop Explicit
Nested Loops 2494/s -- -56% -62% -69%
New Array Per Loop 5682/s 128% -- -13% -30%
Anon. Ref Per Loop 6536/s 162% 15% -- -19%
Explicit 8065/s 223% 42% 23% --
The benchmark first tells us how many iterations of which subroutines
it's running. It then tells us how long each method took to run the given
number of iterations. Finally, it prints out the statistics table, sorted
from slowest to fastest. The Rate column tells us how many
iterations each subroutine was able to perform per second. The remaining
colums tells us how fast each method was in comparison to each of the other
methods. (For example, 'Explicit' was 223% faster than 'Nested Loops', while
'New Array Per Loop' is 13% slower than 'Anon. Ref Per Loop'). From the
above, we can see that 'Explicit' is by far the fastest of the four methods.
It is, however, only 23% faster than 'Ref Per Loop', which requires far
less typing and is much more easily maintainable (if your boss suddenly
tells you he'd rather have the two-d array be 20x17, and each cell init'ed
to 'X' rather than 'x', which of the two would you rather had been used?).
You can, of course, read more about this module, and see its other options,
by reading: perldoc Benchmark
- Command-line options
- Perl provides several command-line options which make it possible to
write very quick and very useful "one-liners". For more information on all
the options available, refer to
perldoc perlrun
-e
- This option takes a string and evaluates the Perl code within. This
is the primary means of executing a one-liner
perl -e'print qq{Hello World\n};'
(In windows, you may have to use double-quotes rather than single. Either
way, it's probably better to use q// and qq// within your one liner,
rather than remembering to escape the quotes).
-l
- This option has two distinct effects that work in conjunction. First,
it sets $\ (the output record terminator) to the current value of $/
(the input record separator). In effect, this means that every print
statement will automatically have a newline appended. Secondly, it auto-chomps
any input read via the <> operator, saving you the typing necessary
to do it.
perl -le 'while (<>){ $_ .= q{testing}; print; }'
The above would automatically chomp $_, and then add the newline back
on at the print statement, so that "testing" appears on the same line
as the entered string.
-w
- This is the standard way to enable warnings in your one liners.
This saves you from having to type
use warnings;
-M
- This option auto-
uses a given module.
perl -MData::Dumper -le'my @foo=(1..10); print Dumper(\@foo);'
-n
- This disturbingly powerful option wraps your entire one-liner in
a
while (<>) { ... } loop. That is, your one-liner will
be executed once for each line of each file specified on the command
line, each time setting $_ to the current line and $. to current line
number.
perl -ne 'print if /^\d/' foo.txt beta.txt
The above one-line of code would loop through foo.txt and beta.txt,
printing out all the lines that start with a digit. ($_ is assigned
via the implicit while (<>) loop, and both print and m//
operate on $_ if an explict argument isn't given).
-p
- This is essentially the same thing as
-n, except that
it places a continue { print; } block after the while
(<>) { ... } loop in which your code is wrapped. This is useful
for reading through a list of files, making some sort of modification,
and printing the results.
perl -pe 's/Paul/John/' email.txt
Open the file email.txt, loop through each line, replacing any instance
of "Paul" with "John", and print every line (modified or not) to STDOUT
-i
- This one sometimes astounds people that such a thing is possible
with so little typing. -i is used in conjunction with either -n or -p.
It causes the files specified on the command line to be edited "in-place",
meaning that while you're looping through the lines of the files, all
print statements are directed back to the original files. (That goes
for both explicit
prints, as well as the print
in the continue block added by -p.)
If you give -i a string, this string will be used to create a back-up
copy of the original file. Like so:
perl -pi.bkp -e's/Paul/John/' email.txt msg.txt
The above opens email.txt, replaces each line's instance of "Paul" with
"John", and prints the results back to email.txt. The original email.txt
is saved as email.txt.bkp. The same is then done for msg.txt
Remember that any of the command-line options listed here can also be
given at the end of the shebang in non-oneliners. (But please do not start
using -w in your real programs - use warnings; is still preferred
because of its lexical scope and configurability).
Data::Dumper
- The standard Data::Dumper module is very useful for examining exactly
what is contained in your data structure (be it hash, array, or object (when
we come to them) ). When you
use this module, it exports one
function, named Dumper. This function takes a reference to
a data structure and returns a nicely formatted description of what that
structure contains.
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my @foo = (5..10);
#add one element to the end of the array
#do you see the error?
$foo[@foo+1] = 'last';
print Dumper(\@foo);
When run, this program shows you exactly what is inside @foo:
$VAR1 = [
5,
6,
7,
8,
9,
10,
undef,
'last'
];
(I know we haven't covered references yet. For now, just accept my assertion
that you create a reference by prepending the variable name with a backslash...)
__DATA__ & <DATA>
- Perl uses the __DATA__ marker as a pseudo-datafile. You can use this
marker to write quick tests which would involve finding a file name, opening
that file, and reading from that file. If you just want to test a piece
of code that requires a file to be read (but don't want to test the actual
file opening and reading), place the data that would be in the input file
under the __DATA__ marker. You can then read from this pseudo-file using
<DATA>, without bothering to open an actual file:
#!/usr/bin/env perl
use strict;
use warnings;
while (my $line = <DATA>) {
chomp $line;
print "Size of line $.: ", length $line, "\n";
}
__DATA__
hello world
42
abcde
The above program would print:
Size of line 1: 11
Size of line 2: 2
Size of line 3: 5
$.
- The
$. variable keeps track of the line numbers of the
file currently being processed via a while (<$fh>) { ... }
loop. More explicitly, it is the number of the last line read of the last
file read.
__FILE__ & __LINE__
- These are two special markers that return, respectively, the name of
the file Perl is currently executing, and the Line number where it resides.
These can be used in your own debugging statements, to remind yourself where
your outputs were in the source code:
print "On line " . __LINE__ . " of file " . __FILE__ . ", \$foo = $foo\n";
Note that neither of these markers are variables, so they cannot be interpolated
in a double-quoted string
warn() & die()
- These are the most basic of all debugging techniques.
warn()
takes a list of strings, and prints them to STDERR. If the last element
of the list does not end in a newline, warn() will also print
the current filename and line number on which the warning occurred. Execution
then proceeds as normal.die() is identical to warn(),
with one major exception - the program exits after printing the list of
strings.
All debugging statements should make use of either warn()
or die() rather than print(). This will insure
you see your debugging output even if STDOUT has been redirected, and will
give you the helpful clues of exactly where in your code the warning occurred.
OK, this is starting to look ugly. Like a regex match, we can pull that apart
with a trailing x:
s/
(
^ # either beginning of line
| # or
(?<=,) # a single comma to the left
)
.*? # as few characters as possible
(
(?=,) # a single comma to the right
| # or
$ # end of string
)
/XXX/gx;
That's much easier to read (relatively speaking).
Like a regular expression match, we can use an alternate delimiter for the
left and right sides of the substitution:
$_ = "hello";
s%ell%ipp%; # $_ is now "hippo"
The rules are a bit complicated, but it works precisely the way Larry Wall wanted
it to work. If the delimiter chosen is not one of the special
characters that begins a pair, then we use the character twice more to both
separate the pattern from the replacement and to terminate the replacement,
as the example above showed.
However, if we use the beginning character of a paired character set (parentheses,
curly braces, square brackets, or even less-than and greater-than), we close
off the pattern with the corresponding closing character. Then,
we get to pick another delimiter all over again, using the same rules. For example,
these all do the same thing:
s/ell/ipp/;
s%ell%ipp%;
s;ell;ipp;; # don't do this!
s#ell#ipp#; # one of my favorites
s[ell]#ipp#; [] for pattern, # for replacement
s[ell][ipp]; [] for both pattern and replacement
s<ell><ipp>; <> for both pattern and replacement
s{ell}(ipp); {} for pattern, () for replacement
No matter what the closing delimiter might be for either the pattern or the
replacement, we can include the character literally by preceding it with a backslash:
$_ = "hello";
s/ell/i\/n/; # $_ is now "hi/no";
s/\/no/res/; # $_ is now "hires";
To avoid backslashing, pick a distinct delimiter:
$_ = "hello";
s%ell%i/n%; # $_ is now "hi/no";
s%/no%res%; # $_ is now "hires";
Conveniently, if a paired character is used, the pairs may be nested without
invoking any backslashes:
$_ = "aaa,bbb,ccc,ddd,eee,fff,ggg";
s((^|(?<=,)).*?((?=,)|$))(XXX)g; # replace all fields with XXX
Note that even though the pattern contains closing parentheses, they are all
paired with opening parentheses, so the pattern ends at the right place.
The right side of the substitution operation is generally treated as if it
were a double-quoted string: variable interpolation and backslash interpretation
is performed directly:
$replacement = "ipp";
$_ = "hello";
s/ell/$replacement/; # $_ is now "hippo"
The left side of a substitution is also treated as if it were
a double-quoted string (with a few exceptions), and this interpolation happens
before the result is evaluated as a regular expression:$pattern = "ell";
$replacement = "ipp";
$_ = "hello";
s/$pattern/$replacement/; # $_ is now "hippo"
Using this form of pattern, Perl is forced to compile the regular expression
at runtime. If this happens in a loop, Perl may need to recompile the regular
expression repeatedly, causing a slowdown. We can give Perl a hint that the
pattern is really a regular expression by using a regular expression literal:
$pattern = qr/ell/;
$replacement = "ipp";
$_ = "hello";
s/$pattern/$replacement/; # $_ is now "hippo"
The qr operation creates a Regexp object, which interpolates into
the pattern with minimal fuss and maximal speed.
A useful topic, especially about attachment sending.
If you need to send emails from a host but don't want to run
sendmail, this tech tip explains how to use Perl
to send emails. This procedure can be used on a host such as a Sun Fire V120
server running the Solaris 9 OS.
- Overview
- Sending Simple Email
- Required Perl Modules
- Simple Email Script:
email.pl
- Sending Email Attachments
- Required Perl Modules
- The Email Attachment Perl Script
- About the Author
CGI/Perl script for uploading files
Here's a small perl script that I have used for uploading files to a webserver.
The location can be changed .Rt now it saves the files to
/tmp/upload1
#!/usr/bin/perl
use CGI ;
my $query = new CGI;
print $query->header ( );
# Expects the client to sends the name of the file to be uploaded in an input
field "file"
my $filename=$query->param("file");
my $fpath1="/tmp/upload1/$filename";
open (UPLOADFILE,">$fpath1") || die "Cannot open file";
$filename =~ s/.*[\/\\](.*)/$1/;
my $upload_filehandle = $query->upload("file");
my $buf;
while (read($upload_filehandle,$buf,1024)) {
print UPLOADFILE $buf;
}
close UPLOADFILE;
#This has been tested on Solaris only
# Can be used to transfer binary files also
#For WINDOWS the BINMODE option may be needed
The table of contents, two sample chapters, and the index from Data Munging
with Perl are available in PDF format. You need Adobe's free Acrobat Reader
software to view it. You may download Acrobat Reader
here.
Download the Table of Contents
Download Chapter 2
Download Chapter 3
Download the Index
... ... ...
Source code from Data Munging with Perl is contained
in either a single ZIP file, or a Unix gzipped and tarred file archive.
Free unzip programs for most platforms are available at
Info-Zip.
Download the source code:
cross_src.zip (44 Kb)
or
cross_src.tar.gz (19 Kb)
Playing Chomp
by Gábor Szabó
Abstract
Though some of us might think so, chomp is not only a Perl function. It is also
the name of a NIM-like Combinatorial Game that was unsolved until recently.
It has a solution and implementation in Maple and I am writing an implementation
in Perl for educational and research purposes.
Introduction
When I went to high-school in the early 1980s in Budapest, Hungary, I used to
play a game with a class mate that we called eating chocolate. We actually did
not really play it as we knew that there was a winning strategy for the player
that moved first but we tried to find a mathematical description for that winning
strategy. For that I wrote several programs that would compute the winning positions
but we did not have any results.
A few years later I bought a book called "Dienes Professzor Játékai" [DIENES]
in Hungarian translation but actually I have looked only at a couple of pages
in the book until recently.
Then about a year ago I decided it is time to learn how to create and upload
a module to CPAN and as the explanation regarding how to get accepted in PAUSE
was rather discouraging I decided I try to play safe and start with a module
that probably no one else wants to develop but which can be nice to have on
CPAN: Games::NIM. I planned to develop the module to play the game and to calculate
the winning positions for NIM and later to extend to Chocolate. To my surprise
I got the access and uploaded version 0.01 in December 2001 and then it got
stuck at that version.
Now when I thought about attending YAPC::Europe::2002 I decided to renew the
work around Games::NIM and proposed a talk about complexity in algorithms in
connection to that module and another module called Array::Unique.
When the proposal got accepted I suddenly discovered that I have not much to
say about the subject and have to work really hard in order to give you something
worthwhile. So I started to work on Games::NIM again and read the book of Dienes
[DIENES] about games and another very useful one called
"Mastering Algorithms with Perl" [ALGORITHMS]. I suddenly
discovered that the game I knew as chocolate eating game is actually known as
Chomp and it is still basically unsolved. It all sounded very encouraging.
Welcome to the log4perl project page.
Log::Log4perl is a Perl port
of the widely popular log4j
logging package.
Logging beats a debugger if you want to know what's going
on in your code during runtime. However, traditional logging packages are too
static and generate a flood of log messages in your log files that won't help
you.
Log::Log4perl is different.
It allows you to control the amount of logging messages generated very effectively.
You can bump up the logging level of certain components in your software, using
powerful inheritance techniques. You can redirect the additional logging messages
to an entirely different output (append to a file, send by email etc.) -- and
everything without modifying a single line of source code.
Further reading
[Mar 25, 2006]
Beginning Perl
now available in eBook from Perl.com. this is a very good intro book!.
This is a great idea that might change the way UNIX is perceived (C-written
somewhat archaic system with non-uniform set of obscure command line utilities)
and used.
Perl/Linux is a Linux distribution where all programs are written in Perl.
The only compiled code in this Perl/Linux system is the Linux Kernel (not currently
built with this project), Perl, and uClibc.
About: Ryan's In/Out Board (formerly known as Whosin) is a simple
and quick Perl-driven Web-based in/out board for use on intranets and extranets.
Users can change their status by clicking their name or calling the script with
a name parameter, allowing for desktop shortcuts which give single click "check-in/out"
links. Custom and/or default comments can be added to their status. No database
system is required, you just need a Web server and Perl. A script to check all
staff out is also provided, which is handy if called as an overnight cron job.
It uses the Date::EzDate Perl module.
Changes: A few people were having problems with data files not being
written to. This version will print read/write errors to the browser if it encounters
them. It does not fix any read/write issues similar to the ones people were
experiencing, because there's nothing to fix as such. Those errors were related
to filesystem permissions and thus beyond the realm of the script.
About: otl is intended to convert a text file to a HTML or XHTML file.
It is different than many other text-to-HTML programs in that the input format
(by default a simple highly readable plain text format) can be customized by
the user, and the output format (by default XHTML) can be user-defined. It can
process complex structures such as ordered and unordered lists (nested or not),
and add custom "headers" and "footers" to documents. The conversion utilizes
Perl regex, adding quite a bit of flexibility and power to the conversion process.
Since both the syntax of the source file and of the output can be readily customized,
otl in theory can be used for many types of conversions. The package also includes
tag-remove, a script for stripping HTML/XHTML-ish tags from documents.
Changes: The "chempretty" script has been removed and replaced with
a more general script, "otlsub". With otlsub, you can perform a set of search/replace
operations on a set of files using a Perl regex for matching. otlsub supports
recursion, allowing you to descend through a directory tree and process all
files matching a filename pattern. otlsub automatically adjusts references to
local files in hyperlinks depending on directory depth. New otl features include
a --descend option (recursive descent through all subdirectories) and various
other minor modifications.
[Feb 28, 2006] Visual Python (Python), and Visual Perl (Perl) integrate
with Visual Studio 2005
Perl isn't the last, best programming language
you'll ever use for every task. (Perl itself is a C program, you know.) Sometimes
other languages do things better. Take logic programming--Prolog handles relationships
and rules amazingly well, if you take the time to learn it. Robert Pratte shows
how to take advantage of this from Perl.
[Perl.com]
Kendrew Lau taught HTML development to business students. Grading web pages
by hand was tedious--but Perl came to the rescue. Here's how Perl and HTML parsing
modules helped make teaching fun again.
[Perl.com]
Programmers often use flat files when storing
small amounts of data. Take for example storing something such as small
caching information. For example for one project I was working on, I
needed to store IP numbers, the unique IP address of the visitor, and
the time the entry occurred. I used flat files for this task because
it was not very data intensive, and the information was cleared every
15 minutes.
When doing something like this, you can
take 2 different approaches. You can create a file for each visitor
(what I had done, as I needed to store extra information), something
that I like to call flat-files, or you can have the same file for all
entries.
When creating many different files you will
need to be able to ensure that you can have a unique filename for each
file, otherwise files will start to overlap after some time. You can
use the Digest::SHA1 modules to generate a 160 bit signature from random
data (only in incredibly rare cases will the signature to be the same),
however there are number of different ways to do this. Once you generate
the unique name you can start to create the flat file.
# Open file
for write only or die.
open(FH, "> $unique_filename") or die("Error: $!");
# Lock the file.
flock(FH, 2);
# Save the remote ip address, a null, and then the time.
print FH $ENV{REMOTE_ADDR}, "\0", time;
# Close the file and release lock or die.
close(FH) or die("Error: $!"); |
Now this takes care of saving the data in
flat-files. Retrieving data from a simple structure like this is very
simple.
# We open the
file for reading only or die.
open(FH, "$unique_filename") or die("Error: $!");
# Read the first line from open file.
$line = <FH>;
# Close the file or die.
close(FH) or die("Error: $!");
# Separate the data using split.
($remote_addr, $create_time) = split(/\0/, $line); |
In this example, the $ENV{REMOTE_ADDR} and
the time since epoch is saved in the $unique_filename file. Be careful
to watch for security risks when using a variable in an open (for more
information read perlsec man page or view it online at http://www.perl.com/pub/doc/manual/html/pod/perlsec.html).
Using the same fundamental ideas you can create much more complex data
structures within flat-files.
As I mentioned earlier, the other way of
using flat files is to create one larger file for all entries. Retrieving
data from this kind of flat file database can be slower as data increases,
so only use this if it presents something beneficial to your programs.
You've been warned! The basic ideas for using this type of flat file
database is virtually the same as for flat-files.
Rather than opening the file for writing
as we did in the flat-files example, we have to open the file for appending,
because overwriting data will not help us in this example. We must also
separate each entry by a delimiter, I will use the newline character,
and we no longer need to use $unique_filename in open because the filename
will be static.
# Open file
for append or die.
open(FH, ">> ./cache.db") or die("Error: $!");
# Lock the file.
flock(FH, 2);
# Save the unique id, a null, remote ip address, a null,
and then the time since epoch.
print FH $unique_id, "\0", $ENV{REMOTE_ADDR}, "\0", time,
"\n";
# Close the file and release lock or die.
close(FH) or die("Error: $!"); |