A new competitor seemed to emerge out of the woodwork every month or so.
The first thing I would do, after checking to see if they had a live online
demo, was look at their job listings. After a couple years of this I could
tell which companies to worry about and which not to. The more of an IT
flavor the job descriptions had, the less dangerous the company was.
The safest kind were the ones that wanted
Oracle experience. You never had to worry about those.
You were also safe if they said they wanted C++ or Java developers.
If they wanted Perl or Python programmers,
that would be a bit frightening-- that's starting to sound like a company
where the technical side, at least, is run by real hackers.
If I had ever seen a job posting looking
for Lisp hackers, I would have been really worried.
-- Paul Graham co-founder, Viaweb
This is the forth page of an ongoing series of pages covering scripting language
topics for the beginning programmer (others cover
Unix shells,
TCL and
Perl )
Python is the language influenced by Perl but incorporating Europians and, more
specifically, Niklaus Wirth ideas about programming languages. Python's core syntax
and certain aspects of its philosophy are directly inherited from
ABC. First version of Python did not introduced any new ideas and by-and large
was just more cleaner Perl. It was released in 1991, the same year as Linux. Wikipedia
has a short article about Python history:
History of Python - Wikipedia,
the free encyclopedia
But starting with version 2.2 it added support of co-routines in a platform independent
way (via generators concept inspired by
Icon) and became class of its own. I think that this makes Python in certain
areas a better scripting language then other members of the "most popular troika"
(Perl, PHP and JavaScript). Availability of coroutines favorably distinguish Python
from Perl.
Another important feature of Python is that it enjoys support of Google (which
employs Python creator,
Guido van Rossum) and
Microsoft (Iron Python and support In Visual Studio and other tools like Expressions
Web). Currently Python is shipped as standard component only with Linux and
FreeBSD. Nether Solaris, not AIX or HP-UX include Python by default (but they do
include Perl).
Now Ruby competes with Python in this area and programmers who value coroutines
paradigm of software development (and it is really paradigm, not a language feature)
can try both languages and compare the level of integration and power provided (my
impression is that Ruby is slightly better in thin area).
Still Python enjoy support of Google and that is still no similar sponsor for
Ruby. Look at the history of JavaScript which survives as an abandoned language,
but paid heavy price both in terms of speed of development of the language and popularity
:-(.
Some of the main reasons that Python is popular as a first language is that it
is more or less forgiving and what is more important is more helpful about syntax
errors (due to its complex lexical structure and syntax Perl is as horrible in that
as one can get -- real nightmare).
Python also has another innovative aspect (new is a well forgotten past - FORTRAN
4 used an indentation to distinguish between labels and statements ;-): it
uses indentation to determine nesting of statements. Multiline statements are marked
like a multiline UNIX command: with a backslash. Multiple close of the blocks become
just the matter of appropriate amount of moving nesting to the left.
Another attractive feature is clean and powerful I/O statements, although Python
lacks the power of the elegant Perl open statement).
Finally, Python scales quite well from learning tool to professional instrument.
One can learn it at a basic procedural level and then learn co-routines based programming
in a relatively short period of time (only Modula permits the same flexibility).
Although Python is a scripting language that has application area similar
to Perl, similarities seem to end quickly. They're not overly similar in implementation
details, nor even remotely similar in syntax. Their extension mechanisms also take
entirely different directions. Perl's motto seems to be TIMTOWTDI with an
attitude of "whatever gets the job done", where as Python seems to follow the practice
of KISS, preferring simplicity and consistency in design to flexibility.
We all understand that now not necessary the best language win. Fortunately there
are several positive signs for Python:
Jython is one of the few scripting languages that integrates well with Java.
That makes it ideal for rapid development and for easily maintained programs.
Python has more or less clean interface with C and C++. Modules which do
need speed can be replaced later by C++
Python has extensive library of modules. Among them:
pikcle - object serialization
re - regular expressions
struct - access to C structs
ConfigParser - parse config files
getopt - parse command-line options
socket (also threading. SocketServer, )
cgi (also CGIHTTPServer, urllib,urlparse - parse a url into (scheme,
netloc, path, parameters, query, fragment)
Indentation style in Python is actually an interesting innovation that very few
people understand. But making this revolutionary step -- relegating block brackets
to the lexical level of blank space and comments Python failed to make a necessary
adjustment and include pretty printer into interpreter (with the possibility of
creating pretty printed program from comments). Such a pretty printer actually needs
to understand two things: the format of comments that suggest indenting like
//{ <label>
//} <label>
and the current number of spaces in the tab like pragma tab = 8. The interesting
possibility is that in pretty printed program those comments can be removed and
after a reading pretty printed program into the editor reinstated automatically.
Such feature would be extremely cool and can be implemented in any scriptable editor
like THE, Emacs (note: If you like the keybindings and programmability of
Emacs, but don't like its size, try
JED which also
has a pretty good Python mode), etc.
Python actually encourages a programmer to use a decent editor but we knew that
already, right? The main benefit I see to syntactic indenting is that is narrows
down the possible range of coding styles. If you think about it, most of the (dare
I say) splintering of C/C++/Java coding styles is due to the placement of the {
and } symbols. This acts against readability for other developers. Since there are
no braces, there are no style wars over where to put the braces and that is a very
important advantage for any environment with several supersizes ego.
An often overlooked advantage of this Python feature is not only that language
saved two important for any scripting language symbols for other uses, but that
such solution automatically leads to more compact (as for the number of lines) programs
(deletion of curly brackets usually help to lessen the number of lines in C or Perl
program by 20% or more).
My impression is that few people understand that C solution for blocks
({ } blocks) was pretty weak in comparison with its prototype language (PL/1): it
does not permit nice labeled closer of blocks like
A:do ... end A;
in PL/1. IMHO introduction of a pretty printer as a standard feature of both
a compiler and a GUI environment is long overdue and here Python can make its mark.
Moreover such an approach might help somewhat compensate for all those OO excesses
that lengthen the program two or three time in comparison with a pure procedural
language like PL/1, Ada or Modula-2.
Here are some companies that are using Python in commercial apps (borrowed from
a
Slashdot post, now you need to add Goggle to the list):
Advanced Management Solutions Inc.
AMS provides the AMS REALTIME suite of enterprise software for project management,
resource management, cost management and timesheets. The Python language
engine is embedded in AMS REALTIME as a means of extending the products,
and also as a way of enabling custom behavior and company-specific business
rules to be supported.
CWI CWI, Python's home, has used
Python in, among other things, GrINS, a 20,000 line authoring environment
for transportable hypermedia presentations, and a 5,000 line multimedia
teleconferencing tool, as well as many many smaller programs. See the collection
of multi-media project papers.
Digital Creations A long-time sponsor
of the PSA, Digital creations develops with Python - and also makes some
of their Python software available for free!
ILU ILU (it's spelled Inter-Language
Unification but it's pronounced eye-loo) is a (very) CORBA-ish multi-language
object interface system. It has bindings for Common Lisp, C++, ANSI C, Modula-3
and Python.
Infoseek Ultraseek Server, Infoseek's
commercial search engine product, is implemented as an elaborate multi-threaded
Python program with the primitive indexing and search operations performed
by a built-in module. Most of the program is written in Python, and both
a built-in spider and HTTP server can be customized with additional Python
code. The program contains over 11,000 lines of Python code, and the user
interface is implemented with over 17,000 lines of Python-scripted HTML
templates. Try it out on the Python.Org web search page or download an evaluation
copy from Infoseek Software.
eGroups.com (previously Findmail)
A comprehensive public archive of Internet mailing lists, implemented in
pure Python. Latest statistics from Scott Hassan: 180,000 lines of Python
doing everything from a 100% dynamic website to all email delivery, pumping
out 200 messages/second on a single 400 MHz Pentium!
LLNL A group at the Lawrence Livermore
National Laboratories is basing a new numerical engineering environment
on Python, replacing a home-grown scripting language of ten-year standing.
Paul Dubois is a central figure in that effort.
NASA Johnson Space Center uses Python
in its Integrated Planning System as the standard scripting language. Efforts
are underway to develop a modular collection of tools for assisting shuttle
pre-mission planning and to replace older tools written in PERL and shell
dialects. Python will also be installed in the new Mission Control Center
to perform auxiliary processing integrated with a user interface shell.
Ongoing developments include an automated grammar based system whereby C++
libraries may be interfaced directly to Python via compiler techniques.
This technology can be extended to other languages in the future.
IV Image Systems AB IV Image Systems
uses Python for many projects, including a satellite image production system
for the Swedish Meteorological and Hydrological Institute (SMHI) (see next
entry). This system receives raw data from several weather satellites, and
produces images for many purposes, including the satellite images used for
the presentation of the daily weather on Swedish TV 4. For more information,
contact Goran Bondeson.
Swedish Meteorological and Hydrological
Institute SMHI is the home of the Swedish civilian weather, hydrological
and oceanographic services. It's Python-based remote sensing software for
automatic product generation, using NOAA and Meteosat data, provides information
to bench forecasters, objective analysis schemes, and commercial interests
such as the media. At SMHI's Research & Development Unit, a Python-based
"Radar Analysis and Visualization Environment"
Written in Python. The last version is LFM 2.3 dated May 2011. Codebase is
below 10K lines.
Lfm is a curses-based file manager for the Unix console written in Python
21 May 2011
Python 2.5 or later is required now. PowerCLI was added, an advanced
command line interface with completion, persistent history, variable
substitution, and many other useful features.
Persistent history in all forms was added. Lots of improvements were
made and bugs were fixed
A couple of you make donations each month (out of about a thousand of you
reading the text each week). Tragedy of the commons and all that... but if some
more of you would donate a few bucks, that would be great support of the author.
In a community spirit (and with permission of my publisher), I am making
my book available to the Python community. Minor corrections can be made to
later printings, and at the least errata noted on this website. Email me at
<mertz@gnosis.cx> .
A few caveats:
(1) This stuff is copyrighted by AW (except the code samples which are released
to the public domain). Feel free to use this material personally; but no permission
is given for further distribution beyond your personal use.
(2) The book is provided in "smart ASCII" format. This is converted to print
(and maybe to fancier electronic formats) by automated scripts (txt->LaTeX->PDF
for the printed version).
As a highly sophisticated "digital rights management" system, those scripts
are not themselves made readily available. :-)
Developing applications for the Linux desktop typically requires some type
of graphical user interface (GUI) framework to build on. Options include GTK+
for the GNOME desktop and Qt for the K Desktop Environment (KDE). Both platforms
offer everything a developer needs to build a GUI application, including libraries
and layout tools to create the windows users see. This article shows you how
to build desktop productivity applications based on the screenlets widget toolkit
(see Resources for a link).
A number of existing applications would fit in the desktop productivity category,
including GNOME Do and Tomboy. These applications typically allow users to interact
with them directly from the desktop through either a special key combination
or by dragging and dropping from another application such as Mozilla Firefox.
Tomboy functions as a desktop note-taking tool that supports dropping text from
other windows.
You need to install a few things to get started developing screenlets. First,
install the screenlets package using either the Ubuntu Software Center or the
command line. In the Ubuntu Software Center, type screenlets in
the Search box. You should see two options for the main package
and a separate installation for the documentation.
Python and Ubuntu
You program screenlets using Python. The basic installation of Ubuntu 10.04
has Python version 2.6 installed, as many utilities depend on it. You may need
additional libraries depending on your application's requirements. For the purpose
of this article, I installed and tested everything on Ubuntu version 10.04.
Next, download the test screenlet's source from the screenlets.org site.
The test screenlet resides in the src/share/screenlets/Test folder and uses
Cairo and GTK, which you also need to install. The entire source code for the
test program is in the TestScreenlet.py file. Open this file in your favorite
editor to see the basic structure of a screenlet.
Python is highly object oriented and as such uses the class
keyword to define an object. In this example, the class is named TestScreenlet
and has a number of methods defined. In TestScreenlet.py, note the following
code at line 42:
def __init__(self, **keyword_args):
Python uses the leading and trailing double underscore (__) notation
to identify system functions with predefined behaviors. In this case, the
__init__ function is for all intents and purposes the constructor
for the class and contains any number of initialization steps to be executed
on the creation of a new instance of the object. By convention, the first argument
of every class method is a reference to the current instance of the class and
is named self. This behavior makes it easy to use self
to reference methods and properties of the instance it is in:
self.theme_name = "default"
The screenlets framework defines several naming conventions and standards, as
outlined on screenlets.org's developer's page (see Resources
for a link). There's a link to the source code for the screenlets package along
with the application programming interface (API) documentation. Looking at the
code also gives you insight into what each function does with the calling arguments
and what it returns.
The basic components of a screenlet include an icon file, the source code
file, and a themes folder. The themes folder contains additional folders for
different themes. You'll find a sample template at screenlets.org with the required
files and folders to help you get started.
For this first example, use the template provided to create a basic "Hello
World" application. The code for this basic application is shown in Listing
1.
#!/usr/bin/env python
import screenlets
class HelloWorldScreenlet(screenlets.Screenlet):
__name__ = 'HelloWorld'
__version__ = '0.1'
__author__ = 'John Doe'
__desc__ = 'Simple Hello World Screenlet'
def __init__(self, **kwargs):
# Customize the width and height.
screenlets.Screenlet.__init__(self, width=180, height=50, **kwargs)
def on_draw(self, ctx):
# Change the color to white and fill the screenlet.
ctx.set_source_rgb(255, 255, 255)
self.draw_rectangle(ctx, 0, 0, self.width, self.height)
# Change the color to black and write the message.
ctx.set_source_rgb(0, 0, 0)
text = 'Hello World!'
self.draw_text(ctx, text, 10, 10, "Sans 9" , 20, self.width)
if __name__ == "__main__":
import screenlets.session
screenlets.session.create_session(HelloWorldScreenlet)
Each application must import the screenlets framework and create a new session.
There are a few other minimal requirements, including any initialization steps
along with a basic draw function to present the widget on screen. The TestScreenlet.py
example has an __init__ method that initializes the object. In
this case, you see a single line with a call to the screenlet's __init__
method, which sets the initial width and height of the window to be created
for this application.
The only other function you need for this application is the on_draw
method. This routine sets the background color of the box to white and draws
a rectangle with the dimensions defined earlier. It sets the text color to black
and the source text to "Hello World!" and then draws the text. ...
One nice thing about writing screenlets is the ability to reuse code from
other applications. Code reuse opens a world of possibilities with the wide
range of open source projects based on the Python language. Every screenlet
has the same basic structure but with more methods defined to handle different
behaviors. Listing 2 shows a sample application named TimeTrackerScreenlet.
This example introduces a few more concepts that you need to understand before
you start building anything useful. All screenlet applications have the ability
to respond to specific user actions or events such as mouse clicks or drag-and-drop
operations. In this example, the mouse down event is used as a trigger to change
the state of your icon. When the screenlet runs, the start.png image is displayed.
Clicking the image changes it to stop.png and records the time started in
self.started. Clicking the stop image changes the image back to
start.png and displays the amount of time elapsed since the first start image
was clicked.
Responding to events is another key capability that makes it possible to
build any number of different applications. Although this example only uses
the mouse_down event, you can use the same approach for other events
generated either by the screenlets framework or by a system event such as a
timer. The second concept introduced here is persistent state. Because your
application is running continuously, waiting for an event to trigger some action,
it is able to keep track of items in memory, such as the time the start image
was clicked. You could also save information to disk for later retrieval, if
necessary.
Now that you have the general idea behind developing screenlets, let's put
all together. Most users these days use a Really Simple Syndication (RSS) reader
to read blogs and news feeds. For this last example, you're going to build a
configurable screenlet that monitors specific feeds for keywords and displays
any hits in a text box. The results will be clickable links to open the post
in your default Web browser. Listing 3 shows the source code for the RSS Search
screenlet.
#!/usr/bin/env python
from screenlets.options import StringOption, IntOption, ListOption
import xml.dom.minidom
import webbrowser
import screenlets
import urllib2
import gobject
import pango
import cairo
class RSSSearchScreenlet(screenlets.Screenlet):
__name__ = 'RSSSearch'
__version__ = '0.1'
__author__ = 'John Doe'
__desc__ = 'An RSS search screenlet.'
topic = 'Windows Phone 7'
feeds = ['http://www.engadget.com/rss.xml',
'http://feeds.gawker.com/gizmodo/full']
interval = 10
__items = []
__mousesel = 0
__selected = None
def __init__(self, **kwargs):
# Customize the width and height.
screenlets.Screenlet.__init__(self, width=250, height=300, **kwargs)
self.y = 25
def on_init(self):
# Add options.
self.add_options_group('Search Options',
'RSS feeds to search and topic to search for.')
self.add_option(StringOption('Search Options',
'topic',
self.topic,
'Topic',
'Topic to search feeds for.'))
self.add_option(ListOption('Search Options',
'feeds',
self.feeds,
'RSS Feeds',
'A list of feeds to search for a topic.'))
self.add_option(IntOption('Search Options',
'interval',
self.interval,
'Update Interval',
'How frequently to update (in seconds)'))
self.update()
def update(self):
"""Search selected feeds and update results."""
self.__items = []
# Go through each feed.
for feed_url in self.feeds:
# Load the raw feed and find all item elements.
raw = urllib2.urlopen(feed_url).read()
dom = xml.dom.minidom.parseString(raw)
items = dom.getElementsByTagName('item')
for item in items:
# Find the title and make sure it matches the topic.
title = item.getElementsByTagName('title')[0].firstChild.data
if self.topic.lower() not in title.lower(): continue
# Shorten the title to 30 characters.
if len(title) > 30: title = title[:27]+'...'
# Find the link and save the item.
link = item.getElementsByTagName('link')[0].firstChild.data
self.__items.append((title, link))
self.redraw_canvas()
# Set to update again after self.interval.
self.__timeout = gobject.timeout_add(self.interval * 1000, self.update)
def on_draw(self, ctx):
"""Called every time the screenlet is drawn to the screen."""
# Draw the background (a gradient).
gradient = cairo.LinearGradient(0, self.height * 2, 0, 0)
gradient.add_color_stop_rgba(1, 1, 1, 1, 1)
gradient.add_color_stop_rgba(0.7, 1, 1, 1, 0.75)
ctx.set_source(gradient)
self.draw_rectangle_advanced (ctx, 0, 0, self.width - 20,
self.height - 20,
rounded_angles=(5, 5, 5, 5),
fill=True, border_size=1,
border_color=(0, 0, 0, 0.25),
shadow_size=10,
shadow_color=(0, 0, 0, 0.25))
# Make sure we have a pango layout initialized and updated.
if self.p_layout == None :
self.p_layout = ctx.create_layout()
else:
ctx.update_layout(self.p_layout)
# Configure fonts.
p_fdesc = pango.FontDescription()
p_fdesc.set_family("Free Sans")
p_fdesc.set_size(10 * pango.SCALE)
self.p_layout.set_font_description(p_fdesc)
# Display our text.
pos = [20, 20]
ctx.set_source_rgb(0, 0, 0)
x = 0
self.__selected = None
for item in self.__items:
ctx.save()
ctx.translate(*pos)
# Find if the current item is under the mouse.
if self.__mousesel == x and self.mouse_is_over:
ctx.set_source_rgb(0, 0, 0.5)
self.__selected = item[1]
else:
ctx.set_source_rgb(0, 0, 0)
self.p_layout.set_markup('%s' % item[0])
ctx.show_layout(self.p_layout)
pos[1] += 20
ctx.restore()
x += 1
def on_draw_shape(self, ctx):
ctx.rectangle(0, 0, self.width, self.height)
ctx.fill()
def on_mouse_move(self, event):
"""Called whenever the mouse moves over the screenlet."""
x = event.x / self.scale
y = event.y / self.scale
self.__mousesel = int((y -10 )/ (20)) -1
self.redraw_canvas()
def on_mouse_down(self, event):
"""Called when the mouse is clicked."""
if self.__selected and self.mouse_is_over:
webbrowser.open_new(self.__selected)
if __name__ == "__main__":
import screenlets.session
screenlets.session.create_session(RSSSearchScreenlet)
Building on the concepts of the first two examples, this screenlet uses a
number of new concepts, including the config page. In the on_init
routine, three options are added for the user to specify: a list of RSS feeds
to track, a topic of interest to search for, and an update interval. The update
routine then uses all of these when it runs.
Python is a great language for this type of task. The standard library includes
everything you need to load the Extensible Markup Language (XML) from an RSS
feed into a searchable list. In Python, this takes just three lines of code:
raw = urllib2.urlopen(feed_url).read()
dom = xml.dom.minidom.parseString(raw)
items = dom.getElementsByTagName('item')
The libraries used in these three lines include urllib2 and
xml. In the first line, the entire contents found at the
feed_url address are read into the string raw. Next, because you know
that this string contains XML, you use the Python XML library dom.minidom.parseString
method to create a document object made up of node objects.
Finally, you create a list of element objects corresponding to the individual
XML elements named item. You can then iterate over this list to
search for your target topic. Python has a very elegant way of iterating over
a list of items using the for keyword, as in this code snippet:
for item in items:
# Find the title and make sure it matches the topic.
title = item.getElementsByTagName('title')[0].firstChild.data
if self.topic.lower() not in title.lower(): continue
Each item matching your criteria is added to the currently displayed list,
which is associated with this instance of the screenlet. Using this approach
makes it possible to have multiple instances of the same screenlet running,
each configured to search for different topics. The final part of the update
function redraws the text with the updated list and fires off a new update timer
based on the interval on the config page. By default, the timer fires every
10 seconds, although you could change that to anything you want. The timer mechanism
comes from the gobject library, which is a part of the GTK framework.
This application expands the on_draw method quite heavily to
accommodate your new functionality. Both the Cairo and Pango libraries make
it possible to create some of the effects used in the text window. Using a gradient
gives the background of the widget a nice look along with rounded angles and
semi-transparency. Using Pango for layout adds a number of functions for saving
and restoring the current context easily. It also provides a way to generate
scalable fonts based on the current size of the screenlet.
The trickiest part in the on_draw method is handling when a
user hovers over an item in the list. Using the for" keyword,
you iterate over the items in the screenlet to see whether the user is hovering
over that particular item. If so, you set the selected property and change the
color to provide visual feedback. You also use a bit of markup to set the link
property to bold—probably not the most elegant or efficient way to deal with
the problem, but it works. When a user clicks one of the links in the box, a
Web browser is launched with the target URL. You can see this functionality
in the on_mouse_down function. Python and its libraries make it
possible to launch the default web browser to display the desired page with
a single line of code. Figure 2 shows an example of this screenlet.
Such polls mainly reflect what industry is using, no so much the quality of
the language. In other poll the best Linux distribution is Ubuntu, which is
probably the most primitive among the major distributions available.
According to
Linux
Journal readers, Python is both the best programming language and
the best scripting language out there. This year, more than 12,000
developers on weighed in on what tools are helping them work and play as part
of the
Linux Journal's 2010 Readers' Choice Award - and it came as no surprise
to those of us at ActiveState that Python came out on top as both the
Best Scripting Language (beating out PHP, bash, PERL and Ruby)
- and for the second straight year, Python also won as the Best Programming
Language, once again edging out C++, Java, C and Perl for the honors.
At ActiveState,
we continue see a steady stream of
ActivePython Community Edition downloads, more enterprise deployments of
ActivePython Business
Edition, and a steady increase in the number of enterprise-ready Python
packages in our PyPM Index that
are being used by our customers over a wide range of verticals including high-tech,
financial services, healthcare, and aerospace companies. Python has matured
into an enterprise-class programming language that continues to nuture it's
scripting world roots. We're happy to see Python get the recognition that
it so justly deserves!
"I'm not entirely sure why Python has never caught on with me as a language
to use on a regular basis. Certainly, one of the things that always bugs me is the
lack of good integrated documentation support like POD (although apparently reStructured
Text is slowly becoming that), but that's not the whole story. I suspect a lot is
just that I'm very familiar with Perl and with its standard and supporting library,
and it takes me longer to do anything in Python. But the language just feels slightly
more awkward, and I never have gotten comfortable with the way that it uses exceptions
for all error reporting. "
On the subject of C program indentation: In My Egotistical Opinion, most
people's C programs should be indented six feet downward and covered with
dirt.
— Blair P. Houghton
Introduction
Around the beginning of April, 2001, I finally decided to do something about
the feeling I'd had for some time that I'd like to learn a few new programming
languages. I started by looking at Python. These are my notes on the process.
Non-religious comments are welcome. Please don't send me advocacy.
I chose Python as a language to try (over a few other choices like Objective
Caml or Common Lisp) mostly because it's less of a departure from the languages
that I'm already comfortable with. In particular, it's really quite a bit like
Perl. I picked this time to start since I had an idea for an initial program
to try writing in Python, a program that I probably would normally write in
Perl. I needed a program to help me manage releases of the various software
package that I maintain, something to put a new version on an ftp site, update
a series of web pages, generate a change log in a nice form for the web, and
a few other similar things.
I started by reading the Python
tutorial off www.python.org, pretty
much straight through. I did keep an interactive Python process running while
I did, but I didn't type in many of the examples; the results were explained
quite well in the tutorial, and I generally don't need to do things myself to
understand them. The tutorial is exceptionally well-written; after finishing
reading it straight through (which took me an evening) and skimming the library
reference, I felt I had a pretty good grasp on the language.
Things that immediately jumped out at me that I liked a lot:
The syntax. It really is very clean and uncluttered, and that makes
Python code easy to read. It also appealed to my sense of aesthetics. I've
gotten pretty tired of braces, which I find visually ugly, and the use of
indentation and a simple colon at the end of the introductory syntactic
element was very refreshing and nice.
The coherence of the language. It feels very designed, by someone who
has a good attention to detail and who appreciates making like things look
similar. I found that I could predict fairly well how new things would behave
based on both what I knew about old things and just on general intuition.
The breadth of the standard libraries. I was quite pleased by this.
As a system administrator, most of the code that I write is glue code that
requires interacting with a bunch of different network protocols, external
programs, and various system services, and one of the problems I often have
with new languages is their inability to play well with the rest of Unix
or happily dive into network programming when needed. Python doesn't seem
to have this problem.
There were a few things that I immediately didn't like, after having just
read the tutorial:
Triple-quoted strings. That syntax is just ugly. It's particularly ugly
when used for function doc strings. I think I'd even prefer here-docs, which
have horrible syntax problems of their own.
String quoting. I found it rather confusing, particularly in combination
with the r'' quoting (mostly for regular expressions). I'm
used to shell quoting, where one set of quotes interpolates and the other
makes everything literal except for backslashes and the quote character.
I can deal with the lack of interpolation (it does simplify things), but
r'' quoting is almost like single quoting in the shell except
it isn't and having "" and '' be interchangeable
was very odd.
Exceptions. I have a love/hate relationship with exceptions. On one
hand, I really do enjoy, when writing code just for myself, to not have
to bother checking all the system calls and still getting a reasonable error
message when things fail. On the other hand, throwing exceptions at users
really bugs me. Java does that, and I've seen it confuse the heck out of
the poor user who has utterly no idea what the backtrace means or what to
do with it. The nice thing about checking oneself is that one has complete
control over the appearance, and it's hard to do that with an exception
system like Python's without catching every exception and then trying to
produce some suitable generic error message. I think this is one of these
things that I'll just get used to.
There were also a couple of things that I immediately missed from other languages:
POD. I've gotten really used to routinely documenting all of my Perl
scripts with a POD manual at the end of the script, which I can then easily
convert into plain text for --help, into a manual page, or into HTML for
the web. I'm going to really miss this in Python. I may just keep using
POD anyway; it has so many advantages over most other mechanisms for documenting
simple programs that I've seen.
Safe program execution. It looks like the standard ways of spawning
a program from inside Python are os.system() and os.popen() and that both
of those invoke a shell. Perl lets you force the interpreter to never go
through a shell, and I much prefer that for safety reasons. I have this
nagging feeling that I'm going to have to fake my own versions of those
standard routines, which I hate doing. Ick.
2001-04-08
Over the next few days, I started reading the language manual straight through,
as well as poking around more parts of the language reference and writing some
code. I started with a function to find the RCS keywords and version string
in a file and from that extract the version and the last modified date (things
that would need to be modified on the web page for that program). I really had
a lot of fun with this.
The Python standard documentation is excellent. I mean truly superb.
I can't really compare it to Perl (the other language that has truly excellent
standard documentation), since I know Perl so well that I can't evaluate its
documentation from the perspective of the beginner, but Python's tutorial eased
me into the language beautifully and the language manual is well-written, understandable,
and enjoyable to read. The library reference is well-organized and internally
consistent, and I never had much trouble finding things. And they're available
in info format as well as web pages, which is a major advantage for me; info
is easier for me to read straight through, and web pages are easier for me to
browse.
The language proved rather fun to write. Regex handling is a bit clunky since
it's not a language built-in, but I was expecting that and I don't really mind
it. The syntax is fun, and XEmacs python-mode does an excellent job handling
highlighting and indentation. I was able to put together that little function
and wrap a test around it fairly quickly (in a couple of hours while on the
train, taking a lot of breaks to absorb the language reference manual or poke
around in the library reference for the best way of doing something).
That's where I am at the moment. More as I find time to do more....
2001-05-04
I've finished my first Python program, after having gotten distracted by
a variety of other things. It wasn't the program I originally started writing,
since the problem of releasing a new version of a software package ended up
being more complicated than I expected. (In particular, generating the documentation
looks like it's going to be tricky.) I did get the code to extract version numbers
and dates written, though, and then for another project (automatically generating
man pages from scripts with embedded POD when installing them into our site-wide
software installation) I needed that same code. So I wrote that program in Python
and tested it and it works fine.
The lack of a way to safely execute a program without going through the shell
is really bothering me. It was also the source of one of the three bugs in the
first pass at my first program; I passed a multiword string to pod2man and forgot
to protect it from the shell. What I'm currently doing is still fragile in the
presence of single quotes in the string, which is another reason why I much
prefer Perl's safe system() function. I feel like I must be missing something;
something that fundamental couldn't possibly fail to be present in a scripting
language.
A second bug in that program highlights another significant difference from
Perl that I'm finding a little strange to deal with, namely the lack of equivalence
between numbers and strings. My program had a dictionary of section titles,
keyed by the section numbers, and I was using the plain number as the dictionary
key. When I tried to look up a title in the dictionary, however, I used as the
key a string taken from the end of the output filename, and 1 didn't match "1".
It took me a while to track that down. (Admittedly, the problem was really laziness
on my part; given the existence of such section numbers as "1m" and "3f", I
should have used strings as the dictionary keys in the first place.)
The third bug, for the record, was attempting to use a Perl-like construct
to read a file (while line = file.readline():). I see that Python
2.1 has the solution I really want in the form of xreadlines, but
in the meantime that was easy enough to recode into a test and a break
in the middle of the loop.
The lack of a standard documentation format like Perl's POD is bothering
me and I'm not sure what to do about it. I want to put the documentation (preferrably
in POD, but I'm willing to learn something else that's reasonably simple) into
the same file as the script so that it gets updated when the script does and
doesn't get lost in the directory. This apparently is just an unsolved problem,
unless I'm missing some great link to an embedded documentation technique (and
I quite possibly am). Current best idea is to put a long triple-quoted string
at the end of my script containing POD. Ugh.
I took a brief look at the standard getopt library (although I didn't end
up using it), and was a little disappointed; one of the features that I really
liked about Perl's Getopt::Long was its ability to just stuff either the arguments
to options or boolean values into variables directly, without needing something
like the long case statement that's a standard feature of main() in many C programs.
Looks like Python's getopt is much closer to C's, and requires something quite
a bit like that case statement.
Oh, and while the documentation is still excellent, I've started noticing
a gap in it when it comes to the core language (not the standard library; the
documentation there is great). The language reference manual is an excellent
reference manual, complete with clear syntax descriptions, but is a little much
if one just wants to figure out how to do something. I wasn't sure of the syntax
of the while statement, and the language reference was a little heavier than
was helpful. I find myself returning to the tutorial to find things like this,
and it has about the right level of explanation, but the problem with that is
that the tutorial is laid out as a tutorial and isn't as easy to use as a reference.
(For example, the while statement isn't listed in the table of contents, because
it was introduced in an earlier section with a more general title.)
I need to get the info pages installed on my desktop machine so that I can
look things up in the index easily; right now, I'm still using the documentation
on the web.
2001-11-13
I've unfortunately not had very much time to work on this, as one can tell
from the date.
Aahz pointed out a way to execute a program without going through the shell,
namely os.spawnv(). That works, although the documentation is extremely poor.
(Even in Python 2.1, it refers me to the Visual C++ Runtime Library documentation
for information on what spawnv does, which is of course absurd.) At least the
magic constants that it needs are relatively intuitive. Unfortunately, spawnv
doesn't search the user's PATH for a command, and there's nothing like spawnvp.
Sigh.
There's really no excuse for this being quite this hard. Executing a command
without going through the shell is an extremely basic function that should be
easily available in any scripting language without jumping through these sorts
of hoops.
But this at least gave me a bit of experience in writing some more Python
(a function to search the PATH to find a command), and the syntax is still very
nice and convenient. I'm bouncing all over the tutorial and library reference
to remember how to do things, but usually my first guesses are right.
I see that Debian doesn't have the info pages, only the HTML documentation.
That's rather annoying, but workable. I now have the HTML documentation for
Python 2.1 on local disk on my laptop.
2002-07-20
I've now written a couple of real Python programs (in addition to the simple
little thing to generate man pages by running pod2man). You can find them (cvs2xhtml
and cl2xhtml) with my web
tools. They're not particularly pretty, but they work, and I now have some
more experience writing simple procedural Python code. I still haven't done
anything interesting with objects. Comments on the code are welcome. Don't expect
too much.
There are a few other documentation methods for Python, but they seem primarily
aimed at documenting modules and objects rather than documenting scripts. Pydoc
in particular looks like it would be nice for API documentation but doesn't
really do anything for end-user program documentation. Accordingly, I've given
up for the time being on finding a more "native" approach and am just documenting
my Python programs the way that I document most things, by writing embedded
POD. I've yet to find a better documentation method; everything else seems to
either be far too complicated and author-unfriendly to really write directly
in (like DocBook) or can't generate Unix man pages, which I consider to be a
requirement.
The Python documentation remains excellent, if scattered. I've sometimes
spent a lot of time searching through the documentation to find the right module
to do something, and questions of basic syntax are fairly hard to resolve (the
tutorial is readable but not organized as a reference, and the language reference
is too dense to provide a quick answer).
2004-03-03
My first major Python application is complete and working (although I'm not
yet using it as much as I want to be using it). That's
Tasker, a web-based
to-do list manager written as a Python CGI script that calls a Python module.
I've now dealt with the Python module building tools, which are quite nice
(nicer in some ways than Perl's Makefile.PL system with some more built-in functionality,
although less mature in a few ways). Python's handling of the local module library
is clearly less mature than Perl, and Debian's Python packages don't handle
locally installed modules nearly as well as they should, but overall it was
a rather positive experience. Built-in support for generating RPMs is very interesting,
since eventually I'd like to provide .deb and RPM packages for all my software.
I played with some OO design for this application and ended up being fairly
happy with how Python handled things. I'm not very happy with my object layout,
but that's my problem, not Python's. The object system definitely feels far
smoother and more comfortable to me than Perl's, although I can still write
OO code faster in Perl because I'm more familiar with it. There's none of the
$self hash nonsense for instance variables, though, which is quite nice.
The CGI modules for Python, and in particular the cgitb module for displaying
exceptions nicely in the browser while debugging CGI applications, are absolutely
excellent. I was highly impressed, and other than some confusion about the best
way to retrieve POST data that was resolved after reading the documentation
more closely, I found those modules very easy to work. The cgitb module is a
beautiful, beautiful thing and by itself makes me want to use Python for all
future CGI programming.
I still get caught all the time by the lack of interchangability of strings
and numbers and I feel like I'm casting things all the time. I appreciate some
of the benefits of stronger typing, but this one seems to get in my way more
often than it helps.
I'm also still really annoyed at the lack of good documentation for the parts
of the language that aren't considered part of the library. If I want documentation
on how print works, I have only the tutorial and the detailed language standard,
the former of which is not organized for reference and the latter of which is
far too hard to understand. This is a gaping hole in the documentation that
I really wish someone would fix. Thankfully, it only affects a small handful
of things, like control flow constructs and the print statement, so I don't
hit this very often, but whenever I do it's extremely frustrating.
I've given up on documentation for scripts and am just including a large
POD section at the end of the script, since this seems to be the only option
that will generate good man pages and good web pages. I'm not sure what to do
about documentation for the module; there seem to be a variety of different
proposals but nothing that I can really just use.
Oh, and one last point on documentation: the distutils documentation needs
some work. Thankfully I found some really good additional documentation on the
PyPI web site that explained a lot more about how to write a setup.py script.
2010-03-25
Six years later, I still find Python an interesting language, but I never
got sufficiently absorbed by it for it to be part of my standard toolkit.
I've subsequently gotten some additional experience with extending Python
through incorporating an extension written by Thomas Kula into the
remctl distribution.
The C interface is relatively nice and more comfortable than Perl, particularly
since it doesn't involve a pseudo-C that is run through a preprocessor. It's
a bit more comfortable to read and write.
Python's installation facilities, on the other hand, are poor. The distutils
equivalent of Perl's ExtUtils::MakeMaker is considerably worse, despite ExtUtils::MakeMaker
being old and crufty and strange. (I haven't compared it with Module::Build.)
The interface is vaguely similar, but I had to apply all sorts of hacks to get
the Python extension to build properly inside a Debian packaging framework,
and integrating it with a larger package requires doing Autoconf substitution
on a ton of different files. It was somewhat easier to avoid embedding RPATH
into the module, but I'd still much rather work with Perl's facilities.
Similarly, while the test suite code has some interesting features (I'm using
the core unittest framework), it's clearly inferior to Perl's Test::More support
library and TAP protocol. I'm, of course, a known fan of Perl's TAP testing
protocol (I even wrote
my own implementation
in C), but that's because it's well-designed, full-featured, and very useful.
The Python unittest framework, by comparison, is awkward to use, has significantly
inferior reporting capabilities, makes it harder to understand what test failed
and isolate the failure, and requires a lot of digging around to understand
how it works. I do like the use of decorators to handle skipping tests, and
there are some interesting OO ideas around test setup and teardown, but the
whole thing is more awkward than it should be.
I'm not entirely sure why Python has never caught on with me as a language
to use on a regular basis. Certainly, one of the things that always bugs me
is the lack of good integrated documentation support like POD (although apparently
reStructured Text is slowly becoming that), but that's not the whole story.
I suspect a lot is just that I'm very familiar with Perl and with its standard
and supporting library, and it takes me longer to do anything in Python. But
the language just feels slightly more awkward, and I never have gotten comfortable
with the way that it uses exceptions for all error reporting.
I may get lured back into it again at some point, though, since Python 3.0
seems to have some very interesting features and it remains popular with people
who know lots of programming languages. I want to give it another serious look
with a few more test projects at some point in the future.
What do Python 2.x programmers need to know about Python 3?
With the latest major Python release, creator Guido van Rossum saw the
opportunity to tidy up his famous scripting language. What is different about
Python 3.0? In this article, I offer some highlights for Python programmers
who are thinking about making the switch to 3.x.
There can be many reasons why you might need a client/server application.
For a simple example, purchasing for a small retail chain might need up to the
minute stock levels on a central server. The point-of-sale application in the
stores would then need to post inventory transactions to the central server
in real-time.
This application can easily be coded in Python with performance levels of
thousands of transactions per second on a desktop PC. Simple sample programs
for the server and client sides are listed below, with discussions following.
About:
cfv is a utility to both test and create .sfv (Simple File Verify), .csv, .crc,
.md5(sfv style), md5sum, BSD md5, sha1sum, and .torrent checksum verification
files. It also includes test-only support for .par and .par2 files. These files
are commonly used to ensure the correct retrieval or storage of data.
Release focus: Major bugfixes
Changes:
Help output is printed to stdout under non-error conditions. A mmap file descriptor
leak in Python 2.4.2 was worked around. The different module layout of BitTorrent
5.x is supported. A "struct integer overflow masking is deprecated" warning
was fixed. The --private_torrent flag was added. A bug was worked around in
64-bit Python version 2.5 and later which causes checksums of files larger than
4GB to be incorrectly calculated when using mmap.
BitRock Web Stacks provide you with the easiest way to install and run the
LAMP platform in a variety of Linux distributions. BitRock Web Stacks are free
to download and use under the terms of the Apache License 2.0. To learn more
about our licensing policies, click
here.
You can find up-to-date WAMP,
LAMP and
MAMP stacks at the
BitNami open source website. In addition
to those, you will find freely available application stacks for popular open
source software such as Joomla!,
Drupal,
Mediawiki and
Roller. Just like BitRock Web
Stacks, they include everything you need to run the software and come packaged
in a fast, easy to use installer.
BitRock Web Stacks contain several open source tools and libraries. Please
be sure that you read and comply with all of the applicable
licenses.
If you are a MySQL Network subscriber (or would like to purchase a subscription)
and want to use a version of LAMPStack that contains the MySQL Certified binaries,
please send an email to sales@bitrock.com.
For further information, including supported platforms, component versions,
documentation, and support, please visit our
solutions
section.
This is a project to produce an efficient way of filling a large area of
screen space with terminals. This is done by splitting the window into a resizeable
grid of terminals. As such, you can produce a very flexible arrangements of
terminals for different tasks.
Read me
Terminator 0.8.1
by Chris Jones <cmsj@tenshu.net>
This is a little python script to give me lots of terminals in a single window,
saving me valuable laptop screen space otherwise wasted on window decorations
and not quite being able to fill the screen with terminals.
Right now it will open a single window with one terminal and it will (to some
degree) mirror the settings of your default gnome-terminal profile in gconf.
Eventually this will be extended and improved to offer profile selection per-terminal,
configuration thereof and the ability to alter the number of terminals and save
meta-profiles.
You can create more terminals by right clicking on one and choosing to split
it vertically or horizontally. You can get rid of a terminal by right clicking
on it and choosing Close. ctrl-shift-o and ctrl-shift-e will also effect the
splitting.
ctrl-shift-n and ctrl-shift-p will shift focus to the next/previous terminal
respectively, and ctrl-shift-w will close the current terminal and ctrl-shift-q
the current window
It's quite shamelessly based on code in the vte-demo.py from the vte widget
package, and on the gedit terminal plugin (which was fantastically useful).
vte-demo.py is not my code and is copyright its original author. While it does
not contain any specific licensing information in it, the VTE package appears
to be licenced under LGPL v2.
the gedit terminal plugin is part of the gedit-plugins package, which is licenced
under GPL v2 or later.
I am thus licensing Terminator as GPL v2 only.
Cristian Grada provided the icon under the same licence.
Python and the Programmer
A Conversation with Bruce Eckel, Part I
by Bill Venners
Jun 2, 2003
Summary
Bruce Eckel talks with Bill Venners about why he feels Python is "about
him," how minimizing clutter improves productivity, and the relationship
between backwards compatibility and programmer pain.
Bruce Eckel wrote the best-selling books Thinking in C++ and Thinking in Java, but for the past several years he's preferred to think
in Python. Two years ago, Eckel gave a keynote address at the 9th International
Python Conference entitled "Why I love Python." He presented ten reasons he
loves programming in Python in "top ten list" style, starting with ten and ending
with one.
In this interview, which is being published in weekly installments, I ask
Bruce Eckel about each of these ten points. In this installment, Bruce Eckel
explains why he feels Python is "about him," how minimizing clutter improves
productivity, and the relationship between backwards compatibility and programmer
pain.
Bill Venners: In the introduction to your "Why I Love Python" keynote,
you said what you love the most is "Python is about you." How is Python
about you?
Bruce Eckel: With every other language I've had to deal with, it's
always felt like the designers were saying, "Yes, we're trying to make your
life easier with this language, but these other things are more important."
With Python, it has always felt like the designers were saying, "We're trying
to make your life easier, and that's it. Making your life easier is the thing
that we're not compromising on."
For example, the designers of C++ certainly attempted to make the programmer's
life easier, but always made compromises for performance and backwards compatibility.
If you ever had a complaint about the way C++ worked, the answer was performance
and backwards compatibility.
Bill Venners: What compromises do you see in Java? James Gosling did
try to make programmers more productive by eliminating memory bugs.
Bruce Eckel: Sure. I also think that Java's consistency of error handling
helped programmer productivity. C++ introduced exception handling, but that
was just one of many ways to handle errors in C++. At one time, I thought that
Java's checked exceptions were helpful, but I've modified my view on that. (See
Resources.)
It seems the compromise in Java is marketing. They had to rush Java out to
market. If they had taken a little more time and implemented design by contract,
or even just assertions, or any number of other features, it would have been
better for the programmer. If they had done design and code reviews, they would
have found all sorts of silliness. And I suppose the way Java is marketed is
probably what rubs me the wrong way about it. We can say, "Oh, but we don't
like this feature," and the answer is, "Yes, but, marketing dictates that it
be this way."
Maybe the compromises in C++ were for marketing reasons too. Although choosing
to be efficient and backwards compatible with C was done to sell C++ to techies,
it was still to sell it to somebody.
I feel Python was designed for the person who is actually doing the programming,
to maximize their productivity. And that just makes me feel warm and fuzzy all
over. I feel nobody is going to be telling me, "Oh yeah, you have to jump through
all these hoops for one reason or another." When you have the experience of
really being able to be as productive as possible, then you start to get pissed
off at other languages. You think, "Gee, I've been wasting my time with these
other languages."
Number 10: Reduced Clutter
Bill Venners: In your keynote, you gave ten reasons you love Python.
Number ten was reduced clutter. What did you mean by reduced clutter?
Bruce Eckel: They say you can hold seven plus or minus two pieces
of information in your mind. I can't remember how to open files in Java. I've
written chapters on it. I've done it a bunch of times, but it's too many steps.
And when I actually analyze it, I realize these are just silly design decisions
that they made. Even if they insisted on using the Decorator pattern in
java.io, they should have had a convenience constructor for opening
files simply. Because we open files all the time, but nobody can remember how.
It is too much information to hold in your mind.
The other issue is the effect of an interruption. If you are really deep
into doing something and you have an interruption, it's quite a number of minutes
before you can get back into that deeply focused state. With programming, imagine
you're flowing along. You're thinking, "I know this, and I know this, and I
know this," and you are putting things together. And then all of a sudden you
run into something like, "I have to open a file and read in the lines." All
the clutter in the code you have to write to do that in Java can interrupt the
flow of your work.
Another number that used to be bandied about is that programmers can produce
an average of ten working lines of code per day. Say I open up a file and read
in all the lines. In Java, I've probably already used up my ten working lines
of code for that day. In Python, I can do it in one line. I can say, "for
line in file('filename').readlines():," and then I'm ready to process
the lines. And I can remember that one liner off the top of my head, so I can
just really flow with that.
Python's minimal clutter also helps when I'm reading somebody else's code.
I'm not tripping over verbose syntax and idioms. "Oh I see. Opening the file.
Reading the lines." I can grok it. It's very similar to the design patterns
in that you have a much denser form of communication. Also, because blocks are
denoted by indentation in Python, indentation is uniform in Python programs.
And indentation is meaningful to us as readers. So because we have consistent
code formatting, I can read somebody else's code and I'm not constantly tripping
over, "Oh, I see. They're putting their curly braces here or there." I don't
have to think about that.
Number 9: Not Backwards Compatible in Exchange for Pain
Bill Venners: In your keynote, your ninth reason for loving Python
was, "Not backwards compatible in exchange for pain." Could you speak a bit
about that?
Bruce Eckel: That's primarily directed at C++. To some degree you
could say it refers to Java because Java was derived primarily from C++. But
C++ in particular was backwards compatible with C, and that justified lots of
language issues. On one hand, that backwards compatibility was a great benefit,
because C programmers could easily migrate to C++. It was a comfortable place
for C programmers to go. But on the other hand, all the features that were compromised
for backwards compatibility was the great drawback of C++.
Python isn't backwards compatible with anything, except itself. But even
so, the Python designers have actually modified some fundamental things in order
to fix the language in places they decided were broken. I've always heard from
Sun that backwards compatibility is job one. And so even though stuff is broken
in Java, they're not going to fix it, because they don't want to risk breaking
code. Not breaking code always sounds good, but it also means we're going to
be in pain as programmers.
One fundamental change they made in Python, for example, was "type class
unification." In earlier versions, some of Python's primitive types were not
first class objects with first class characteristics. Numbers, for example,
were special cases like they are in Java. But that's been modified so now I
can inherit from integer if I want to. Or I can inherit from the modified dictionary
class. That couldn't be done before. After a while it began to be clear that
it was a mistake, so they fixed it.
Now in C++ or Java, they'd say, "Oh well, too bad." But in Python, they looked
at two issues. One, they were not breaking anybody's existing world, because
anyone could simply choose to not upgrade. I think that could be an attitude
taken by Java as well. And two, it seemed relatively easy to fix the broken
code, and the improvement seemed worth the code-fixing work. I find that attitude
so refreshing, compared to the languages I'd used before where they said, "Oh,
it's broken. We made a mistake, but you'll have to live with it. You'll have
to live with our mistakes."
Next Week
Come back Monday, June 9 for Part I of a conversation with Java's creator
James Gosling. I am now staggering the publication of several interviews at
once, to give the reader variety. The next installment of this interview with
Bruce Eckel will appear on Monday, June 23. If you'd like to receive a brief
weekly email announcing new articles at Artima.com, please subscribe to the
Artima Newsletter.
Talk Back!
Have an opinion about programmer productivity, backwards compatibility, or
breaking code versus programmer pain. Discuss this article in the News & Ideas
Forum topic,
Python and the Programmer.
Over several versions, Python has hugely enhanced its "laziness." For several
versions, we have had generators defined with the yield statement
in a function body. But along the way we also got the itertools
modules to combine and create various types of iterators. We have the
iter() built-in function to turn many sequence-like objects into iterators.
With Python 2.4, we got generator expressions, and with 2.5 we will get
enhanced generators that make writing coroutines easier. Moreover, more and
more Python objects have become iterators or iterator-like; for example, what
used to require the .xreadlines() method or before that the
xreadlines module, is now simply the default behavior of
open() to read files.
Similarly, looping through a dict lazily used to require the
.iterkeys() method; now it is just the default for key in
dct behavior. Functions like xrange() are a bit "special"
in being generator-like, but neither quite a real iterator (no
.next() method), nor a realized list like range() returns.
However, enumerate() returns a true generator, and usually does
what you had earlier wanted xrange() for. And itertools.count()
is another lazy call that does almost the same thing as xrange(),
but as a full-fledged iterator.
Python is strongly moving towards lazily constructing sequence-like objects;
and overall this is an excellent direction. Lazy pseudo-sequences both save
memory space and speed up operations (especially when dealing with very large
sequence-like "things").
The problem is that Python still has a schizoaffective condition when it
comes to deciding what the differences and similarities between "hard" sequences
and iterators are. The troublesome part of this is that it really violates Python's
idea of "duck typing": the ability to use a given object for a purpose just
as long as it has the right behaviors, but not necessarily any inheritance or
type restriction. The various things that are iterators or iterator-like sometimes
act sequence-like, but other times do not; conversely, sequences often act iterator-like,
but not always. Outside of those steeped in Python arcana, what does what is
not obvious.
The main point of similarity is that everything that is sequence- or iterator-like
lets you loop over it, whether using a for loop, a list comprehension,
or a generator comprehension. Past that, divergences occur. The most important
of these differences is that sequences can be indexed, and directly sliced,
while iterators cannot. In fact, indexing into a sequence is probably the most
common thing you ever do with a sequence -- why on earth does it fall down so
badly on iterators? For example:
>>> r = range(10)
>>> i = iter(r)
>>> x = xrange(10)
>>> g = itertools.takewhile(lambda n: n<10, itertools.count())
#...etc...
For all of these, you can use for n in thing. In fact, if you
"concretize" any of them with list(thing), you wind up with exactly
the same result. But if you wish to obtain a specific item -- or a slice of
a few items -- you need to start caring about the exact type of thing.
For example:
With enough contortions, you can get an item for every type of sequence/iterator.
One way is to loop until you get there. Another hackish combination might be
something like:
The pre-call to itertools.tee() preserves the original iterator.
For a slice, you might use the itertools.islice() function, wrapped
up in contortions.
So with some effort, you can coax an object to behave like both a sequence
and an iterator. But this much effort should really not be necessary;
indexing and slicing should "just work" whether a concrete sequence or a iterator
is involved.
Notice that the Indexable class wrapper is still not as flexible
as might be desirable. The main problem is that we create a new copy of the
iterator every time. A better approach would be to cache the head of the sequence
when we slice it, then use that cached head for future access of elements already
examined. Of course, there is a trade-off between memory used and the speed
penalty of running through the iterator. Nonetheless, the best thing would be
if Python itself would do all of this "behind the scenes" -- the behavior might
be fine-tuned somehow by "power users," but average programmers should not have
to think about any of this.
In the next installment in this series, I'll discuss accessing methods using
attribute syntax.
It's clear that Python is under pressure from Ruby :-)
It's hard to believe Python is more than 15 years old already. While that
may seem old for a programming
language, in the case of Python it means the language is mature. In spite
of its age, the newest versions of Python are powerful, providing everything
you would expect from a modern programming language.
This article provides a rundown of the new and important features of Python
2.5. I assume that you're familiar with Python and aren't looking for an introductory
tutorial, although in some
cases I do introduce some of the material, such as generators.
[Sep 30, 2006]
Python 2.5
Release We are pleased to announce the release of Python 2.5 (FINAL),
the final, production release of Python 2.5, on September 19th, 2006.
conditional expressions look pretty ugly.
Some standard Python objects now support Ruby-style syntax using the 'with'
statement. File objects are one example:
with open('/etc/passwd', 'r') as f:
for line in f:
print line
... more processing code ...
why to stray from mainstream C-style is unlear to me. When developing computer language
syntax, natural language imitation should not be the priority - also being different
for the sake of being different is so very early 90s
[01 Feb 2000] Python columnist Evelyn Mitchell brings you a quick reference
and learning tools for newbies who want to get to know the language. Print it,
keep it close at hand, and get down to programming!
Microsoft has shipped the release candidate for IronPython
1.0 on its CodePlex community source site.
In a July 25 blog post, S. "Soma" Somasegar, corporate vice president of
Microsoft's developer division, praised the team for getting to a release candidate
for a dynamic language that runs on the Microsoft CLI (Common Language Infrastructure).
Microsoft designed the CLI to support a variety of programming languages. Indeed,
"one of the great features of the .Net framework is the Common Language Infrastructure,"
Somasegar said.
"IronPython is a project that implements the dynamic object-oriented Python
language on top of the CLI," Somasegar said. IronPython is both well-integrated
with the .Net Framework and is a true implementation of the Python language,
he said.
And ".Net integration means that this rich programming framework is available
to Python developers and that they can interoperate with other .Net languages
and tools," Somasegar said. "All of Python's dynamic features like an
interactive interpreter, dynamically modifying objects and even metaclasses
are available. IronPython also leverages the CLI to achieve good performance,
running up to 1.5 times faster than the standard C-based Python implementation
on the standard Pystone benchmark."
Click here to read an
eWEEK interview with Python creator Guido van Rossum.
Moreover, the download of the release candidate for IronPython 1.0 "includes
a tutorial which gives .Net programmers a great way to get started with Python
and Python programmers a great way to get started with .Net," Somasegar said.
Somasegar said he finds it "exciting to see that the Visual Studio
SDK [software development kit] team has used the IronPython project as a chance
to show language developers how they can build support for their language into
Visual Studio. They have created a sample, with source, that shows some
of the basics required for integrating into the IDE including the project
system, debugger, interactive console, IntelliSense and even the Windows forms
designer. "
IronPython is the creation of Jim Hugunin, a developer on the Microsoft
CLR (Common Language Runtime) team. Hugunin joined Microsoft in 2004.
In a statement written in July 2004, Hugunin said: "My plan was to do a little
work and then write a short pithy article called, 'Why .Net is a terrible platform
for dynamic languages.' My plans changed when I found the CLR to be an excellent
target for the highly dynamic Python language. Since then I've spent much of
my spare time working on the development of IronPython."
However, Hugunin said he grew frustrated with the slow pace of progress
he could make by working on the project only in his spare time, so he decided
to join Microsoft.
IronPython is governed by Microsoft's
Shared Source license.
Python, the open source scripting language, has grown tremendously popular in
the last five years—and with good reason. Python boasts a sophisticated object
model that wise developers can exploit in ways that Java, C++, and C# developers
can only dream of.
This article is the first in a two-part series that will dig deep to explore
the fascinating new-style Python object model, which was introduced in Python
2.2 and improved in 2.3 and 2.4. The object model and type system are very dynamic
and allow quite a few interesting tricks. In this article I will describe the
object, model, and type system; explore various entities; explain the life cycle
of an object; and introduce some of the countless ways to modify and customize
almost everything you thought immutable at runtime.
The Python Object Model
Python's objects are basically a bunch of attributes. These attributes include
the type of the object, fields, methods, and base classes. Attributes are also
objects, accessible through their containing objects.
The built-in dir() function is your best friend when it comes to exploring
python objects. It is designed for interactive use and, thereby, returns a list
of attributes that the implementers of the dir function thought would be relevant
for interactive exploration. This output, however, is just a subset of all the
attributes of the object. The code sample below shows the dir function in action.
It turns out that the integer 5 has many attributes that seem like mathematical
operations on integers.
The function foo has many attributes too. The most important one is __call__
which means it is a callable type. You do want to call your functions, don't
you?
Next I'll define a class called 'A' with two methods, __init__ and dump, and
an instance field 'x' and also an instance 'a' of this class. The dir function
shows that the class's attributes include the methods and the instance has all
the class attributes as well as the instance field.
The Python Type System
Python has many types. Much more than you find in most languages (at least explicitly).
This means that the interpreter has a lot of information at runtime and the
programmer can take advantage of it by manipulating types at runtime. Most types
are defined in the types module, which is shown in the code immediately below.
Types come in various flavors: There are built-in types, new-style classes (derived
from object), and old-style classes (pre Python 2.2). I will not discuss old-style
classes since they are frowned upon by everybody and exist only for backward
compatibility.
Python's type system is object-oriented. Every type (including built-in types)
is derived (directly or indirectly) from object. Another interesting fact is
that types, classes and functions are all first-class citizens and have a type
themselves. Before I delve down into some juicy demonstrations let me introduce
the built-in function 'type'. This function returns the type of any object (and
also serves as a type factory). Most of these types are listed in the types
module, and some of them have a short name. Below I've unleashed the 'type'
function on several objects: None, integer, list, the object type, type itself,
and even the 'types' module. As you can see the type of all types (list type,
object, and type itself) is 'type' or in its full name types.TypeType (no kidding,
that's the name of the type).
What is the type of classes and instances? Well, classes are types of course,
so their type is always 'type' (regardless of inheritance). The type of class
instances is their class.
It's time for the scary part—a vicious cycle: 'type' is the type of object,
but object is the base class of type. Come again? 'type' is the type of object,
but object is the base class of type. That's right—circular dependency. 'object'
is a 'type' and 'type' is an 'object'.
How can it be? Well, since the core entities in Python are not implemented themselves
in Python (there is PyPy but that's another story) this is not really an issue.
The 'object' and 'type' are not really implemented in terms of each other.
The one important thing to take home from this is that types are objects
and are therefore subject to all the ramifications thereof. I'll discuss those
ramifications very shortly.
Instances, Classes, Class Factories, and Metaclasses
When I talk about instances I mean object instances of a class derived from
object (or the object class itself). A class is a type, but as you recall it
is also an object (of type 'type'). This allows classes to be created and manipulated
at runtime. This code demonstrates how to create a class at runtime and instantiate
it.
def init_method(self, x, y):
self.x = x
self.y = y
def dumpSum_method(self):
print self.x + self.y
D = type('DynamicClass',
(object,),
{'__init__':init_method, 'dumpSum':dumpSum_method})
d = D(3, 4)
d.dumpSum()
As you can see I created two functions (init_method and dumpSum_method) and
then invoked the ubiquitous 'type' function as a class factory to create a class
called 'DynamicClass,' which is derived from 'object' and has two methods (one
is the __init__ constructor).
It is pretty simple to create the functions themselves on the fly too. Note
that the methods I attached to the class are regular functions that can be called
directly (provided their self-argument has x and y members, similar to C++ template
arguments).
Functions, Methods and other Callables
Python enjoys a plethora of callable objects. Callable objects are function-like
objects that can be invoked by calling their () operator. Callable objects include
plain functions (module-level), methods (bound, unbound, static, and class methods)
and any other object that has a __call__ function attribute (either in its own
dictionary, via one of its ancestors, or through a descriptor).
It's truly complicated so the bottom line is to remember that all these flavors
of callables eventually boil down to a plain function. For example, in the code
below the class A defines a method named 'foo' that can be accessed through:
an instance so it is a bound method (bound implicitly to its instance)
through the class A itself and then it is an unbound method (the instance
must be supplied explicitly)
directly from A's dictionary, in which case it is a plain function (but
you must still call it with an instance of A).
So, all methods are actually functions but the runtime assigns different types
depending on how you access it.
class A(object):
def foo(self):
print 'I am foo'
>>> a = A()
>>> a.foo
<bound method A.foo of <__main__.A object at 0x00A13EB0>>
>>> A.foo
<unbound method A.foo>
>>> A.__dict__['foo']
<function foo at 0x00A0A3F0>
>>> a.foo
>>> a.foo()
I am foo
>>> A.foo(a)
I am foo
>>> A.__dict__['foo'](a)
I am foo
Let's talk about static methods and class methods. Static methods are very simple.
They are similar to static methods in Java/C++/C#. They are scoped by their
class but they don't have a special first argument like instance methods or
class methods do; they act just like a regular function (you must provide all
the arguments since they can't access any instance fields). Static methods are
not so useful in Python because regular module-level functions are already scoped
by their module and they are the natural mapping to static methods in Java/C++/C#.
Class methods are an exotic animal. Their first argument is the class itself
(traditionally named cls) and they are used primarily in esoteric scenarios.
Static and class methods actually return a wrapper around the original function
object. In the code that follows, note that the static method may be accessed
either through an instance or through a class. The class method accepts a cls
instance as its first argument but cls is invoked through a class directly (no
explicit class argument). This is different from an unbound method where you
have to provide an instance explicitly as first argument.
class A(object):
def foo():
print 'I am foo'
def foo2(cls):
print 'I am foo2', cls
def foo3(self):
print 'I am foo3', self
foo=staticmethod(foo)
foo2=classmethod(foo2)
>>> a = A()
>>> a.foo()
I am foo
>>> A.foo()
I am foo
>>> A.foo2()
I am foo2 <class '__main__.A'>
>>> a.foo3()
I am foo3 <__main__.A object at 0x00A1AA10>
Note that classes are callable objects by themselves and operate as instance
factories. When you "call" a class you get an instance of that class as a result.
A different kind of callable object is an object that has a __call__ method.
If you want to pass around a function-like object with its context intact, __call__
can be a good thing.
Listing 1
features a simple 'add' function that can be replaced with a caching adder class
that stores results of previous calculations. First, notice that the test function
expects a function-like object called 'add' and it just invokes it as a function.
The 'test' function is called twice—once with a simple function and a second
time with the caching adder instance. Continuations in Python can also be implemented
using __call__ but that's another article.
Metaclasses
Metaclasse is a concept that doesn't exist in today's mainstream programming
languages. A metaclass is a class whose instances are classes. You already encountered
a meta-class in this article called 'type'. When you invoke "type" with a class
name, a base-classes tuple, and an attribute dictionary, the method creates
a new user-defined class of the specified type. So the __class__ attribute of
every class always contains its meta-class (normally 'type').
That's nice, but what can you do with a metaclass? It turns out, you can
do plenty. Metaclasses allow you to control everything about the class that
will be created: name, base classes, methods, and fields. How is it different
from simply defining any class you want or even creating a class dynamically
on the fly? Well, it allows you to intercept the creation of classes that are
predefined as in aspect-oriented programming. This is a killer feature that
I'll be discussing in a follow-up to this article.
After a class is defined, the interpreter looks for a meta-class. If it finds
one it invokes its __init__ method with the class instance and the meta-class
gets a stab at modifying it (or returning a completely different class). The
interpreter will use the class object returned from the meta-class to create
instances of this class.
So, how do you stick a custom metaclass on a class (new-style classes only)?
Either you declare a __metaclass__ field or one of your ancestors has a __metaclass__
field. The inheritance method is intriguing because Python allows multiple inheritance.
If you inherit from two classes that have custom metaclasses you are in for
a treat—one of the metaclasses must derive from another. The actual metaclass
of your class will be the most derived metaclass:
class M1(type): pass
class M2(M1): pass
class C2(object): __metaclass__=M2
class C1(object): __metaclass__=M1
class C3(C1, C2): pass
classes = [C1, C2, C3]
for c in classes:
print c, c.__class__
print '------------'
Output:
<class '__main__.C1'> <class '__main__.M1'>
------------
<class '__main__.C2'> <class '__main__.M2'>
------------
<class '__main__.C3'> <class '__main__.M2'>
Day In The Life of a Python Object
To get a feel for all the dynamics involved in using Python objects let's track
a plain object (no tricks) starting from its class definition, through its class
instantiation, access its attributes, and see it to its demise. Later on I'll
introduce the hooks that allow you to control and modify this workflow.
The best way to go about it is with a monstrous simulation.
Listing 2
contains a simulation of a bunch of monsters chasing and eating some poor person.
There are three classes involved: a base Monster class, a MurderousHorror class
that inherits from the Monster base class, and a Person class that gets to be
the victim. I will concentrate on the MurderousHorror class and its instances.
Class Definition
MurderousHorror inherits the 'frighten' and 'eat' methods from Monster and adds
a 'chase' method and a 'speed' field. The 'hungry_monsters' class field stores
a list of all the hungry monsters and is always available through the class,
base class, or instance (Monster.hungry_monsters, MurderousHorror.hungry_monsters,
or m1.hungry_monsters). In the code below you can see (via the handy 'dir' function)
the MurderousHorror class and its m1 instance. Note that methods such as 'eat,'
'frighten,' and 'chase' appear in both, but instance fields such as 'hungry'
and 'speed' appear only in m1. The reason is that instance methods can be accessed
through the class as unbound methods, but instance fields can be accessed only
through an instance.
class NoInit(object):
def foo(self):
self.x = 5
def bar(self):
print self.x
if __name__ == '__main__':
ni = NoInit()
assert(not ni.__dict__.has_key('x'))
try:
ni.bar()
except AttributeError, e:
print e
ni.foo()
assert(ni.__dict__.has_key('x'))
ni.bar()
Output:
'NoInit' object has no attribute 'x'
5
Object Instantiation and Initialization
Instantiation in Python is a two-phase process. First, __new__ is called with
the class as a first argument, and later as the rest of the arguments, and should
return an uninitialized instance of the class. Afterward, __init__ is called
with the instance as first argument. (You can read more about __new__ in the
Python reference manual.)
When a MurderousHorror is instantiated __init__ is the first method called.
__init__ is similar to a constructor in C++/Java/C#. The instance calls the
Monster base class's __init__ and initializes its speed field. The difference
between Python and C++/Java/C# is that in Python there is no notion of a parameter-less
default constructor, which, in other languages, is automatically generated for
every class that doesn't have one. Also, there is no automatic call to the base
class' default __init__ if the derived class doesn't call it explicitly. This
is quite understandable since no default __init__ is generated.
In C++/Java/C# you declare instance variables in the class body. In Python
you define them inside a method by explicitly specifying 'self.SomeAttribute'.
So, if there is no __init__ method to a class it means its instances have no
instance fields initially. That's right. It doesn't HAVE any instance fields.
Not even uninitialized instance fields.
The previous code sample (above) is a perfect example of this phenomenon.
The NoInit class has no __init__ method. The x field is created (put into its
__dict__) only when foo() is called. When the program calls ni.bar() immediately
after instantiation the 'x' attribute is not there yet, so I get an 'AttributeError'
exception. Because my code is robust, fault tolerant, and self healing (in carefully
staged toy programs), it bravely recovers and continues to the horizon by calling
foo(), thus creating the 'x' attribute, and ni.bar() can print 5 successfully.
Note that in Python __init__ is not much more then a regular method. It is
called indeed on instantiation, but you are free to call it again after initialization
and you may call other __init__ methods on the same object from the original
__init__. This last capability is also available in C#, where it is called constructor
chaining. It is useful when you have multiple constructors that share common
initialization, which is also one of the constructors/initializers. In this
case you don't need to define another special method that contains the common
code and call it from all the constructors/initializers; you can just call the
shared constructor/initializer directly from all of them.
Attribute Access
An attribute is an object that can be accessed from its host using the dot notation.
There is no difference at the attribute access level between methods and fields.
Methods are first-class citizens in Python. When you invoke a method of an object,
the method object is looked up first using the same mechanism as a non-callable
field. Then the () operator is applied to the returned object. This example
demonstrates this two-step process:
class A(object):
def foo(self):
print 3
if __name__ == '__main__':
a = A()
f = a.foo
print f
print f.im_self
a.foo()
f()
Output:
<bound method A.foo of <__main__.A object at 0x00A03EB0>>
<__main__.A object at 0x00A03EB0>
3
3
The code retrieves the a.foo bound method object and assigns it to a local variable
'f'. 'f' is a bound method object, which means its im_self attribute points
to the instance to which it is bound. Finally, a.foo is invoked through the
instance (a.foo()) and by calling f directly with identical results. Assigning
bound methods to local variables is a well known optimization technique due
to the high cost of attribute lookup. If you have a piece of Python code that
seems to perform under the weather there is a good chance you can find a tight
loop that does a lot of redundant lookups. I will talk later about all the ways
you can customize the attribute access process and why it is so costly.
Destruction
The __del__ method is called when an instance is about to be destroyed (its
reference count reaches 0). It is not guaranteed that the method will ever be
called in situations such as circular references between objects or references
to the object in an exception. Also the implementation of __del__ may create
a new reference to its instance so it will not be destroyed after all. Even
when everything is simple and __del__ is called, there is no telling when it
will actually be called due to the nature of the garbage collector. The bottom
line is if you need to free some scarce resource attached to an object do it
explicitly when you are done using it and don't wait for __del__.
A try-finally block is a popular choice for garbage collection since it guarantees
the resource will be released even in the face of exceptions. The last reason
not to use is __del__ is that its interaction with the 'del' built-in function
may confuse programmers. 'del' simply decrements the reference count by 1 and
doesn't call '__del__' or cause the object to be magically destroyed. In the
next code sample I use the sys.getrefcount() function to determine the reference
count to an object before and after calling 'del'. Note that I subtract 1 from
sys.getrefcount() result because it also counts the temporary reference to its
own argument.
import sys
class A(object):
def __del__(self):
print "That's it for me"
if __name__ == '__main__':
a = A()
b = a
print sys.getrefcount(a)-1
del b
print sys.getrefcount(a)-1
Output:
2
1
That's it for me
Hacking Python
Let the games begin. In this section I will explore different ways to customize
attribute access. The topics include the __getattribute__ hook, descriptors,
and properties.
- __getattr__, __setattr__ and __getattribute__
These special methods control attribute access to class instances. The standard
algorithm for attribute lookup returns an attribute from the instance dictionary
or one of its base class's dictionaries (descriptors will be described in the
next section). They are supposed to return an attribute object or raise AttributeError
exception. If you define some of these methods in your class they will be called
upon during attribute access under some conditions.
__getattr__ and __setattr__ work with old-style and new-style classes.
__getattr__ is called to get attributes that cannot be found using the
standard attribute lookup algorithm.
__setattr__ is called for setting the value of any attribute. This asymmetry
is necessary to allow adding new attributes to instances.
__getattribute__ works for new-style classes only. It is called to get
any attribute (existing or non-existing). __getattribute__ has precedence
over __getattr__, so if you define both, __getattr__ will not be called
(unless __getattribute__ raises AttributeError exception).
Listing 3
is an interactive example. It is designed to allow you to play around with it
and comment out various functions to see the effect. It introduces the class
A with a single 'x' attribute. It has __getattr__, __setattr__, and __getattribute__
methods. __getattribute__ and __setattr__ simply forward any attribute access
to the default (lookup or set value in dictionary). __getattr__ always returns
7. The main program starts by assigning 6 to the non-existing attribute 'y'
(happens via __setattr__) and then prints the preexisting 'x', the newly created
'y', and the still non-existent 'z'. 'x' and 'y' exist now, so they are accessible
via __getattribute__. 'z' doesn't exist so __getattribute__ fails and __getattr__
gets called and returns 7. (Author's Note: This is contrary to the documentation.
The documentation claims if __getattribute__ is defined, __getattr__ will never
be called, but this is not the actual behavior.)
Descriptors
A descriptor is an object that implements three methods __get__, __set__, and
__delete__. If you put such a descriptor in the __dict__ of some object then
whenever the attribute with the name of the descriptor is accessed one of the
special methods is executed according to the access type (__get__ for read,
__set__ for write, and __delete__ for delete).This simple enough indirection
scheme allows total control on attribute access.
The following code sample shows a silly write-only descriptor used to store
passwords. Its value may not be read nor deleted (it throws AttributeError exception).
Of course the descriptor object itself and the password can be accessed directly
through A.__dict__['password'].
class WriteOnlyDescriptor(object):
def __init__(self):
self.store = {}
def __get__(self, obj, objtype=None):
raise AttributeError
def __set__(self, obj, val):
self.store[obj] = val
def __del(self, obj):
raise AttributeError
class A(object):
password = WriteOnlyDescriptor()
if __name__ == '__main__':
a = A()
try:
print a.password
except AttributeError, e:
print e.__doc__
a.password = 'secret'
print A.__dict__['password'].store[a]
Descriptors with both __get__ and __set__ methods are called data descriptors.
In general, data descriptors take lookup precedence over instance dictionaries,
which take precedence over non-data descriptors. If you try to assign a value
to a non-data descriptor attribute the new value will simply replace the descriptor.
However, if you try to assign a value to a data descriptor the __set__ method
of the descriptor will be called.
Properties
Properties are managed attributes. When you define a property you can provide
get, set, and del functions as well as a doc string. When the attribute is accessed
the corresponding functions are called. This sounds a lot like descriptors and
indeed it is mostly a syntactic sugar for a common case.
This final code sample is another version of the silly password store using
properties. The __password field is "private." Class A has a 'password' property
that, when accessed as in 'a.password,' invokes the getPassword or setPassword
methods. Because the getPassword method raises the AttributeError exception,
the only way to get to the actual value of the __password attribute is by circumventing
the Python fake privacy mechanism. This is done by prefixing the attribute name
with an underscore and the class name a._A__password. How is it different from
descriptors? It is less powerful and flexible but more pleasing to the eye.
You must define an external descriptor class with descriptors. This means you
can use the same descriptor for different classes and also that you can replace
regular attributes with descriptors at runtime.
Properties are more cohesive. The get, set functions are usually methods of
the same class that contain the property definition. For programmers coming
from languages such as C# or Delphi, Properties will make them feel right at
home (too bad Java is still sticking to its verbose java beans).
Python's Richness a Mixed Blessing
There are many mechanisms to control attribute access at runtime starting with
just dynamic replacement of attribute in the __dict__ at runtime. Other methods
include the __getattr__/__setattr, descriptors, and finally properties. This
richness is a mixed blessing. It gives you a lot of choice, which is good because
you can choose whatever is appropriate to your case. But, it is also bad because
you HAVE to choose even if you just choose to ignore it. The assumption, for
better or worse, is that people who work at this level should be able to handle
the mental load.
In my next article, I will pick up where I've left off. I'll begin by contrasting
metaclasses with decorators, then explore the Python execution model, and explain
how to examine stack frames at runtime. Finally, I'll demonstrate how to augment
the Python language itself using these techniques. I'll introduce a private
access checking feature that can be enforced at runtime.
Gigi Sayfan is a software developer working on CELL applications for
Sony Playstation3. He specializes in cross-platform object-oriented programming
in C/C++/C#/Python with emphasis on large-scale distributed systems.
Meld is a GNOME 2 visual diff and merge tool.
It integrates especially well with CVS. The diff viewer lets you edit files
in place (diffs update dynamically), and a middle column shows detailed changes
and allows merges. The margins show location of changes for easy browsing, and
it also features a tabbed interface that allows you to open many diffs at once.
This resource is provided so that people can
use CWM, find out what it does (documentation used to be sparse), and perhaps
even contribute to its development.
built-in sets - the sets module, introduced in 2.3, has now
been implemented in C, and the set and frozenset types are available
as built-in types (PEP
218)
unification of integers and long integers - an operation
that would return a number too big for an integer will automatically
return a long integer. (PEP
237)
generator expressions - generator expressions are similar
to a list comprehension, but instead of creating the entire list of
results they create a generator that returns the results one by one.
This allows for efficient handling of very large lists. (PEP
289)
reversed() - a new builtin that takes a sequence and returns
an iterator that loops over the elements of the sequence in reverse
order (PEP 322)
new sort() keyword arguments - sort() now accepts keyword
arguments cmp, key and reverse
sorted() - a new builtin sorted() acts like an in-place list.sort()
but can be used in expressions, as it returns a copy of the sequence,
sorted.
string methods - strings gained an rsplit() method, and the
string methods ljust(), rjust() and center() accept an argument to specify
the fill character.
eval() now accepts any form of object that acts as a mapping
as its argument for locals, rather than only accepting a dictionary.
There's all sorts of new and shiny evil possible thanks to this little
change.
If you need more speed than native Python provides, you can always write
code in C and wrap it so it is callable from Python. The wrapping is really
easy to do, once you have understood the general concepts involved in it.
The product I currently work on has about 10000 lines of C code (crypto
and networking) which is used this way, and it works perfectly. For more
information about extending Python with C, see:
Dive Into Python is a free Python book for experienced programmers.
You can read the
book online, or download
it in a variety of formats. It is also available in
multiple languages.
This book is still being written. The first three chapters are a solid overview
of Python programming. Chapters covering
HTML processing,
XML processing, and
unit testing are complete, and a chapter covering
regression testing is in progress. This is not a teaser site for some larger
work for sale; all new content will be published here, for free, as soon as
it’s ready. You can
read the revision
history to see what’s new. Updated 28 July 2002
Wing IDE for Python
Python IDS that includes source browser and editor. The editor supports
folding
Designed to maximize programmer productivity, Wing IDE and Python can reduce
development and maintenance costs by 50 to 90 percent of that seen with languages
such as C, C++, Java, VB, and Perl.
Whether you are building dynamic web sites, desktop applications, or complex
enterprise solutions, Wing IDE and Python provide you with a fast, scalable,
and portable development platform that lets you concentrate on building application-specific
functionality.
Wing and Python are easy to learn and integrate well with other tools and
your existing non-Python code base.
Key features of Wing IDE include:
Networked graphical debugger,
supports both stand-alone application development and remote debugging of
externally launched code (like web CGIs and servlets).
Source code analyzer and browser,
offers high-level graphical inspection of source code structure, making
it easier to understand, redesign, and maintain code.
Powerful source editor, provides
syntax highlighting for many languages, auto-completion, auto-indent,
indentation analysis and conversion, keyboard macros, structural code folding,
and customization of key bindings. An optional emacs personality
module is included.
Project manager, organizes
your code and speeds access to your files.
Wing IDE comes in two product levels: Wing IDE Standard and Wing IDE
Lite, a scaled down version available for non-commercial use only. For a
detailed listing of features found in each product level, see the
product
feature list.
What's New in Python
2.2 -- generators is a very interesting feature of Python 2.2 that is essentially
a co-routine.
Generators are another new feature, one that
interacts with the introduction of iterators.
You're doubtless familiar with how function
calls work in Python or C. When you call a function, it gets a private namespace
where its local variables are created. When the function reaches a
return statement, the
local variables are destroyed and the resulting value is returned to the caller.
A later call to the same function will get a fresh new set of local variables.
But, what if the local variables weren't thrown away on exiting a function?
What if you could later resume the function where it left off? This is
what generators provide; they can be thought of as resumable functions.
Here's the simplest example of a generator function:
def generate_ints(N):
for i in range(N):
yield i
A new keyword,
yield, was introduced for
generators. Any function containing a yield
statement is a generator function; this is detected by Python's bytecode compiler
which compiles the function specially as a result. Because a new keyword was
introduced, generators must be explicitly enabled in a module by including a
from __future__ import generators
statement near the top of the module's source code. In Python 2.3 this statement
will become unnecessary.
When you call a generator function, it
doesn't return a single value; instead it returns a generator object that supports
the iterator protocol. On executing the yield
statement, the generator outputs the value of
i, similar to a
return statement. The
big difference between yield
and a return statement
is that on reaching a yield
the generator's state of execution is suspended and local variables are preserved.
On the next call to the generator's .next()
method, the function will resume executing immediately after the
yield statement. (For complicated
reasons, the yield
statement isn't allowed inside the try
block of a try...finally
statement; read
PEP 255 for a full explanation of the interaction between
yield and exceptions.)
Here's a sample usage of the
generate_ints generator:
>>> gen = generate_ints(3)
>>> gen
<generator object at 0x8117f90>
>>> gen.next()
0
>>> gen.next()
1
>>> gen.next()
2
>>> gen.next()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 2, in generate_ints
StopIteration
You could equally write
for i in generate_ints(5),
or a,b,c = generate_ints(3).
Inside a generator function, the
return statement can only
be used without a value, and signals the end of the procession of values; afterwards
the generator cannot return any further values.
return with a value, such
as return 5,
is a syntax error inside a generator function. The end of the generator's results
can also be indicated by raising StopIteration
manually, or by just letting the flow of execution fall off the bottom of the
function.
You could achieve the effect of generators
manually by writing your own class and storing all the local variables of the
generator as instance variables. For example, returning a list of integers could
be done by setting self.count
to 0, and having the next()
method increment self.count
and return it. However, for a moderately complicated generator, writing a corresponding
class would be much messier. Lib/test/test_generators.py contains a number of
more interesting examples. The simplest one implements an in-order traversal
of a tree using generators recursively.
# A recursive generator that generates Tree leaves in in-order.
def inorder(t):
if t:
for x in inorder(t.left):
yield x
yield t.label
for x in inorder(t.right):
yield x
Two other examples in Lib/test/test_generators.py
produce solutions for the N-Queens problem (placing queens on an chess board
so that no queen threatens another) and the Knight's Tour (a route that takes
a knight to every square of an chessboard without visiting any square twice).
The idea of generators comes from other programming
languages, especially Icon (http://www.cs.arizona.edu/icon/),
where the idea of generators is central. In Icon, every expression and function
call behaves like a generator. One example from ``An Overview of the Icon Programming
Language'' at http://www.cs.arizona.edu/icon/docs/ipd266.htm
gives an idea of what this looks like:
sentence := "Store it in the neighboring harbor"
if (i := find("or", sentence)) > 5 then write(i)
In Icon the
find() function returns the
indexes at which the substring ``or'' is found: 3, 23, 33. In the
if statement,
i is first assigned
a value of 3, but 3 is less than 5, so the comparison fails, and Icon retries
it with the second value of 23. 23 is greater than 5, so the comparison now
succeeds, and the code prints the value 23 to the screen.
Python doesn't go nearly as far as Icon in adopting
generators as a central concept. Generators are considered a new part of the
core Python language, but learning or using them isn't compulsory; if they don't
solve any problems that you have, feel free to ignore them. One novel feature
of Python's interface as compared to Icon's is that a generator's state is represented
as a concrete object (the iterator) that can be passed around to other functions
or stored in a data structure.
Written by Neil Schemenauer, Tim Peters,
Magnus Lie Hetland. Implemented mostly by Neil Schemenauer and Tim Peters,
with other fixes from the Python Labs crew.
Dive Into Python
Dive Into Python is a free Python book for experienced programmers.
You also have my Python vote. Python is not only a powerful language
on its own, but it's also a good glue language, pulling C++ together to
speed up particular modules, or using Jython for a Java perspective(IMHO,
I like programming in Jython better than Java itself:)
Python is also good for server-side scripting as well as operating system
scripts. Not to mention that Python has lots of support including,
Twisted Matrix
[twistedmatrix.com], which is an event-based framework for internet applications
that include a web server, a telnet server, a multiplayer RPG engine, a
generic client and server for remote oject access, and API's for creating
new protocols and services.
If you want GUI design, you have wxWindows, wxPython for wxGTK, Tk, PyQt,
PyGTK, etc.
I would tell your boss to got with Python; You'll end up using it down the
road anyway ;)
I can definitely speak for the elegance of Python; I've used it to build
some pretty large-scale AI projects, among about a zillion other things.
If you're looking for the weirder OO features like operator overloading,
you'll find them in Python, but the calisthenics you have to go through
to do it might make you think twice about using it.
The only real drawback to Python is the execution
speed. One of the AI projects I previously mentioned worked
great (and I completed it probably 10 times faster than I would have had
I been using C/C++), but it ran very very slowly. The cause of
the problem is clear -- no static typing, reference-count garbage collection,
a stack-based VM (I ain't knockin' it! It's just hard to optimize)...
If you are planning to do anything compute-intensive, maybe Python is
not the right choice. It's always possible to break out to C/C++ with the
Python API (to do your compute-intensive tasks), but there are drawbacks:
over time, the Python C-API will probably drift, and you'll have to keep
tweaking your native code to keep it working. Also, the inner workings of
Python, especially refcounts, can be boggling and the source of bugs and
memory leaks that can be fantastically hard to track down.
If you consider Python, then great. Just keep these points in mind.
It does sound like your boss was trying to tell you to use Python, unless
it needs to be blindingly fast (as in "operating system" or "3d shooter").
I'd like to second that recommendation and the arguments already posted.
To respond to Cliff's question: The one thing that every new language
should have is that "significant whitespace" look-ma-no-braces syntax used
by Python - or at least the option of not having to spend your life typing
semicolons.
In fact, you could probably retrofit the common C and C++ compilers to
accept significant whitespace. This would mean the Linux kernel people would
be reading:
static int proc_sel(struct task_struct *p, int which, int who):
if(p->pid):
switch (which):
case PRIO_PROCESS:
if (!who && p == current):
return 1
return(p->pid == who)
case PRIO_PGRP:
if (!who):
who = current->pgrp
return(p->pgrp == who)
case PRIO_USER:
if (!who):
who = current->uid
return(p->uid == who)
return 0
(Modified from /linux/kernel/sys.c (C) Linus Torvalds)
Of course, there is a problem of finding a new name for this sort of
C without violating somebodys copyright...is "Clean C" still available?
Pylint is a static type checker for Python (compare with
PyChecker 0.4) David Jeske and Scott Hassan proved that it is possible to
do this. They are working on a type inference engine that understands the Python
language and can detect type errors and violations. Adding type checking to
python without
changing the language will ease the maintenance of a large python project with
lots of developers. Please remember that eGroups before it was bought by Yahoo
was a huge Python project (more than 180,000 lines of Python doing everything
from a 100% dynamic website to all email delivery, pumping out 200 messages/second
on a single 400 MHz Pentium!)
The Python Refactoring Browser, helping Pythonistas everywhere glide over
the gory details of refactoring their code. Watch him extract jumbled code into
well ordered classes. Gasp, as he renames all occurrences of a method. Thank
You Bicycle Repair Man!
Python 9- Interview with Bruce Eckel 2001-05-08 (0:25:09) After mastering
the complexities of C++ and Java -and making them easy to grasp for thousands
of programmers- Bruce Eckel has moved on to Python. He describes the
language and its strengths in light of his experience with other languages and
tools. Is a new best-seller in the works? Thinking in Python? Interview
and video production by Kirby Angell. [466]
This is a brief introduction to
Python for Lisp programmers.
Basically, Python can be seen as a dialect of Lisp with "traditional" syntax
(what Lisp people call "infix" or "m-lisp" syntax). One message on comp.lang.python
said "I never understood why LISP was a good idea until I started playing with
python." Python supports all of Lisp's essential features except
macros, and you don't miss macros all
that much because it does have eval, and operator overloading, so you can create
custom languages that way. (Although it wasn't my intent, Python programmers
have told me this page has helped them learn Lisp.)
I looked into Python because I was considering
translating the
code for the Russell & Norvig
AI textbook
from Lisp to Java so that I could have (1) portable GUI demos, (2) portable
http/ftp/html libraries, (3) free development environments on all major platforms,
and (4) a syntax for students and professors who are afraid of parentheses.
But writing all that Java seemed much too daunting, and I think that students
who used Java would suffer by not having access to an interactive environment
to try things out. Then I discovered
JPython, a version of Python
that is neatly integrated into Java, giving us access to the Java GUIs. Of course,
Python already has web libraries, so JPython can use either those or Java's.
My conclusion
Python is an excellent language for my intended
use. It is a good language for many of the applications that one would use Lisp
as a rapid prototyping environment for. The three main drawbacks are (1) execution
time is slow, (2) there is very little compile-time error analysis, even less
than Lisp, and (3) Python isn't called "Java", which is a requirement in its
own right for some of my audience. I need to determine if JPython is close enough
for them.
Python can be seen as either a practical (better
libraries) version of Scheme, or as a cleaned-up (no $@&%) version of Perl.
While Perl's philosophy is TIMTOWTDI (there's more than one way to do it), Python
tries to provide a minimal subset that people will tend to use in the same way.
One of Python's controversial features, using indentation level rather than
begin/end or {/}, was driven by this philosophy: since there are no braces,
there are no style wars over where to put the braces. Interestingly, Lisp has
exactly the same philosophy on this point: everyone uses emacs to indent their
code. If you deleted the parens on control structure special forms, Lisp and
Python programs would look quite similar.
Python has the philosophy of making sensible
compromises that make the easy things very easy, and don't preclude too many
hard things. In my opinion it does a very good job. The easy things are easy,
the harder things are progressively harder, and you tend not to notice the inconsistencies.
Lisp has the philosophy of making fewer compromises: of providing a very powerful
and totally consistent core. This can make Lisp harder to learn because you
operate at a higher level of abstraction right from the start and because you
need to understand what you're doing, rather than just relying on what feels
or looks nice. But it also means that in Lisp it is easier to add levels of
abstraction and complexity; Lisp makes the very hard things not too hard.
Zope (the Z Object Publishing Environment) is an application
server that is gaining in popularity. But what is it? What's an application
server, anyway? How does all this compare with nice familiar paradigms like
CGI? More importantly, is Zope a fad, or is it here to stay?
In a nutshell, here's what you get from Zope:
Database integration
Template-based page production
User authentication,
Selective permissions so that certain users can only
update specific parts of your site
A nice object-oriented paradigm for the distribution
of custom code
Management of persistent objects and sessions
And you get all this in an easily installed, easily maintained
package. But the most significantly new thing about Zope is how it encourages
you to look at your Web site differently than you do now. Let's talk about that
a little before we get down to brass tacks.
(Apr 27, 2001, 18:01 UTC) (631 reads) (0 talkbacks) (Posted by
mhall)
Zope 2.3.2 has been released with minor bugfixes
from the last beta release of this version.
Are there situations in which it's better
to use Perl than Python, or vice versa?
van Rossum:
They're both competing for the same niche, in some cases.
If all you need to do is simple text processing, you might use Perl. Python,
much more than Perl, encourages clean coding habits to make it easy for
other people to follow what you are doing. I've seen a lot of people who
were developing in Perl moving to Python because it's easier to use. Where
Python wins is when you have users who are not very sophisticated but have
to write some code. An example would be in educational settings, where you
have students who have no prior programming experience. Also, Python seems
to work better when you have to write a large system, working together in
a large group of developers, and when you want the lifetime of your program
to be long.
searchEnterpriseLinux:
In what situations would people use Java
instead of Python?
van Rossum:
Java and Python have quite different characteristics. Java
is sophisticated language. It takes a longer time to learn Java than Python.
With Java, you have to get used to the compile-edit-run cycle of software
development. Python is a lighter-weight language. There is less to learn
to get started. You can hit the ground running when you have only a few
days of Python training under your belt. Python and Java can be used in
the same project. Python would be used for higher level control of an application,
and Java would be used for implementing the lower level that needs to run
relatively efficiently. Another place where Python is used is an extension
language. An example of that is Object Domain, a UML editing tool written
in Java. It uses Python as an extension language.
searchEnterpriseLinux:
Will use of Python in developing applications
increase?
van Rossum:
It's going to increase dramatically. There's a definite
need for a language that's as easy to use as Python. Python is used to teach
students how to program, so that is creating a base of people who know and
like the language.
searchEnterpriseLinux:
Are there technology developments that
will benefit Linux developers?
van Rossum:
Python developer tools are becoming available that work
well in the Linux world. Until a year ago, the Python developer was stuck
with using Emacs or VI to edit source code. Both of those editors have support
for Python, but they're just editors. Now there are tools like Wing IDE,
which is a development tool that is written in Python. Komodo is more ambitious
and less finished at the moment, and it also supports program development
with Perl. There is a Swedish company called Secret Labs which have had
Windows development tools for Python, but which is now focusing on the Linux
market. As more development environments become available for Python, we'll
see a lot more users on the Linux platform.
Dave Warner wrote an excellent article on XML-RPC and Python
here, using Meerkat as an example of a real XML-RPC server
Another
article about XML-RPC
According to Jeff Walsh's InfoWorld article, Microsoft is planning to open up
their operating system using XML-RPC. Such a protocol could be deployed quickly
in other operating systems that support HTTP, ranging from Perl scripts running
on Linux, to Quark publishing systems running on Macs, to relational databases
running on mainframe systems. It could put Windows at the center of a new kind
of web, one built of logic, storing objects and methods, not just pages and
graphics.
People interested in XML-RPC might be interested in checking it out using
KDE. Since KDE 2.0 a DCOP XMLRPC bridge has been included allowing easy access
to a wide range of the desktops APIs.
I work for a
company (Technisys)
which have created several years ago an RPC tool called "tmgen". This tool
is built as a layer on top of rpcgen, adding session cookie handling, SSL
support, a stateless server, handling of enumerated values with associated
long and short descriptions, and many other thing. It's in fact, an application
server built on top of RPC.
This baby have been running for many years in the most important banks
and credit cards companies here in Argentina (yes, you know the brands,
but I'm not sure I can tell you which ones =) ).
The "tmgen" tool reads a ".def" file that defines the datatyes, and ".trn"
files which have the code of the different "transactions". Having read those
files, it automatically generates the server source (including the rpcgen
input source).
I was asked to make it possible for the clients to be programmed in the
Java language. I evaluated several possibilities, one of them using a Java
client for RPC. This required us to go for a proprietary solution, besides,
being in control of both sides it looked silly to be tied to a protocol.
Another possibility would have been to modify tmgen to create an RMI server.
But the best was to create an XML server (IMO). I then evaluated SOAP and
XML-RPC. SOAP seemed very nice, but XML-RPC was *direct* mapping of the
semantics and types of our existing solution. The benefits of SOAP were
a drawback in this case, we just wanted to have strings, structs, ints and
floats.
So, now it's working. It takes a definition of the structs, the method
and which parameters they get, and it creates code (using the
gnome-xml library (which
I recommend). The automatically generated code works as a standalone inetd
webserver which reads an XML-RPC query from the net, parses the query, loads
it in the generated C structures, run the "transaction", and creates a response
from the reply structures. The final result was that all those old C, RPC-only,
programs started to have an XML interface.
I added the
Helma RPC-XML
client and voila, we had a Java client. So I must say that my experience
in this legacy system with XML-RPC was great.
Talking about new systems, I think that XML-RPC does the wrong thing,
by defining markup for the types instead of marking the semantic of the
data.
I've used XML-RPC before, written my own version of
it, and used SOAP extensively (and even posted a few patches to the IBM
SOAP for Java implementation). My conclusion has been that while RPC is
slow, XML-RPC (and its variants) are necessarily slower. The idea itself
is good, and it may be useful for so-called "web services" where there is
a strict client-server relationship between communicating machines and there
are few RPC's bouncing around the network. However, the overhead of a full
XML parser, plus a protocol implementation, plus a marshaller and demarshaller
to and from XML (especially since no language really provides this) is a
big problem for XML-RPC and its kin. According to Dave Winer... "Conceptually,
there's no difference between a local procedure call and a remote one, but
they are implemented differently, perform differently (RPC is much slower)
and therefore are used for different things." Bzzt! A local procedure call
can pass references, pointers, etc. A remote procedure call is not just
slower, but it also limits what kind of data you can reasonably expect to
pass in the parameters. A pointer to a database driver, for instance, or
a function pointer (for callbacks) are entirely different beasts! From the
slashdot header: "It's deliberately minimalist but nevertheless quite powerful,
offering a way for the vast majority of RPC applications that can get by
on passing around boolean/integer/float/string datatypes to do their thing
in a way that is lightweight and easy to to understand and monitor" Lightweight?
Simple to understand and monitor? First of all, XML parsers are NEVER lightweight,
especially if you want to do SOAP and use things like namespaces. Second
of all, if you're doing only booleans/integers/floats/strings as the above
suggests, then you're fine. More complex datatype marshalling? Ouch! I'm
going to leave XML-RPC, SOAP, and its kin behind for a while and see what
happens. Until then, caveat earlius adopterus!
I am a big fan of distributed computing, heck I wrote
an
article about it on K5, and have always wondered what the XML-RPC payoff
is.
From what I can tell, XML-RPC is a way to replace the binary protocols that
current distributed systems use (e.g. Corba's IIOP or DCOM's ORPC) with
an XML based one. So when an object needs perform a remote method call,
instead of just sending it's arguments in a compact efficient binary packet,
it builds an XML string which has to be parsed on the other end. So with
XML RPC, remote method calls now need the addition of an XML parser to their
current bag of tricks.
On the surface it seems that this makes it easier to perform distributed
computing since any shmuck can use an XML parser and call a few functions.
But it means that an extra layer of abstraction has been added to operations
that should be performed rather quickly for the dubious benefit of compatibility
across platforms (which is yet to be realized) which seems to be more jumping
on the XML hype bandwagon than reality. My biggest issue is that for XML-RPC
to support things that are the biggest issues of distributed computing (e.g.
keeping track of state) would add so much bloat to the XML parsing, string
building, etc process for making a remote call as to make it unfeasible.
Anyone see any errors in this thinking?
Grabel's Law
2 is not equal to 3 - not even for very large values of 2.
Yes. It's true that as a consequence of its text-based nature, communication
via XML is decidedly more bandwidth-intensive than any binary counterpart.
The problem, though, lies in translation of that binary format two and from
different machines, architectures, languages, and implementations.
I currently do quite a bit of work using SOAP, which is similar to XML-RPC,
but a little less well-developed. It's a no-brainer. If I'm using Java,
Perl, C, C++ or even Python, it's relatively easy to make calls between
these different languages. I don't have to worry about the endianess of
the data I'm working with. I don't have to worry about learning some new
data encoding scheme (XML is very well-defined, and very ubiquitous; almost
every language has a translator). Communicating between a language like
Perl, which has no typing, and Java, which has more strict typing is a no
brainer, because the data structures are defined by a well-documented, human
readable schema, and when I look at the data I'm sending, I can see the
raw information.
Bandwidth concerns might have been paramount two years ago, but the world
in which XML-RPC and SOAP are being used has already shifted to broadband.
Now, human clarity and complete interoperability, as well as the ease of
use of porting XML-RPC (or SOAP) constructs to another language (since it's
just XML text) make it a much more efficient model in terms of programmer
time.
Yes, from a strictly bandwidth concern, CORBA or DCOM beat XML hands down,
but when you remove that consideration, it's not that big a deal. Couple
it with existing protocols (I've seen SOAP implementations via HTTP, FTP,
and even SMTP), and the opportunity to grow via existing infrastructure
and well-known technologies, and you just have an easier (and thus, I would
argue, more open) model to work with.
"PythonWorks Pro, entirely developed
in Python, contains an editor, project manager, a deployment tool, an integrated
browser, debugger, and a proprietary interface design tool. PythonWorks Pro
allows the developer to easily create Python applications using the built-in
tools."
Stackless Python: An interview with creator Christian Tismer
At first brush, Stackless Python might seem like a minor fork to CPython. In
terms of coding, Stackless makes just a few changes to the actual Python C code
(and redefines "truth"). The concept that Christian Tismer (the creator of Stackless
Python) introduces with Stackless is quite profound, however. It is the concept
of "continuations" (and a way to program them in Python).
To attempt to explain it in the simplest terms, a continuation
is a representation, at a particular point in a program, of everything the program
is capable of doing subsequently. A continuation is a potential that depends
on initial conditions. Rather than loop in a traditional way, it is possible
to invoke the same continuation recursively with different initial conditions.
One broad claim I have read is that continuations, in a theoretical sense, are
more fundamental and underlie every other control structure. Don't worry
if these ideas cause your brain to melt; that is a normal reaction.
Reading Tismer's background article in the
Resources is a good start for further
understanding. Pursuing his references is a good way to continue from there.
But for now, let's talk with Tismer at a more general level:
Mertz: Exactly what is Stackless Python? Is there
something a beginner can get his or her mind around that explains what is
different about Stackless?
Tismer: Stackless Python is a Python implementation
that does not save state on the C stack. It does have stacks -- as many
as you want -- but these are Python stacks.
The C stack cannot be modified in a clean way from a language
like C, unless you do it in the expected order. It imposes a big obligation
on you: You will come back, exactly here, exactly in the reverse way as
you went off.
"Normal" programmers do not see this as a restriction
in the first place. They have to learn to push their minds onto stacks from
the outset. There is nothing bad about stacks, and usually their imposed
execution order is the way to go, but that does not mean that we have to
wait for one such stack sequence to complete before we can run a different
one.
Programmers realize this when they have to do non-blocking
calls and callbacks. Suddenly the stack is in the way, we must use threads,
or explicitly store state in objects, or build explicit, switchable stacks,
and so on. The aim of Stackless is to deliver the programmer from these
problems.
Mertz: The goal of Stackless is to be 100% binary
compatible with CPython. Is it?
Tismer: Stackless is 100% binary compatible at
the moment. That means: You install Python 1.5.2, you replace Python15.dll
with mine, and everything still works, including every extension module.
It is not a goal, it was a demand, since I didn't want to take care about
all the extensions.
Mertz: Stackless Python has been absolutely fascinating
to read about for me. Like most earthbound programmers, I have trouble getting
my mind wholly around it, but that is part of what makes it so interesting.
Tismer: Well, I'm earthbound, too, and you might
imagine how difficult it was to implement such a thing, without any idea
what a continuation is and what it should look like in Python. Getting myself
into doing something that I wasn't able to think was my big challenge. After
it's done, it is easy to think, also to redesign. But of those six months
of full-time work, I guess five were spent goggling into my screen and banging
my head onto the keyboard.
Continuations are hard to sell. Coroutines and generators,
and especially microthreads are easier. All of the above can be implemented
without having explicit continuations. But when you have continuations already,
you find that the step to these other structures is quite small, and continuations
are the way to go. So I'm going to change my marketing strategy and not
try any longer to sell the continuations, but their outcome. Continuations
will still be there for those who can see the light.
Mertz: There is a joke about American engineers
and French engineers. The American team brings a prototype to the French
team. The French team's response is: "Well, it works fine in practice; but
how will it hold up in theory?" I think the joke is probably meant to poke
fun at a "French" style, but in my own mind I completely identify with the
"French" reaction. Bracketing any specific national stereotypes in the joke,
it is my identification in it that draws me to Stackless. CPython works
in practice, but Stackless works in theory! (In other words, the abstract
purity of continuations is more interesting to me personally than is the
context switch speedups of microthreads, for example).
Tismer: My feeling is a bit similar. After realizing
that CPython can be implemented without the C stack involved, I was sure
that it must be implemented this way; everything else looks insane
to me. CPython already pays for the overhead of frame objects, but it throws
all their freedom away by tying them to the C stack. I felt I had to liberate
Python. :-)
I started the project in May 1999. Sam Rushing was playing
with a hardware coroutine implementation, and a discussion on Python-dev
began. Such a stack copying hack would never make it into Python, that was
clear. But a portable, clean implementation of coroutines would, possibly.
Unfortunately, this is impossible. Steve Majewski gave up five years ago,
after he realized that he could not solve this problem without completely
rewriting Python.
That was the challenge. I had to find out. Either it is
possible, and I would implement it; or it is not, and I would prove the
impossibility. Not much later, after first thoughts and attempts, Sam told
me about call/cc and how powerful it was. At this time, I had no idea in
what way they could be more powerful than coroutines, but I believed him
and implemented them; after six or seven times, always a complete rewrite,
I understood more.
Ultimately I wanted to create threads at blinding speed,
but my primary intent was to find out how far I can reach at all.
Mertz: On the practical side, just what performance
improvements is Stackless likely to have? How great are these improvements
in the current implementation? How much more is possible with tweaking?
What specific sorts of applications are most likely to benefit from Stackless?
Tismer: With the current implementation, there
is no large advantage for Stackless over the traditional calling scheme.
Normal Python starts a recursion to a new interpreter. Stackless unwinds
up to a dispatcher and starts an interpreter from there. This is nearly
the same. Real improvements are there for implementations of coroutines
and threads. They need to be simulated by classes, or to be real threads
in Standard Python, while they can be implemented much more directly with
Stackless.
Much more improvement of the core doesn't seem possible
without dramatic changes to the opcode set. But a re-implementation, with
more built-in support for continuations et. al., can improve the speed of
these quite a lot.
Specific applications that might benefit greatly are possibly
Swarm simulations, or multiuser games with very many actors performing tiny
tasks. One example is the EVE game (see
Resources below), which is under
development, using Stackless Python.
Mertz: What do you think about incorporating Stackless
into the CPython trunk? Is Stackless just as good as an available branch,
or does something get better if it becomes the core version?
Tismer: There are arguments for and against it.
Against: As long as I'm sitting on the Stackless implementation, it is mine,
and I do not need to discuss the hows and whys. But at the same time, I'm
struggling (and don't manage) to keep up with CVS. Better to have other
people doing this.
Other Python users, who aren't necessarily interested
in kinky stuff, won't recognize Stackless at all; just the fact that it
happens to be faster, and that the maximum recursion level now is an option
and not a hardware limit. And there is another promise for every user: There
will be pickleable execution states. That means you can save your program
while it is running, send it to a friend, and continue running it.
Finally, I'm all for it, provided that all my stuff makes
it into the core; at the same time, I do not want to see a half-baked solution,
as has been proposed several times.
Mertz: Any thoughts on future directions for Stackless?
Anything new and different expected down the pipeline? Stackless still suffers
from some recursions. Will they vanish?
Tismer: Pickling support will be partially implemented.
This will be working first for microthreads since they provide the cleanest
abstraction at the moment. They are living in a "clean room" where the remaining
recursion problem doesn't exist. My final goal is to remove all interpreter
recursion from Python. Some parts of Stackless still have recursions, especially
all the predefined __xxx__ methods of objects. This is very hard
to finalize since we need to change quite a few things, add new opcodes,
unroll certain internal calling sequences, and so on.
Python's fearless leader and self-proclaimed "Benevolent Dictator
for Life" (BDFL), Guido van Rossum, recently wrote an open letter to the community,
proclaiming Python's decision to move the project to new auspices in the Open
Source realm.
"Python is growing rapidly. In order to take it to the
next level, I've moved with my core development group to a new employer, BeOpen.com,"
van Rossum wrote. "BeOpen.com is a startup company with a focus on Open Source
communities, and an interest in facilitating next-generation application development.
It is a natural fit for Python."
BeOpen.com develops online communities for Open Source application
users, developers and interested parties through its network of portal Web sites.
It also offers support and custom development for Open Source applications specific
to major corporations and governmental entities.
Van Rossum created Python in the early 1990s at CWI (the National
Research Institute for Mathematics and Computer Science in the Netherlands)
in Amsterdam. In 1995, he moved to the United States, and lives in Reston, Virginia.
In the past five years, van Rossum has worked as a researcher
at the Corporation for National Research Initiatives (CNRI). He also is technical
director of the Python Consortium, a CNRI-hosted international consortium that
promotes Python use and development.
Van Rossum has headed up Python's development for the past
decade. He said he expects to remain in the position for at least another 10
years and although he "is not worried about getting hit by a bus," he also has
announced that he soon will be married and the union between Python and BeOpen.com
will give him some time for a honeymoon.
At BeOpen.com, van Rossum is the director of a new development
team recently dubbed PythonLabs. The team includes three of his colleagues from
CNRI: Fred Drake, Jeremy Hylton, and Barry Warsaw. "Another familiar face will
join us shortly: Tim Peters. We have our own
Web site
where you can read more about us, our plans and our activities. We've also posted
a FAQ there specifically about PythonLabs, our transition to BeOpen.com, and
what it means for the Python community," he said.
As open source projects go, Python, the "very
high level language" developed by Guido Van Rossum 10 years ago, is a prime
example of the "scratch your own itch" design philosophy.
Van Rossum, a 44 year old developer who spent
much of his collegiate and post- collegiate years working with instructional
software languages such as ABC and Pascal, freely admits that it was his annoyance
with these languages' real-word performance that drove him to create Python.
"The history of Python came out of the frustration
I had with ABC when it wasn't being used for teaching but for day-to-day ad
hoc programming," says Van Rossum, who in 1999 received an "Excellence in Programming"
award from the Dr. Dobb's Journal for his Python work.
Nearly a decade after releasing the first
version of Python, Van Rossum is currently looking at how to bridge the gap
between languages that are easy for non-technical users to grasp and languages
that are capable of performing industrial-strength computational tasks. Recently,
he and other Python developers have joined together to form Computer Programming
for Everybody, or CP4E, a project that is currently seeking funds from DARPA,
the modern day descendant of the organization that helped build the early Internet.
According to Van Rossum, CP4E will explore the way non-technical users engage
in computing, in a way that will gain new insights into smoothing the human/machine
interaction.
A lot of continuation-based research was done in the Standard ML project at AT&T
and Princeton in the late 80's/early 90's. Andrew Appel (who wrote the native-code
generator for this particular SML compiler) has a book,
Compiling with Continuations, which goes into this stuff. Continuation-passing
is odd at first but pretty easy to code up in a functional language.
I'm would like to have a language that match Perl in power. From what I know
of it, which comes mostly just from reading the docs, Ruby's the closest thing I've
seen yet. It seems to look more at Perl for its syntax, which means it has implicit
arguments for functions, magical character prefixes in certain spots, and cryptic
names for some global such as $$. It's also a much younger language; the interpreter
will become painted into a corner by growing complexity. This tendency is hard to
avoid; look at Perl or Zope for examples of the complexity problems that can result.
The road to hell is paved with good intentions.
Microthreads are useful when you want to program many
behaviors happening simultaneously. Simulations and games often want to
model the simultaneous and independent behavior of many people, many businesses,
many monsters, many physical objects, many spaceships, and so forth. With
microthreads, you can code these behaviors as
Python functions. You
will still need to think a teeny bit about the fact that context switching
occurs between threads, hence this documentation. The microthread package
uses
Stackless Python.
Microthreads switch faster and use much less memory than
OS threads. You can run thousands of microthreads simultaneously. Additionally,
the microthread library includes a rich set of objects for inter-thread
communication, synchronization, and execution control.
However, along with the usual density of technical achievements -- including
stackless Python, Python inside Java, three-dimensional Python,
tuning Python for type safety, and more -- Python displays a number of symptoms
of growing business respectability.
Will Ware dove into the Python core to allow microthreads
(threads of operation which work in a single system thread). This is similar
in spirit to some of work done by Christian Tismer for his "Stackless Python",
and there are rumors that the former will be adapted to the latter. http://world.std.com/~wware/uthread.html
It's easy to understand why I use Python. It's flexible,
easy, and powerful in a way that cannot be matched by other mainstream languages.
If you use Python, then you know what I am talking about. If you do not,
you will eventually; and then you'll know what I am talking about. I've
asked myself why I promote Python so much. I have never been such a vocal
advocate for other languages I've liked--even XML, my bread and butter.
There's a selfish component and an ethical one.
Selfishly speaking, I want to live in a world where most
software is written in a decent programming language. Java is decent,
and I don't mind it. Therefore I don't begrudge its success. But I consider
it a proprietary language surrounded by a re-invent-the-wheel culture.
I do not consider Perl decent for reasons that will become
clear, and I do hope that Python takes most of its popularity. I
refuse to become proficient in "indecent" languages. That means that much
of the software out there in the "open-source" world is in fact closed to
me. In an emergency, I could hold my nose and dive in, but I would not do
so to scratch the proverbial itch.
...
Python's inventor is involved in a project to make the
Python language and tools accessible to children. The project is called
Computer Programming for Everybody (CP4E). I like that title enough to consider
it the name of a movement, a goal, a dream: Computer Programming for Everybody
(CP4E).
I am not entirely naive: Computer programming is hard.
It is precisely because it is hard that there is no excuse for adding
artificial obstacles like modern languages rooted in the idioms of dead
languages, and adding syntaxes so complex that humans cannot keep them in
their head.
Give Python a try. I think you'll like it. The language itself is very simple
(see also discussion of the braces problem
below). All the power is found in the extension modules. To learn the language basics
should only take a week or so. I think the ease of learning argument is also a powerful
argument in favor of Python. I've been programming Perl off and on several years
now, but I cannot claim to be a Perl guru -- the language is so huge and so baroque,
a really PL/1-style language that I like too ;-). IMHO nobody has any change
to master Perl in a week ;-).
Many discard Python because it uses indentation to denote blocks in a language
and non a familiar C-like syntax. I think that's a big mistake.
Indentation is not a burden because the most natural form of the program is an indented
one. In languages like PL/1, C a pretty printer is a must. But if you consistently
uses a pretty printer why waist two useful symbols ("{" "}" ) for what can be done
with comments. That means that indentation can be always generated automatically
from special comments. But the problem is that it is optional feature, you don't
have to use it, but it is highly recommended. I personally use "//-{" and "//-}"
as substitutes.
In Python adding block-delimiting comments is the optional feature and the use
of whitespace is the mandatory feature. Python reversed the priority because whitespace
is a better visual aid in understanding the flow of a program. But if you use comments
to mark beginning and end of each block and automatic indentation, than it's indentation
than really matters. And that's great. It's easy to miss delimiters like "{" and
"}" when looking at a page of code.
The choice in Python was in favor of pretty printer and I tend to support it
on the basis on usability issues. Languages such as Perl and Tcl made their design
decisions based on familiarity with C rather than usability. They simply borrowed
the successful C-based notation of using curly braces for blocks. And they miss
some earlier PL/1 extensions like closing several blocks with one delimiter. Her
in Python one get this nice feature for free.
See an interesting Slashdot discussion below that shed some additional light
on the situation.
The syntactic indenting in Python is
actually extremely cool. Ok, so maybe it means that emacs is the only editor
it makes sense programming with - but we knew that already right? The main
benifit I see to syntactic indenting is that is narrows down the possible
range of coding styles. If you think about it, most of the (dare I say)
splintering of C/C++/Java coding styles is due to the placement of the {
and } symbols. This acts against readability for other developers. It's
also less typing.
If you like the keybindings and programmability of Emacs,
but don't like its size, try
JED which
also has a pretty good Python mode, too.
That's something that seriously bugs
me about Python. I have tried over the years to not be a syntax bigot, but
Perl loses in this respect. I saw the quip, "Perl -- the only programming
language that looks the same both before and after RSA encryption," and
it makes perfect sense. Dilbert's joke about a rat dancing on his keyboard
and writing a web browser could probably come true in Perl.
Python is a big, big win. But the required indentation just stupifies me.
I feel like I'm writing a FORTRAN 77 program, indenting to the correct place
so that the compiler/interpreter can find the right column of the punched
card...
I'm not a big fan of 1950's cars, clothes, music, or programming languages.
A really useful tool (written in Python or Perl or awk or whatever)
would read in a "Braced Python" file, remove the {}'s and redo the indentation
inside them, and then write out a syntactically-correct file. (For bonus
points, play some 50's tunes while it's converting.)
This would aid the transition from free-form languages like C++ (and FORTRAN
95...). It's enough to introduce new ideas; don't beat the user over the
head with personal indentation preferences at the same time.
A really useful tool (written in
Python or Perl or awk or whatever) would read in a "Braced Python" file,
remove the {}'s and redo the indentation inside them, and then write out
a syntactically-correct file.
such a tool comes with the standard python distribution
(look for pindent in the Tools directory).
(there's also something called corbascript, which is pretty
much python with braces...)
real python programmers tend to use python-aware editors,
but that's another story...
That's completely false. Perl certainly
has strong OO syntax. Perl, in fact, has much stronger OO than Python. A
couple of reasons are that in Perl you have privacy, and also because in
Perl its much easier to generate metaclasses.
I write python all day long every day. I used to do same for perl. This
statement is pretty darn funny. As my coworker jeffrey (with a derisive
snort) would say "OO is all about virtual private methods and late bindings"
(then bust up laughing).
From what I can tell, they plan on revamping pretty
much all of the syntax...and it won't be backwards compatible in the
end release.(they do plan on working slowly up to it, AFIAK)
Nyet, tovarisch. The "backwards compatibility" issue in
Python 3000 is mostly likely to bite extension writers, who code to Python's
C API; most code written in Python will see very little, if any, compatibility
problems.
I don't recall any of the specifics, but maybe
they will be integrating the most sought after change....no more tabs
and whitespace! ;)
No way, no how, not ever! The use of indentation to indicate
block structure is a core feature, there by explicit design, to increase
readability (you probably indent your C/C++, Java, or Perl to match the
block structure already, right? so just lose the "noise" characters, and
let the parser read the indentation). The issue has been repeatedly raised
on comp.lang.python, mostly by people who haven't actually used Python for
anything significant. As for the dreaded whitespace-eating nanoviruses,
those have been discussed extensively on the newsgroup, as well.
Sorry, but no. This is simply too much for a number of reasons...
1) positionally dependant languages are simply silly.
2) my programmers work in perl and C++ as well, there is no reason to inflict
a radically different syntax system on them
3) for those of us with long programming histories, those {} characters
are important visual cues in the code we read, I have no intention of turning
off reflexes I learned over the last 20 years or so just to satisfy someones
syntax egotism.
Your argument falls apart when you
realize that Python is designed to be a language that is for newbies, is
for oldies, is here to stay and will create careers for Python programmers.
Answers: 1) positionally dependant languages are simply silly.
No paradigm is "simply silly." It either works, or it doesn't. In the context
of Python, positional dependence helps define the language. Have you ever
taken over someone elses code? This enforces easy to read, easy to maintain
code.
2) my programmers work in perl and C++ as well, there is no reason to
inflict a radically different syntax system on them
Not much faith in your programmers. I see that if they know Perl well, then
there may be no need to use Python. However, I was given the choice of learning
a modern scripting language. I already knew Borne shell and awk. After much
research, I decided on Python. My C++ is strong and Python was an obvious
fit for my skills and style. I am still happy with the decision.
3) for those of us with long programming histories, those {} characters
are important visual cues in the code we read, I have no intention of turning
off reflexes I learned over the last 20 years or so just to satisfy someones
syntax egotism.
These "visual cues" are learned. Just as positional dependence is learned.
It's just a different paradigm, not necessarily better or worse.(better)
:)
ActiveState, the leader in open source programming
tools, announces the release of
Komodo 1.0, the first Mozilla application by a third party. Komodo is a
Perl and Python integrated development environment for programming using the
Mozilla application framework. A full-featured, multi-language IDE, its timesaving
features include integrated online help and an interactive remote debugger.
Komodo also includes the only one of its kind, regular expression toolkit, for
one of the most difficult technologies used in scripting languages.
"Komodo is the first commercial grade IDE for Perl and Python, and it's cross-platform
as well", said Dick Hardt, Founder & CEO, ActiveState. "Mozilla's component
oriented framework will allow us to easily add support for additional languages
and features throughout the coming months."
"Mozilla is ideally suited for cross-platform Web-based development. This makes
it a natural fit for ActiveState's powerful new IDE, Komodo," said Mitchell
Baker, chief lizard wrangler at mozilla.org. "Perl, Python and JavaScript developers
will find Komodo a wonderful tool for using Mozilla for Web services development."
Komodo features:
Regular Expression toolkit
Auto completion and call tips
Integrated online help
Rich language–aware code editor
Interactive remote debugging
Code folding
Visible source code that is customizable and
extensible
"The combination of Perl, Python and Mozilla allowed us to build a great IDE
for rapid application development," said Dr. David Ascher, Komodo Project Lead.
"I'm quite pleased with the feedback from beta-testers, who said that Komodo
is saving them time, which is our primary goal."
"ActiveState's release of the first application built on Mozilla is a watershed
event for the open source movement," said Tim O'Reilly, Founder & CEO of O'Reilly
& Associates. "It demonstrates that there's more to Mozilla than the next generation
Netscape browser. More importantly, it provides the web-enabled IDE that makes
cross-platform development with open source languages like Perl and Python accessible
to more than the hacker elite."
Komodo is available with ASPN Komodo at $295. ASPN Komodo delivers the Komodo
IDE and all updates, plus online, searchable access to cookbooks, technical
references, sample code, and more. An educational license is free for those
learning to program through
ASPN Open.
The 1.0 release supports Windows. Linux support and Komodo XSLT are available
as pre-release software.
Wing IDE for Python
Python IDS that includes source browser and editor. The editor supports
folding
Designed to maximize programmer productivity, Wing IDE and Python can reduce
development and maintenance costs by 50 to 90 percent of that seen with languages
such as C, C++, Java, VB, and Perl.
Whether you are building dynamic web sites, desktop applications, or complex
enterprise solutions, Wing IDE and Python provide you with a fast, scalable,
and portable development platform that lets you concentrate on building application-specific
functionality.
Wing and Python are easy to learn and integrate well with other tools and
your existing non-Python code base.
Key features of Wing IDE include:
Networked graphical debugger,
supports both stand-alone application development and remote debugging of
externally launched code (like web CGIs and servlets).
Source code analyzer and browser,
offers high-level graphical inspection of source code structure, making
it easier to understand, redesign, and maintain code.
Powerful source editor, provides
syntax highlighting for many languages, auto-completion, auto-indent,
indentation analysis and conversion, keyboard macros, structural code folding,
and customization of key bindings. An optional emacs personality
module is included.
Project manager, organizes
your code and speeds access to your files.
Wing IDE comes in two product levels: Wing IDE Standard and Wing IDE
Lite, a scaled down version available for non-commercial use only. For a
detailed listing of features found in each product level, see the
product
feature list.
Python is a really clean and elegant interpreted
OO
language. I don't think many folks will argue that
Perl is the same.
Well, Python was OO from the ground up and Perl was not, and it shows. But
as far as elegance is concerned, it is rather in the eye of the beholder.
The fact that there are many compact ways to solve the same problem
was a goal of Perl, and it is hard to imagine a language which achieves
this to a higher degree. In that sense, you have to grant that Perl does
what it sets out to do brilliantly.
I think of Perl as a radical language and Python as a rather conservative.
Of course, Python is highly innovative, but it takes conventional notions
of what is needed in a language to efficiently support a programming project
and pares away everything that is unnecessary baggage. Perl, at least my
take on it, rejects at the outset that there is any one way to support a
programming project and encourages the programmer to adopt whatever stance
works to solve the problem.
The interesting thing is that for me the common juxtaposition of Perl and
Python makes more sense than you'd think at the outset. I've tried my hand
at each for projects, and I don't find I'm particularly more productive
in one than the other. Perhaps Perl has a slight edge because I've been
using it longer, and most of the projects I've tried are typical of the
classic Perl problem space of Practical Extraction and Reporting.
I don't really find one language to be more maintainable than the other;
perhaps Python restricts certain bad habits that Perl permits, but in reality
efficient and sensible problem decomposition and program organization are
rather more important. Perhaps the flavor of bad programming varies somewhat
between the languages -- bad Perl is dense and indigestible, bad Python
is bland and undifferentiated.
"It is rather difficult
to find short, simple examples of coroutines
which illustrate the
importance of the idea; the most useful coroutine
applications
generally are quite lengthy"
Knuth [The Art of Computer Programming]
Coroutines are classic programming-in-the-large methodology. In
languages that does not support coroutines directly threads probably a possible
workaround. Coroutines and threads may be used for the same purpose but they are
definitively not the same. Coroutines share one processor with (at the level of
the coroutines primitives) explicit transfer of control where threads may be
executed in parallel or participate in time-sharing (without explicit control of
transfer). One problem with coroutines is that blocking I/O causes
the coroutines to block. See for example
http://starship.python.net/crew/aahz/
[Alan spends the weekend upgrading an old laptop, so he can look at
this stuff in the comfort of the garden, which was bathed in glorious
sunshine for the entirety of the Irish Bank Holiday weekend B-). He
also prints and reads the writings of Christian Tismer[1], David
Mertz[2,3] and the Timbot[4,5], in relation to generators,
coroutines, continuations, pseudo/weightless/micro-threads, etc].
Steven Taschuk wrote:
[
Lots of great stuff which illustrates the problems of calling from
generator/coroutine code to functional code and vice versa.
]
I now understand. Thanks for taking the time to explain.
Thanks to your clear examples, I now picture coroutines in the
following way (which I hope is correct :-).
1. The "traditional" function/method call paradigm builds a stack
frame every time a function/method is called. Since this stack frame
holds the references to the local variables for the function/method,
it must be destroyed and collected if memory is not to be "leaked".
The only time when the stack frame can be destroyed is when the
references it contains are no longer required: i.e. after the
instance of the function/method call has finished.
2. The only method by which the "traditional" function call can
invoke another "traditional" function/method call is to call it
"recursively", thus causing the construction of another stack frame.
There is no mechanism for it to "substitute" the stack frame of the
called function "in place of" its own stack frame. (Although I
believe that Stackless can do this, after much rewriting and
reinvention of truth :-).
3. Because all "traditional" function/method calls, involve the
creation and eventual destruction of a stack frame, I label the space
in which they operate "Linear stack space".
4. There is another space, which I call "Generator space". When a
call is made into "Generator space", a stack frame is not
constructed: at least not every time. Instead, a resumable and
persistent stack frame is "resumed": this "resumable stack frame" was
created once, sometime in the past: it is termed, in current python
terminology, the generator-iterator.
5. When the code in "Generator space" (i.e. the generator-iterator)
is called, it resumes immediately after where it last 'yield'ed, and
continues until it 'yield's again, or returns or excepts. When it
'yield's, two things happen. A: The resumable stack frame is
"frozen", so that it can later be resumed again. and B: A python
object, which may be None, is transferred back to the caller.
6. For any call from "Linear Stack space" into "Generator space",
there is a crossover point, P. When the called code in "Generator
space" finishes executing, it can only enter back into "Linear stack
space" through point P: it cannot exit through any other point.
(According to the current python model).
7. If any calls are made from "Generator space" into "Linear stack
space", this leads to the creation of a stack frame, which must be
destroyed. If the called function/method in "Linear stack space" then
calls back into "Generator space", and the "Linear space" function is
not allowed to exit naturally, this is essentially unbound recursion,
which will lead eventually to a blown stack. (The non-functioning
Producer/Consumer example illustrates this: the code in "Generator
space" could be made to work by "traditionally" calling the write and
read functions, but this mutual recursion would eventually blow the
"Linear space" stack).
8. When a stack frame in "Generator space" 'yield's, it is possible
to adopt a convention for the 'yield'ed value: the "dispatcher"
convention. Under this convention, the value 'yield'ed can be a
reference to another resumable stack frame in Generator space. A
special "dispatch"-function is coded to recognise these references,
and to immediately call back into "Generator space". The newly
resumed stack frame is still limited by the same constraints as the
original call into "Generator space": it must eventually return to
"Linear stack space" through point P, in this case the dispatcher
function.
9. A complex network of interacting "resumable stack frames", or
generators, can mutually 'yield' control to each other, assuming the
cooperation of the dispatcher function. Execution continually
"bounces" back and forth between the dispatch function (Point P, in
"Linear stack space") and various 'yield'ed resumable states in
"Generator space".
10. A network of interacting generator-iterators could potentially
execute much faster than an equivalent series of "traditional"
function/method calls in "Linear stack space", since there is no
overhead associated with con/destruction of stack frames, no parsing of
parameters, etc. Such a network of interacting generators are called
"coroutines", and require the presence of a dispatcher function: the
dispatcher function must be explicitly created by the programmer, as
python currently exists.
10a: refinement: In fact python generators are really
"semi"-coroutines, because of the requirement to keep 'yield'ing to a
dispatcher function. In a language which implemented
"full"-coroutines (maybe a __future__ version of python?), the
dispatcher would not be required (at least explicitly), and resumable
states could transfer control directly to any other resumable state
for which they hold a reference.
11. The efficiency benefits of generators can be also realised by
*not* adopting the dispatcher convention. Instead, a generator can
simply 'yield' the series of values it has been coded to generate:
the nodes in a data structure, the numbers in a computed series, etc.
This can lead to more natural code, particularly for the calling
function which utilises the series of values 'yield'ed: it is mostly
unaware that there is a resumable state involved in the generation of
the values. Simple example:
def squares(n):
for i in xrange(n):
yield n, n*n
# Instantiate a generator-iterator
sqgen = squares(100)
# We can do the following because the generator-iterator
# conventiently implements the iterator protocol.
for i, sq in sqgen():
print "The square of %d is %d" % (i, sq)
12. It is possible to avoid the "mutual" recursion problem of calling
from "Generator space" into "Linear stack space", by the following
actions
o Moving all "traditional" function/method calls required from
"Linear stack space" into "Generator space".
o By expanding the "dispatcher" convention to allow generators to
yield special values representing a function call or a return value.
The dispatcher function must then explicitly manage a call stack.
13. It is possible to model ultra-lightweight threads by representing
each thread as a simple generator which steps through the states
required to implement the functionality required of the "thread". The
"execution pointer" of such threads is advanced simply by calling the
".next()" method of the "thread"s generator-iterator. (Incidentally,
as well as being highly efficient in terms of resource consumption,
these ultra-lightweight threads offer much finer control of
thread-prioritisation, thread-creation, destruction and "collection",
etc.)
Now that I think I've got my head around it, I think I'm going to try
my hand at an example or two, which I will post to the newsgroup
(might be a few weeks, I'm kinda busy right now). The two main
interesting examples that come to mind are
1. A ultra-lightweight thread implementation of an
asyncore/medusa/twisted style socket server.
2. A coroutine based version of something that is normally
(relatively) resource-intensive: a series of SAX2 callbacks to a
chain of ContentHandlers/Filters.
Lastly, I think I'm beginning to "get" continuations. Although the
examples given in Christian Tismers paper "Continuations and
Stackless Python"[1] are difficult to follow without the definition
of a continuation object (which seems to require intimate familiarity
with the Stackless implementation), I think I understand the general
principle.
And it's a principal I'm not really interested in pursuing, because I
can't see that it will make me any more productive, or my programs
more efficient (than they could be using the "coroutines" and
"weightless-threads" described above). And I can imagine that
continuation code could be a little brain-bending to write (thus
making me *less* productive %-), as this example of a while loop
(which I sort of understand) shows:
def looptest(n):
this = continuation.current()
k = this.update(n)
if k:
this(k-1)
else:
del this.link
But I can see the power inherent in the continuations paradigm.
Again, many thanks to all who responded to my original post.
Reading material.
[1] http://www.stackless.com/spcpaper.pdf
[2] http://www-106.ibm.com/developerworks/linux/library/l-pygen.html
[3] http://www-106.ibm.com/developerworks/library/l-pythrd.html
[4] http://tinyurl.com/dbyn
[5] http://www.tismer.com/research/stackless/coroutines.tim.peters.html
kind regards,
--
alan kennedy
"John Roth" <johnroth@ameritech.net> wrote in message
news:uv94kkq5ajbae3@news.supernews.com...
>
> "Rocco Moretti" <roccomoretti@netscape.net> wrote
> > process of set attributes -> call processing function -> repeat is an
> > akward way of achieving what you really want to do - continue the
> > generator with given data. I'd agree with what Guido said on
> > Python-Dev: there is no advantage in this scheme over passing a
> > mutable dummy object when the generator is created (you could even
> > call it __self__ if you really wanted too ...).
> I thoroughly agree.
Me too.
> with some bemusement, because, for me at least, the most
> intuitive thing to do is make yield a built in function rather than
> a statement.
But then we would need a another keyword (possible, of course) to tell
the compiler to change the generated bytecode. It would also break
the parallel between 'return something' and 'yield something'. The
only difference in execution is that yield does less -- by *not*
deleting the execution frame.
> That way, it can return a value, which is what
> was passed in on the reinvocation.
One of the design features of generators is that resumption is
extremely quick precisely because the argument processing and function
setup steps are bypassed. Just restore the execution frame and go on
with the next statement [bytecode actually].
Another design feature is that they are meant to work seemlessly with
for statements, with only one explicit call (to produce the iterator
that 'for' needs). In this context, passing in additional values is
not possible.
Another problem is with multiple yields. Consider the following
hypothetical generator with the proposed yield() 'function' (see next
comment for why the '' marks):
def genf(a):
b=yield(a)
b,c=yield(a+b)
yield(a+b+c)
gen = genf(1)
Then the documentation must say: on the first call to gen.next(), pass
one arg; on the next, pass two; finally, don't pass any. Not good,
methings.
> The trouble with the whole thing is that conceptually, "yield" has two
> values: one that it returns to the caller, and one that it
> gets from the caller. It's a two way pipe.
(You mean, 'would be'.) Pipes and such usually have explicit read and
write methods that make the order of operations clear. The yield()
proposal hijacks function notation to squash a read and write
together -- with the order switched at the two ends.
To explain: y=f(x) usually means 'set y to a value depending on
(calculated from) x' -- this is the meaning of 'function'.
Operationally, a function call means to send x to process f to
initialize the corresponding parameter, suspend operation while f
operates, and set y to the value returned by f. The combined
next(arg)/yield() idea warps and shuffles these semantics.
Let p = pipe or whatever: Then (omitting suspend) y = yield(x) could
mean
p.write(x); y=p.read()
where (contrary to the implication of function notation) the value
read cannot depend on x since it is calculated at the other end before
the write.
The corresponding x = gen.next(y) then means
x=p.read; p.write(y)
which makes the order of value passing the opposite of what a function
call means. The only reason to write y is for a possible future call
to gen.next(). I think it much better to separate the read and write
and pair the writing of y with the reading of the x that functionally
depends on the value written.
One could reverse the meaning of x = gen.next(y) to be
p.write(y); x=p.read()
but the x read would still not depend on the y written since the
latter would have to be set aside and ignored until the predetermined
x was sent back. IE, y=yield(x) would have to mean and be implemented
as
<hidden-slot> = p.read(); p.write(x); y=<hidden-slot>
In either case, there is a synchronization problem in that the values
sent to the generator are shifted by one call. The last gen.next()
call must send a dummy that will be ignored. On the other hand, if,
for instance, there is one yield function which depends on the
variable reset by the yield function, then that variable must be
separately initialized in the initial genf() call.
So it arguably makes just as much sense to initialize via genf() with
a mutable with paired write/read, send/receive, or set/get methods or
the equivalent operations (as with dicts). One can then make explicit
calls to pass data in either direction without fake 'function' calls.
If one pairs 'yield None' with 'dummy=gen.next()' or 'for dummy in
genf(mutable):', one can even do all value passing, in both
directions, via the two-way channel or mutable and use 'yield'
strictly for suspend/resume synchronization of the coroutines.
--
This proposal come close to asking for what the original Stackless did
with continuations. These allowed some mind-boggling code. Almost
too fancy for what Python is meant to be 8-).
Terry J. Reedy
Major revision: more details about exceptions, return vs StopIteration, and
interactions with try/except/finally; more Q&A; and a BDFL Pronouncement. The
reference implementation appears solid and works as described here in all
respects, so I expect this will be the last major revision (and so also last
full posting) of this PEP.
The output below is in ndiff format (see Tools/scripts/ndiff.py in your Python
distribution). Just the new text can be seen in HTML form here:
http://python.sf.net/peps/pep-0255.html
"Feature discussions" should take place primarily on the Python Iterators list:
mailto:python-iterators@lists.sourceforge.net
Implementation discussions may wander in and out of Python-Dev too.
PEP: 255
Title: Simple Generators
- Version: $Revision: 1.3 $
? ^
+ Version: $Revision: 1.12 $
? ^^
Author: nas@python.ca (Neil Schemenauer),
tim.one@home.com (Tim Peters),
magnus@hetland.org (Magnus Lie Hetland)
Discussion-To: python-iterators@lists.sourceforge.net
Status: Draft
Type: Standards Track
Requires: 234
Created: 18-May-2001
Python-Version: 2.2
- Post-History: 14-Jun-2001
+ Post-History: 14-Jun-2001, 23-Jun-2001
? +++++++++++++
Abstract
This PEP introduces the concept of generators to Python, as well
as a new statement used in conjunction with them, the "yield"
statement.
Motivation
When a producer function has a hard enough job that it requires
maintaining state between values produced, most programming languages
offer no pleasant and efficient solution beyond adding a callback
function to the producer's argument list, to be called with each value
produced.
For example, tokenize.py in the standard library takes this approach:
the caller must pass a "tokeneater" function to tokenize(), called
whenever tokenize() finds the next token. This allows tokenize to be
coded in a natural way, but programs calling tokenize are typically
convoluted by the need to remember between callbacks which token(s)
were seen last. The tokeneater function in tabnanny.py is a good
example of that, maintaining a state machine in global variables, to
remember across callbacks what it has already seen and what it hopes to
see next. This was difficult to get working correctly, and is still
difficult for people to understand. Unfortunately, that's typical of
this approach.
An alternative would have been for tokenize to produce an entire parse
of the Python program at once, in a large list. Then tokenize clients
could be written in a natural way, using local variables and local
control flow (such as loops and nested if statements) to keep track of
their state. But this isn't practical: programs can be very large, so
no a priori bound can be placed on the memory needed to materialize the
whole parse; and some tokenize clients only want to see whether
something specific appears early in the program (e.g., a future
statement, or, as is done in IDLE, just the first indented statement),
and then parsing the whole program first is a severe waste of time.
Another alternative would be to make tokenize an iterator[1],
delivering the next token whenever its .next() method is invoked. This
is pleasant for the caller in the same way a large list of results
would be, but without the memory and "what if I want to get out early?"
drawbacks. However, this shifts the burden on tokenize to remember
*its* state between .next() invocations, and the reader need only
glance at tokenize.tokenize_loop() to realize what a horrid chore that
would be. Or picture a recursive algorithm for producing the nodes of
a general tree structure: to cast that into an iterator framework
requires removing the recursion manually and maintaining the state of
the traversal by hand.
A fourth option is to run the producer and consumer in separate
threads. This allows both to maintain their states in natural ways,
and so is pleasant for both. Indeed, Demo/threads/Generator.py in the
Python source distribution provides a usable synchronized-communication
class for doing that in a general way. This doesn't work on platforms
without threads, though, and is very slow on platforms that do
(compared to what is achievable without threads).
A final option is to use the Stackless[2][3] variant implementation of
Python instead, which supports lightweight coroutines. This has much
the same programmatic benefits as the thread option, but is much more
efficient. However, Stackless is a controversial rethinking of the
Python core, and it may not be possible for Jython to implement the
same semantics. This PEP isn't the place to debate that, so suffice it
to say here that generators provide a useful subset of Stackless
functionality in a way that fits easily into the current CPython
implementation, and is believed to be relatively straightforward for
other Python implementations.
That exhausts the current alternatives. Some other high-level
languages provide pleasant solutions, notably iterators in Sather[4],
which were inspired by iterators in CLU; and generators in Icon[5], a
novel language where every expression "is a generator". There are
differences among these, but the basic idea is the same: provide a
kind of function that can return an intermediate result ("the next
value") to its caller, but maintaining the function's local state so
that the function can be resumed again right where it left off. A
very simple example:
def fib():
a, b = 0, 1
while 1:
yield b
a, b = b, a+b
When fib() is first invoked, it sets a to 0 and b to 1, then yields b
back to its caller. The caller sees 1. When fib is resumed, from its
point of view the yield statement is really the same as, say, a print
statement: fib continues after the yield with all local state intact.
a and b then become 1 and 1, and fib loops back to the yield, yielding
1 to its invoker. And so on. From fib's point of view it's just
delivering a sequence of results, as if via callback. But from its
caller's point of view, the fib invocation is an iterable object that
can be resumed at will. As in the thread approach, this allows both
sides to be coded in the most natural ways; but unlike the thread
approach, this can be done efficiently and on all platforms. Indeed,
resuming a generator should be no more expensive than a function call.
The same kind of approach applies to many producer/consumer functions.
For example, tokenize.py could yield the next token instead of invoking
a callback function with it as argument, and tokenize clients could
iterate over the tokens in a natural way: a Python generator is a kind
of Python iterator[1], but of an especially powerful kind.
- Specification
+ Specification: Yield
? ++++++++
A new statement is introduced:
yield_stmt: "yield" expression_list
"yield" is a new keyword, so a future statement[8] is needed to phase
- this in. [XXX spell this out]
+ this in. [XXX spell this out -- but new keywords have ripple effects
+ across tools too, and it's not clear this can be forced into the future
+ framework at all -- it's not even clear that Python's parser alone can
+ be taught to swing both ways based on a future stmt]
The yield statement may only be used inside functions. A function that
- contains a yield statement is called a generator function.
+ contains a yield statement is called a generator function. A generator
? +++++++++++++
+ function is an ordinary function object in all respects, but has the
+ new CO_GENERATOR flag set in the code object's co_flags member.
When a generator function is called, the actual arguments are bound to
function-local formal argument names in the usual way, but no code in
the body of the function is executed. Instead a generator-iterator
object is returned; this conforms to the iterator protocol[6], so in
particular can be used in for-loops in a natural way. Note that when
the intent is clear from context, the unqualified name "generator" may
be used to refer either to a generator-function or a generator-
iterator.
Each time the .next() method of a generator-iterator is invoked, the
code in the body of the generator-function is executed until a yield
or return statement (see below) is encountered, or until the end of
the body is reached.
If a yield statement is encountered, the state of the function is
frozen, and the value of expression_list is returned to .next()'s
caller. By "frozen" we mean that all local state is retained,
including the current bindings of local variables, the instruction
pointer, and the internal evaluation stack: enough information is
saved so that the next time .next() is invoked, the function can
proceed exactly as if the yield statement were just another external
call.
+ Restriction: A yield statement is not allowed in the try clause of a
+ try/finally construct. The difficulty is that there's no guarantee
+ the generator will ever be resumed, hence no guarantee that the finally
+ block will ever get executed; that's too much a violation of finally's
+ purpose to bear.
+
+
+ Specification: Return
+
A generator function can also contain return statements of the form:
"return"
Note that an expression_list is not allowed on return statements
in the body of a generator (although, of course, they may appear in
the bodies of non-generator functions nested within the generator).
- When a return statement is encountered, nothing is returned, but a
+ When a return statement is encountered, control proceeds as in any
+ function return, executing the appropriate finally clauses (if any
- StopIteration exception is raised, signalling that the iterator is
? ------------
+ exist). Then a StopIteration exception is raised, signalling that the
? ++++++++++++++++
- exhausted. The same is true if control flows off the end of the
+ iterator is exhausted. A StopIteration exception is also raised if
+ control flows off the end of the generator without an explict return.
+
- function. Note that return means "I'm done, and have nothing
? -----------
+ Note that return means "I'm done, and have nothing interesting to
? +++++++++++++++
- interesting to return", for both generator functions and non-generator
? ---------------
+ return", for both generator functions and non-generator functions.
? +++++++++++
- functions.
+
+ Note that return isn't always equivalent to raising StopIteration: the
+ difference lies in how enclosing try/except constructs are treated.
+ For example,
+
+ >>> def f1():
+ ... try:
+ ... return
+ ... except:
+ ... yield 1
+ >>> print list(f1())
+ []
+
+ because, as in any function, return simply exits, but
+
+ >>> def f2():
+ ... try:
+ ... raise StopIteration
+ ... except:
+ ... yield 42
+ >>> print list(f2())
+ [42]
+
+ because StopIteration is captured by a bare "except", as is any
+ exception.
+
+
+ Specification: Generators and Exception Propagation
+
+ If an unhandled exception-- including, but not limited to,
+ StopIteration --is raised by, or passes through, a generator function,
+ then the exception is passed on to the caller in the usual way, and
+ subsequent attempts to resume the generator function raise
+ StopIteration. In other words, an unhandled exception terminates a
+ generator's useful life.
+
+ Example (not idiomatic but to illustrate the point):
+
+ >>> def f():
+ ... return 1/0
+ >>> def g():
+ ... yield f() # the zero division exception propagates
+ ... yield 42 # and we'll never get here
+ >>> k = g()
+ >>> k.next()
+ Traceback (most recent call last):
+ File "<stdin>", line 1, in ?
+ File "<stdin>", line 2, in g
+ File "<stdin>", line 2, in f
+ ZeroDivisionError: integer division or modulo by zero
+ >>> k.next() # and the generator cannot be resumed
+ Traceback (most recent call last):
+ File "<stdin>", line 1, in ?
+ StopIteration
+ >>>
+
+
+ Specification: Try/Except/Finally
+
+ As noted earlier, yield is not allowed in the try clause of a try/
+ finally construct. A consequence is that generators should allocate
+ critical resources with great care. There is no restriction on yield
+ otherwise appearing in finally clauses, except clauses, or in the try
+ clause of a try/except construct:
+
+ >>> def f():
+ ... try:
+ ... yield 1
+ ... try:
+ ... yield 2
+ ... 1/0
+ ... yield 3 # never get here
+ ... except ZeroDivisionError:
+ ... yield 4
+ ... yield 5
+ ... raise
+ ... except:
+ ... yield 6
+ ... yield 7 # the "raise" above stops this
+ ... except:
+ ... yield 8
+ ... yield 9
+ ... try:
+ ... x = 12
+ ... finally:
+ ... yield 10
+ ... yield 11
+ >>> print list(f())
+ [1, 2, 4, 5, 8, 9, 10, 11]
+ >>>
Example
# A binary tree class.
class Tree:
def __init__(self, label, left=None, right=None):
self.label = label
self.left = left
self.right = right
def __repr__(self, level=0, indent=" "):
s = level*indent + `self.label`
if self.left:
s = s + "\n" + self.left.__repr__(level+1, indent)
if self.right:
s = s + "\n" + self.right.__repr__(level+1, indent)
return s
def __iter__(self):
return inorder(self)
# Create a Tree from a list.
def tree(list):
n = len(list)
if n == 0:
return []
i = n / 2
return Tree(list[i], tree(list[:i]), tree(list[i+1:]))
# A recursive generator that generates Tree leaves in in-order.
def inorder(t):
if t:
for x in inorder(t.left):
yield x
yield t.label
for x in inorder(t.right):
yield x
# Show it off: create a tree.
t = tree("ABCDEFGHIJKLMNOPQRSTUVWXYZ")
# Print the nodes of the tree in in-order.
for x in t:
print x,
print
# A non-recursive generator.
def inorder(node):
stack = []
while node:
while node.left:
stack.append(node)
node = node.left
yield node.label
while not node.right:
try:
node = stack.pop()
except IndexError:
return
yield node.label
node = node.right
# Exercise the non-recursive generator.
for x in t:
print x,
print
+ Both output blocks display:
+
+ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
+
Q & A
+ Q. Why not a new keyword instead of reusing "def"?
+
+ A. See BDFL Pronouncements section below.
+
- Q. Why a new keyword? Why not a builtin function instead?
+ Q. Why a new keyword for "yield"? Why not a builtin function instead?
? ++++++++++++
A. Control flow is much better expressed via keyword in Python, and
yield is a control construct. It's also believed that efficient
implementation in Jython requires that the compiler be able to
determine potential suspension points at compile-time, and a new
- keyword makes that easy.
+ keyword makes that easy. The CPython referrence implementation also
+ exploits it heavily, to detect which functions *are* generator-
+ functions (although a new keyword in place of "def" would solve that
+ for CPython -- but people asking the "why a new keyword?" question
+ don't want any new keyword).
+
+ Q: Then why not some other special syntax without a new keyword? For
+ example, one of these instead of "yield 3":
+
+ return 3 and continue
+ return and continue 3
+ return generating 3
+ continue return 3
+ return >> , 3
+ from generator return 3
+ return >> 3
+ return << 3
+ >> 3
+ << 3
+
+ A: Did I miss one <wink>? Out of hundreds of messages, I counted two
+ suggesting such an alternative, and extracted the above from them.
+ It would be nice not to need a new keyword, but nicer to make yield
+ very clear -- I don't want to have to *deduce* that a yield is
+ occurring from making sense of a previously senseless sequence of
+ keywords or operators. Still, if this attracts enough interest,
+ proponents should settle on a single consensus suggestion, and Guido
+ will Pronounce on it.
+
+ Q. Why allow "return" at all? Why not force termination to be spelled
+ "raise StopIteration"?
+
+ A. The mechanics of StopIteration are low-level details, much like the
+ mechanics of IndexError in Python 2.1: the implementation needs to
+ do *something* well-defined under the covers, and Python exposes
+ these mechanisms for advanced users. That's not an argument for
+ forcing everyone to work at that level, though. "return" means "I'm
+ done" in any kind of function, and that's easy to explain and to use.
+ Note that "return" isn't always equivalent to "raise StopIteration"
+ in try/except construct, either (see the "Specification: Return"
+ section).
+
+ Q. Then why not allow an expression on "return" too?
+
+ A. Perhaps we will someday. In Icon, "return expr" means both "I'm
+ done", and "but I have one final useful value to return too, and
+ this is it". At the start, and in the absence of compelling uses
+ for "return expr", it's simply cleaner to use "yield" exclusively
+ for delivering values.
+
+
+ BDFL Pronouncements
+
+ Issue: Introduce another new keyword (say, "gen" or "generator") in
+ place of "def", or otherwise alter the syntax, to distinguish
+ generator-functions from non-generator functions.
+
+ Con: In practice (how you think about them), generators *are*
+ functions, but with the twist that they're resumable. The mechanics of
+ how they're set up is a comparatively minor technical issue, and
+ introducing a new keyword would unhelpfully overemphasize the
+ mechanics of how generators get started (a vital but tiny part of a
+ generator's life).
+
+ Pro: In reality (how you think about them), generator-functions are
+ actually factory functions that produce generator-iterators as if by
+ magic. In this respect they're radically different from non-generator
+ functions, acting more like a constructor than a function, so reusing
+ "def" is at best confusing. A "yield" statement buried in the body is
+ not enough warning that the semantics are so different.
+
+ BDFL: "def" it stays. No argument on either side is totally
+ convincing, so I have consulted my language designer's intuition. It
+ tells me that the syntax proposed in the PEP is exactly right - not too
+ hot, not too cold. But, like the Oracle at Delphi in Greek mythology,
+ it doesn't tell me why, so I don't have a rebuttal for the arguments
+ against the PEP syntax. The best I can come up with (apart from
+ agreeing with the rebuttals ... already made) is "FUD". If this had
+ been part of the language from day one, I very much doubt it would have
+ made Andrew Kuchling's "Python Warts" page.
Reference Implementation
- A preliminary patch against the CVS Python source is available[7].
+ The current implementation, in a preliminary state (no docs and no
+ focused tests), is part of Python's CVS development tree[9].
+ Using this requires that you build Python from source.
+
+ This was derived from an earlier patch by Neil Schemenauer[7].
Footnotes and References
[1] PEP 234, http://python.sf.net/peps/pep-0234.html
[2] http://www.stackless.com/
[3] PEP 219, http://python.sf.net/peps/pep-0219.html
[4] "Iteration Abstraction in Sather"
Murer , Omohundro, Stoutamire and Szyperski
http://www.icsi.berkeley.edu/~sather/Publications/toplas.html
[5] http://www.cs.arizona.edu/icon/
[6] The concept of iterators is described in PEP 234
http://python.sf.net/peps/pep-0234.html
[7] http://python.ca/nas/python/generator.diff
[8] http://python.sf.net/peps/pep-0236.html
+ [9] To experiment with this implementation, check out Python from CVS
+ according to the instructions at
+ http://sf.net/cvs/?group_id=5470
Copyright
This document has been placed in the public domain.
Local Variables:
mode: indented-text
indent-tabs-mode: nil
End:
[Neel Krishnaswami]
> ...
> I've been looking at Icon, and it occurs to me that if coroutines and
> generators were available at the Python level, it might yield a way of
> doing string processing that is more "Pythonic" than regexps.
Yes, and you can find much more about this in the very early days of the
String SIG archives.
> Regexps are nice because when you have a pattern that they can
> directly represent, then you can simply specify a pattern and then you
> don't have to worry about the tedious bookkeeping of looping over the
> string and keeping track of the state variable.
>
> However, when you need to match a pattern that a regexp can't match,
> then suddenly you need to break the string into tokens with regexps
> and then loop over the pieces and keep track of a bunch of state
> variables that don't necessarily correspond to the pieces you are
> actually interested in.
>
> This is unpleasant, because a) clumsy bookkeeping is bad, and b)
> there's two different sorts of syntax needed to do basically
> similar tasks.
The latter is one of the arguments Griswold gave in the paper that
introduced Icon, contrasting Icon's uniform approach to SNOBOL4's distinct
pattern and procedural sublanguages.
I don't think he'd make the same argument today! Icon is an idiosyncratic
delight, but most string pattern matching was in fact easier (to write,
modify, or read) in SNOBOL4. Once you start using Icon for complex pattern
matching, you'll soon discover that it's very hard to go back to old code
and disentangle "the pattern" from "the processing" -- *because* everything
looks the same, and it's all intermixed. The distinct sublanguages in
SNOBOL4 enforced a clean separation; and that was more often an aid than a
burden.
Have noted before that writing string matching code in Icon feels a lot like
writing in an assembly language for SNOBOL4. Of course the latter didn't
have Icon's power in other areas (e.g. it's at least possible to program
structural pattern-matchers in Icon, using generators to match e.g. tree
patterns; SNOBOL4's patterns started & ended with strings).
> If we could compose generators just like functions, then the bookkeeping
> can be abstracted away and the same approach will work for arbitrarily
> complicated parsing tasks.
The real advantage of regexps is speed, & that's probably while they'll
always be much more popular. SNOBOL4 and Icon didn't define "bal" builtins
because you couldn't code that pattern yourself <wink>. bal is beyond a
regexp's abilities, but it's still a simple kind of pattern, and just runs
too darned slow if implemented via the recursive definition as run with the
general backtracking machinery.
> So I think it would be nice if these lovely toys were available at
> the Python level. Is this a possibility?
I've been bugging Guido about generators since '91, but for the billion and
one uses other than string matching. Anything is possible -- except that
I'll ever stop bugging him <wink>.
> (I'd love to be able to define coroutines in Python like so:
>
> def evens(z):
> for elt in z:
> if z % 2 == 0:
> suspend(z)
How do you intend to resume it?
> It would probably require some rethinking of Python's iteration
> protocol though.)
That's not a hangup; Guido already knows how to do it cleanly. Like any Big
New Feature there are real questions about costs vs benefits, and so far
this stuff has been widely viewed as "too esoteric". I think it's natural
as breathing, to the point that a *non*-resumable function strikes me as
never inhaling <wink>.
For now, there's a working implementation of a generator protocol (somewhat
more flexible than Icon's) in the source distribution, under
Demo/threads/Generator.py. I also posted a general (thread-based) coroutine
implementation a bit over 5 years ago. Building these on Christian's
stackless Python instead should run much faster.
BTW, generators are much easier to implement than coroutines -- the former
don't require threads or stacklessness or continuations under the covers,
just some relatively straightforward changes to Python's current
implementation (Steven Majewski noted this 5 years ago). This is why
generators work in all ports of Icon, but coroutines are an optional feature
that's supported only if a platform guru has taken the time to write the
platform-dependent context-switching assembly code Icon coexps require.
degeneratoredly y'rs - tim
[Dec 26, 2001] Nice post with an explanation of coroutines to a person
outside academic community
Aaron Watters wrote:
>
> James C. Phillips wrote:
> >
> > Daniel Larsson wrote:
> > > Coroutines are still rare in newer languages. I'm sure there
> > > are more, but I know only of Modula-2 and BETA.
> >
> > For us non-computer-science-types, what is a coroutine?
> > I've never heard this term before.
> >
> > -Jim
>
> I've never really used them, but as I recall they are like
> subroutines, except that more than one coroutine can be
> active at once and one coroutine can explicitly give control
> to another or something... I personally don't understand why
> you need programming language features to emulate this kind of
> behaviour, but I'm probably ill informed and wrong.
>
> Maybe some expert can tell me: is there anything you can do
> with coroutines that can't be emulated directly with instances
> of classes in Python or M3? Please educate... -- Aaron Watters
I would hardly describe myself as an expert in coroutines, but
anyway...
Suppose you have a binary tree class:
class Node:
def __init__(self, elem, left=None, right=None):
self.elem = elem
self.left, self.right = left, right
class BinTree:
def __init__(self):
self.root = None
def scan(self, node):
if node:
self.scan(node.left)
suspend node.elem # Suspend coroutine and return value
self.scan(node.right)
coroutine traverse(self):
self.scan(self.root)
return None # Terminate coroutine
A coroutine has its own stack. When a coroutine is called, the
caller's stack is saved, and the execution is moved to the callee's
stack. When a coroutine suspends, stacks are reversed again.
In the above case, each call to traverse will do one inorder step
in the tree and return the value at that node. Here's how to use
it to print the contents of a binary trees:
tree = BinTree()
... # init tree
while 1:
elem = tree.traverse()
if elem == None: break
print elem
This could certainly be implemented in Python without coroutines,
and I don't even know if this is even a good example of the
benefits of coroutines. Anyway, some problems where you would like
to use threads, might be easier to solve with coroutines, since
you don't have any synchronization problems (you have explicit
control of when switching between threads).
Hope I didn't confuse too many people out there...
--
Daniel Larsson, ABB Industrial Systems AB
Howdy,
Andrew Cooke wrote:
>
> In article <000001bf768e$48e40580$45a0143f@tim>,
> "Tim Peters" <tim_one@email.msn.com> wrote:
> > def traverse_post(self):
> > for child in self.left, self.right:
> > if child is not None:
> > suspend child.traverse_post()
> > suspend self.data
>
> That finally hammered home to me just what continuations are all about.
> Does anyone have something similarly elegant that shows a coroutine?
> I've just had a look at the stackless python documentation and
> coroutines seem to be described as something coming out of a single
> detailed example. Is there a simpler definition? (I did look back
> through some posts on Deja, but there was nothing recent that seemed to
> explain what a coroutine actually is - sorry if I've missed something
> obvious).
What did you read, actually?
Here some pointers:
Homepage: http://www.tismer.com/research/stackless/
IPC8 paper: http://www.tismer.com/research/stackless/spcpaper.htm
Slides: http://www.tismer.com/research/stackless/spc-sheets.ppt
The latter gives you a bit of understanding what a continuation is.
Tim Peters about coroutines can be found here:
http://www.tismer.com/research/stackless/coroutines.tim.peters.html
More on coroutines by Sam Rushing:
http://www.nightmare.com/~rushing/copython/index.html
On Scheme, continuations, generators and coroutines:
http://www.cs.rice.edu/~dorai/t-y-scheme/
Revised Scheme 5 report:
http://www.swiss.ai.mit.edu/~jaffer/r5rs_toc.html
The following books are also highly recommended:
- Andrew W. Appel, Compiling with Continuations, Cambridge University
Press, 1992
- Daniel P. Friedman, Mitchell Wand, and Christopher T. Haynes,
Essentials of Programming Languages, MIT Press, 1993
- Christopher T. Haynes, Daniel P. Friedman, and Mitchell Wand,
Continuations and Coroutines, Computer Languages, 11(3/4): 143-153,
1986.
- Strachey and Wadsworth, Continuations: A mathematical semantics which
can deal with full jumps. Technical monograph PRG-11, Programming
Research Group, Oxford, 1974.
I don't think this is easy stuff at all, and you can't expect
to find a simple answer by skimming a couple of web pages.
It took me a lot of time to understand a fair part of this,
and this is also a showstopper to get this stuff to be used.
> Also, comp.lang.lisp is currently dissing continuations. Would that be
> because the alternative is to pass the code that will process the node
> into the iterator as a (first class) function? Obviously, in this case,
> yes, but is that true in general (are there examples where continuations
> or coroutines make something possible that really is tricky to do in
> other ways)?
An iterator is quite an easy thing, and it can be implemented
without continuations rather easily. Continuations cover problems
of another order of magnitude. It is due to too simple examples
that everybody thinks that coroutines and generators are the
only target. Continuations can express any kind of control flow,
and they can model iterators, coroutines and generators easily.
The opposite is not true!
I know this isn't enough to convince you, but at the time I have
no chance. I need to build some very good demo applications
which use continuations without exposing them to the user,
and this is admittedly difficult.
ciao - chris
--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Düppelstr. 31 : *Starship* http://starship.python.net
12163 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home
We add a a new special form, (coroutine (lambda (v)
<body>)) that evaluates to a coroutine object.
Name a coroutine the same way you name an ordinary function;
e.g.
(define Coroutine-1 (coroutine (lambda (v) ...)))
The body of a coroutine is a non-terminating expression
using standard Scheme constructs and one additional form This is (resume
<co> <val>), where <co> refers to any other coroutine defined in
the same way.
The parameter v is passed the first time coroutine
is called or resumed.
Suppose coroutine A executes (resume x B).
If this is the first time B has been called, x is passed
to v.
Otherwise, B was halted by having called (resume y ..).
x is returned as the value of that call to resume.
This PEP discusses changes required to core Python in order to
efficiently support generators, microthreads and coroutines. It is
related to PEP 220, which describes how Python should be extended
to support these facilities. The focus of this PEP is strictly on
the changes required to allow these extensions to work.
While these PEPs are based on Christian Tismer's Stackless[1]
implementation, they do not regard Stackless as a reference
implementation. Stackless (with an extension module) implements
continuations, and from continuations one can implement
coroutines, microthreads (as has been done by Will Ware[2]) and
generators. But in more that a year, no one has found any other
productive use of continuations, so there seems to be no demand
for their support.
However, Stackless support for continuations is a relatively minor
piece of the implementation, so one might regard it as "a"
reference implementation (rather than "the" reference
implementation).
Background
Generators and coroutines have been implemented in a number of
languages in a number of ways. Indeed, Tim Peters has done pure
Python implementations of generators[3] and coroutines[4] using
threads (and a thread-based coroutine implementation exists for
Java). However, the horrendous overhead of a thread-based
implementation severely limits the usefulness of this approach.
Microthreads (a.k.a "green" or "user" threads) and coroutines
involve transfers of control that are difficult to accommodate in
a language implementation based on a single stack. (Generators can
be done on a single stack, but they can also be regarded as a very
simple case of coroutines.)
Real threads allocate a full-sized stack for each thread of
control, and this is the major source of overhead. However,
coroutines and microthreads can be implemented in Python in a way
that involves almost no overhead. This PEP, therefor, offers a
way for making Python able to realistically manage thousands of
separate "threads" of activity (vs. todays limit of perhaps dozens
of separate threads of activity).
Another justification for this PEP (explored in PEP 220) is that
coroutines and generators often allow a more direct expression of
an algorithm than is possible in today's Python.
Discussion
The first thing to note is that Python, while it mingles
interpreter data (normal C stack usage) with Python data (the
state of the interpreted program) on the stack, the two are
logically separate. They just happen to use the same stack.
A real thread gets something approaching a process-sized stack
because the implementation has no way of knowing how much stack
space the thread will require. The stack space required for an
individual frame is likely to be reasonable, but stack switching
is an arcane and non-portable process, not supported by C.
Once Python stops putting Python data on the C stack, however,
stack switching becomes easy.
The fundamental approach of the PEP is based on these two
ideas. First, separate C's stack usage from Python's stack
usage. Secondly, associate with each frame enough stack space to
handle that frame's execution.
In the normal usage, Stackless Python has a normal stack
structure, except that it is broken into chunks. But in the
presence of a coroutine / microthread extension, this same
mechanism supports a stack with a tree structure. That is, an
extension can support transfers of control between frames outside
the normal "call / return" path.
Problems
The major difficulty with this approach is C calling Python. The
problem is that the C stack now holds a nested execution of the
byte-code interpreter. In that situation, a coroutine /
microthread extension cannot be permitted to transfer control to a
frame in a different invocation of the byte-code interpreter. If a
frame were to complete and exit back to C from the wrong
interpreter, the C stack could be trashed.
The ideal solution is to create a mechanism where nested
executions of the byte code interpreter are never needed. The easy
solution is for the coroutine / microthread extension(s) to
recognize the situation and refuse to allow transfers outside the
current invocation.
We can categorize code that involves C calling Python into two
camps: Python's implementation, and C extensions. And hopefully we
can offer a compromise: Python's internal usage (and C extension
writers who want to go to the effort) will no longer use a nested
invocation of the interpreter. Extensions which do not go to the
effort will still be safe, but will not play well with coroutines
/ microthreads.
Generally, when a recursive call is transformed into a loop, a bit
of extra bookkeeping is required. The loop will need to keep it's
own "stack" of arguments and results since the real stack can now
only hold the most recent. The code will be more verbose, because
it's not quite as obvious when we're done. While Stackless is not
implemented this way, it has to deal with the same issues.
In normal Python, PyEval_EvalCode is used to build a frame and
execute it. Stackless Python introduces the concept of a
FrameDispatcher. Like PyEval_EvalCode, it executes one frame. But
the interpreter may signal the FrameDispatcher that a new frame
has been swapped in, and the new frame should be executed. When a
frame completes, the FrameDispatcher follows the back pointer to
resume the "calling" frame.
So Stackless transforms recursions into a loop, but it is not the
FrameDispatcher that manages the frames. This is done by the
interpreter (or an extension that knows what it's doing).
The general idea is that where C code needs to execute Python
code, it creates a frame for the Python code, setting its back
pointer to the current frame. Then it swaps in the frame, signals
the FrameDispatcher and gets out of the way. The C stack is now
clean - the Python code can transfer control to any other frame
(if an extension gives it the means to do so).
In the vanilla case, this magic can be hidden from the programmer
(even, in most cases, from the Python-internals programmer). Many
situations present another level of difficulty, however.
The map builtin function involves two obstacles to this
approach. It cannot simply construct a frame and get out of the
way, not just because there's a loop involved, but each pass
through the loop requires some "post" processing. In order to play
well with others, Stackless constructs a frame object for map
itself.
Most recursions of the interpreter are not this complex, but
fairly frequently, some "post" operations are required. Stackless
does not fix these situations because of amount of code changes
required. Instead, Stackless prohibits transfers out of a nested
interpreter. While not ideal (and sometimes puzzling), this
limitation is hardly crippling.
Advantages
For normal Python, the advantage to this approach is that C stack
usage becomes much smaller and more predictable. Unbounded
recursion in Python code becomes a memory error, instead of a
stack error (and thus, in non-Cupertino operating systems,
something that can be recovered from). The price, of course, is
the added complexity that comes from transforming recursions of
the byte-code interpreter loop into a higher order loop (and the
attendant bookkeeping involved).
The big advantage comes from realizing that the Python stack is
really a tree, and the frame dispatcher can transfer control
freely between leaf nodes of the tree, thus allowing things like
microthreads and coroutines.
View: (This is the only article in this
thread) |
Original Format
Date: 1998/07/24
Since this has come up a couple of times, I finally dug thru
my files and found a few traces of this experiment and put
the files on <http://galen.med.virginia.edu/~sdm7g/Python/CO>
BTW: I considered the experiment a partial failure, because it
would not work to implement full-coroutines ( because ceval.c
is recursive and some of the state is implicit in the C calling
stack. ) but I believe it did successfully implement semi-coroutines
( i.e. Icon-like generators ) -- which seems to be all that
some folks want.
The mods were done to python-1.0.2
I haven't looked at how hard this would be to update to 1.5.1
--- README ---
There were experimental mods to python-1.0.2 to make frameobjects
into resumable continuations. ( put here for the curious and just
in case anyone wants to try a similar experiment with a more
current release. )
the files modified were frameobject.h and ceval.c
A new opcode was added for SUSPEND, but no new suspend statement
was added to the parser. I used a python disassembler/assembler
to change specific RETURN opcodes into SUSPENDs.
( various .py files here were hacked in that manner and used for testing.
)
co.txt & cont.semantics were the beginning of some notes on
various thread control issues.
py-co is part of the mailing list correspondence between me, Tim and Guido
that started me on that experiment.
More notes on this later when I find time.
- Steve Majewski <sdm7g@Virginia.EDU>
Python Microthreads
Will Ware, Christian Tismer, Just van Rossum, Mike Fletcher
Microthreads are useful when you want to program many behaviors happening
simultaneously. Simulations and games often want to model the simultaneous and
independent behavior of many people, many businesses, many monsters, many
physical objects, many spaceships, and so forth. With microthreads, you can
code these behaviors as Python functions.
You will still need to think a teeny bit about the fact that context switching
occurs between threads, hence this documentation. The microthread package uses
Stackless Python.
Microthreads switch faster and use much less memory than OS threads. You
can run thousands of microthreads simultaneously. Additionally, the
microthread library includes a rich set of objects for inter-thread
communication, synchronization, and execution control.
Some sort of resolution to Stackless Python seems likely for 2.1. Guido is
inclined to take a solution for 90% of the problems: "I still think that the
current Stackless implementation is too complex, and that continuations aren't
worth the insanity they seem to require (or cause :-), but that microthreads
and coroutines *are* worth having and that something not completely unlike
Stackless will be one day the way to get there..." He then went on to post a
strawman API for microthreads:
http://www.python.org/pipermail/python-dev/2000-November/017216.html
Christian Tismer agreed with him that continuations aren't really
necessary. "I'm happy to toss continuations for core Python, if we can find
the right building blocks for coro/gen/uthreads. I think Guido comes quite
near this, already."
http://www.python.org/pipermail/python-dev/2000-November/017252.html
Traditional pattern matching languages, like SNOBOL4,
represent patterns as an abstract data type. A special and specific pattern
evaluation routine traverses both the pattern structure and the subject text,
applying evaluation rules as indicated by the pattern, and advancing or
regressing depending on their outcome.
A co-expression can be thought of as an independent,
encapsulated thread-like context, where the results of the expression can be
picked off one at a time. Let us consider an example: suppose you are writing
a program that generates code, and you need something that will generate names
for you. This expression will generate names:
"name" || seq()
(seq produces an infinite sequence of integers, by default
starting at 1.) Of course, an expression exists at one point in the code; we
need to separate the evaluation of the expression from its textual position in
the program. We do this by creating a co-expression:
c := create ("name" || seq())
Now wherever a label name is desired, it can be obtained by activating
the co-expression c:
tempvar_name := @c
After a co-expression has produced all its results, further
evaluation with @ will fail. The ^ operator produces
a new co-expression with the same expression as its argument, but `rewound' to
the beginning.
A while ago there was a (crossposted) thread about about pipes: most of the
discussion was about Perl, but it caused me to repost my redirect.py modules (
to show how *easy* the problem was in Python! ;-), and also to think more about
the utility of having a more general 2-way interface.
The tostring/tolines functions in redirect are able to package the printed
output from a function and return it as a python object. This gives some of the
capability of unix pipes without having to actually pipe thru the OS - the
non-trivial semantic difference being that redirect does not produce any
concurrency - the inner function must run to completion before the redirecting
wrapper can return the collected output to the calling function.
Actual concurrency of independent threads is not required, though: coroutine
would be sufficient since the read/write is a blocking operation that could be
an implicit transfer of control.
I know Guido is not as enamored of coroutines as old Icon-ers like Tim and I.
He has (more than once, I think) stated the view that anything you could do with
coroutines, you can do with classes. This view is certainly true from a
theoretical POV - complemantarity of the object=method+data and closure=function+environment
and all that, and coroutines are just a subset of continuations, etc., etc. But
the difference is that saving of state into an objects instance variables has to
be explicitly programmed, where coroutines and continuation are implicit. ( And
being able to turn a pair of read/writes in a pair of functions into a coroutine
return/continue, mediated transparently by a file-like python object would seem
to be a potential big win for code-reuse. )
It is possible to program a pair of classes that cooperate with each other,
but there is no way to make a class that can link two arbitrary functions by
their read/write methods. The flow of control is not symetrical: one of the
functions must RETURN before the other can get continue.
The fact that Python frames are (heap allocated) objects and not just pushed
on the stack, means that they can persist whether that
function is active or not. In fact, an exception passes you a link to a whole
bunch of frameobjects in it's traceback object.
A frame object appears contain all of the necessary state to serve as a
continuation except that the interpreter's stactpointer and blockpointer are not
preserved, and when a stack frame is unwound by an exception, the stack is
emptied by poping and DECREF-ing the contents.
I looks to me that it would no be difficult to give Python continuations: if
a stackpointer and blockpointer are added to the frameobject, all that is
missing is an operation to package a frameobject-continuation, and a 'resume'
operation ( a change to ceval.c ) that takes a frame/continuation object instead
of creating a new frameobject.
I am far from a complete understanding of ceval and what happens to frame and
blocks in Python, so I may be missing some obvious
problem ( And all the details of *exactly* how to do this aren't completely
clear to me yet, either ).
Comments ?
What I initially wanted was some sort of cooperative coroutine control
between two functions mediated by a file-like object that used it's read/write
methods to alternate control on the two functions. Except for inet servers, most
of the cases where I might want threads seem to actually fit a blocking
coroutine model better than parallel threads, and in the cases where it doesn't,
I'm quite happy to use unix processes for threading.
I would like to have threads on dos/mac, but I don't know if any of the portable
(on unix) threads packages work on those platforms, or how much work it would be
to get really portable threads written in C to work in Python. But since the
easiest method of adding coroutine control appears to be the more general case
of adding continuations to Python, I would think that this also gives a possible
method of adding preemptive threads to the Python interpreter in a portable
manner by coding it into the virtual machine rather than the real machine.
Comments ?
[ Encouraging or _discouraging_ comments invited! If I'm blind to some fatal
flaw or don't understand what's going on in ceval, I'ld be happy to hear about
it before I actually waste time in trying to implement it! ]
-- Steve Majewski (804-982-0831) <sdm7g@Virginia.EDU> --
-- UVA Department of Molecular Physiology and Biological Physics --
-- Box 449 Health Science Center Charlottesville,VA 22908 --
[ "Cognitive Science is where Philosophy goes when it dies ... if it hasn't been
good!" - Jerry Fodor ]
Co-routines are of particular use in file processing and
simulation applications.
In file processing, co-routines enable the programmer to
separate records or characters in time, rather than in space (i.e. physical
file position), and view his program in terms of data flow from module to
module, rather than flow of control.
In simulation, co-routines allow a more natural modelling of
of a set of co-operating processes. It should be stressed that although
co-routines share a number of properties with asynchronous processes
(preservation of local variables etc.), which make modelling easy, co-routines
are not separate processes, and the user must still manage flow of control
between them.
For a formal definition of a co-routine and a full
explanation of the fundamental concepts, the reader is referred to the
technical literature (Knuth, CACM etc.).
Preserving State:
The current state of execution of a function can be largely
described by its program instruction counter (IC) and its stack frame and
pointer. The IC gives the location where execution is taking place, and the
stack frame provides the values of all arguments and other storage local to
the function. If the IC, stack frame, and stack pointer are preserved when a
function is suspended, the function may be resumed at the exact point of
suspension by restoring these values.
In regular functions, the current state of the function is
lost when the function returns; the only way the state may be preserved when
transferring to another function is to call the other function as a subroutine
of the first. Then, of course, the state of the subroutine must be lost when
it returns to resume execution in its caller.
In a like manner, one can conceive of a group of functions,
like those that constitute a program, which can only preserve the group state
by calling other groups of functions (programs) as sub-programs. The state of
a sub-program, as with the state of a called function, vanishes when the
sub-program returns to its caller. The states of both the caller and callee
cannot be easily preserved.
Co-routines, on the other hand, are groups of functions
which have been given a mechanism for preserving their states before
transferring to other co-routines. Transfers among co-routines do not involve
the regular caller/callee hierarchy, and the states of all co-routines may be
thought of as existing concurrently.
We've spent the last several pages on almost
microscopic details of process behavior. Rather than continue our descent into
the murky depths, we'll revert to a higher-level view of processes.
Earlier in this chapter, we covered ways of
controlling multiple simultaneous jobs within an interactive login session;
now we'll consider multiple process control within shell programs. When two
(or more) processes are explicitly programmed to run simultaneously and
possibly communicate with each other, we call them
coroutines.
This is actually nothing new: a pipeline is an
example of coroutines. The shell's pipeline construct encapsulates a fairly
sophisticated set of rules about how processes interact with each other. If we
take a closer look at these rules, we'll be better able to understand other
ways of handling coroutines-most of which turn out to be simpler than
pipelines.
When you invoke a simple pipeline, say
ls | more, the shell invokes a series of UNIX
primitive operations, a.k.a. system calls. In
effect, the shell tells UNIX to do the following things; in case you're
interested, we include in parentheses the actual system call used at each
step:
Create two subprocesses, which we'll call P1
and P2 (the fork system call).
Set up I/O between the processes so that P1's
standard output feeds into P2's standard input (pipe).
Start /bin/ls in
process P1 (exec).
Start /bin/more in
process P2 (exec).
Wait for both processes to finish (wait).
You can probably imagine how the above steps
change when the pipeline involves more than two processes,
Now let's make things simpler. We'll see how to
get multiple processes to run at the same time if the processes do not need to
communicate. For example, we want the processes dave
and bob to run as coroutines, without
communication, in a shell script. Our initial solution would be this:
dave &
bob
Assume for the moment that
bob is the last command in the script. The above
will work-but only if dave finishes first.If dave is still running when the script
finishes, then it becomes an orphan, i.e., it enters
one of the "funny states" we mentioned earlier in this chapter. Never mind the
details of orphanhood; just believe that you don't want this to happen, and if
it does, you may need to use the "runaway process" method of stopping it,
discussed earlier in this chapter.
There is a way of making sure the script
doesn't finish before dave does: the built-in
command wait. Without arguments,
wait simply waits until all background jobs
have finished. So to make sure the above code behaves properly, we would add
wait, like this:
dave &
bob
wait
Here, if bob
finishes first, the parent shell will wait for dave
to finish before finishing itself.
If your script has more than one background
job and you need to wait for specific ones to finish, you can give
wait the same type of job argument (with a
percent sign) as you would use with kill,
fg, or bg.
However, you will probably find that
wait without arguments suffices for all
coroutines you will ever program. Situations in which you would need to wait
for specific background jobs are quite complex and beyond the scope of this
book.
In fact, you may be wondering why you would
ever need to program coroutines that don't communicate with each other. For
example, why not just run bob after
dave in the usual way? What advantage is there
in running the two jobs simultaneously?
If you are running on a computer with one
processor (CPU), then there is a performance advantage-but only if you have
the bgnice option turned off (see
Chapter 3, Customizing Your Environment), and even then only in certain
situations.
Roughly speaking, you can characterize a
process in terms of how it uses system resources in three ways: whether it
is CPU intensive (e.g., does lots of number
crunching), I/O intensive (does a lot of reading
or writing to the disk), or interactive (requires
user intervention).
We already know from
Chapter 1 that it makes no sense to run an interactive job in the
background. But apart from that, the more two or more processes differ with
respect to these three criteria, the more advantage there is in running them
simultaneously. For example, a number-crunching statistical calculation
would do well when running at the same time as a long, I/O-intensive
database query.
On the other hand, if two processes use
resources in similar ways, it may even be less efficient to run them at the
same time as it would be to run them sequentially. Why? Basically, because
under such circumstances, the operating system often has to "time-slice" the
resource(s) in contention.
For example, if both processes are "disk
hogs," the operating system may enter a mode where it constantly switches
control of the disk back and forth between the two competing processes; the
system ends up spending at least as much time doing the switching as it does
on the processes themselves.
This phenomenon is known as thrashing; at its most
severe, it can cause a system to come to a virtual standstill. Thrashing is
a common problem; system administrators and operating system designers both
spend lots of time trying to minimize it.
But if you have a computer with multiple CPUs
(such as a Pyramid, Sequent, or Sun MP), you should be less concerned about
thrashing. Furthermore, coroutines can provide dramatic increases in speed
on this type of machine, which is often called a
parallel computer; analogously, breaking up a process into coroutines
is sometimes called parallelizing the job.
Normally, when you start a background job on
a multiple-CPU machine, the computer will assign it to the next available
processor. This means that the two jobs are actually-not just
metaphorically-running at the same time.
In this case, the running time of the
coroutines is essentially equal to that of the longest-running job plus a
bit of overhead, instead of the sum of the run times of all processes
(although if the CPUs all share a common disk drive, the possibility of
I/O-related thrashing still exists). In the best case-all jobs having the
same run time and no I/O contention-you get a speedup factor equal to the
number of jobs.
Parallelizing a program is often not easy;
there are several subtle issues involved and there's plenty of room for
error. Nevertheless, it's worthwhile to know how to parallelize a shell
script whether or not you have a parallel machine, especially since such
machines are becoming more and more common.
We'll show how to do this-and give you an
idea of some of the problems involved-by means of a simple task whose
solution is amenable to parallelization.
Task 8.3
Write a utility that allows you to make
multiple copies of a file at the same time.
We'll call this script
mcp. The command mcpfilename dest1 dest2 ... should copy
filename to all of the destinations given. The
code for this should be fairly obvious:
file=$1
shift
for dest in "$@"; do
cp $file $dest
done
Now let's say we have a parallel computer
and we want this command to run as fast as possible.
To parallelize this script,
it's a simple matter of firing off the cp
commands in the background and adding a wait
at the end:
file=$1
shift
for dest in "$@"; do
cp $file $dest &
done
wait
Simple, right? Well, there is one little
problem: what happens if the user specifies duplicate destinations? If
you're lucky, the file just gets copied to the same place twice.
Otherwise, the identical cp commands will
interfere with each other, possibly resulting in a file that contains two
interspersed copies of the original file. In contrast, if you give the
regular cp command two arguments that point to
the same file, it will print an error message and do nothing.
To fix this problem, we would have to write
code that checks the argument list for duplicates. Although this isn't too
hard to do (see the exercises at the end of this chapter), the time it
takes that code to run might offset any gain in speed from
parallelization; furthermore, the code that does the checking detracts
from the simple elegance of the script.
As you can see, even a seemingly trivial
parallelization task has problems resulting from multiple processes having
concurrent access to a given system resource (a file in this case). Such
problems, known as concurrency control issues,
become much more difficult as the complexity of the application increases.
Complex concurrent programs often have much more code for handling the
special cases than for the actual job the program is supposed to do!
Therefore it shouldn't surprise you that
much research has been and is being done on parallelization, the ultimate
goal being to devise a tool that parallelizes code automatically. (Such
tools do exist; they usually work in the confines of some narrow subset of
the problem.) Even if you don't have access to a multiple-CPU machine,
parallelizing a shell script is an interesting exercise that should
acquaint you with some of the issues that surround coroutines.
Now that we have seen how to program
coroutines that don't communicate with each other, we'll build on that
foundation and discuss how to get them to communicate-in a more
sophisticated way than with a pipeline. The Korn shell has a set of features
that allow programmers to set up two-way communication between coroutines.
These features aren't included in most Bourne shells.
If you start a background process by
appending |& to a command instead of
&, the Korn shell will set up a special two-way
pipeline between the parent shell and the new background process.
read -p in the parent shell reads a line of the
background process' standard output; similarly,
print -p in the parent shell feeds into the standard input of the
background process.
Figure 8.2 shows how this works.
This scheme has some intriguing
possibilities. Notice the following things: first, the parent shell
communicates with the background process independently of its own standard
input and output. Second, the background process need not have any idea that
a shell script is communicating with it in this manner. This means that the
background process can be any pre-existing program that uses its standard
input and output in normal ways.
Here's a task that shows a simple example:
Task 8.4
You would like to have an online
calculator, but the existing UNIX utility dc(1)
uses Reverse Polish Notation (RPN), a la
Hewlett-Packard calculators. You'd rather have one that works like the
$3.95 model you got with that magazine subscription. Write a calculator
program that accepts standard algebraic notation.
The objective here is to write the program
without re-implementing the calculation engine that
dc already has-in other words, to write a program that translates
algebraic notation to RPN and passes the translated line to
dc to do the actual calculation. [12]
[12] The utility bc(1)
actually provides similar functionality.
We'll assume that the function
alg2rpn, which does the translation, already
exists: given a line of algebraic notation as argument, it prints the RPN
equivalent on the standard output. If we have this, then the calculator
program (which we'll call adc) is very simple:
dc |&
while read line'?adc> '; do
print -p "$(alg2rpn $line)"
read -p answer
print " = $answer"
done
The first line of this code starts
dc as a coroutine with a two-way pipe. Then the
while loop prompts the user for a line and
reads it until the user types [CTRL-D] for
end-of-input. The loop body converts the line to RPN, passes it to
dc through the pipe, reads
dc's answer, and prints it after an equal sign. For example:
Actually-as you may have noticed-it's not
entirely necessary to have a two-way pipe with dc.
You could do it with a standard pipe and let dc
do its own output, like this:
{ while read line'?adc> '; do
print "$(alg2rpn $line)"
done
} | dc
The only difference from the above is the
lack of equal sign before each answer is printed.
But: what if you wanted to make a fancy
graphical user interface (GUI), like the xcalc
program that comes with many X Window System installations? Then, clearly,
dc's own output would not be satisfactory, and
you would need full control of your own standard output in the parent
process. The user interface would have to capture dc's
output and display it in the window properly. The two-way pipe is an
excellent solution to this problem: just imagine that, instead of
print " = $answer ", there is a call to a
routine that displays the answer in the "readout" section of the
calculator window.
All of this suggests that the two-way pipe
scheme is great for writing shell scripts that interpose a software layer
between the user (or some other program) and an existing program that uses
standard input and output. In particular, it's great for writing new
interfaces to old, standard UNIX programs that expect line-at-a-time,
character-based user input and output. The new interfaces could be GUIs,
or they could be network interface programs that talk to users over links
to remote machines. In other words, the Korn shell's two-way pipe
construct is designed to help develop very up-to-date software!
Before we leave the subject of coroutines,
we'll complete the circle by showing how the two-way pipe construct compares
to regular pipelines. As you may have been able to figure out by now, it is
possible to program a standard pipeline by using |&
with print -p.
This has the advantage of reserving the
parent shell's standard output for other use. The disadvantage is that the
child process' standard output is directed to the two-way pipe: if the
parent process doesn't read it with read -p,
then it's effectively lost.
A multiphase algorithm in which the phases are linked by
temporary files (or arrays) can be reduced to a single-pass algorithm using
coroutines.
Coroutines are defined and described in detail in Knuth
(Volume I) and most other modern books on algorithmics. Under IRIX, you can
write coroutines in a natural way using any one of three models:
The UNIX model of forked processes that communicate using
pipes.
The POSIX threads model using POSIX message queues to
communicate.
The MPI (or PVM) model of cooperating processes
exchanging messages.
Coroutines are subroutines, with neither the
caller nor the callee being "in charge". Instead, they allow
program-controlled interleaving of instructions generated by both. Suppose A
calls B. Then B wants to allow A to perform some more computation. B can
"resume A", which then runs until it "resumes B". Then A can execute until it
needs data from B, which might produce part of that data, and resume A, to
examine or compute with the part produced so far. Coroutines have been
exploited in the past in compilers, where the "parser" asks the "lexer" to run
until the lexer has to stop (say at end of line). The lexer then resumes the
parser to process that line's data, and is itself resumed to continue reading
input characters. The text also shows an example of a tree-comparison problem
solved logically by coroutines. Their advantage is that the cooperative
behavior allows the "high-level" program to terminate the computation early,
before the companion routine "completes" its assigned task. I have also used
them to simulate parallel computation, when I want to build my own control
over the task scheduling process.
As an interesting exercise, the text shows
how coroutines can be simulated in C, using C's "setjmp()" and "longjmp()"
library procedures. These procedures are intended for use in setting
exception-handler routines. However, they have the property that they create
concrete realizations of a "stopped" task -- an instruction counter, along
with a variable reference context is stored when a setjmp occurs, and is
resumed when a longjmp to the saved item is performed. The longjmp(Buf,
Return) causes the setjmp(Buf) to return (again), this time returning value
Return, instead of the 0 setjmp(Buf) returns when it is called.
A while ago there was a (crossposted) thread about about
pipes: most of the discussion was about Perl, but it caused me to repost my
redirect.py modules ( to show how *easy* the problem was in Python! ;-),
and also to think more about the utility of having a more general 2-way
interface.
The tostring/tolines functions in redirect are able to
package the printed output from a function and return it as a python object.
This gives some of the capability of unix pipes without having to actually
pipe thru the OS - the non-trivial semantic difference being that redirect
does not produce any concurrency - the inner function must run to completion
before the redirecting wrapper can return the collected output to the calling
function.
Actual concurrency of independent threads is not required,
though: coroutine would be sufficient since the read/write is a blocking
operation that could be an implicit transfer of control.
I know Guido is not as enamored of coroutines as old Icon-ers
like Tim and I. He has (more than once, I think) stated the view that anything
you could do with coroutines, you can do with classes. This view is certainly
true from a theoretical POV - complemantarity of the object=method+data and
closure=function+environment and all that, and coroutines are just a subset of
continuations, etc., etc. But the difference is that saving of state into an
objects instance variables has to be explicitly programmed, where coroutines
and continuation are implicit. ( And being able to turn a pair of read/writes
in a pair of functions into a coroutine return/continue, mediated
transparently by a file-like python object would seem to be a potential big
win for code-reuse. )
It is possible to program a pair of classes that cooperate
with each other, but there is no way to make a class that can link two
arbitrary functions by their read/write methods. The flow of control is not
symetrical: one of the functions must RETURN before the other can get
continue.
The fact that Python frames are (heap allocated) objects and
not just pushed on the stack, means that they can persist whether that
function is active or not. In fact, an exception passes you a link to a whole
bunch of frameobjects in it's traceback object.
A frame object appears contain all of the necessary state to
serve as a continuation except that the interpreter's stactpointer and
blockpointer are not preserved, and when a stack frame is unwound by an
exception, the stack is emptied by poping and DECREF-ing the contents.
I looks to me that it would no be difficult to give Python
continuations: if a stackpointer and blockpointer are added to the frameobject,
all that is missing is an operation to package a frameobject-continuation, and
a 'resume' operation ( a change to ceval.c ) that takes a frame/continuation
object instead of creating a new frameobject.
I am far from a complete understanding of ceval and what
happens to frame and blocks in Python, so I may be missing some obvious
problem ( And all the details of *exactly* how to do this aren't completely
clear to me yet, either ).
Comments ?
What I initially wanted was some sort of cooperative
coroutine control between two functions mediated by a file-like object that
used it's read/write methods to alternate control on the two functions. Except
for inet servers, most of the cases where I might want threads seem to
actually fit a blocking coroutine
model better than parallel threads, and in the cases where it doesn't, I'm
quite happy to use unix processes for threading. I would like to have threads
on dos/mac, but I don't know if any of the portable (on unix) threads packages
work on those platforms, or how much work it would be to get really portable
threads written in C to work in Python. But since the easiest method of adding
coroutine control appears to be the more general case of adding continuations
to Python, I would think that this also gives a possible method of adding
preemptive threads to the Python interpreter in a portable manner by coding it
into the virtual machine rather than the real machine.
Comments ?
[ Encouraging or _discouraging_ comments invited! If I'm
blind
to some fatal flaw or don't understand what's going on in
ceval, I'ld be happy to hear about it before I actually
waste time in trying to implement it! ]
-- Steve Majewski (804-982-0831) <sdm7g@Virginia.EDU> --
-- UVA Department of Molecular Physiology and Biological Physics --
-- Box 449 Health Science Center Charlottesville,VA 22908 --
[ "Cognitive Science is where Philosophy goes when it dies ...
if it hasn't been good!" - Jerry Fodor ]
atbowler@thinkage.on.ca (Alan Bowler) writes in a.f.c:
>In article <6i72h2$hhh$1@strato.ultra.net> [20]jmfbahxx@ma.ultranet.com
writes
:
>>
>>Another neat thing that was used was the notion of co-routines.
>>Perhaps somebody more qualified would talk about this?
>You mean "co-operative multitasking with threads" :-)
Well, there are coroutines that take many instructions to switch between and
there are coroutines that take one instruction. Barbara was talking about the
latter. The definition of a coroutine as I heard it is that the two code paths
use the same mechanism to transfer control back and forth. Of the many styles
of subroutine calls on the PDP-10, JSP ac,addr is the fastest, as it's the
only one that doesn't require a memory store. Its ISP is something like:
ac = PC
PC = effective address [addr in the usual case]
The subroutine return, of course, is:
JRST (ac)
Here, the efective address is the contents of the register.
The coroutine instruction combined the two:
JSP ac,(ac)
This essentially exchanged the PC with ac.
There's one big catch here - there can't be unanswered pushes or pops in
either piece of code, this is one reason why many people equate coroutines
with context switches and the exchange of many registers.
I have two good examples to describe. I'll put the second one in a separate
posting.
I wrote the client side of TOPS-10 TELNET for the Arpanet that ran at Carnegie
Mellon, Harvard, and Rutgers. Telnet has several multi character sequences and
they can be split across message boundaries. TOPS-10 made
it easiest for use to get a message at interrupt level, and step through each
octet in sequence, calling the TTY input code as necessary. However, parsing
the Telnet options is more easily done by code that can call a
subroutine to fetch one character at a time.
So I compromised. The network side looked something like:
prcmsg: PUSHJ P,SAVE1 ;Save P1
MOVE P1,LAST_PC(F) ;Get PC where telnet left off
PUSHJ P,GET_CH ;Get next byte from message
JRST DONE ;None left
JSP P1,(P1) ;call telnet processor
JRST LOOP
DONE: MOVEM P1,LAST_PC(F)
POPJ P,
Not too exciting.
The telnet side looked something like:
PRCTEL: CAIN T1,IAC ;Telnet escape?
JRST NORMCH ;just text
JSP P1,(P1) ;get next character
CAIGE T1,MINTEL ;command in range 240 - 255
JRST BAD ;Out of range
JRST DISPTBL-MINTEL(T1) ; Dispatch
...
WONT: JSP P1,(P1) ;Get option code
...
The nice thing about all this was that the telnet processor had no idea that
some of its coroutine calls actually resulted in dismissing an interrupt.
Another way of looking at this code is to see a state machine where the PC is
the state variable.
A decade later when I was at Alliant, I fielded a phone call from a customer
with a Macintosh machine that was sometimes having trouble with its Telnet
link. The customer had managed to trace it to telnet commands
split between two TCP messages. I really tried to be sympathetic, but I was
firm that Alliant's system, while perhaps not being Mac-friendly, was
compliant with the protocols and that I was sure a Macintosh should
be able to handle split telnet commands.
Other architectures have coroutine instructions too. On the PDP-11:
JSR Rx,@Rx
The Intel 860 _almost_ has one:
calli r1
(Calli is like jsp r1,ea in that the return pc is stored in r1.) However,
the i860 manual says this is a no-no.
I should know if Alpha has one, but I just don't do enough assembly language
here.
--
<> Eric (Ric) Werme <> The above is unlikely to contain <>
<> ROT-13 addresses: <> official claims or policies of <>
<> <jrezr@mx3.qrp.pbz> <> Digital Equipment Corp. <>
<> <jrezr@plorecbegny.arg> <> http://www.cyberportal.net/werme <>
A simple and powerful coroutine scheme has been offered in
Modula-2 by N. Wirth. The two basic operations (exported by the
SYSTEM module) of Modula-2 are:
NEWPROCESS creates a new coroutine with a
stack starting at addr with size size. The coroutine
starts execution by calling the parameterless global procedure proc.
A handle to the new coroutine is returned in new.
The coroutine facilities of Modula-2 allow
multi-threaded programs to be constructed. In such programs, several
threads may be at various stages of execution at the same time. These threads
are quasi-concurrent. That is, only one thread is actually active at
any one time, but by interleaving the execution of the various threads all may
progress apparently in parallel. The use of couroutines allows certain unique
forms of program organization which are rather under-utilized in current
practice, probably since few languages support coroutine primitives. In
particular, coroutines form a natural foundation for simulation programs.
Program threads are sometimes also known as lightweight processes,
since they provide some of the functionality of processes, but are many, many
orders of magnitude less costly in execution time.
Execution of each coroutine is explicitly suspended by
transferring control to another coroutine. Each coroutine has its own
activation stack at runtime, and these stacks are explicitly created and
initialized by a call to the procedure Coroutines.NEWPROCESS.
Programs which do not use the coroutines library, so-called
single-stack programs have little need to perform stack overflow
testing. Typically, several hundred megabytes of virtual memory are available
for expansion of the stack segment of such programs, although it is usual for
s process size limit to be exceeded well before this. Programs which use the
coroutines library have a separate stack for each coroutine, suggesting the
prudent use of stack overflow testing. The facilities provided for this are
also available for single-stack programs, although the default continues to be
for stack overflow testing to be disabled.
Perhaps the best tutorial introduction to the language. It has clear and correct
explanations, and covers some fairly advanced topics. The book is an updated Common
Lisp version of the 1984 edition published by Harper and Row Publishers.
[MK&BM]
Code from the book, including several implementation-specific versions of the
SDRAW and DTRACE tools described therein, is also available on
the author's web site.
You can use PayPal to make a contribution, supporting hosting of this site with different providers to distribute and speed up access.
Currently there are two functional mirrors: softpanorama.info (the fastest) and softpanorama.net.
Disclaimer:
The statements, views and opinions presented on this web page are those of the author and are not endorsed by, nor
do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the
author may be associated with.We do not warrant the correctness of the information provided or its fitness for any purpose.