|
Softpanorama |
May the source be with you, but remember the KISS principle ;-)
|
| News | Recommended Books | Recommended Links | Tutorials | Reference | CGI-scanners | Selena Sol CGI scripts |
| CGI Security | SSI | Administration | Debugging | History | Humor | Etc. |
CGI is a very flexible and powerful protocol, and it scales much more that most WEB developers assume. CGI may be not that fancy technology, but it's simple and you can do almost anything it it.
The most common tool for writing CGI scripts is Perl, therefore most CGI scripts you can find on the WEB are written in this language.
Essentially, all web applications do pretty much the same things:
|
Learn when a browser is better than a GUI application, and when a CGI is best of all
Level: Introductory
Peter Seebach (seebs@plethora.net),
Freelance author, Plethora.net
27 Feb 2007
Writing local Web applications can be quick, easy, and efficient for solving specific Intranet problems. Understand why a Web browser is sometimes a better interface than a GUI application, and when a CGI script may be the simplest and most elegant solution.
The vast majority of Web sites you visit are presumably open to Internet access, but many companies have found that Intranet development has its place. However, you can take this further -- you can develop perfectly functional Web applications that will never send so much as a single packet over a network interface. Experienced Web developers sometimes find themselves struggling to learn a GUI toolkit when a simple CGI script would serve their needs perfectly well.
A local-only Web application is, if anything, much simpler than one intended for general use. You can easily set browser requirements, and server performance is very unlikely to be an issue. Simple applications, using standard CGI form widgets and the like, can be written in a fraction of the time required for development of self-contained applications. Applications that are built around forms or data manipulation are often excellent candidates for implementation as a trivial Web service.
In many cases, a custom application like this can provide an elegant and simple solution to a very specific problem. I once wrote a picture browser that did nothing but browse a single directory full of files from a camera and let me file the pictures into categories. This took maybe twenty lines of Perl, and the result was much faster than a more general solution, to say nothing of how long it would have taken to write a GUI application to do the same thing.
What has the Web browser done for us, anyway?
You might ask, quite reasonably, Why would anyone bother? What does the Web browser do that another application can't do? The answer is obvious: nothing. But then, what do high-level languages do that you can't do in machine code? Nothing, really. The advantage of using a Web browser as your interface is that all the hard code has been written already. You don't have to run around checking for resize events, window expose events, or menu events. All you do is read a chunk of data expressing a request and process it.
The potential weakness of Web-based development is lack of control over appearance. This is not much of a weakness. If you're trying to get an application developed quickly, the last thing you want is to spend a lot of time messing about with appearance. Whatever platform you're on, the Web browser's native buttons and text widgets will be familiar to the user. That's a feature, really.
The Web browser does one more thing that is very useful: It gives you a number of preference settings you're not required to maintain yourself. Font sizes can be changed by the user on the fly. Similarly, if you can generate your output in a nice, simple HTML form, it can be printed easily and quickly. Many features you might otherwise have to implement (saving output to a file, printing output, and resizing windows are just some examples) are implemented for you.
Architecture for the single user
Although it might seem that a single-user environment completely eliminates a lot of application development considerations, this isn't quite the case. I recently wrote a document-filing system as a Web application. Because I designed it for a maximum of one simultaneous user, I felt I could dispense with a lot of the file-locking code that might otherwise be necessary. I was wrong. The user interface utilized frames, and one of the sanity-checks performed by my display program could fail catastrophically if another program modified the database while it was running. The browser would often end up running both queries at once.
A local server means that network bandwidth is not an issue; even things that might be problematic on ethernet are not a problem on a local host, although huge files will still slow down or crash browsers.
If you are developing an application to run on a local Web server, give thought to the question of what happens when someone else accesses it over the network. Ideally, ensure that the local application is only accessible on the local network interface. If you can't do that, you will need some kind of security. It might be enough to simply have your application refuse all connections that aren't from 127.0.0.1, which is a lot more secure than password-based security.
It might be reasonable to simply pick an alternative port number so you can have a dedicated server for your application. On a UNIX®-type box, setting up a personal copy of Apache on port 8880 will take minutes and gives you complete control over server features and settings. It also ensures that your application back end runs with your usual privileges, so you don't have to make crucial files world-writeable, which is a big plus.
For a traditional Web application, using a database server on the back end simplifies development immensely, because the database engine -- just by being a single server -- provides most of the serialization and locking you need, and more can be obtained easily. For a single-user application that's going to run on a single machine, this is probably overkill and might not be worth the trouble. A local application might have no need of this and might benefit from using a simple and easily moved data file. I would move my document-filing application between my laptop and desktop computers whenever I went anywhere. This was trivial because it simply used a Berkeley DB file.
The Berkeley DB format offers a fairly simple solution that is easily accessible from a number of languages. Because there's no database server involved, it's a poor choice for an application with many simultaneous users; to be safe, you have to lock the database, perform your operations, and then unlock it. On the other hand, because no database server is involved, it's a fairly good choice for a program with very few simultaneous users.
Here's sample Perl code to attach to a Berkeley DB database, complete with locking:
my %db;
open(DBFILE, '+file.db');
flock(DBFILE, LOCK_EX);
$dbhandle = tie %db, 'DB_File', 'file.db', O_CREAT|O_RDWR, 0666, $DB_HASH;
|
After this code has run, you can access the database directly as a standard Perl hash:
$db{user} = "me";
print "<p>User: $db{user}</p>\n";
|
You can also use the dbhandle object to perform operations
on the database:
$dbhandle->del("user");
|
When you're done, you're done -- that is, when your program exits, the
lock is dropped automatically. The flock() locking
is purely advisory -- it won't prevent other programs from
writing to the file if they don't use it. If your program is
implemented as a number of related programs, put the locking
code in all of them, or better yet, put it in a shared module.
This way, the purely advisory locking still gives you what you
need: reliable assurance that only one program at a time is
modifying the data files.
Some applications will work fine with plain files, such as CSV files or just flat field text files. Some might need a full SQL database. Don't feel compelled to adopt "enterprise-class" solutions for small applications that are intended to get a job done quickly and easily. Save your effort for good error recovery and nice convenience features. That said, if you need a relational database, use one.
|
|
|
Applications are probably reasonably well addressed using forms with lots of buttons in them. If you have metadata or context that needs to be passed from one page to another, go ahead and use hidden form fields; the security concerns you might face (users can override them) are non-issues in this context. You can use image buttons for many purposes, although they do require you to develop images. If you want to bypass this, create an image button with alt text and an invalid URL for its image:
print $q->image_button(-name => "sort",
-value => "$name",
-src => "/nonexistant",
-alt => $label);
|
This might seem ridiculous, compared to the simpler alternative of just
using a hyperlink, but in a large form, the corresponding
href=... link might be quite large, and indeed impossible
to precalculate if the button is submitting a form. It's an ugly
hack, but it works.
|
|
|
If you write a couple of small local host Web applications, you will quickly notice something: They are often quick enough to write, even with debugging, to be substantially more efficient than performing common tasks by hand. The efficiency gain of simplifying a long or tedious procedure might easily pay back the development time of an application that might take only minutes to develop.
Perl's standard CGI library provides a huge variety of useful basic tools to work with. A couple of hours of playing around with it and getting used to its extremely flexible and rich feature set will let you write a broad variety of applications with very little effort. Just as a simple scripting system can make one-off command line scripts practical and cost effective, simple CGI scripts can make a variety of graphical programs practical and cost effective. If you put an afternoon into playing around with these, you'll quickly find that dozens of tasks that you wish someone would have written a program for are easily within your grasp.
|
|
|
A disturbingly common experience is to write a small, simple, one-off application, quite possibly intended only for local use, and then discover that there's a compelling reason to scale it up to a larger audience.
Even when an application is only for local use, write clean code, and have lots of instrumentation and debugging output available. Good code with good support for debugging pays for itself in maintenance costs very quickly. When you add in the possibility of needing to rework the application for a larger (plural!) audience, it pays off even more.
Refactoring can be a fair amount of work, but it's not as bad as you might think. For the most part, it will mean scaling up the actual back-end database or file manipulation to allow for better and smarter locking. If your program does file manipulations, you might consider writing a small server that handles these operations atomically, and thus guarantees serialization. It might well make sense for such a server to handle only one client at a time, as long as individual scripts run quickly.
The major thing you need to watch out for is a program that assumes something about internal state. For instance, in a document-classification program, the obvious thing to do is to display "the next file" to the user, then file it appropriately. With two users, you need some tracking of which file each user is processing, and some way to handle edge cases -- if someone opens the application, then wanders away at some point, you will need to let someone else look at the file in question. Sometimes this can involve a substantial rework of the back end, but it will generally be pretty easy to keep the user-visible front end stable.
Until now FastCGI was behind mod_php, java and mod_perl in terms of popularity among web server administrators and web developers. But times have changed and changed for good.
In the early days of web development when the CGI interface was the leader and web servers were quite slow, developers felt that they needed a faster server technology, that can be used to run their web applications on high-traffic web sites. The solution to the problem seemed obvious – the developers had to take their CGI-based code and put it into the web server process.
With this solution, the operating system didn’t have to start a new process every time a request had been received, which is very expensive, and you could write your application with a persistent functionality in mind and ability to cache data between several different http requests.
These were the days when some of the most popular web server APIs were born – Internet Information Server’s ISAPI, Netscape Server’s NSAPI, and Apache’s module API. This trend created some of the best known and quite often used technologies in web development like mod_php, mod_python, java servlets (and later jsp), asp. But the conception that stays behind these technologies is not flawless. There are many problems with applications that run inside your average web server.
For example mod_perl’s high memory usage per child process can suck the available ram, php’s problems with threads can kill the whole web server, and many security problems arising from the fact that the most popular web server (Apache) can’t do simple things like changing the OS user it executes the request with. For quite some time there have been solutions, like putting a light-weight proxy server in front of apache, installing third-parity software for IIS or using php’s safe mode and OpenBasedir (Oh GOD!) on apache, but these are not elegant and pose other problems on their own. Also the hardware progress in the last few years made the server modules obsolete.
In the mean time, when the server modules were gaining glory and fame, a little-known technology with a different conception and implementation was born. It was called FastCGI and the basic problem it was designed to solve was to make CGI programs run faster. Later, it became clear that FastCGI solves many other problems and design flaws that the server modules had.
How FastCGI works?
FastCGI runs in the web server process, but doesn’t handle the request itself. Instead it manages a pool of the so-called FastCGI servers outside of the web server process and when a request arrives, the FastCGI manager sends the http data through a socket to one of the available fastcgi servers to handle this request. This strategy is quite simple and has the following advantages:
- The FastCGI servers can be written in any language that has an api to communicate through sockets
- The FastCGI servers run outside of the web server thus improving stability and allowing the web server to handle only requests for static data with very little overhead. You won’t need a front-end proxy for this. Thread-unsafe applications can be run with
threaded web servers.- The FastCGI manager can change the owner of the FastCGI servers, which allows the web administrator to have different virtual hosts served by different OS users. (Anyone remember Apache2’s perchild MPM?)
- The FastCGI servers are persistent processes, which serve requests many times faster than standard CGIs.
In the beginning FastCGI was not so popular, because its use of external processes and communication through sockets required more resources to be allocated on the host system. Today this is not the case, because for the last few years the hardware development made huge leaps ahead and system memory is not so expensive anymore. In present days many of the web servers have full support for FastCGI and the trend is to migrate the current web applications to run under it. These are some of the most popular web servers that have support for FastCGI:
- Apache – http://httpd.apache.org
- Lighttpd – http://www.lighttpd.net/
- Zeus Web Server – http://www.zeus.com/products/zws/
- Sun Java System Web Server – http://www.sun.com
In November Microsoft announced support for FastCGI on IIS 5, IIS 6 and IIS 7 (Beta). Click here to read the announcement.
Link To VerySimple Scripts -- nice collection of scripts
Installing Perl Modules on MS Windows Servers (the Easy Way)
WDVL Introduction to the Web Application Development Environment (Tools) by Selena Sol May 31, 1999
CGI Programming 101 - Learn CGI Today! online book. Average quality.
Web Client Programming with Perl -- online book
Preface
Chapter 1: Introduction
Chapter 2: Demystifying the Browser
Chapter 3: Learning HTTP
Chapter 4: The Socket Library
Chapter 5: The LWP Library
Chapter 6: Example LWP Programs
Chapter 7: Graphical Examples with Perl/TkAppendix A: HTTP Headers
Appendix B: Reference Tables
Appendix C: The Robot Exclusion Standard
Tutorial Documentation -- tutorial gateway -- Perl-based, very good idea
WDVL CGI The Common Gateway Interface for Server-side Processing
CGI Script Tutorial and CGI Resources
Common Gateway Interface (CGI) Specifications
CGI-Resources Page
CGI
Tutorials and scripts
The Idiot's Guide to Solving Perl CGI Problems
Perl
Tutotial Start
CGI Scripts from NCSA
ENMPC: Tutorial on CGI
Perl and CGI Tutorial
CGI
Tutorial - Frames version
Matt's Perl Tutorial
Danny Aldham's
Perl CGI Tutorial Page version 1.07
Perl and CGI Tutorial
CGI Tutorial && Link
CGI Tutorial: Start
CGI Manual
CGI & Perl links on the
WWW
Perl-Related Links
CGI
Tutorial: A simple CGI script
CGI
Tutorial: What CGI scripts are
***** CGI Resource Index. This is just Metaindex page: Archive and catalog of CGI scripts, documentation and resources.
WDVL CGI The Common Gateway Interface for Server-side Processing
CGI Script Tutorial and CGI Resources
CGI scripts have access to 20 or so environment variables, such as QUERY_STRING and CONTENT_LENGTH mentioned on the main page. Here's the complete list at NCSA.
- REQUEST_METHOD
- The HTTP method this script was called with. Generally "GET", "POST", or "HEAD".
- HTTP_REFERER
- The URL of the form that was submitted. This isn't always set, so don't rely on it. Don't go invading people's privacy with it, neither.
- PATH_INFO
- Extra "path" information. It's possible to pass extra info to your script in the URL, after the filename of the CGI script. For example, calling the URL
http://www.myhost.com/mypath/myscript.cgi/path/info/herewill set PATH_INFO to "/path/info/here". Commonly used for path-like data, but you can use it for anything.
- SERVER_NAME
- Your Web server's hostname or IP address (at least for this request).
- SERVER_PORT
- Your Web server's port (at least for this request).
- SCRIPT_NAME
- The path part of the URL that points to the script being executed. It should include the leading slash, but certain older Web servers leave the slash out. You can guarantee the leading slash with this line of Perl:
$ENV{'SCRIPT_NAME'}=~ s#^/?#/# ;So the URL of the script that's being executed is, in Perl,
"http://$ENV{'SERVER_NAME'}:$ENV{'SERVER_PORT'}$ENV{'SCRIPT_NAME'}"The complete URL the script was invoked with may also have PATH_INFO and QUERY_STRING at the end.
MIME types are standard, case-insensitive strings that identify a data type, used throughout the Internet for many purposes. They start with the general type of data (like text, image, or audio), followed by a slash, and end with the specific type of data (like html, gif, or jpeg). HTML files are identified with text/html, and GIFs and JPEGs are identified with image/gif and image/jpeg. Here's a pretty good list of commonly-used MIME types.
Whisker
Whisker is a CGI scanner with impressive features that makes it much
better than most CGI scanners.
Download:
http://www.wiretrip.net/rfp/p/doc.asp?id=21&iface=2
Perl
Perl CGI Scripts and Resources - a pretty good site !
Selena Sol's Public Domain CGI Script Archive and Resource Library
Brock's Perl Scripts CGI scripts plus some useful for admin scrips
CGI-Resources Page
CGI
Tutorials and scripts
Perl
Tutotial Start
CGI Scripts from NCSA
ENMPC: Tutorial on CGI
Perl and CGI Tutorial
CGI
Tutorial - Frames version
Matt's Perl Tutorial
Danny Aldham's
Perl CGI Tutorial Page version 1.07
Perl and CGI Tutorial
CGI Tutorial && Link
CGI Tutorial: Start
CGI Manual
CGI & Perl links on the
WWW
Perl-Related Links
CGI
Tutorial: A simple CGI script
CGI
Tutorial: What CGI scripts are
WDVL WebWare -- A monthly column for the cultural anthropologist and other liberal arts hackers gone Webmaster
BigNoseBird.Com's Free perl CGI Scripts Archive Page 'The Strangest Name in Web Authoring Resources'. 14 useful Perl scripts including; a virtual greeting card script, survey script with instant colour results, domain name lookup & various search scripts. There are no demos are avalible for the scripts, but they are keen to help if you have any problems setting them up.
Selena Sol's Guestbook made a little easier by BigNoseBird.Com
CGI SCRIPT TUTORIALS
- The CGI Primer explains what a CGI program needs to do in order to function.
- CGI Tutorial Start
- Footnotes for CGI Made Really Easy
Tutorial Documentation -- tutorial gateway -- Perl-based, very good idea
Tutorial on CGI Database Programming with Perl
- Telnet Tutorial and Basic Unix Commands
- Paths vs. URLs
- Password Protecting Directories
- Installing Perl on NT Workstations
- Interfacing your cgi to authorize.net
- 5 minute Basic MYSQL Tutorial
- Comprehensive MYSQL information
- CGI Programming Is Simple!
- Beginner's Guide to CGI Scripting with Perl
- CGI for the Total Non-Programmer - a tutorial
- CGI Made Really Easy
- CGI Scripting in PERL
- Perl Programming Help
- CGI Programming 101 - Learn CGI Today!
- CGI-Center: Learning Perl
- CGI Programming - Project Bill Resources
- The CGI Start-Up Guide
- Vanclubs cgi scripts Tutorials
- Welcome to Web consultant - Perl Tutorial
- www.perltutorial.com - under construction
- Perl Tutorial: Start
- Perl Tutorial
- Introduction to Perl 5
- Perl.com: List of Perl Tutorials
- CGI Programming FAQ
- Perl tutorial in Spanish!
- Getting Started With Perl
- Introduction to Perl
- Perl Lessons
- Introduction to Perl & CGI Programming
- perltoot
- Perl Primer - CGI/Perl Tutorial
- Perl Tutorial
- CGI City Tips and Tutorials
- CGI & Perl Tutorial
- The Perl Tutorials
- 82.562: Perl Tutorial
- NCSA Perl Tutorial
- CGI For The Non-Programmer
- A Perl Tutorial: Super-Basics
- Perl tutorial and Reference Manual
- RUCS Perl Tutorial
- Take 10 Minutes to Learn PERL
- Larry Wall Perl Tutorial
- Perl 101
- Perl Tutorials
- Webmonkey: Intro to Mod Perl
- CGI for the Total Non-Programmer
The Idiot's Guide to Solving Perl CGI Problems
Debugging CGI Programs contains a useful script to help debug your CGI programs. Requires Apache Server v1.2.
Seite zum Thema Linux -- SendingMirror.pl, a small script to keep your remote web server or ftp server up to date by pushing the changed data from your local host, maybe behind a firewall or a dialup line.
The problem with /usr/ucb/mail shell escapes is going stay with us for quite a while: I have found that many web sites run CGI helper scripts that send data from the network into /usr/ucb/mail, without censoring of, for example, newline characters embedded in the data.
Copyright © 1996-2008 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
Standard disclaimer: The statements, views and opinions presented on this web page are those of the author and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.
Last modified: June 05, 2008