Softpanorama
(slightly skeptical) Open Source Software Educational Society

May the source be with you, but remember the KISS principle ;-)

Google   


WDS (Webserver-Database-Scripting Language) Links

A Web Site is a Harsh Mistress

News

See also 

Recommended Links

Tutorials

Reference

HTML/DHTML

LAMP

Frontpage
Webservers HTTP Protocol Perl HTML Processors and Converters Webmaster Toolset Web Server Security Perl Web Site Management Scripts Alternatives to Adobe Products PDF converters
Perl CGI Scripitng CGI Security Perl Wiki Server Side Includes (SSI) Perl HTTP Logs Processing Scripts WEB Application Security HTTP Return Codes  Humor

Early versions of the WWW developed a reputation as a versatile and convenient tool for accessing mission-critical data at the European Laboratory for Particle Physics (CERN). Paradoxically the Web tools Tim Berners-Lee developed were the most successful and were widely regarded as the best way to access the CERN phone directory. Please note, that the first successful WWW application was not distribution of published papers it was a gateway to an existing and important application. Of course, the versatility of  WWW became clearer as the technology spread among high energy physics institutions and then to the outside world. But is it really an accident that the Web took off as a gateway to existing information system? I think this is not an accident and that's why WWW served as a launch pad for several scripting languages, including Perl, JavaScript, PHP and Python.

As Oleg Kiselyov noted in his Login - Speaking HTTP paper:

...HTTP is useful in its own right, for example, as a good file-distribution protocol with a number of important advantages over ftp. This article gives an example how to speak HTTP and get understood.

... By definition[1], HTTP is a request/response protocol that exchanges messages in a format similar to that used by Internet mail (MIME). An HTTP transaction is essentially a remote procedure call. It is usually a blocking call, although HTTP/1.1 provides for asynchronous and batch modes. HTTP allows intermediaries (caches, proxies) to cut into the response-reply chain.

An operation to execute remotely is expressed in HTTP as an application of a request method to a resource. Additional parameters, if needed, are communicated via request headers or a request body. The request body may be an arbitrary octet-stream. The HTTP/1.1 standard defines methods GET, HEAD, POST, PUT, DELETE, OPTIONS, TRACE, and CONNECT. A particular server may accept many others. This extensibility is a rather notable feature of HTTP. The parties can use not only custom methods but custom request and reply headers as well. In addition, a client and a server may exchange meta-information via "name=value" attribute pairs of the standard "Content-Type:" header.

Most of the HTTP transactions performed every day are done behind the scenes by browsers, proxies, robots, and servers. Yet the protocol is so simple that one can easily speak it oneself. The only requirement is a language or tool that is able to manipulate text strings and establish TCP connections. Even a simple telnet application may do in a pinch, which is often useful for debugging. Server-side programming is less demanding: a servlet or a scriptlet does not need to bother with the network connectivity, authentication, access restrictions, SSL, and other similar chores. Server modules or FastCGI give a server-side programmer even more tools: load-balancing, persistence, database connectivity, etc. This article demonstrates how to use Perl scripts to speak and respond HTTP directly.

 

Nikolai Bezroukov


Notes:
  • Those pages are written by people for whom English is not a native language. Some amount of grammar and spelling errors should be expected.
  • This is a Spartan WHYFF (We Help You For Free) site. It cannot replace the best teachers and the best books.
  • The site contain some obsolete pages as it develops like a living tree... Some links on older pages are broken. Please try to use Google, Open directory, etc. to find a replacement link (see HOWTO search the WEB for details). We would appreciate if you can mail us a correct link.

Search Amazon by keywords:

Google   
Open directory

Research Index

 

Old News ;-)

[Mar 20, 2007] IT: Microsoft Tracks Down Mass Fake Web Pages

"According to an article on New York Times, Microsoft researchers have discovered tens of thousands of junk Web pages, created only to lure search-engine users to advertisements. While most of us have run across them from time to time, the company researchers have found the pages are deliberately generated in vast numbers by a small group of shadowy operators. By following the money trail, Microsoft researchers were able to track the flow from big-name advertisers to search engine spammers. Many use Google's blogspot.com to set up spam doorway pages. 'The practice has proved to be a vexing problem for the major search companies, which struggle to prevent both spammers and companies specializing in improving legitimate clients' Web traffic -- a field known as search-engine optimization -- from undermining their page-ranking systems. Surprisingly, the researchers noted that the vast bulk of the junk listings was created from just two Web hosting companies and that as many as 68 percent of the advertisements sampled were placed by just three advertising syndicators.' The report is available at Microsoft Strider Search Ranger project page."
 

USENIX ;login - Speaking HTTP

 

General Web Scripting Tutorials

CGI Tutorial Start

Tutorial Documentation -- tutorial gateway -- Perl-based, very good idea

Perl Primer-CGI-Perl Tutorial

Webteacher.com - web database, javascript tutorial, cgi tutorial. An interactive site for helping non-programmers learn advanced web development skills. Great for beginners.

WDVL CGI The Common Gateway Interface for Server-side Processing

An instantaneous introduction to CGI scripts and HTML forms, Academic Computing Services, University of Kansas

CGI Script Tutorial and CGI Resources

Common Gateway Interface (CGI) Specifications
CGI-Resources Page
 CGI Tutorials and scripts
The Idiot's Guide to Solving Perl CGI Problems
Perl Tutotial Start
CGI Scripts from NCSA
ENMPC: Tutorial on CGI
Perl and CGI Tutorial
CGI Tutorial - Frames version
Matt's Perl Tutorial
Danny Aldham's Perl CGI Tutorial Page version 1.07
Perl and CGI Tutorial
CGI Tutorial && Link
CGI Tutorial: Start
CGI Manual
CGI & Perl links on the WWW
Perl-Related Links
CGI Tutorial: A simple CGI script
CGI Tutorial: What CGI scripts are


Reference


See also


Recommended Links


HTML Pretty Printing

 

htmlpp A Simple HTML Pretty Printer by Len Budney.

htmlpp is a simple HTML pretty printer, based on nsgmls and SGMLS.pm. The code is pretty alpha, but gives attractive results for many HTML docs. Some things, like nested tables, are rendered only passably. Other deeply-nested structures may render badly as well.

Note that this pretty-printer is oldish, and alpha, and unlikely to be developed any further. It's not a bad illustration of some of the possibilities for SGML technology in web authoring. Perhaps someone will take up the challenge, and build the "right" tool!

Since htmlpp gets its input from nsgmls, invalid documents should not be expected to work. However, a side effect of this approach is that minor errors and inconsistencies are actually fixed. Attribute values are always quoted in the pretty printed version. Characters like "<", ">" and "&" are converted into the appropriate SGML entities in attribute values and in document text. End tags are inserted automatically -- which will surprise you if you thought it was legal to imbed <pre> elements inside <p> elements, for example.

HTMLPrettyPrinter - generate nice HTML files from HTML syntax trees

[June 7, 2002] A prettyprinter for HTML documents -- From the author book The Web Architect's Handbook; an interesting in that it makes heavy use of modules:

use LWP::Simple;
use HTML::Parse;
use HTML::Entities;
use Text::Wrap;
use Getopt::Long;
 

[July 14, 2001] Clean up your Web pages with HTML TIDY is a free utility to fix mistakes made while editing HTML and to automatically tidy up sloppy editing into nicely layed out markup.

It also works great on the atrociously hard to read markup generated by specialized HTML editors and conversion tools, and can help you identify where you need to pay further attention on making your pages more accessible to people with disabilities.  

[July 14, 1999] hindent -- HTML indentation (pretty printing) utility Mar 28th 1999, 19:16 stable: 1.0.1 - devel: none license: GPL

http://www.domtools.com/pub/hindent1.1.0.tar.gz (12 hits)
Homepage: http://www.domtools.com/unix/hindent.shtml (34 hits)
Changelog: http://www.domtools.com/pub/hindent1.1.0-changes.txt

FHTML.PL (Perl) Formats and indents HTML code and writes a new file with the results.

ZDNet Software Library - Pretty HTML

Pretty HTML is an easy-to-use program that formats your HTML Web pages. After processing, your HTML code is neatly arranged, commented, spaced, and indented, making it much easier to read and maintain. You can also use Pretty HTML to compress your Web pages by eliminating unnecessary spaces and carriage returns. Process your Web pages one at a time or batch-format entire folders in a single operation. Pretty HTML offers a number of options to ensure that the HTML formatting is done to your liking. To play it extra safe, you can have the program make backup copies of your originals. Excellent online help is included.

 

 


Search and Replace

Perl scripts

 

sarep (Console/Editors) Command-line search and replace tool written in Perl.
Sep 16th 1998, 21:51 stable: 0.32 - devel: none - license: freely distributable

replacer.pl (Perl) A utility to replace all instances of a given text string with a new text string in all the files in a single directory.

Treesed -- Freeware
Treesed, a Perl program, is a search/replace tool for lists of files. It can search for patterns in a list of files, or even a tree of directories with files.

Usage:
treesed pattern1 <pattern2> -files <file1 file2 ...>
treesed pattern1 <pattern2> -tree

Treesed searches for pattern1. If pattern2 is supplied pattern1 is replaced by pattern2. If pattern2 is not supplied treesed just searches. A list of files can be supplied with the -files parameter. Treesed is also capable of search/replace in files in subdirectories if you supply the -tree parameter. All files in the current directory and subdirectories are processed. Always a backup is made of the original file, with a random numeric suffix.

 

 

non-perl

 


Not Traditional Tools

[June 7, 1999] CVS Version Control for Web Site Projects

Whatcha' gonna make - SunWorld - October 1998 -- make can be used for compiling a book or WEB site

Using m4 to write HTML.

Web Page Generator (Perl) This program allows the user to create a generic web page.


CGI Security

The problem with /usr/ucb/mail shell escapes is going stay with us for quite a while: I have found that many web sites run CGI helper scripts that send data from the network into /usr/ucb/mail, without censoring of, for example, newline characters embedded in the data.

 


Etc

WebMaker

Download: http://www.services.ru/linux/webmaker/WebMaker-0.8.0.tar.gz
Homepage: http://www.services.ru/linux/webmaker/

WebMaker is a GUI HTML Editor for Unix. Main features include a nice GUI interface, menus, toolbar and dialogs for tag editing, multiple windows support, HTML 4.0 support, color syntax highlighting, preview with external browser, ability to filter editor content through any external program that supports stdin/stdout and KDE integration.

 


Other WEB Technologies


Copyright © 1996-2007 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

Standard disclaimer: The statements, views and opinions presented on this web page are those of the author and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

Last modified: May 06, 2008