|
Softpanorama |
May the source be with you, but remember the KISS principle ;-)
Softpanorama Search
|
| News | Recommended Links | Webmaster Toolset | Perl HTML Processors and Converters |
I've been working on this tool for a while. It's not quite done yet, but it's definetly at a useful stage. I call it the tabifier, but in truth it's a code beautifier.
The overarching design goal for this tool was to beautify HTML code without breaking it. This is of course not totally possible, but I've strived to get as close as possible. There are great HTML beautifiers out there like HTML Tidy but they generally do too much. When I take an ugly HTML page that someone else has written and I don't know, I want to pass it through a beautifier to make it easier to work with. Things like tidy, though, will drastically alter the code, often making it more work to turn the result of the beautification back into something that displays like the original than it would be to just plain fix it.
htmlpp is a simple HTML pretty printer, based on nsgmls and SGMLS.pm. The code is pretty alpha, but gives attractive results for many HTML docs. Some things, like nested tables, are rendered only passably. Other deeply-nested structures may render badly as well.
Note that this pretty-printer is oldish, and alpha, and unlikely to be developed any further. It's not a bad illustration of some of the possibilities for SGML technology in web authoring. Perhaps someone will take up the challenge, and build the "right" tool!
Since htmlpp gets its input from nsgmls, invalid documents should not be expected to work. However, a side effect of this approach is that minor errors and inconsistencies are actually fixed. Attribute values are always quoted in the pretty printed version. Characters like "<", ">" and "&" are converted into the appropriate SGML entities in attribute values and in document text. End tags are inserted automatically -- which will surprise you if you thought it was legal to imbed <pre> elements inside <p> elements, for example.
use LWP::Simple;
use HTML::Parse;
use HTML::Entities;
use Text::Wrap;
use Getopt::Long;
It also works great on the atrociously hard to read markup generated by specialized HTML editors and conversion tools, and can help you identify where you need to pay further attention on making your pages more accessible to people with disabilities.
| http://www.domtools.com/pub/hindent1.1.0.tar.gz (12 hits) | |
| Homepage: | http://www.domtools.com/unix/hindent.shtml (34 hits) |
| Changelog: | http://www.domtools.com/pub/hindent1.1.0-changes.txt |
(Perl) Formats and indents HTML code and writes a new file with the results.
Pretty HTML is an easy-to-use program that formats your HTML Web pages. After processing, your HTML code is neatly arranged, commented, spaced, and indented, making it much easier to read and maintain. You can also use Pretty HTML to compress your Web pages by eliminating unnecessary spaces and carriage returns. Process your Web pages one at a time or batch-format entire folders in a single operation. Pretty HTML offers a number of options to ensure that the HTML formatting is done to your liking. To play it extra safe, you can have the program make backup copies of your originals. Excellent online help is included.
Since all the documents in the world are getting converted to HTML format (or being in the process), the HTML beautifier is immensely important. Each and every document, book, articles, news and papers about science, technology, medicine, politics and others are already available or in the process of getting converted to HTML documents. So HTML deservers a separate chapter like this one.
Copyright © 1996-2009 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Site uses AdSense so you need to be aware of Google privacy policy. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
Disclaimer:
Last modified: August 15, 2009