Tidybot Logo


Introduction

Tidybot is a cross-platform batch (X)HTML syntax-checker and report-generator. It traverses a directory tree of (X)HTML files on your hard disk, and generates a web page listing all the errors and warnings it encounters.

Tidybot can be used in two ways: from the command-line, or through a user-friendly GUI application. The output will be identical in either case.


Downloading Tidybot

Unix

Microsoft Windows

You can have a look at the installation instructions, at the version history of user-visible changes, or at some screenshots.


Using Tidybot

Using the command-line version

The command-line version of Tidybot is (on all platforms) an executable program called tidybot.

You can run the command tidybot --help from a suitable shell or command prompt to see a summary of the available options.

Using the GUI version

On Unix platforms, the GUI version of Tidybot is installed as tidybot_gui, and can be executed directly from the command line.

On MS Windows, the installer will have installed the GUI version as the program Tidybot GUI, which can be executed from the Start Menu: Start > Programs > Tidybot > Tidybot GUI.

In addition (depending on what you instructed the installer to do) you may also have a Tidybot GUI icon on your Desktop, and/or in the “Quick Launch” area of your Taskbar, that you can use to start the program.

The GUI version of Tidybot allows you to specify exactly the same options as the command-line version, just in a more convenient point-and-click fashion. The GUI version also has some functionality the command-line version lacks:


Tidybot Options

Command-line usage:

tidybot OPTION... <source directory>...

That is, zero or more options followed by one or more source directories.

GUI usage:

All Tidybot options have correspondingly labelled versions as fields in the GUI screen (except for --version and --help, which are menu items instead, and --verbose, which is on the TODO list).

Options:

If both the LIST and FILE versions of an option are specified on the command-line, the option values will be merged. Multiple LIST options (or multiple FILE options) are allowed, but the value of the last one specified will override all the earlier values.


Examples

The directory ‘example-sources’ distributed with Tidybot contains three sample XHTML files: ‘good.html’, ‘bad-1.html’, and ‘bad-2.html’. The first file contains correct XHTML, the latter two contain various errors.

The GUI version of Tidybot will start up with the ‘example-sources’ as the default value for the ‘Source Directories’ field.

The following examples are expressed as command-line invocations of Tidybot. They all assume the user has cd’d into the directory that contains the ‘example-sources’ subdirectory.

These examples can easily be replicated in the GUI version by placing the values in the corresponding fields. (In fact, Tidybot GUI makes it a lot easier to experiment with the options and see what kind of effect they have.)

Default behaviour: check all the files in ‘example-sources’ and place the resulting html report files in the current directory:

tidybot example-sources

As before, but place the report files in /tmp:

tidybot --output-directory=/tmp example-sources

Do not process bad-1.html and bad-2.html (both invocations will have this effect):

tidybot --exclude=bad-1.html,bad-2.html example-sources

tidybot --exclude=bad example-sources

Do not include any Tidy messages containing the word ‘Warning’ in the reports:

tidybot --suppress=Warning example-sources

Additionally, suppress one very specific type of error message:

tidybot --suppress="Warning, Error: <spanstyle> is not recognized" example-sources

(This also illustrates why the various -from options exist — specifying too many options directly on the command-line can get unwieldy very fast.)

Instruct the underlying Tidy library to use an XML parser rather than a HTML parser:

tidybot --tidy-options="input-xml:yes" example-sources

Have Tidybot send mail after it has finished (Unix-only):

tidybot --mailto=maintainers@kronto.org \
--note "Report at: <http://foo.bar/tidybot.html>" example-sources

The --mailto option is typically used when Tidybot is running in command-line mode as an automatically scheduled application (e.g. via the Unix crontab).


Caveats

Tidybot, written in the high-level scripting language Python, is a typical ‘glue’ program that creates a more convenient user-interface and nicer output around already existing, lower-level components.

Tidybot was written because I needed its specific functionality and it was easier (and more fun) to cobble together my own utility than to try and adapt something else. Once it existed, I felt I might as well make it available to others.

By design, Tidybot offers access to only a small subset of what the underlying Tidy library is capable of. The original Tidy’s strong suit is actually fixing your legacy HTML files, converting them into shiny, valid XHTML files. In contrast, Tidybot purely exists to check existing files — it does not convert anything.

Another limitation is that Tidy is not a formal XHTML or XML validator. Its syntax checking, while extensive, does not guarantee 100% correctness. Similarly, there are certain exotic uses of XML/XHTML (namespaces are one example) which, while valid, are not supported by the Tidy library, and may therefore be incorrectly flagged as errors. Tidybot is good for keeping an eye on things and quickly spotting problems. It should never be used as a replacement for real validation checks.

There are quite a few other freeware wrappers and GUIs for TidyLib available that might suit your needs much better than Tidybot. A good starting point is the TidyLib Project’s Tidy Binaries page.

So why would you want to use Tidybot? Well, I wrote it for a medium-sized website (around 3200 static XHTML pages), that has sections being maintained by about a dozen different people, who can each upload their own files to the central repository. We run Tidybot automatically every night, and make the report pages available as part of our developer's section. This way, there is always a current validation report available, and everyone can double-check their uploads and make sure that any errors that have crept in are quickly fixed.


Credits

Source Code

Tidybot uses the following components:

In addition, the GUI version of Tidybot uses:

On MS Windows, Tidybot also uses:

Finally, Tidybot’s name, design and functionality were much inspired by (but have by now also vastly deviated from) Hans de Graaff’s Checkbot, a link checker for web pages.

Infrastructure

The Tidybot source code is stored in a Subversion repository. Public repository viewing is provided by the FishEye browser, with additional statistics courtesy of mpy svn stats. Issue tracking is done with FogBugz project management software.


Bugs

If you encounter a problem with Tidybot (any problem -- even something as trivial as say a typo in the documentation!) or if you have a feature request, please consider filing a bug report. Just plain e-mail is of course also always welcome, as is posting a message to the Tidybot User Forum.


Development

Anyone can browse the Tidybot repository.

You can also do a proper subversion checkout of the latest Tidybot sources by using the command:

% svn checkout https://svn.kronto.org/tidybot/trunk/ tidybot-svn

Once you've done this, you can stay current by cd'ing into the created tidybot-svn subdirectory and saying:

% svn update


License

Tidybot is free software. It and its source code are released under the MIT License, which means they can be freely used, copied, modified, and distributed. Consult the license for full details.


Other Languages

To the best of my knowledge, Tidybot has not been translated to any other languages, but at least this documentation web page is now also available in German translation.


Tidybot is Copyright ©2005 by Leo Breebaart (leo@kronto.org)