Site Redesign
I have been working on an alternative design for HyperDrifter.com for a while. Up until the 15th of September 2009, the site was running on the Joomla CMS software and, although it provides practically everything you could need for a website, I wanted faster load-times for pages and also to try to simplify the process of writing content. I also did not want to have to worry about keeping the software and its extensions up-to-date. This post summarises my work on some software to help me do this.
I started working on a project called WebGen about a year and a half ago. The idea was to take a bunch of unconnected content typically found on websites (HTML files, images, etc.) and collate it and organise it into a fully-navigable website with a consistent appearance. The individual HTML files could be written with any text editor and would not need to include any code related to the appearance of the site. Index pages and navigation menus would be automatically generated by looking at the directory structure and content of the original files.
The big difference between the WebGen idea and CMS's like Joomla or Wordpress would, of course, be the fact that the final website content would be static as opposed to being retrieved from a database or cache and formatted every time a visitor calls up the page. The static format would remove the need for a server-side scripting language such as ASP or PHP and for access to a database. Of course the static format has the disadvantage that any change made to the appearance of the site could require an upload of the entire website's contents. I know that this would be impractical for a large website but I didn't fall into that category.
This type of approach would also require the use of XHTML instead of HTML because files needed to be well-formed before they could be processed with XML tools. Because it is possible and very common to serve XHTML as HTML, I did not see this as a drawback.
Starting with PERL
I started out by writing a few scripts in PERL. These scanned the unconnected content and collected information such as filenames, mime-types, creation and last-modification dates. The result of this process was an XML document that mirrored the structure of the original content and contained enough information for the other parts of WebGen to do their work.
I wrote some other PERL scripts to use the content in this XML document to direct their activities and generate parts of the final page. This worked reasonably well and I soon had a collection of scripts that were able to generate menus, index pages and breadcrumb trails. I then created an XSLT file to specify the structure of the final HTML page and which would pull in the various fragments to produce the website-ready page.
Switching to XSLT
This all worked fairly well but I soon realised that I should have been writing XSLT files instead of PERL scripts. I couldn't replace the file scanner that generated the summary XML document because XSLT just isn't designed for that kind of functionality. But I could write XSLT files that read the summary document and wrote out the menu and breadcrumb fragments. Parameters containing XPaths could be passed to these XSLTs to specify what part of the summary document tree they should work on.
Another issue I was having at this stage was that the process didn't really allow for easy customisation. I found that there would be too many command-line options necessary to allow for the kind of control I wanted. I then hit on the idea of using an XSLT file to control the entire process. The idea was that I could just write a XSLT that read the summary document and generated another XML document which contained a list of actions. These actions would specify all the steps required to turn the original content into the final website-ready version.
The last part of the puzzle was another XSLT which would turn the actions document into system commands. I decided to use PERL again at this point. So if, for example, there was a copyFile tag in the actions file, the XSLT would generate a "cp( src, dest );" command in the output.
The following diagram shows the various components and their use in the process.
The advantages of this approach were that the generated actions could be as simple or as complicated as I wanted; I could control and configure the entire generation process within one file; and I could use a fairly straightforward language syntax and let the PERL XSLT do the hard work.
To test this out, I created a basic XSLT file that generated actions for menus, directory contents and breadcrumbs within every content directory as well as the actions to transform every HTML file into the final version with all of these parts included.
The next diagram shows how the various parts of the process contribute to adding a menu to a web page.
Once I had the basic version working, I wanted to design a website template that would include RSS and ATOM feeds, site-maps in HTML and XML, .htaccess files for Apache, and robots.txt files. It was fairly straightforward to code up individual XSLT files for each of these features because all the information was in the summary document (called analysis.xml in the diagrams). I then set about creating the XSLT that would generate the actions required to pull these pieces together.
Building on Previous Work
One of the features of XSLT that came in handy at this point was the xsl:import feature, which allows you to import the rules from one XSLT file into another. Using this, I was able to use the basic actions generator I had done earlier as a starting point and only make the adjustments I needed to integrate the new content.
So I ended up with a fairly straight-forward XSLT that generated all the content I wanted. The final task was to customize this XSLT to generate the HyperDrifter.com website.
I used the xsl:import feature again to provide a foundation for the new XSLT and, in the end, I only had to modify the actions.xsl and xhtml.xsl files.
So, on 15th of September I had finally generated content that I was happy would replace the Joomla CMS version and still provide all the extras. And this is the result. As I said at the outset, I wasn't as interested in the appearance but I think I achieved a fairly usable website. Now I just have to get on with updating my content...
Final Thoughts
I realised while working on this project that the XSL language is very flexible. Although it can seem a bit clunky at times, it does a lot of work for you and the integration of XPath can make the code very concise. If you find yourself having to write repetitious code to process an XML file in some scripting language, you should stop and ask yourself if the same thing could be achieved in an XSLT. There is a lot of help available on the web and if you come up against a road-block, the chances are that someone has attempted what you're attempting.
I think that it could be useful to someone who is familiar with XSLT and I plan on getting the WebGen code into a form that can be released publicly. I'll need to write some decent documentation to explain all the steps involved but I'm hoping to work on that over the next few weeks.
Thanks for reading.