Structuring XSLT code

We pursue our discussion of the coding style. In the previous article we covered some aspects of naming. Today we're going to dive into another vital question – how to structure an XSLT project?

XSLT projects have their peculiarities. Correct, they're not as big as many industry Java Enterprise projects. In fact, they are relatively small. In contrast, exactly because of their scope, XSLT projects are started and completed far more often.

If your team has a policy on how to structure source code, you're quick off the mark in starting your new project.

General note

We perfectly understand why standards play a critical part in software development. Nonetheless, we don't live in an ideal world; and we frequently face teams maintaining whole sets of diverse policies. I do work in such a team. :-)

My #1 rule here is be consistent. If consistency is not possible at the team level, be consistent for the projects you lead. There's no need (and no money) for converting legacy projects to new conventions. Think ahead.

Multi-module projects

Don't mix module files. Use folders to organise your modules.

- src
   |- common
      |- functions.xsl
      +- templates.xsl
   |- jats
      |- identity.xsl
      +- jats.xsl
   +- bits
      |- identity.xsl
      +- bits.xsl

Module structure

Single responsibility principle

Paraphrasing the Single responsibility principle, every module should implement a single part of product's functionality. But remember to keep a reasonable balance between spaghetti code and a stylesheet hell.

Imports file

Extract xsl:import and xsl:include statements into an imports file. Do not confuse people with imports scattered across files. By reflecting the module structure, the imports file provides an explanation of the internal composition of the module.

<!-- Contents of imports.xsl -->
<xsl:stylesheet ...>

    <xsl:import href="module-1.xsl"/>

    <xsl:import href="module-2.xsl"/>

</xsl:stylesheet>

The perfect example of how not to ogranize imports:

<!-- Contents of module-1.xsl -->
<xsl:stylesheet ...>

    <xsl:import href="module-2.xsl"/>

    <!-- Templates, functions, etc. -->

</xsl:stylesheet>
Parameters file

Create a separate parameters stylesheet, declaring the module API.

<!-- Contents of parameters.xsl -->
<xsl:stylesheeet ...>

    <!-- Turns on verbose logging. -->
    <xsl:param name="debug" value="false()" as="xs:boolean"/>

    <!-- A URI of the TOC file to be generated. --->
    <xsl:param name="toc-uri" required="yes" as="xs:anyURI"/>

<xsl:stylesheet>
Main file

Well-designed transforms have general rules that are invoked when a specific match is not found. I never rely on built-in rules and tend to override them explicitly.

Create a file that will answer:

  • What are base transform rules?
    • Identity transform?
    • Inclusive copy (do not transform elements without exact match)?
    • General ignore patterns?
  • What is error handling policy?
    • Handle dynamic errors (XSLT 3.0)
    • Abort if document structure is confusing?
    • Log missing optional elements?
Copy source XML schema

When adding stylesheets, copy the structure of the source XML format.

- html
   |- head.xsl
   |- body.xsl
   |- table.xsl
   |- ol.xsl
   |- ...
Sample layout

Given these points, an example source tree can be:

- html
   |- html-lib
      |- body.xsl
      |- head.xsl
      |- imports.xsl
      |- main.xsl
      |- parameters.xsl
      +- utility.xsl
   +- html.xsl 

Source file structure

A stylesheet should contain, in order:

  • XML declaration
  • Copyright notice if needed
  • Module documentation if applicable
  • xsl:stylesheet element

Stylesheet declaration

Ordering

Start with the root element, going deeper into the canonical tree.

<xsl:template match="html"/>

<xsl:template match="body"/>

<xsl:template match="p"/>
Secondary ordering

Some developers feel comfortable about reading templates in direct priority order (0, 1, 2). They argue that going from general to specific reflects our thinking process. In object oriented languages, too, we first place the base class and then its children.

<xsl:template match="p[1]"/>

<xsl:template match="p[1][@class]" priority="10"/>

<xsl:template match="p[1][@class = 'info')]" priority="20"/>

For me, it is easier to start from the top. Either way, remember to stay consistent.

Utility code
  1. Place helper functions (templates) under the place they are called from.
  2. If there are multiple callers, put them at the bottom of the file.
  3. Extract them to utility file if you heavily use them in many stylesheets.

Finally

As a final point, let me tell you the story about the Commissioner's plan of 1811. Manhattan was originally designed as a street grid. New York citizens weren't over-enthusiastic about reanimating the ancient Greek idea. But after the City Council started the plan execution, the borough began growing by leaps and bounds. Even today Manhattan has its streets north of Houtson St. laid out in a numbered grid pattern. So, every newcomer will get around very easy.

As against finding a place on the NYC map, I can spend hours in a search for a particular line of code. Neither a debugger nor an excellent modern IDE will recover the time you lose on tinkering with an ill-designed system. When in fact, a well-thought code structure brings joy and happiness. :-)

Serhiy Hapiy

Started my career in software development in 2012, working on XSLT solutions. Later participated in core Java and Python projects. My current domain is Big Data and Machine Learning.

comments powered by Disqus