XSLT naming conventions

Anton Chekhov

This is Anton Pavlovich Chekhov, a famous Russian writer. I like Chekhov. Among other things, he believed that writing good is writing briefly. Although his ideas are far from being primitive, his style is clean and simple. It guides me through a narrative right to the idea.

We communicate in code. I want my code to be as intelligible as if it was Chekhov's prose. And the first step towards it is a style guide.

To be honest, I was stunned that XSLT community does not have such a document. So, I have chosen to share my own experience. My goal is not to give a formula but to explain particular choices. And I am always open to discussion. Also, having started with naming conventions, I plan to continue writing on XSLT coding style. So, in the end, we may come up with a comprehensive material to published on GitHub or elsewhere.

Principles

  • Spare the maintainer. XML is verbose. As time goes by our brain learns to ignore angle brackets, repeated tags and xsl prefixes. But let's try not to complicate it at least.
  • Follow the rules. The standard XPath library has set a convention. Although some examples may be confusing (e.g. fn:day-from-dateTime), it is still a good reference.
  • Respect the context. We're not doing OOP here. We are not required to use its terminology.

Modules

Every XSLT module starts with the name. The name of the module explains its purpose. Good naming can save the reader's time, so we want to be careful about that.

We deal with different types of XSLT transforms, including but not limited to:

  1. Data mapping. For instance, DocBook to HTML, any XSL-FO transform.
  • Aggregating, filtering, etc. Build TOC or a book map.
  • Styling or typesetting. Apply @class attributes to an HTML file.

The name can vary depending on the purpose. But the overall scheme is consistent.

Purpose Comment Example
Map Mention the formats docbook-html.xsl
Filter Mention source and output docbook-toc.xsl, dita-bookmap.xsl
Style Mention source and purpose html-style.xsl

I suggest to omit "to" (docbook-to-html.xsl), "2", "transform", and suchlike words, because the meaning derives from the nature of XSLT. I don't use CamelCase as well, just because it's not Java. I use hyphens for a single reason – the XPath library.

Sometimes we want to prepend the project name: xmlrocks-docbook-html.xsl.

Sometimes we split modules by functionality.

- docbook-html.xsl
- docbook-html-common.xsl
- docbook-html-list.xsl
- docbook-html-p.xsl

The duplication above does not make sense at all, so I usually move the child stylesheets to a library folder.

- docbook-html.xsl
- dobook-html-lib
   |- common.xsl
   |- list.xsl
   +- p.xsl

Namespace declarations

  • Be consistent. Always choose the same namespace prefixes for fear of confusing people.
  • Avoid long prefixes (e.g. my-cool-library:calculate-something()). Let your namespace URI explain your lib.

Template modes

I think of modes in terms of nouns. There's no point in naming the mode like generate-toc or convert-xhtml-to-oasis. We already know that we're converting something, right? It is OK to describe the conversion's nature: toc, xhtml-oasis.

Note that I continue using hyphens in modes for the sake of rationality.

Variables and parameters

Use hyphens and nouns.

In our team, we always disagree: some argue that global parameters must be called differently so that they can be easily recognized. Say, they can have dots as the separator: toc.levels. Personally, I don't like complex naming rules. We live in the age of IDEs, so syntax highlighting can solve many issues.

Callable components

Named templates

In "normal" programming languages callable components are verbs. But because XSLT routines are often related to transforming structures, saying transform-something is not necessary. Consider:

<xsl:call-template name="transform-paragraph"/>
<xsl:template name="transform-paragraph"/>

and:

<xsl:call-template name="paragraph"/>
<xsl:template name="paragraph"/>

However, sometimes it makes sense to include a verb:

<xsl:template name="remove-whitespace"/>

Or, say, we have various routines on paragraphs: join-paragraphs, duplicate-paragraphs. Specific name is always better. But when you cannot compile a concrete name, beware of nonsense.

Stylesheet functions

The first temptation of a rising XSLT star is to use accessor-like names. Because we cannot set something in XSLT (immutable variables), we get things: fn:get-anchor-id, fn:get-nearest-paragraph. We use "get" as a short-hand version of "query", and it's fine.

But sometimes I just omit that word and the code magically gets cleaner.

<!-- Before -->
<xsl:variable name="nearest-paragraph-anchor-id" 
              value="fn:get-anchor-id(fn:get-nearest-paragraph(.))"/>

<!-- After -->
<xsl:variable name="nearest-paragraph-anchor-id" 
              value="fn:anchor-id(fn:nearest-paragraph(.))"/>

Of course, it's not a must. And by and large, we still have verbs in operational functions: fn:concat-names.

Conclusions

The XML community is known for being extremely lenient. I often meet people who tend to ignore code quality issues. I don't think of quality as an indisputable value neither, but in our company we raise the standard. Each time I discover a badly formatted code or a module inconsistent in style, I feel I can do things more important than learning new conventions.

In the next post we'll get deeper into formatting. Although its even harder to argue about personal preferences in formatting, it should be fun. :)

In conclusion, I wanted to ask you, dear reader, to join the discussion. Whenever you found my thoughts helpful, please criticize freely.

Serhiy Hapiy

Started my career in software development in 2012, working on XSLT solutions. Later participated in core Java and Python projects. My current domain is Big Data and Machine Learning.

comments powered by Disqus