How not to write spec documents

A few of us keep complaining about the way the HTML5 specification is written.  The argument we are given is that the HTML5 is written in a more prescriptive form to reduce the ambiguity that exists in the more traditional “declarative” style RFC documents. I’ve spent a fair bit of time reading specs over the last few years and although they were tough to get going with I feel pretty comfortable now.

So, Ben Foster was asking on Twitter about his use of the “alternate” link relation and I decided I go check out the spec.  To find the spec I first went here and discovered that the spec is

Let’s walk through this text… Link type "alternate"

Shame they decided to use the term link type instead of the standard “link relation”.  The attribute is “rel” after all. 

The alternate keyword may be used with link, a, and area elements.

Hmmm, ok.  They are defining the use of alternate within the context of a HTML document.  That’s unfortunate, but it is their prerogative.  It just means that if someone like Ben decides that he thinks the idea of an “alternative” link is useful in his media type then he is free to make up his own rules on what alternative means to him.   

The meaning of this keyword depends on the values of the other attributes.

Whoa.  Hold on. What attributes?  Where?  Maybe it will be explained next…

If the element is a link element and the rel attribute also contains the keyword stylesheet

The alternate keyword modifies the meaning of the stylesheet keyword in the way described for that keyword. The alternate keyword does not create a link of its own.

Oh, a special case.  What I think they are saying is if we have rel=”stylesheet alternate” then alternate changes the meaning of stylesheet in a way defined somewhere else.  Oh dear, that directly contradicts RFC5988, that states:

 Relation types SHOULD NOT infer any additional semantics based upon
 the presence or absence of another link relation type, or its own
 cardinality of occurrence.

Moving on...

The alternate keyword is used with the type attribute set to the value application/rss+xml or the value application/atom+xml

Huh?  In which element? Just those two media types?  No others? To set what value? What are they talking about?

The keyword creates a hyperlink referencing a syndication feed (though not necessarily syndicating exactly the same content as the current page).

How does a keyword create a hyperlink? Do they mean that the alternate link relation indicates that the target of the link will be a syndication feed with the media type designated by the type attribute?  Why don’t they say that? 

The first link, a, or area element in the document (in tree order) with the alternate keyword used with the type attribute set to the value application/rss+xml or the value application/atom+xml must be treated as the default syndication feed for the purposes of feed autodiscovery.

Yay for standards and simplicity.  I get to reference my feed three different ways as long as I only use these two predefined media types.  Good job we don’t want to use the media type for versioning or anything like that. 

The following link element gives the syndication feed for the current page:

<link rel="alternate" type="application/atom+xml" href="data.xml">

The following extract offers various different syndication feeds:

<p>You can access the planets database using Atom feeds:</p>
 <li><a href="recently-visited-planets.xml" rel="alternate" type="application/atom+xml">Recently Visited Planets</a></li>
 <li><a href="known-bad-planets.xml" rel="alternate" type="application/atom+xml">Known Bad Planets</a></li>
 <li><a href="unexplored-planets.xml" rel="alternate" type="application/atom+xml">Unexplored Planets</a></li>
And now we are resorting to examples because the definition was sufficiently opaque that the average reader is just going to skip over the blurb and move right on to the examples.  One thing that I have learned about specs is that they tend to be split into two parts, normative and non-normative.  The normative part is bit that is worded very carefully to minimize ambiguity.  The non-normative bits are examples and algorithms that help to illustrate the specification but aren’t guaranteed to be 100% precise.  This separation allows the normative spec to stay smaller although it ends up being a bit dry.  Intermixing the two types of documents leads to what we have above, vague and confusing text with examples that show only part of the story.


The keyword creates a hyperlink referencing an alternate representation of the current document.

Ok, so now we seem to be talking about the more general case.  Wouldn’t it make more sense to define the general case first and then call out the exceptions that are specific to the nature of HTML?  The annoying thing is that the keyword doesn’t create a hyperlink.  It is the link, a or area element that tells the user agent that a link needs to be created.  The “alternate” link relation provides some meaning to the link.  The way this is phrased makes it sound like I could do <span class=”alternate”/> and a hyperlink would be created.

The nature of the referenced document is given by the media, hreflang, and type attributes.

If you follow the links looking for the meaning of these attributes you end up in a section called “4.12.2 Links created by a and area elements”.  What about the Link element?  If you dig deeper you get nice self-referential statements like “The media attribute describes for which media the target document was designed”.  It is interesting that they use the term target as that term is defined in RFC 5988 to mean the resource referenced by the URL of the link.  However, HTML defines target completely differently.  In fact target can be an attribute of an A or AREA element.  But that’s not the target document that they are talking about . Fun times.

If the alternate keyword is used with the media attribute, it indicates that the referenced document is intended for use with the media specified.

If the alternate keyword is used with the hreflang attribute, and that attribute's value differs from the root element's language, it indicates that the referenced document is a translation.

If the alternate keyword is used with the type attribute, it indicates that the referenced document is a reformulation of the current document in the specified format.

The media, hreflang, and type attributes can be combined when specified with the alternate keyword.

For example, the following link is a French translation that uses the PDF format:

<link rel=alternate type=application/pdf hreflang=fr href=manual-fr>

This relationship is transitive — that is, if a document links to two other documents with the link type "alternate", then, in addition to implying that those documents are alternative representations of the first document, it is also implying that those two documents are alternative representations of each other.

I could go on nitpicking about stuff, but I think the point that I am trying to make is that in trying to make these specs “more accessible” and “less ambiguous” I think they have done the opposite.  It is true that IETF RFCs are notorious for being dry and dense in their wording.  However, they are consistent and so with a little effort to understand the style, you can get quickly get over the initial pain and it becomes much easier to read. The end result are documents that are far more coherent.  In my opinion, this effort write documents that lead people by the nose through a nested bunch of “if this then” that logic produces a far inferior specification than the declarations of inter-related facts that are the basis of RFC style specs.

If you are wondering if maybe I just picked out an exceptional example from the HTML spec, I dare you to go and try and grok the “browsing contexts” section…


No Comments

Add a Comment

comments powered by Disqus