... and what we are doing to bring them back
A couple of weeks ago we made the decision to start removing microformats from BBC sites that used the DateTime pattern, the most popular of which is hCalendar.
This pattern was provided to give non-BBC programmers an API (an "application programming interface" - these allow computers as well as human readers to use BBC data) with which they could create software using the information on sites such as /programmes and iPlayer.
Unfortunately the pattern had a number of flaws, which I'll summarise here:
In terms of accessibility: using the DateTime pattern causes some screen readers (in non-default configurations) to read out the contents of the title attribute rather than the text content of the element, meaning users will hear data which is designed to be understood by computers rather than information designed to be understood by people
In terms of usability: using the DateTime pattern causes a tooltip to appear containing this machine-readable data when the user hovers the mouse over the text content. Some technical users may understand "1998-03-12T09:30:00-05:00", but the majority of BBC users will not.
Because of the above problems, we changed our semantic markup standard, adding a rule that the title attribute MUST contain human-readable data.
This is why microformats have started to disappear from BBC sites.
We need to uphold the needs of our users, and see if we can find alternative patterns which do not have these negative usability and accessibility side-effects before programmers start building too much software which depends on the DateTime pattern in its current form.
The BBC have engaged with the microformats community to come up with alternative patterns. While this is a complex process, I hope that through this engagement an alternative pattern will be found which satisfies all the demands on it, from a programming, web standards, usability and accessibility perspective.
My colleague Jake Archibald (who is a Senior Client Side developer in BBC FM&T) has more technical detail on this decision below, and the latest summary of the debate around these alternative patterns.
Jonathan Hassell is Head of User Experience & Accessibility, BBC Future Media & Technology
The Microformat DateTime Pattern
Michael Smedhurst blogged about the removal of this pattern from bbc.co.uk/programmes on the BBC Radio labs blog a couple of weeks ago, and then the world went mad.
The RDFa guys started claiming victory and a small war broke out in the microformats community around alternatives to the pattern.
I'd like to clear up exactly why we don't support the current pattern, and what alternatives have been proposed.
What are microformats?
The HTML elements we use in modern web development are from a specification released in 1999. The web has evolved considerably since then and there are notable gaps in the specification. A developer can, using HTML, identify some content as computer code, but cannot identify a telephone number or a date.
The microformats community create HTML patterns for describing things such as contact details and calendar events. Other programs and websites can read this pattern and present the data in another way. An example of this is the Operator plugin for Firefox, which recognises the hCalendar microformat and lets the user add detected events to their Google calendar. (n.b image from nennett on flickr)
What's wrong with the current datetime pattern?
To be held on
<abbr class="dtstart" title="1998-03-12T08:30:00-05:00">
12 March 1998 from 8:30am EST
<abbr class="dtend" title="1998-03-12T09:30:00-05:00">
The screen reader issue
This is the most commonly discussed issue, but it's not as big as people suggest. Some screen readers in non-default configurations will read out the contents of the title attribute rather than the text content of the element, meaning users may hear the machine data rather than the human data.
Personally, I believe that screen readers should read the title attribute rather than the text content by default (I'll come back to this later), but they don't, so it's not that much of an issue.
The tooltip issue
This is the biggest issue in my opinion. When you use the above pattern, a tooltip will appear containing the content of the title attribute when the user hovers the mouse over the text content. Like the screen reader issue, this is presenting machine data to the human user. Some technical users may understand "1998-03-12T09:30:00-05:00", but the majority of BBC users will not.
The semantic issue
The HTML4 and XHTML2 specifications say the <abbr> element is for marking up an abbreviated form with the expanded form in the title attribute, and the <abbr> element should be used around each instance of the abbreviated form. The HTML4 specification says the content of the title attribute may be presented to the user, so you can conclude that the content is intended for humans.
On the other hand, the XHTML2 spec is vague, defining the title attribute as "meta-information about the element on which it is set". In XHTML2 land, the microformat use of <abbr> seems valid.
HTML5 defines <abbr> as an abbreviation or acronym, with an optional expansion via the title attribute. In my opinion, this is the best definition of <abbr>. Expansions should only be used when they're needed. So it would be used like this:
<p>I am 6<abbr title="foot">ft</abbr> tall and work for the BBC</p>
Here I have expanded 'ft', because I read it as 'foot', whereas I read 'BBC' as each letter individually. This is why I believe screen readers should read from the title attribute of <abbr> elements rather than their text content.
The BBC's decision
Because of the above issues, we changed our semantic markup standard, adding a rule that the title attribute MUST contain human readable data. This is why some microformats have started to disappear from BBC sites.
What are the alternatives?
RDFa is a possible alternative but BBC sites will require an exemption from our standards and guidelines before they can use them, because they don't validate as XHTML strict.
Microformats are an excellent way of adding additional semantic value to a page without compromising validation. However, we can't use them if they create usability or accessibility issues.
Alternatives to the datetime pattern have already been proposed which attempt to solve the current problems. Here's a quick overview of 3 proposals...
Empty elements with title:
To be held on
<span class="dtstart" title="1998-03-12T08:30:00-05:00"></span>
12 March 1998 from 8:30am EST until
<span class="dtend" title="1998-03-12T09:30:00-05:00"></span>
Here, empty elements are used to create key-value pairs using class and title. Screen readers ignore the empty element and the hover area for the tooltip is zero-width so it won't appear to the user in normal circumstances. For microformat parsers, there's little change from the current implementation, as most (if not all) do not require an <abbr> element.
However, it has the same semantic issues with title as the current standard has, and should an empty element even have a title?
It's also been raised that some CMS / tidying systems have issues with empty elements, making them self-closing or removing them completely.
Data in the class attribute:
To be held on
<span class="dtstart data-1998-03-12T08:30:00-05:00">
12 March 1998 from 8:30am EST
<span class="dtend data-1998-03-12T09:30:00-05:00">
Here, the machine data is moved into the class attribute. The content of the class attribute is never presented as human readable data and the spec proposes using it "For general purpose processing by user agents". The developer is free to use whatever element is semantically best. Microformat parsers would have to find the element with the identifying class, such as 'dtstart', then look in the same attribute for the data class beginning 'data-'. Elements could have many identifying classes, but only one data class.
However, despite the "general purpose" definition of the class attribute, it's an unusual use of the attribute and not in line with the object-oriented concept of 'class'. Also, a principle of microformats is to keep data visible to humans, whereas this proposal intentionally hides data in the class attribute.
Here's a link to furtherdiscussion of the data class proposal
Date and time separation using value excerption
To be held on
<span class="dtstart dtend">
<abbr class="value" title="1998-03-12">
12 March 1998
<abbr class="value" title="08:30">
<abbr class="value" title="-0500">
<abbr class="value" title="09:30">
<abbr class="value" title="-0500">
Here, the time information is split up into separate parts. The <abbr> element and title attribute are used to provide the machine alternative to individual date parts. The machine data is still displayed to user via a tooltip and potentially read by a screen reader, but splitting it up makes it feel more human (and keeps the data visible). A single span can represent both the start and end times, ideal for situations like the above where the start date is only mentioned once, as it is also the end date. Parsers would have to collect all the elements with an identifying class such as 'dtstart', then gather all the <abbr> elements within with class 'value'. The parser would recognise the string patterns in the title attributes (as they are not in a particular order) and construct a full date from them.
However, this pattern seems complicated for both implementors and parsers, involves more elements and can require multiple 'dtstart' and 'dtend' classes, as in the example above. The semantic issues around <abbr> and title remain, as does the possibility of screen readers reading the machine value rather than the human value. Tooltips would still be presented to the user, which may be unwanted and potentially confusing. Non-technical users may not be used to seeing dates year first, or timezones represented in that way.
Here's a link to furtherdiscussion of the separation proposal
It's clear that none of the proposals are ideal, and as usual semantics play a big part in the debates between them. The microformats community need to come up with a solution that solves the issues with the current pattern and doesn't create any new ones. Once they do that, microformats such as hCalendar (in their new form) will begin to reappear on BBC sites.
Jake Archibald is a Senior CSD in BBC Future Media & Technology