Future Media Standards & Guidelines

Semantic Mark-up v1.6

1 Introduction

1.1 Semantic mark-up is HTML that describes the content, rather than the manner in which the content is presented. It allows the meaning to be delivered to users regardless of the browser they use, so that content can be provided to the widest possible audience. (See Appendix 3 for more.)

1.2 As an example, <em>this text</em> uses HTML as it was intended by the W3C, while <i>this text</i> does not. The appearance of this visual browsers (as opposed to screenreaders, etc.) will be the same, but only by using semantic HTML is the meaning of the content preserved across all types of browser. Only by using semantic HTML can the emphasis of the words be maintained, whether emphasis is presented visually as italics (as it is by default) or in another way.

1.3 The BBC endorses the principle of separating content from presentation in web pages, using HTML as a semantic mark-up language.

Top of page

2 Scope of This Standard

2.1 This standard is an 'ideal' ambition, which cannot be fully realised at this time, due to the legacy of non-semantic content currently on bbc.co.uk.

2.2 This standard must be applied as far as is possible - to all new templates (e.g. world service new templates, BBC TV test templates), or pages produced outside of existing templating systems and content management systems. (e.g. bespoke news 'specials', pop-ups).

2.3 For pages that are in older templates (e.g. barley) where authors only have control of the content area, you SHOULD apply the markup structure (H tag structure) to the 'content' area of the document (see Appendix 2).

Top of page

3 Principles of Application

3.1 You MUST NOT use semantic tags outside the purposes defined below, e.g. if you use the blockquote tag you MUST ONLY use it as defined below and not for any other purpose, such as to set a particular presentation style.

3.2 If you are capturing semantic meaning in a document, you MUST use the appropriate semantic tag, e.g. an <abbr> tag MUST be used in preference to <span class="abbr">

Top of page

4 Headings

4.1 All pages MUST use heading elements

4.2 Heading elements MUST convey the structure of the document (as described in Appendix 1), rather than the editorial emphasis of its content (e.g. the most important story on the page).

4.3 Heading elements MUST be ordered hierarchically, i.e. if there is an H2 element on the page it SHOULD be preceded by an H1 element somewhere on the same page, if there is an H3 element on the page it SHOULD be preceded by an H2 element somewhere on the same page, if there is an H4 element on the page it SHOULD be preceded by an H3 element somewhere on the same page, etc. Intermediate levels MUST NOT be omitted (e.g. H1 directly to H3).

NOTE: this is to maintain well structured pages, which are essential in order to deliver a good user experience to those using assistive technologies such as screenreaders.

4.4 Headings SHOULD be followed by further content, e.g.

<h3>Title</h3> <p>Text text</p>

4.5 Headings SHOULD NOT be treated as 'standalone' content.

4.6 Headings MUST NOT have a consecutive series of same level headings without content between each e.g.

<h3>Title</h3>

<h3>Title</h3>

4.7 Headings MAY have sequential headings (without content between each) to specify hierarchy. For example:

<h3>Section</h3>

<h4>Sub section</h4>

4.8 There MUST be one, and only one H1 per page.

4.9 The H1 MUST be subject of that page - e.g. for http://news.bbc.co.uk/1/hi/uk/5261908.stm, the H1 would be "Ryanair issues luggage ultimatum", not "BBC", "BBC News" or "BBC NEWS | UK | Ryanair issues luggage ultimatum".

Top of page

5 Lists

5.1 There are three valid list types: ordered lists <ol>; unordered lists <ul>; definition lists <dl>

5.2 All lists SHOULD be preceded by a header - <h*>description</h*> - that describes the content of the list. Example: <h2>Other top stories</h2> before a list of other top stories.

5.3 <ul> unordered list: MUST ONLY be used where the order of the list is not editorially significant.

5.4 <ol> ordered list: MUST ONLY be used where the order of the list items is editorially significant. (Even if the numbers are hidden with CSS).

5.5 <ul> and <ol> type lists MUST have at least one <li> item.

5.6 <dl> lists MUST contain at least one <dt> with a corresponding <dd>.

5.7 <dd> MUST have at least one corresponding <dt>.

5.8 <dl> MAY have multiple terms for a given definition, as well as multiple definitions for a given term.

5.9 <dl> definition list: MUST only be used to describe terms and their definitions.

5.10 In content, you SHOULD use h tag structure rather than nested lists.

5.11 In navigation you MAY use nested lists.

5.12 Nested lists MUST NOT be more than 3 levels deep. i.e. this limit is demonstrated as follows:

  • item
  • item
    • item
    • item
      • item
      • item
  • item
  • item

Top of page

6. Table Mark-up

6.1 Tables SHOULD ONLY be used for conveying tabular data, not presentational (layout) use.

6.2 Tables MUST be used for tabular data. Tabular data is data that has relationships in two or more dimensions.

6.3 If you are displaying tabular data you MUST use a "summary" attribute to describe the editorial intent of the data.

6.4 If you wish to apply a caption to your table you SHOULD use a caption tag to do this, e.g. source or copyright of data.

6.5 If you wish to supply a title to your data you MUST use a heading tag, so as to enable navigation to the table within the page by screenreaders.

6.6 In a data table you MUST make use of <thead> and <tbody>.

6.7 If your table has a footer this MUST be encapsulated in a <tfoot> tag.

6.8 If you have table headings you MUST use <th> tags for these.

Top of page

7 Other Semantic Tags

7.1 These presentational tags MUST NOT be used.

<b>
bold contents
<i>
italic contents
<big>
increased font size
<blink>
alternating for- and background colours
<marquee>
for scrolling text
<s>
strikes through text
<small>
decreases font size
<strike>
strikes through text
<tt>
teletypewriter style
<u>
underlines contents
<center>
centers a section of text
<nobr>
creates a region of non-breaking text
<font>
changes the size, style and color of text

7.2 These semantic tags MUST be used where the content matches their description:

<p>
defines a paragraph of text.
<em>
indicates emphasis.
<strong>
indicates stronger emphasis.

7.3 These semantic tags SHOULD be used where the content matches their description:

<blockquote>
defines a block quotation.
<q>
defines a short quotation (inline).
<cite>
contains a citation or a reference to other sources.
<abbr>
indicates an abbreviated form (e.g BBC, HTML).
<dfn>
defines instances of special terms or phrases.
<code>
designates a fragment of computer code.
<samp>
designates sample output from programs, scripts, etc.
<kbd>
indicates text that is typed on a keyboard.
<var>
indicates an instance of a variable or program argument.
<ins>
defines inserted document content.
<del>
defines deleted document content.
<address>
defines an address

7.4 These notational tags MUST be used.

<sub>
subscripted text
<sup>
superscripted text

7.5 These tags MAY be used.

<br />
inserts a line break in to the text flow
<pre>
preformats text - although alternatives to using this tag are recommended

7.6 You MUST ONLY use <br /> tags to create single line breaks. For more than one line break an alternative should be found (e.g. paragraphs).

7.7 You SHOULD NOT use <br /> tags in general. Recognised exceptions are poetry, address areas and where the line break may be argued as part of the meaning rather than the presentation.

Top of page

8 Tag Attributes

8.1 You MAY use height and width attributes on images and embedded media.

8.2 You SHOULD NOT use height and width attributes on any other tags. Height and width SHOULD be defined by the CSS.

8.3 You SHOULD NOT use border attributes on tags. Borders SHOULD be defined by the CSS.

8.4 You SHOULD NOT use align, valign or clear attributes.

8.5 You SHOULD NOT use style attributes, except where using syndicated content or internal syndicating systems.

8.6 For alt and title attributes see Textual Equivalents Standard.

Top of page

9. Microformats

9.1 You MAY use microformats on your site where there are agreed, not draft, specifications (refer to the Microformats community wiki site for details) with the exception of those that use the title attribute of HTML's abbr element.

9.1.1 Some microformats use the abbr element to conceal machine-readable data; for example, date-times and geographical coordinates. For screen-reader users that expand abbreviations they will hear the full date-time or coordinate; for example 2008-05-15T19:30:00+01:00 instead of 19:30.

9.1.2 If you want to use microformats in the abbr element you MUST first discuss this with your product lead.

9.1.3 If you want to use draft microformats you MUST first discuss this with your product lead.

9.2. If you do use microformats, you MUST ensure that the title attribute contains human-readable data. See also the section on title attributes in the Textual Equivalents Standard.

Top of page

10 Forms

10.1 For compound elements (where text is used to label a form element), the <label> tag MUST be used to explicitly associate the relevant text label with its form element.

10.2 This MUST be done using a 'for' attribute on the label and a pairing 'id' attribute on the element.

e.g. <label for="apple">apple</label><input id="apple" />.

10.3 If there is no text that labels the form element then the element MUST have a title attribute.

10.4 A label-input pair (compound element) SHOULD be contained in a block level element (e.g. <div>, <p> or <li> tag).

10.5 A label-input pair SHOULD NOT be contained in a <dl>, as this provides no additional structural information.

Top of page

11 Appendix

Appendix 1

As explained on the internal wiki making a document structure for a web page is challenging, resulting in the following compromise of structural layout.

Idealised page heading structure for non-portal (/ left hand nav) pages.

To lead with the H1 as the page 'subject', to be followed by story related headings, then page sections, then global navigation

  • H1 - page subject
    • H2 - story subsection
    • H2 - story subsection
      • H3 - story sub-subsection
    • H2 - story subsection
    • H2 - Story related Nav (functions and features around this story / associated links section)
      • H3 - right nav section
    • H2 - Nav
      • H3 - local nav (left nav)
      • H3 - toolbar nav
      • H3 - footer nav

Idealised page heading structure for 'portal' (/ no left hand nav) pages.

  • H1 - page subject
    • H2 - story subsection
    • H2 - story subsection
      • H3 - story sub-subsection
    • H2 - story subsection
    • H2 - Story related Nav (functions and features around this story / associated links section)
      • H3 - right nav section
    • H2 - Nav
      • H3 - toolbar nav
      • H3 - footer nav

Appendix 2 - Existing structure

Although this would leave the navigation orphaned in templates that do not support the above (i.e. you can not apply h tags to left hand navigation under this model), this remains desirable:

  • H1 - page subject
    • H2 - story subsection
    • H2 - story subsection
      • H3 - story sub-subsection
    • H2 - story subsection
    • H2 - Story related Nav (functions and features around this story / associated links section)
      • H3 - right nav section

Appendix 3

Historically, designers and developers have been forced to use HTML to control both the content and the presentation of web pages. In order to achieve increasingly complex designs in inconsistent browsers, the limited set of tags provided by HTML was used in more and more convoluted ways, resulting in documents that looked good, but which were vastly inefficient, inflexible and inaccessible.

As a result pages became simultaneously difficult to maintain and more prone to cross-browser incompatibility: the worst of both worlds. Vitally important tags like <p>, <blockquote> and <table> were misused because of their default presentational characteristics, or replaced by other less meaningful tags. Purely presentational tags like <font> and <b> were used to make things look right, with no attention paid to the editorial meaning of the content. Often, even after many hours spent breaking the rules still further in an attempt to solve problems, pages didn't make sense to non-visual browsers. Following the 2001 DDA (Disability Discrimination Act) unreasonably inaccessible web pages are now illegal.

With the introduction of CSS (Cascading Style Sheets) and improvements in browser standards, it has become possible to achieve design in a way that preserves the meaning of the content. By separating content from presentation, and by using the proper tools for the job, pages have become smaller, more flexible, easier to maintain, easier to index in search engines, and accessible by all. Finally, the content written by writers can be presented in the way designers intend and in a way that developers can build and maintain.

Top of page

BBC © 2014 The BBC is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.