Advertisement

Future Media Standards & Guidelines

XML Standards v1.00

1. XML Documents

1.1 Your XML Documents MUST be well-formed. Each XML Document MUST be valid according to its accompanying XML Schema.

Vital information This includes DTDs (see Definitions).

1.2 Your XML Documents SHOULD include an XML declaration including version number and character encoding.

Top of page

2. XML Vocabularies

2.1 When you need to employ an XML Vocabulary in your project, you MUST search (using internet search engines for external vocabularies, and forthcoming BBC XML Namespace Registries for internal vocabularies) for existing vocabularies which fulfil your semantic requirements.

2.2 Where such a vocabulary already exists, you SHOULD use it.

2.3 If an existing XML Vocabulary partially meets your requirements you SHOULD:

  • either, contact the XML Vocabulary owner and suggest that it is extended;
  • or, create a new XML Vocabulary which imports/includes the existing XML Vocabulary for extension or restriction

2.4 If you can find no existing XML Vocabulary which meets your requirements, you SHOULD create your own, considering:

  • whether or not to use an XML Schema Language, and
  • creating an XML Namespace for the vocabulary if it may be of use to others in the BBC.

Top of page

3. Creating XML Vocabularies using XML Schema Languages

3.1 When to use an XML Schema Language

You SHOULD use an XML Schema Language to define your XML Vocabulary when:

  • The order and structure of elements/attributes is important
  • You expect persons other than yourself to use it
  • You want to provide tools to help people author it correctly
  • You expect it to be in use for a reasonable length of time
  • You wish to document the vocabulary

3.1.1 Background

XML Schema Languages are an excellent mechanism for documenting a given XML Vocabulary; they can often be used to generate human-readable documentation as well as allowing validation of XML with a suitable parser. By using an XML Schema Language you are increasing the ease with which it can be understood and supported by others.

If an XML Vocabulary captures and describes important business knowledge, or is critical to the operation of systems and software, then an XML Schema is an excellent means of ensuring that your documents are fit for purpose.

3.2 Choosing an XML Schema Language

3.2.1 You SHOULD use W3C XML Schema (WXS) when:

  • You intend to create and/or use an XML Namespace in your vocabulary
  • You intend to share your vocabulary, or incorporate part of someone else's
  • You need to define specific data types for elements/attributes

3.2.2 You SHOULD use a Document Type Definition (DTD) when:

  • A compact representation of your document rules is important
  • The document rules primarily concern nesting of elements, not semantic constraints on contents (as in prose markup)
  • You are working with partners who can only accept this format

3.2.3 All things being equal, you SHOULD choose W3C XML Schema over DTD as this is likely to be the long term choice of the BBC.

3.2.4 Background

The use of a common set of schema languages (or a single language) in the BBC enables software interoperability, vocabulary reuse and skills transfer.

There are a number of different XML schema languages including: Document Type Definition (DTD), W3C XML Schema (WXS), Document Structure Description (DSD), RELAX-NG, XDuce, TREX and Schematron. Each of these has their own particular strengths and weaknesses, and it is not possible to legitimately argue that any one is better than another in all cases.

DTD and W3C XML Schema are in widespread use, have good software support, and are recommended by the W3C. These factors have persuaded us that they are likely to remain as industry standards where others, such as RELAX-NG and Schematron, may fall into obscurity.

W3C XML Schema has been chosen as the preferred option due to its excellent data-typing capabilities and very strong support for XML Namespaces, which are strategically important to the BBC.

Top of page

4. Identifying vocabularies using XML Namespaces

4.1 Should I use XML Namespaces?

4.1.1 You SHOULD create XML Namespaces for each of your XML Vocabularies, if you expect the XML vocabularies to be reused – you should give this careful consideration whenever you create a new XML vocabulary.

4.1.2 You MAY create a single XML Namespace for all your XML Vocabularies.

4.1.3 Background

Some XML Vocabularies, such as a config file or a temporary working format may not necessitate the creation of a Namespace. Others may form part of a family of vocabularies and share the same Namespace. Others may benefit from their own Namespace.

4.1.4 The following scenarios are provided as a guide to help you decide on which approach you should take:

4.1.4.1 No Namespace:

Advantages:

  • verbosity of XML Documents reduced
  • human readability of XML Documents increased

Disadvantages:

  • reusability and resource discovery hampered
  • if defined with W3C XML Schema, elements will appear to inherit the XML Namespace of an importing schema
4.1.4.2 Single Namespace:

Advantages:

  • good for ensuring maximum interoperability within vocabularies maintained by a single entity
  • encourages careful consideration when naming and defining content models for elements

Disadvantages:

  • increases likelihood of element name collision
  • over time, the XML Namespace may cover a large XML Vocabulary with many sub-vocabularies
4.1.4.3 Multiple Namespaces

Advantages:

  • decreased risk of element name collision
  • encourages rapid growth of XML Vocabularies
  • maximum semantic precision enabled

Disadvantages:

  • increases complexity of XML Schemas
  • high levels of interdependency between XML Namespaces could emerge

4.2 Creating XML Namespaces

4.2.1 XML Namespaces MUST conform to the standards described in w3.org/Namespaces in XML.

4.2.2 XML Namespace URIs

4.2.2.1 You SHOULD use a BBC XML Namespace Registry (where available) as the namespace URI. For example:

4.2.2.2 XML Namespace URIs MUST be valid URLs as per [rfc1738 - w3.org/Addressing/URL memo] .

4.2.2.3 XML Namespace URIs MUST be unique and persistent.

4.2.2.4 XML Namespace URIs MUST contain information identifying the version of the XML Vocabulary.

4.2.2.5 Background

Though the benefits of persisting URIs, such as Uniform Resource Names (URNs), are recognised, URLs will prove adequate if carefully managed and maintained.

You are advised to choose a URL which does not reflect internal BBC departmental structures, as these are likely to change over time.

As XML Namespaces are used to identify a particular version of an XML Vocabulary, any naming convention must provide a sensible means of labeling versions.

4.2.2 XML Namespace URI Targets

4.2.2.1 XML Namespace URI Targets MUST be unique and persistent.

4.2.2.2 XML Namespace URI Targets SHOULD be RDDL Resource Directories (according to Feb 18th 2002 RDDL specification, see www.openhealth.org/Resource Directory Description language).

4.2.2.2.1 RDDL Resource Directories MUST be valid according to the [RDDL DTD - www.rddl.org].

4.2.2.2.2 RDDL Resource Directories MUST contain a human-readable description of the XML Vocabulary identified by the namespace.

4.2.2.2.3 RDDL Resource Directories MUST contain contact information for the namespace maintainers.

4.2.2.2.4 RDDL Resource Directories MUST contain resource links to a DTD and/or W3C XML Schema for the XML Vocabulary.

4.2.2.2.5 RDDL Resource Directories MAY contain resource links to any other relevant resources.

4.2.2.3 Background

There are many possible resources that an XML Namespace could resolve to, including: a DTD, XML Schema, RDF Schema (RDFS), documentation, some item of software, XSLT or CSS stylesheet, etc.

A common means of addressing this problem is the use of an RDDL Resource Directory as the namespace target.

Although this is a hotly debated subject, the BBC considers that it is useful to have XML Namespaces resolve to a resource. RDDL provides the greatest flexibility as a target, and the simplest implementation.

Top of page

5. Definitions

XML Document:
An instance of an XML Vocabulary, which may be conformant to an XML Schema. Includes files written to disk, transitory constructs in memory and character streams transmitted over a network.
Well Formed:
See the w3.org/EXtensible Mark-up Language (XML) 1.0/Well-formed XML documents for a complete description.
Valid:
An XML document is valid if it has an associated XML Schema, and if the document complies with the constraints expressed within it.
XML Vocabulary:
An XML element/attribute set plus optional XML Schema for:
  • A specific problem domain (chemical markup, mathematics, vector graphics)
  • A business or technical domain for a vertical market or industry
  • A specific set of users
  • A defined set of functions and requirements.
XML Schema Language:
A formal notational language capable of defining an XML Vocabulary. Provides a means for defining the structure, content and semantics of XML Vocabularies. Allows software to enforce rules made by people. Note: this includes DTDs and should not be confused with the W3C XML Schema, which is an example of an XML Schema Language .
XML Schema:
A document expressed in an XML Schema Language formally describing an XML Vocabulary.
XML Namespace:
See w3.org/Namespaces in XML
Shown below is an example of an XML document with two namespace declarations; one default and bound to a namespace prefix.
<?xml version="1.0"?>
<book xmlns="urn:loc.gov:books"
   xmlns:isbn="urn:ISBN:0-395-36341-6">
   <title>Cheaper by the Dozen</title>
   <isbn:number>1568491379</isbn:number>
</book>
XML Namespace Family:
A logical group of XML Namespaces pertaining to a set of versions of an XML Vocabulary.
BBC XML Namespace Registries:
BBC XML Namespace Registries are the standard mechanism for providing URLs with the properties required of XML Namespaces (see forthcoming standards).
RDDL:
XML Resource Directory Description Language is a means of providing:
  • Human-readable descriptive material about an XML Namespace.
  • A directory of individual resources related to the XML Namespace, each directory entry containing descriptive material and linked to the resource in question.
RDDL Resource Directory:
An RDDL Resource Directory is basically XHTML 1.0 with one additional element, 'resource', that provides XLink attributes used to locate and define related objects. As well as providing human readable information about a namespace, RDDL Resource Directories can be interrogated by software to yield a list of resources and suitable uses for them.

Top of page

6. Triggers for updates of this standard

  • Future versions of this document will include further details of Character Encoding.
  • Future accompanying standards will include XML Metadata Encoding and BBC XML Namespace Registries.

Top of page

7. Document history

DateVersionChangeAuthor
16/12/2003 v1.00 Standard renumbered as 1.00 on approval by Standards Exec Jonathan Hassell
06/11/2003 v0.55 Approved by Technical Forum 31/10/2003 – one small update required (RDDL version, and SHOULD) Jonathan Hassell
24/10/2003 v0.54 Final version including URI example from SE (this version to go to Tech Forum for approval) Jonathan Hassell
20/10/2003 v0.53 After amendments from XML WG meeting on 17/10/03 Jonathan Hassell
20/08/2003 v0.52 Final polish by JH (numbering, compression of grammar etc.) Jonathan Hassell
19/08/2003 v0.51 Bit more tidying by SE and JH, insertion of definitions section and RDDL info, rationalisation of vocab. Just needs a bit of polishing by JH before submission to WG Stephen Elson, Jonathan Hassell
14/08/2003 v0.41 Bit more tidy up by SE Stephen Elson
11/08/2003 v0.4 General tidy up by JH, removal of schema stuff into separate doc. Doc still needs a good edit (is all of the background really necessary) and addition of a few sections (intros, character encoding). Jonathan Hassell
08/08/2003 v0.3 Fundamental structuring of the document (and some clarification between XML vocabs and namespaces) around better understanding of how XML fits into BBCi projects Stephen Elson, Jonathan Hassell
08/08/2003 v0.2 Brought across more sections of Wiki – Schema Registry Stephen Elson, Jonathan Hassell
01/08/2003 v0.1 First attempt at creating standard from information on XML WG Wiki [Internal BBC document internal BBC doc - gain access via your Technical Account Manager] (as agreed at meeting on 24/7/03 [Internal BBC document internal BBC doc – gain access via your Technical Account Manager]) – just brought across first 2 sections; more info to come from rest of Wiki Stephen Elson, Jonathan Hassell

Document editor: Editor, Standards & Guidelines. If you have any comments, questions or requests relating to this document, please contact the Editor, Standards & Guidelines.

Like all other Future Media Standards & Guidelines, this page is updated on a regular basis, through the process described on About Standards & Guidelines.

Top of page

Explore the BBC

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.