BBC Digital Public Space project
(Editor's note: It's a delight to welcome Mo to the blog with this, his first "official" posting).
His speech emphasized the BBC's support for the organisation and its philosophies in the context of the BBC's work on a 'new broadcasting system' that can reach everyone, is free at the point of use and makes BBC programmes available to all those who can benefit from them. The speech also discussed the ways the BBC is seeking to get the maximum value from its archive and asked the audience 'what good is it to retain this archive it if can't be shared?' before describing the 'digital public space' within which the BBC now sees itself as operating as it delivers its services online.
As Ralph noted, the digital public space can mean different things to different people. To some it's a philosophical ideal, the belief that UK citizens have the right to access and interact with the countries social and cultural assets online. To me in my role as Data Analyst within the BBC's small Archive Development team it's something very specific.
I and a couple of colleagues work on the Digital Public Space project. This is a partnership between the BBC and other cultural institutions in the UK, including museums, archives, libraries, galleries and educational bodies, all of whom share a vision of not simply using Internet technology as a distribution channel, but instead being part of that digital environment as it evolves: being part of the Web, rather than just on it.
It aims to be an access point for all of the UK's cultural archives, marrying together both the rich information which has been carefully collated, checked and double-checked over the years by experts in their respective fields, with the more immediately-accessible higher level information and audio-visual material, both from the partners and around the Web.
The first step along the way in achieving this is a prototype which is being developed that brings together the archives and catalogues of some of the partnering institutions (including the BBC's) within an 'Umbrella' data model and creates a platform on which applications and interfaces for navigating, annotating and curating them can be built. Eventually, you would be able to access and add to this information through an online gateway, but there could also be specialist entry-points.
For example, there might be an iPhone or Android app for exploring the history of your local area, or a YouView interface focussed on "British Ballet". Part of what makes the project so exciting is that we really don't know what kinds of interfaces and applications will end up being developed for the platform.
The Semantic Web lies at the very heart of this. It provides the toolkit for describing real-world things in a machine-readable way, just like ordinary web pages describe those things in a human-readable way. Like the "Web of documents" we are generally used to, the Semantic Web is built on the fundamental principle that anybody can publish anything about anything else, without having to go through layers of bureaucracy and paperwork. Even the language used to describe these things -- RDF -- uses vocabularies which are often developed independently of one another, and come into existence by being published somewhere on the Web, and having RDF documents begin to use them. There is no central "ontology authority" who decides what does and doesn't form part of the Semantic Web's vocabulary: if there isn't an ontology in existence which is able to describe the things you need to describe, there's not much, beyond time and effort, standing in the way of you creating one.
Within the digital public space prototype, RDF gives us a common language that institutions can use to describe their catalogues in their own terms. The prototype aggregates these catalogues, finding areas of overlap, and presenting the things described by them in a unified manner, not organised in terms of the catalogue entries that are best suited to archivists, but instead in terms of the people, places, events, things and collections which those entries describe.
First and foremost, the aggregated information is itself published as RDF. Being intended for consumption by software, RDF isn't terribly exciting for most people to look at, so as part of the prototype we're also developing a number of user interfaces to explore different ways in which the catalogues can be navigated.
The aggregation engine doesn't have any special knowledge about the partnering catalogues, though. As far as it's concerned, there's no fundamental difference between an expert institution and anybody else. There's a language for making statements about things (RDF), a way of identifying the things in the catalogues (URIs -- of which what we know as "Web addresses" are a subset), and a way to publish those things (the Web).
There are some practical hurdles to be overcome, however.
With institutions, it's quite easy to mandate that the software that feeds catalogue information to the aggregator must push RDF documents to a RESTful Web service, using a digital certificate which provides a strong identity so that the information can be attributed to them. For individuals, things get a little more complicated. We know that user interfaces can be built to take care of the heavy lifting of generating RDF and pushing it to the aggregator, but that still leaves problems with certificates -- most people don't really use public-key cryptography on a day-to-day basis, and so we need to settle upon an approach to identity that everybody can get to grips with.
Beyond that, there are aspects of RDF which haven't been finalised yet -- attaching digital signatures to different parts of an RDF document, and specifying the source of a set of statements ("named graphs"). With all of these issues, we're looking forward to working with the Web community to find solutions.
You're probably wondering when you'll get to experience the digital public space, and in particular this prototype. The answer is "it depends". This phase of the project is due to end in June, at which point we will have something tangible that can be shared amongst select individuals in the partnering organisations, to act as a proof of concept. While the details have yet to be finalised, we hope that the next stage after that will be to make it available to everybody in each of those organisations on a permanent basis. If that's successful, then we are looking to open it up to the many schools, colleges and universities in the UK.
As you can imagine, the legal and rights issues surrounding both the catalogue information and associated digital media are complex and varied, and navigating them means working closely with rightsholders and industry bodies, and will take some time. However, the BBC remains committed to the aim set out in Putting Quality First ("Opening up the BBC's library of programmes") -- and this is a vision shared by all of the project partners -- of providing permanent access to the UK's cultural archives in a digital environment that's available to everybody.
We know that the digital public space can only become a reality if we build on open technologies and standards as championed by the W3C -- the digital environment in which we're creating this already exists, and so co-operation and partnership is absolutely key to the success of the project.
Mo McRoberts is Data Analyst, BBC