BBC Genome: The Complete Broadcast History of the BBC

Thursday 19 August 2010, 11:45

Helen Papadopoulos Helen Papadopoulos

Tagged with:

Most people know that the BBC does not have a copy of every programme it has ever broadcast. The main reason for this is that when broadcasting began it was seen as an ephemeral medium, and there was no way to record and store what was being transmitted.

Although it became possible to record programmes in the 1950s, magnetic tape was very expensive and recording equipment bulky and complicated, and until relatively recently only those programmes that were considered worth the cost and effort of recording and archiving for posterity were retained. The head of BBC Information and Archives, Sarah Hayes, has already written about this in detail for the Internet blog back in September 2009.

However, even though we may not have a copy of each programme in the BBC's vast archive, there may still be something related to or derived from the original programme: stills, non-broadcast footage, music, documentation, props or other material connected with what was broadcast.

The skilled researchers who work with programme-makers inside the BBC and independent production companies are used to hunting for additional material and know where to look, but on the whole the public don't even know where to start. BBC Genome is our attempt to solve that problem, by creating a comprehensive, easy-to-use online catalogue of all of the BBC's programmes so that people can discover which programmes we have, which we don't have, when and where they were broadcast and even what else we've got that might interest them.

We're working on the basis that "full or near-full public access to archives is both achievable and the right ultimate goal" and, sitting at the heart of a reshaped BBC Online, BBC Genome is the first step towards that goal. It will provide a timeline from the foundation of the British Broadcasting Company in 1922 and provide details of the programmes, channels and services which map on to that timeline, bringing the broadcast history of the BBC to life.

What BBC Genome does

The BBC stores information about the programmes we make and broadcast in many different ways, each one designed to support a specific task or function, but none of these are comprehensive nor in a publicly accessible or searchable form. We want to ensure that our broadcast history becomes and remains a working asset for audiences, and at the end of last year we set about finding a way to reconstruct the BBC's broadcast history all the way back to 1922.

We needed to create a central core, or spine, for the catalogue of broadcast records and there was one source in particular that provided a comprehensive record of the BBC's broadcast history going back to 1923: Radio Times.

RT-Newsstand_blog.jpgIt is an ideal place to begin because we have easy access to it, it contains a record of everything we intended to broadcast - even if what actually went on air wasn't what we planned to show - and it is in a structure and format that people readily recognise, with basic but consistent details for all programmes, along with regional variations. It even lists radio frequencies!

We started with a pilot project to scan two years' worth of Radio Times and extract the programme listings details from the scanned pages, in order to establish the approach and processes. Working with experts at a UK firm which specialises in projects like ours and with the British Library, every page of the 1948 and 1977 editions of Radio Times was scanned.

radio_times.jpg

The Genome pilot: Metadata, OCR and outputs


Genome_Blog_Pilot_metadata.jpgThe images were then converted into computer-readable text using optical character recognition (OCR) software before being divided into separate channel and programme listings so that we could identify details including the programme title, channel name, date and time of broadcast and a synopsis of the show. All of this information was stored in a database that we used to support an experimental website that presented the information in a form similar to BBC Programmes pages.

These are a couple of the early pages complementing the BBC Programmes services that we created from the XML files.


genome.jpgDuring the process we learned an awful lot and collaborated with several BBC departments as we worked out how to make the process accurate and repeatable. As you would expect, we set a very tough and exacting technical specification for the scanning, partly to optimise the accuracy of the OCR process, but just as importantly for long-term preservation purposes. We didn't want anyone in the BBC to have to come back and pay to scan the pages again in five, ten or even thirty years' time if we could avoid it.

At the end of the trial we knew we could extract programme records from Radio Times, but the other important part of the project was to create a BBC channel and service history. There are records for when channels and services began, ended or were rebranded, just not in a single accessible place, and we quickly discovered how complicated the BBC's broadcast history is. For example, in order to work out when regional opt-outs started we needed to search Radio Times and a host of other sources at the BBC Written Archives Centre.

The picture below shows over 20 different editions of Radio Times all for the same week in 1971.

Stack-of-RT_blog.jpgWhat's next for BBC Genome?

In September we will begin the full-scale project of digitising over 80 years' worth of broadcast records. That's approximately 400,000 pages of Radio Times, 3 million programmes and 300 million words to recognise through OCR.

In less than a year we expect the Radio Times digitisation project to be completed and for the first time there will be, in one place, a comprehensive record of every programme.

What you'll be able to search and discover

Initially, you will be able to search by programme title, by year, day and time. Once we fully populate the database with contributors, programme synopses and other sources of data, you'll be able to find people and places and all the programme records they feature in.

You might well discover during your searches that the programme schedules are not entirely correct. They were, of course, correct when each issue of Radio Times was published, but in the early days of radio and television technical hitches sometimes affected the schedules. Similarly, throughout the BBC's broadcast history, changes in live broadcasts and major events at home and abroad will have meant that the published schedules in Radio Times were not always accurate.

What you'll be able to access

Radio Times is owned by the BBC's commercial arm BBC Worldwide and we currently do not have the rights to show the scanned pages themselves, although we hope we may be able to in the future. However, you will be able to take a journey back in time and rediscover how the BBC's networks and programmes reflect Britain's social history. You can already access some archive materials via the collections featured on the BBC Archive site, and Genome will provide access to additional archive information.

Although the BBC only has about 20-25% of the programmes in its physical archive, this still amounts to more than a million hours of output. Radio Times will provide the programme listing and, once that's done, we will start to provide access to the programmes themselves along with other material such as scripts or photos - which will be especially useful where physical programmes no longer exist or where we don't have the rights to make the programme itself available - and begin to make it all visible from BBC Online.

Making everything available will take time, but the Radio Times programme records will soon create the spine for Genome and are a vital first step in bringing the BBC's broadcast history to life.

One last note: Radio Times was first published on 28 September 1923 and I have referred to the foundation of the BBC in 1922. Using other sources, we do plan to make programme records available from the first ever BBC broadcast on 14 November 1922 when the Marconi transmitting station 2LO was taken over by the BBC. It truly will be a complete broadcast history of the BBC!

Helen Papadopoulos is the Project Manager of BBC Genome

Tagged with:

Comments

Jump to comments pagination
 
  • rate this
    0

    Comment number 1.

    Wow, this looks like a really exciting project. Good luck, I can't wait to see the results.

    You said that the information was presented in a form similar to /programmes. Does this mean once you're finished, you'll be putting all the data there?

  • rate this
    0

    Comment number 2.

    I am looking forward to this - but would also like to know the projected cost.

  • rate this
    0

    Comment number 3.

    Thank you for your interest in BBC Genome.
    We are currently looking at the best way of exposing the Radio Times data online and will keep you posted here on the blog.

  • rate this
    0

    Comment number 4.

    Sounds a fascinating undertaking. Will there be some mechanism for members of the public to contribute their own information around programmes? There may well be writers, actors, directors, contributors, musicians etc and their families who are able to contribute knowledge that has been lost to the BBC. Some kind of wiki functionality might be appropriate - perhaps mediated by a third-party platform in order to keep costs down? Using /programmes unique identifiers should make this linkage more straightforward.

  • rate this
    0

    Comment number 5.

    I can't wait untill the Genome project is done... It is needed to be established long ago. Good work.

  • rate this
    0

    Comment number 6.

    I could have done with this data a number of years ago- a project such as this is long overdue. Of course, last minute changes to schedules, for various reasons (breaks in live transmissions, technical problems with recordings, strikes etc), tend not to show in the Radio Times, though, but, sometimes, they do appear in newspaper listings, when the changes were made in reaction to foreseen events. A couple of such examples are the late scheduling of Apollo 8 mission reports in 1968 and the replacement of the Christmas Day Parkinson (which was not recorded, due a strike ending slightly too late for production to be organised) with a Perry Como Special in 1978.

  • rate this
    0

    Comment number 7.

    This sounds great.

    I have a collection of Radio Times from the late 80s to late 90s, would these be of any use to you?

  • rate this
    0

    Comment number 8.

    What happens when the programmes were not broadcast as planned?

    There have been plenty of strikes where the Radio Times still got printed, but shows didn't go out.

    Obvious examples are Douglas Adam's Shada (on Doctor Who) and all those times the News was not broadcast, but "Open All Hours" was.

    And that is not forgetting the regular overrun of sport, and breaking news that happened when there were fewer channels.

  • rate this
    0

    Comment number 9.

    Would also like to register my appreciation and general excitement about this project. Couple of questions:

    1) Is this going to include full regional TV and Radio listings as well?
    2) Is the roll out of the first batch of information going to be gradual, once the information has been formatted, or do you plan on waiting until all issues have been scanned in before publishing anything?

  • rate this
    0

    Comment number 10.

    @lucas42 #1: /programmes is obviously a prime candidate for exposing this data on bbc.co.uk and we certainly built it with that in mind but there are a few technical and user experience implications when 90 years of data (!) hits a system primarily built around 3 years of data (2007-2010) - so as Helen says "we're looking at it" ;-)

    @matthew_shorter #4: great points matthew... sounds like a good option for the future although the first pass is very much to "create a central core, or spine". Then, as you point out, "unique identifiers should make this linkage more straightforward".

    @Paul_Gethin #6 & @Briantist #8: The difference between 'as broadcast' and 'as scheduled' is one of many key questions. I'll assume you're familiar with our Programme Information Platformstructure (forgive me if you're not)... We're looking to get all the Radio Times data into the PIPs structure and then, most likely, the database itself. PIPs has the entity 'broadcasts' and when our schedules have last minute changes those broadcasts will usually be updated as quickly as possible to reflect what actually happens. It'll be a long trawl with the Radio Times data to know when broadcast differed to scheduled but when we do know we don't necessarily want to write over scheduled information which may have equal historical value. One current proposal is to create a 'scheduled' entity alongside the 'broadcasts' so that we can preserve both sets of data... I'm sure when things are in motion there will be more posts on the inner workings and decisions therein.

  • rate this
    0

    Comment number 11.

    @jcjl1980
    1. Is this going to include full regional TV and Radio listings as well?
    Yes the Genome project will do this. However, this is a huge endeavour so we won’t be doing it all at the same time.
    2. Is the roll out of the first batch of information going to be gradual, once the information has been formatted, or do you plan on waiting until all issues have been scanned in before publishing anything?
    We are working on how best to expose the data and once we have completed end-to-end tests with the data, we will be able to plan the roll out in detail. We’ll keep you posted here.



  • rate this
    0

    Comment number 12.

    More about the archive and why we don't have every programme ever made:
    http://www.bbc.co.uk/archive/tv_archive.shtml

  • rate this
    0

    Comment number 13.

    Will there be searchable metadata on the various shows? I know for example I've appeared a couple of times on regional news shows, would we be able to search by year or show name or what items the news features were covering?

  • rate this
    0

    Comment number 14.

    Looks like I'll have to start winding down my website - www.radiolistings.co.uk

  • rate this
    0

    Comment number 15.

    "Radio Times is owned by the BBC's commercial arm BBC Worldwide and we currently do not have the rights to show the scanned pages themselves"

    Hmm, but who owns BBC Worldwide, surely the BBC, and who owns the BBC...

    Whilst a case can be made for such 'ownership rights' were a distinct commercial re-sale or re-broadcast opportunity exists the same can not be said for much of the BBC's historic and archival content, this should be made freely available. Just what commercial value do out-of-date copies of the Radio Times have, byond the value of pulped paper?!...

  • rate this
    0

    Comment number 16.

    @Boilerplated - there's also the issue of photographers' copyright. Often magazines would pay a lower fee and only license an image for 'one use'. This would certainly not cover publishing a photograph online permanently, which would be a completely different use and potentially forever.

  • rate this
    0

    Comment number 17.

    #16. At 11:53pm on 03 Nov 2010, Robert wrote:

    "@Boilerplated - there's also the issue of photographers' copyright. Often magazines would pay a lower fee and only license an image for 'one use'. This would certainly not cover publishing a photograph online permanently, which would be a completely different use and potentially forever."

    Whilst I accept that might be a problem in many magazines I suspect that most of the photos in RT are (or at least were) of BBC productions taken by the BBC's own publicity department/RT staff, certainly within the listings sections and title pages.

    Regards, the erstwhile "Boilerplated"

  • rate this
    0

    Comment number 18.

    Any news on this project? I think it is a wonderful endeavour which I would like to succeed.

    It's not being affected by the cutbacks that will close h2g2, 606 etc is it?

    When can the public expect to see something and get involved?

  • rate this
    0

    Comment number 19.

    BBC Genome update - February 2011

    Thank you for your continued interest in BBC Genome. I'm very pleased to say that the project is alive and kicking.

    Scanning of the magazines is well underway. We have been working hard with our supplier to extract the best data we can, in a meaningful way for our audiences. We expect to have all the data and scanned images delivered this summer. That's one side of the project and it's just as complex working out how to make the data available to the public and integrate it into BBC online. We intend to publish the Radio Times data in phases and that will begin in 2011.

  • rate this
    0

    Comment number 20.

    Hi was just wandering if the project is almost finished when can we see them on the site?

 

Page 1 of 2

This entry is now closed for comments

Share this page

More Posts

Previous
In Their Own Words: British Novelists, from the BBC Archive

Wednesday 18 August 2010, 12:25

Next
A new home for the BBC Archive

Friday 20 August 2010, 08:10

About this Blog

This blog explains what the BBC does and how it works. We link to some other blogs and online spaces inside and outside the corporation. The blog is edited by Jon Jacob.

Follow About the BBC on Twitter

Blog Updates

Stay updated with the latest posts from the blog.

Subscribe using:

What are feeds?

External links about the BBC

The future of the BBC: you either believe in it or you don't (Guardian)

BBC Licence Fee is a bargain - something for everyone and just 40p a day (Mirror)
"The BBC’s job is to deliver to you. Not to politicians or the powerful. Some 96% of the population watch, listen or use the BBC every week"

Veteran BBC broadcaster Gerry Anderson dies (RTE News)
"he'll be sadly missed by all of us, but also by all his loyal listeners, for whom he often brought light on dark days over the decades"

Doctor Who gets new online BBC iPlayer series Doctor Who Extra (Independent)
"essential viewing for everyone who’s ever watched Doctor Who and wondered what it’s a like to be a part of the team that brings this global phenomena to our screens"

Moore: ‘Miranda will bring a different flavour to The Generation Game’ (Guardian)

Judy Murray to appear on Strictly Come Dancing (Scotsman)

James Alexander Gordon, voice of BBC radio's football results, dies at 78 (Guardian)
"Scottish broadcaster ended four-decade stint of announcing classified results in 2013 after being diagnosed with cancer"

New BBC drama Life in Squares to track lives of Bloomsbury Set (Independent)
"Filming is underway for a new drama delving into the intimate lives of the Bloomsbury Set, including tragic literary great Virginia Woolf."

Happy Valley will be back for a second series (Halifax Courier)

Match of the Day at 50: Happy birthday to a football and broadcasting institution (Telegraph)

Jonathan Ross returning to BBC for first time in four years (Digital Spy)

BBC 'Our World War' episode explores future of digital storytelling (Telegraph)
"This opens up a creative pallet that just wasn't available before"

Frankie Bridge first celebrity confirmed for Strictly Come Dancing line-up 2014 (Mirror)

BBC Radio 1 announce 30 Live Lounge acts for Even More Music Month (Digital Spy)

Brits obsessed with the weather? You bet! BBC Weather app is its fastest-growing ever (Tech Radar)
"Most popular checking time? 7am."

Walter, BBC One, Review: 'suspense and laughter' (Telegraph)
"The BBC's new police comedy drama manages to be both profoundly silly and gripping, writes Jake Wallis Simons"

Pointless? It's turned our lives upside down! Alexander Armstrong and Richard Osman on what it's like being catapulted to stardom (Daily Mail)

Alex Jones: Why I leapt at the chance to host Tumble (Wales Online)
"Opportunities like a Saturday-night show just don’t come along very often and I was blown away when they asked me to host it"

Did Great British Bake Off survive move to BBC One? (Digital Spy)
"Sensibly the producers have remained firm with their recipe for success"

Today's the day! Doctor Who Series 8 world premiere hits Cardiff (WalesOnline)

BBC investigates Top Gear after Jeremy Clarkson gaffes (Guardian)
"Cohen is desperate to stem the tide of controversy which has engulfed the show in recent times"

Phil Neville to help fill the void left by Alan Hansen as Match of the Day analyst for Premier League season (Telegraph)
"Broadcaster has demonstrated its faith in Neville following his nightmare World Cup commentary debut by also confirming him as a co-commentator on Radio Five Live"

Does Mary Berry Cooks mark a breakthrough for older women on TV?Does Mary Berry Cooks mark a breakthrough for older women on TV? (Guardian)"What results is an unapologetically old-fashioned TV show with a presenter who makes no attempt to disguise when she was born"

Last updated Thursday 21 August 2014

Blogs from across the BBC

Selected by the About the BBC Blog team.

Parent's experience: starting nursery [CBeebies Grown-Ups]
Referendum debate: Seven key stats about Scotland [Academy]
Long live 35mm [Kermode Uncut]
The making of our World War One interactive guide [Internet]
How the mighty have fallen - Mull Eagle watch CSI [Springwatch]