BBC digitises the Radio Times as part of Genome project
An ambitious BBC project has finally reached the end of a journey that spans 90 years.
From Monday, BBC staff will be able to digitally search all back issues of the Radio Times online.
This is the result of over three years' work on the BBC Genome project, which sought to create and publish a digital record of the BBC's complete broadcast history.
The scheme was given its name because the corporation likens each of its programmes to 'tiny pieces of BBC DNA' that will form a 'data spine' once reassembled.Scale and complexity
But the project ran into a big problem early on in its inception - a comprehensive record of what had been broadcast on the BBC over the last 90 years simply didn't exist. The closest thing was Infax, a record of the tapes and spools held by the corporation in its archive.
What did exist, however, was a complete collection of the Radio Times, from the date of its first issue in September 1923 to the present day. Digitising the magazines proved to be a gargantuan task, explains project lead Ken McEnery.
'We seriously underestimated the scale and complexity of the endeavour,' says the member of the archive content team in west London. 'Other organisations have digitised and published their magazine back catalogue, so digitising 4,500 magazines doesn't seem that daunting until you realise that it's about 420,000 pages, containing 4.9 million programmes and information about over 8 million contributors.'
McEnery puts this into context for those still grappling with the numbers - the dataset is about the size of Wikipedia.Of its time
Part of the digitisation effort was outsourced to a French team that scanned in the magazines' pages and then used optical character recognition (OCR) software to extract the information. It used specially designed software to make sense of the Radio Times' changing layouts so that the information could be presented in a uniform fashion.
Those searching the Radio Times back catalogue will be able to browse by date, service, issue or decade through to 2009. After that date, records generated by the iPlayer catch-up service are used for what's been broadcast.
The newly digitised data must be used with some caution because schedules often changed at the last minute, so the published record is not always accurate. But it does provide a historical overview of the BBC's output and how it has changed through the decades.
The website warns that the information has to be viewed in in its historical context. 'The language used in the Radio Times to describe programmes reflects the tastes and standards at the time of original publication. Some of the language in the Radio Times is not acceptable under current editorial standards,' it says.
Features and longer-length articles in the listings magazine are still not available to search, but the plan is to eventually include these for internal use. Because of copyright restrictions, however, only some of the articles may be made available outside the BBC.'Comprehensive history'
Staff will able to provide feedback on how well the website works and what might need improvement. The information will be used to fine-tune the Beta site before making it available to the general public as soon as practically possible.
One aim in making this available is that eventually people might come forward with missing material that could be used to plug gaps in the archive - the BBC only has about 20-25% of its programmes in a physical archive - but it's not the ultimate goal.
Project manager Helen Papadopoulos says the project is 'really about having a comprehensive history of the BBC and its schedules'.
She believes that by creating a 'comprehensive, easy-to-use online catalogue of all of the BBC's programmes', people will able to discover what programmes the BBC has, what it doesn't have, when and where they were broadcast and even what else the BBC has that might interest them.
This could include physical objects such as letters, stills, music, props and scripts.
'The Genome project is the equivalent of a dictionary,' judges McEnery. 'You might not want to use every single word, but it's there.'