Developing Search At The BBC - Pt 1
Until a couple of years ago, I was a Senior Development Producer at the BBC's New Media department. Whilst I was there I used to blog rather enthusiastically about my work, and the team at the BBC Internet blog has asked me to contribute some articles here about the history of the BBC's web site.
I first started to work at the BBC in 2000, as a junior member of a small team looking after the BBC's search engine. Back then, searching the BBC site was a bewildering and perplexing experience, as there was no global search across all of the content.
Instead, on the Today site, you could find a small box in the top right-hand corner that only searched the Today site. Or, if you were on the EastEnders site, there was a long search box at the bottom of the homepage, that only searched the EastEnders site, and so on.
As well as being somewhat randomly placed, the search boxes weren't even all using the same technology. BBC News used a product from Autonomy, whilst other bits of the BBC were indexed using software called Muscat. The results could be pretty appalling. One of my first jobs involved artificially putting the right URLs at the top of search engine results.
This wasn't a hi-tech solution. We had a spreadsheet that listed search terms, and the URL that should be displayed if a user employed them. We used to improve it based on the frustrated emails we got from the public. A mail would come in saying "I searched for 'Jeremy Paxman' and I never found the Newsnight site", and the team would dutifully add that 'jeremy paxman', 'paxman' and 'rottweiler' should produce bbc.co.uk/newsnight as the number one result.
The Muscat search engine was also unable to distinguish between different languages, so if you typed in 'Tony Blair' you were just as likely to get a news story mentioning his name from the BBC's Portuguese news site as from the English language site.
It was obvious it needed to be improved, and as part of the re-branding of BBC Online to BBCi in 2001, a new global search was introduced. The grey 'toolbar' was added to the top of (nearly) every BBC web page, placing a search box on every page of the site.
Users still didn't get the same results from everywhere. If they were on the Radio 1 site, they only saw results from Radio 1 web pages, unless they chose to do otherwise. This was usually OK for about 85% of searches, which would generally be 'in scope', but it would give users bad results some 15% of the time. Plus of course, if you searched for "The White Stripes", there was no reason why content on the Radio 2 or 6 Music sites about the band wouldn't be of interest to you.
To get around this, the scope of the "Best links" that used to be hand-coded into that spreadsheet was increased. It became a large behind-the-scenes taxonomy mapping relevant BBC content against thousands of keywords and concepts. If you typed in something like "Test The Nation" as your search anywhere on the site, the top results would include a BBC "Best Link" taking you to the national IQ quiz homepage. The new interface wasn't universally acclaimed, but it was a vast improvement, and the indexing of the BBC's content was improving as well.
In my next post, I'll be looking at how the BBC introduced web search to the site in 2002.
Martin Belam is a former Senior Development Producer, New Media