« Previous | Main | Next »

BBC Online and 'deleting' websites

Post categories:

Ian Hunter | 16:36 UK time, Monday, 14 February 2011

There's been a lively discussion on the issues around archiving websites this week that kicked off with an initial post from Adactio blogger Jeremy Keith. He suggested that the BBC's plans to halve its top level directories were cultural vandalism. This was picked up (though later clarified) by @bengoldacre and many others. The tenor of the criticism was the same - that the BBC is failing in its duty to preserve a record of its online past.

On Friday Metro reported on a 'BBC fan' who has captured for posterity a record of the 170 sites it's suggested we'll be deleting for efficiency reasons.

And many have claimed this is only the latest failure, after the wiping (or worse) of programmes in the sixties which are seen as classics today. This is not, and never was, part of the plan.

My post last month explained that we were exploring a range of options for managing legacy content. "Deleting the lot" was not one of them, though offline storage is. The debate is quite complex. For example, one of our oldest sites www.bbc.co.uk/otr is still accessible but you could argue that it is a travesty of what its makers intended. Over time various features (for example, search) have ceased to function. You could argue that the BBC should spend money bringing this site up to date every time technology moves on, but would that be money well spent? The site still offers a number of transcripts of political interviews of the time and we may make it part of the news product. But there still may come a time when people interested in the site are better served by careful offline storage. We are also looking to apply this approach to www.bbc.co.uk/politics97.
Many have argued that www.bbc.co.uk/ww2peopleswar should remain accessible to a wide audience. Again, this is an example of a site we are looking to consolidate into a bigger product - in this case the history section of knowledge and learning.

Similarly, assets from many of the 170 sites will be re-presented in forms which can be more easily kept up to date. For instance, www.bbc.co.uk/hamlet has been superseded by https://www.bbc.co.uk/programmes/b00pk71s and placed into a format which will allow the data and assets to be refreshed or editorially changed going forward. The same has been done with www.bbc.co.uk/annefrank.

This is similar to what a site like the Guardian does when it updates its look and feel. A story from 1999 is still viewable but much of its context has gone, at least in the form which is most accessible, online https://www.guardian.co.uk/news/1999/dec/03/guardianobituaries?INTCMP=SRCH

This means that if we wish to preserve a full record of what we have published, context as well as content, we need to explore a range of options including offline storage.

To restate our intentions: we are moving towards a rational content lifecycle for our websites as practiced by many other sites across the web. The aim is consistent high quality everywhere on the site. We have a number of stand-alone websites which will in due course become obsolete and need to be managed. Some will be consolidated into bigger, persistently managed, content offerings. Others will be moved offline to be preserved.

As our plan develops we'll keep you informed.

Ian Hunter is Managing Editor, BBC Online

Editor's note: Some people seem to be experiencing difficulties commenting on this post. This is a technical problem or bug which is being investigated. Apologies and please bear with us. Update 1 p.m. - this bug has now been fixed and comments are now open again. Apologies.


  • Comment number 1.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 2.

    No, I'm afraid that still doesn't make much sense. While some care may be required to preserve an archival copy of the old sites suitable for longer term storage, so that they don't continue to degrade as things like search infrastructure changes underneath them, there is no reason (at the very least, no reason explained in this post) why that archived copy cannot or should not be accessible online.

    You've asserted that "there still may come a time when people interested in the site are better served by careful offline storage" but not explained - what is the advantage of offline storage, as opposed to storing the exact same thing online?

  • Comment number 3.

    You say I later “clarified” and linked to one tweet of mine. That’s not entirely accurate. As I explained elsewhere in more detail at the time, the post from the BBC which I linked to was ambiguous:


    "The material taken offline is stored for future reference, or deleted altogether."

    Can you clarify as Managing Editor of BBC Online that nothing will be deleted?

    I should say that while the BBC clearly has greater costs associated with keeping this material accessible and online than random people outside the BBC, there is no doubt that people will spring up to host it all in browsable form when it is pulled down by the BBC, as publicly accessible archive. It’ll be genuinely interesting to see what kind of IP tussles happens then.

  • Comment number 4.

    Perhaps the problem is a lack of transparency and clarity. The announcement of the plan to close 200 sites (ignoring the whole TLD/directory/etc issue) has always gone hand in hand with the announcement of 25% cuts to the budget of BBC online. However this list of sites isn't generally the stuff that costs money going forward. In fact most of these sites are already mothballed or not actively developed or maintained.

    I appreciate the need to save money and to focus on doing fewer things better, but mothballing/migrating/deleting/archiving most of this list of sites will contribute little if anything to the savings required. That doesn't mean that some housekeeping shouldn't be done - but let's not pretend that moving, labelling or removing this list of sites is what's going to save any real money.

  • Comment number 5.

    Unfortunately this post only reinforces the notion that the BBC just doesn't 'get it'--certainly true for those at the top of the organisation making these decisions.

    The central issue, and why this policy seems so shortsighted, is the assumption that the number of top level directories is indicative of the size of the website. If the entire website was placed under a /website/ directory, would the site be any smaller? No, of course not.

    As such, this seems like a superficial solution to a political problem. I've heard people suggest that the BBC has to pay for every top-level directory on the site, so there is a real cost associated with this number. If so, maybe the BBC should investigate other ways to save money, namely not enter restrictive and ill-conceived contracts with outsourcing companies like Siemens. Indeed, I expect this destructive policy of 'archiving offline' (which is still deletion from the web) will prove to be more expensive than just leaving it untouched.

    In your post, you cite the example of the website for Hamlet. Whilst programme information exists under /programmes/, what do you intend for the other (more useful) content on that site, for example /hamlet/characters/. Will it remain under the programmes directory, perhaps /programmes/b00pk71s/characters/ (in which case the superficial nature of this policy becomes clear). Or will this content disappear? Or will it be repurposed somehow, moving to somewhere else on the site? With each directory facing a different fate, I hope your plans will be published in more detail prior to their deletion.

    Your second assumption is that the surrounding design is not important, but as a web designer these sites provide a fascinating insight into the web's past; the language of the earlier site--both written and visual--is fascinating in its casualness.

    What's most perplexing about this policy is that it replaces one that was far more appropriate, one often held up as a best-practice example of how to deal with arching content on the web.

    I really do hope you review these plans.

  • Comment number 6.

    Even if material is still available but moved, as you suggest with your Hamlet and Anne Frank examples, what about links to the sites that you mention? You can't see any website, not even the BBC, in isolation, but as part of the wider Web ecosystem. If you do insist on moving sites around, you need to ensure that the old URLs don't break (e.g. with .htaccess or equivalent). While a small organisation could get away with leaving a trail of 404s around, the BBC simply cannot.

    As for taking content offline, why not release it to the Wayback Machine over at archive.org first? This is the most respected archive on the web and would enable the content to remain online at no cost to the BBC. You suggest that "people interested in the [OTR] site are better served by careful offline storage". I don't see how that follows at all. If I want to go and read a transcript of an interview, how am I best served by that transcript being offline and unavailable?

  • Comment number 7.

    Many thanks for these comments.

    Mydogminton is correct to say this is not about saving money, and may be right to point to a conflation of messages in the Putting Quality First announcements. In my first post on this subject I mentioned that the proliferation of top level directories was seen as one indication of the need to re-focus BBC Online. The cheapest thing would be to let sites degrade after their occasion has passed and the production teams have moved on. We have allowed this to happen in the past, but we don't think this achieves the level of quality our online audience expects from us.

    Bengoldacre asks me to clarify that nothing will be deleted. In the case of the listed tlds, we'll keep a record of any taken offlline though it may not comprise every page ever published at that address. And we'll try to ensure that redirects take users to the most relevant current page. The long term value offered by tlds such as www.bbc.co.uk/wma or www.bbc.co.uk/communicate or www.bbc.co.uk/newsa or www.bbc.co.uk/holiday or www.bbc.co.uk/backpage is unlikely to be high, even accepting that it's hard to anticipte what future generations will find interesting. I would say that there'll be more interest in early versions of the BBC homepage or iPlayer or the news site, but as these are updated or replaced they are no longer accessible online.

    Ewan asks when it might make sense to store something offline. Well, the video on www.bbc.co.uk/nationonfilm is becoming harder to access and offers a poor experience. This will only improve if we invest in modernising the site and its assets (which would mean, of course, that the site as it currently is would have been replaced). The team who built and maintained the site have moved on. This seems to me an instance where over time the value offered to online users will decline and where the assets may be best preserved offline.

    We are continuing the work of re-shaping BBC Online around ten distinctive products and each of them will be considering the best way to manage the various types of legacy content in their areas. We'll publish an update in due course.

  • Comment number 8.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 9.

    Thanks for sharing.


More from this blog...

BBC © 2014 The BBC is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.