UK web archive goes live but not online
- 19 December 2013
- From the section Technology
A major archive of British websites has gone live - but not on the web.
Instead, the project can only be accessed in person from a terminal in one of the British Isles' six biggest libraries.
It follows a decade of legal wrangling between the British Library and publishers.
Restrictions imposed by the Legal Deposit Libraries Act 2003 mean the archive can only be accessed in library reading rooms.
The British Library said that there had been "some discussions into the possibility that the Act might be changed in future so that the archived copies of websites might be made available via the web.
"Making archived copies of material available online, and also allowing it to be indexed by search engines, could potentially affect the volume of web user traffic to the rights owner's live website and harm their business model," the statement added.
The project was announced in April this year and has since been archiving the entire UK web domain, including blogs, public tweets and Facebook pages.
It has already amassed billions of web pages.
Anyone over 18 is able to get a free pass to the reading rooms and the collection can be viewed at the British Library, the National Library of Scotland, the National Library of Wales, the Bodleian Libraries, Cambridge University Library and Trinity College Library in Dublin.
Critics have attacked the decision and the length of time it has taken to get the archive available.
"What's particularly tragic here is that the 10 years of foot-dragging and obstructionism by British publishers has resulted in a loss of countless millions of older web pages that are now probably gone forever - and with them, a key part of the UK's early digital heritage," said Glyn Moody on tech news site Techdirt.
Speaking about the archive in April, Richard Gibby, of the British Library, admitted that a lot of material, including information after the London bombings in 2005, has already been lost in a "digital black hole".
Separate to the UK domain archive, the British Library also runs the UK web archive, which has collected more than 13,000 websites with the permission of rights holders. This is freely available online.
In the US, the Internet Archive charity has been copying the web since 1996. Its popular Wayback Machine is an archive of 364 billion web pages, designed to show people what sites looked like in past years.
Meanwhile, the National Library of Norway is planning to digitise all the books written in Norwegian by the mid-2020s.