BBC BLOGS - dot.Rory
« Previous | Main | Next »

Public data: Free at last?

Rory Cellan-Jones | 09:33 UK time, Thursday, 21 January 2010

Go to a site called data.gov.uk this morning, and you'll find the fruits of a ten-year battle. It's being billed as a one-stop shop for developers hoping to find inventive new ways of using government data - and it was described to me by one man who's followed its long gestation closely as the "triumph of the geek".

Unlocking the huge quantities of data stored on computers across Whitehall and up and down the country in local government buildings has long been an obsession for a small group of dedicated enthusiasts. The issue brought together a combination of ambitious software developers who saw a commercial potential in all that data and cyber-utopians who believed that if the mission succeeded it would transform government and its relationship with citizens.

Screengrab of data.gov.uk

They made painfully slow progress, coming up against both incomprehension and inertia in Whitehall and some legitimate concerns about the cost of the whole project. Then in stepped a man who changed everything.

At a lunch at Chequers, the creator of the web, Sir Tim Berners-Lee, explained to the prime minister just why setting government data free was important, a vision he'd already outlined in a speech at the TED conference where he got the crowd chanting "raw data now!" I'm not sure whether he got Gordon Brown to join in with that chant around the Chequers dining-table, but he was given the job of making it happen.

In collaboration with Professor Nigel Shadbolt of Southampton University he set to work spreading the gospel through Whitehall - and surprise, surprise, when confronted with probably the nearest thing Britain has to a science superstar, the civil servants rushed to bring out their raw data and lay it at his feet.

I spoke to the two men about their efforts, Sir Tim joining us fittingly via a VOIP call after we somehow failed to fire up his ISDN connection, an older form of technology.

In order to see this content you need to have both Javascript enabled and Flash installed. Visit BBC Webwise for full instructions. If you're reading via RSS, you'll need to visit the blog to access this content.

Nigel Shadbolt said Sir Tim had had a big effect on the civil servants: "There is star quality there; people are interested in meeting Tim and hearing the vision."

And Sir Tim said that while some people had concerns at the beginning - "will people accuse me of having 'dirty' data? Will they make unreasonable requests?" - they'd quickly realised there was some kudos in putting your data out there for people to use.

The site has been in beta for some months and now has nearly 3,000 datasets online, ranging from "Abandoned Vehicles (2003-04 to 2005-06)" to "Youth Cohort Study & Longitudinal Study of Young People in England". Already, some interesting applications have been developed - a school finder which lets you search local schools ranked by Oftsed score, FillThatHole, which uses ONS Census geography data to facilitate the reporting of potholes and other road hazards, and UK House Prices, a visualisation of property market trends using Land Registry data.

The mood amongst the "free data" community seems pretty upbeat about the whole project - one developer, Harry Metcalfe, told me that Tim Berners-Lee had really put the project on people's radars. He said that "low-hanging fruit" had been picked, but there was still a job to be done in liberating more real-time data, particularly from the transport sector. As we've seen by a couple of rows over iPhone apps, "public" data about things like train timetables and live departure boards is seen as a valuable private resource by its owners.

But can the Free Our Data crowd now be assured of victory? One man who's followed the whole project very closely is James Crabtree of Prospect magazine. He's optimistic but warns of some potential potholes - first, that when Tim Berners-Lee leaves, the spotlight will fade and it will just go back to being a geek issue; then, that despite the enthusiasm of both Gordon Brown and David Cameron, the politicians will turn to other matters. But Crabtree believes the key issue is all about maps, which are crucial to using just about any government dataset.

After a long-running row, it now looks as though the Ordnance Survey will allow free access to much of its mapping data. "But if that decision were to be reversed," says Crabtree, "then all of Tim Berners-Lee's work would come to nothing."

For now, though, prepare for a blizzard of new ways of manipulating public data, from crime maps showing which of your neighbours have been burgled to planning alerts telling you when the bloke across the road has submitted an application to build a two-storey extension. The data brigade says its mission is to use the government's own numbers to make all of our lives better. Let's see if it works.

Comments

  • Comment number 1.

    Great stuff - am delighted to see this going live today. The OS licensing issue is a big one, but only a tactical one. The bigger strategic questions are:

    - will products built from this actually be used by real people in their real lives (and how can this be fostered)?
    - what business models will appear? Nothing's free forever...
    - how will real user needs be considered in building applications using this data?
    - will all the data be there in months and years to come with assured quality? I'll want to know if I'm going to invest in developments that use it.

    There's a longer piece on these themes at http://paulclarke.com/honestlyreal/2010/01/welcoming-data-gov-uk/

  • Comment number 2.

    "it now looks as though the Ordnance Survey will allow free access to much of its mapping data"

    OS has maps of many scales from 1250's etc. to the whole country. My bet is that OS will still fight tooth and nail to retain control of its large scale data as this is the costliest to keep up-to-date and most valuable, however they do need to respond to Google Earth.

    The really interesting stuff is the subsurface mapping of where services run - I'll bet this remains paid for only!

  • Comment number 3.

    Tim Berners-Lee "probably the nearest thing Britain has to a science superstar"? A nice title but he might have to fight Stephen Hawking for it.

  • Comment number 4.

    The data may now be "free" to the public but the cost to people who actually use the raw data and produce reports and results from it has increased massively.


  • Comment number 5.

    I too am really pleased to see the launch of the site, as someone who regularly advocates releasing datasets, data visualisation and the power of web based apps to both private and public sector organisations, it is great to finally see the government go where many private companies fear to tread.
    http://chameleonnet.co.uk/blog

  • Comment number 6.

    "it now looks as though the Ordnance Survey will allow free access to much of its mapping data"

    Nothing in life is free. Ordnance surveys costs will now need to be paid by the government instead of OS being a source of income.
    Everyone will be paying for the Geo data instead of just those that use it.

  • Comment number 7.

    A very quick glance late last night seemed to indicate that this is a start but its got a huge way to go. Most of the content seems simply to be stuff which is already published through either ONS or through a variety of government departments and the site is, at present, simply signposting to a lot of stuff that already exists. Additionally, the vast majority of government data that is currently published does not fall in the category of primary data (the data which is actually observed) rather it is a statistical summarisation of the primary data - counts, averages, median values, etc.

    The real value in publishing government data will come when the fear and paranoia that seems to surround the publication of primary data is overcome. Of course there are significant privacy issues that will have to be addressed but we've got over that hurdle for the publication of house sale prices, so with suitable anonymisation of personal data there's no reason why a lot more primary data couldn't be made available: for example the survey data that ONS bases a lot of its statistical outputs on. It is only when we get to the point of publishing a lot more primary data that we can expect to see some real innovation and some real value extracted from the data that government collects. Then the vision of data.gov.uk will have been achieved.

  • Comment number 8.

    You have to keep in mind this is not all 'raw data' - but manipulated figures - like the CRU ;)

  • Comment number 9.

    Manipulated or not, openness of all data and code for government-sponsored climate science is a key step, bang in line with where Tim Berners-Lee is taking us, and something that would have made 'Climategate' a non-event, as fervent global warming believer Glyn Moody pointed out in November - and as George Monbiot at once agreed with me at a debate on Climategate at Free Word in December. The good people (and sci/math/stats boffins) on Climate Audit have been arguing this for donkey's years. That link takes you to others with further discussion of the issue. The timing's right. Raw Data Now! And all that should go with it.

  • Comment number 10.

    "modellingman" is right - most of the "new" datasets are just links to existing Government sites providing the same old data in the same old unusable Excel or PDF documents.

    The part that was supposed to capture everyone's imagination was these datasets being made available in a format that allows people to create new, useful applications. Well, I've got news for you: the SPARQL interface that's supposed to allow developers to make these apps has almost nothing in it. The only financial or economic data in there is public spending by department, expressed in a few different ways (real terms, percentage GDP, absolute figures). No RPI, no deficit figures, not even a history of interest rates. So, if you're a human wanting to read the data, you might well be grateful for the fact that you can find a link to it more easily. But for those who wanted to make use of the data, it's next to useless.

    This had better just be the beginning. All the datasets need to be made available in an open format that programmers can use, otherwise you'll get zero benefit out of it. I wish the media hadn't got carried away in the hype.

  • Comment number 11.

    I hope we get some OS data. The large scale data (MasterMap) that I use at work will never be free, and that's fine as that actually brings in the bacon for OS. All the other datasets are simply a different cartographic rendition of the MM data, and the other datasets are losing ground to google and open source neogeographical mapping such as OpenStreetMap.

    A smaller scale raster dataset being free would be a huge advantage, as well as unlocking the potential of all the other information in data.gov.uk . I completely disagree with Paul Clarke. The value of place against this data is of paramount importance. GIS provides something that Far better for that data to be provided by OS would make things much easier as most other mapping services would need the other data to be re-projected.

  • Comment number 12.

    @Jimbo, #6:

    That data is quite cheap really, but cheap is still out of the price range for a lot of 'hobbyist' and free-to-air services. Its also paid for by the public services you use all the time - eg Police and Ambulance. So really, we all pay for it anyway.

  • Comment number 13.

    Paul Clarke asked: ‘- will all the data be there in months and years to come with assured quality?’

    The National Archives, who are in the business of safeguarding government data, are pleased to reassure him that we are working with the creators of government data to ensure that the data is captured and, equally importantly, useable, for months and years to come. Plans are already in place to capture regular snapshots of the data.gov.uk site itself in the National Archives’ UK Government Web Archive.

    As for the all-important data that data.gov.uk points to, we shall be advising government departments on the best way of publishing their data on their websites so that we can capture more of it. It is our intention that, if, in the future, data.gov.uk is pointing to data that has disappeared from the government website, the public will be re-directed to an archived version of the web page, where they will be able to download the data.

    We are hoping it won’t be too long before someone comes up with an App that mashes government data from the archives…

  • Comment number 14.

    I am a contributor providing an app that makes use of publicly available data, I am now waiting for historical generated electricity to be included in the datasets, my app will then start to be really useful!

    What will really show the govt taking this liberalisation seriously is when, rather than pay a consultancy vast sums to replicate something that already exists for free, they approach the app developer and fund further development. That way we all win, both enthusiastic developer, the user and the taxpayer! I suppose the only loser will be the consultant who misses out on an inflated fee.

    Re: Comment above on underground assets, these maps are held individually by the utilities, not by OS. They are normally shared reasonably freely between utilities, however it is rare to find a single resource with say water, gas and electric on one map, let alone telecoms and cable, some of which remains restricted.

  • Comment number 15.

    Simonm is correct. Utilities data is held by the utilities themselves. So for gas data that means either one of the 4 main GDNs (National Grid, Scotia Gas Networks (AKA Scotland Gas Networks and Southern Gas Networks), Wales and West Utilities and Northern Gas Networks), or it means one of the smaller Independent Gas Transporters. For electrical data that means your local "incumbent" (whoever owns the former local electricity board in your area), and obviously for water, it is your local water company(s).

    There are steps being taken to integrate this data, but it is a slow process. Of course the background mapping is OS, but uses the large scale mapping that is not likely to be made free.

  • Comment number 16.

    This is a great development, however as with all complex data issues, the public will only get excited by this 'geeks victory' when the applications start to roll out.

    Combining high speed mobile broadband and newly available public data will hail a new era for data freedom.

    [Unsuitable/Broken URL removed by Moderator]

  • Comment number 17.

    Your comment is great, Chris Mills. You almost said everything I wanted to say.

    Angela Dani,
    [Unsuitable/Broken URL removed by Moderator]

  • Comment number 18.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 19.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 20.

    All this user's posts have been removed.Why?

  • Comment number 21.

    I'm a little confused about this story, I better do more research before I finish reading this, thanks.

  • Comment number 22.

    While some of the rules were overly fussy, in general setting these sorts of standards at EU level makes a lot of sense. It is a shame they are weakening their stance here rather than just refining the rules to make conformance as easy as possible cheap uggs.

  • Comment number 23.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 24.

    I spent the beginning of a very busy week on this week talking about the benefits of making public sector data more accessible. I was speaking at the launch of public transit data in Brussels and a good public launch Auto Insurance Quotes, where the local public transport agency STIB made their schedule information available for use within Google Maps in Belgium.

  • Comment number 25.

    "At a lunch at Chequers, the creator of the web, Sir Tim Berners-Lee,"

    Sorry, but the internet as we know it, a place where text, images, audio, video, and other information, is shared through linked sites and pages was not his idea. We don't have that because of him. It is why the internet was created in the first place!

    Nothing personal against the man but stop giving him so much undue credit just because he is British. He is very modest about his accomplishments regarding the internet, and that is for good and honest reasons.

  • Comment number 26.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 27.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 28.

    This comment was removed because the moderators found it broke the house rules. Explain.

 

BBC © 2014 The BBC is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.