bbc.co.uk Navigation

Rory Cellan-Jones

Sir Tim's cry - "raw data now"

  • Rory Cellan-Jones
  • 12 Jun 09, 10:07 GMT

When I heard Gordon Brown reveal the name of his latest celebrity appointment in the Commons, I was intrigued. Sir Tim Berners-Lee, now charged with helping free up government data for all to use, is not a political animal. Had the founding father of the web realised, 3,000 miles away in Boston where he is based, that he risked becoming embroiled in Westminster politics? After all, it took only a few hours for Sir Alan Sugar to come under attack after his appointment as the government's Competitiveness Czar.

Tim Berners-LeeBut when I got hold of Sir Tim on the phone yesterday, he was adamant that this was not some party political job, but part of a grand global mission. He pointed me towards a speech he'd made at the TED conference in California about the issue, where he stressed the importance of setting free all sorts of public data, part of his continuing efforts to reframe the web as a tool for interpreting numbers as well as words.

"I had the audience chanting 'raw data now'!" he said. I listened back to that talk and yes, Sir Tim did manage to set a chant echoing through the crowd, though even a roomful of Californians had to be jollied along before playing ball.

The unworldly scientist became a lot more circumspect when I asked him for concrete examples of the kind of data that might be released, sidestepping my question about putting MPs' expenses online.

He was also unwilling to concede that the UK might have a poor record of releasing data so far. He produced an anecdote about the Department of Transport putting out data about bike accidents, which was swiftly transformed by an outsider into a map, and said he saw some enthusiasm within government which he would encourage: "It hasn't been clear how to do it, exactly what format to put it out in, how to index it and keep track of it."

Sir Tim encouraged bloggers to come up with ideas for the kind of data they would like to see made available. He may soon find himself swamped with suggestions, and then he will have to tackle the Whitehall bureaucracy, and some thorny questions about privacy, security and the commercial value of some public data. What, for instance, will be his line to the Ordnance Survey, which has clashed with "free our data" campaigners over its wish to continue making money from its mapping data, rather than simply putting it out there for anyone to exploit?

Now some cynics might suggest that the government needs no lessons in releasing valuable data into the wild, after all those stories of missing laptops, lost discs and USB sticks. But will the cry "raw data now" resound through the civil service, with Sir Tim leading a chanting crowd of bureaucrats through Whitehall? "We'll see - listen carefully!" was the web creator's advice. But I fear he may be in for a bruising few months, as he tries to convince Sir Humphrey et al to let it all hang out.

Still, we've at least taken his message onboard. I've posted our entire phone interview here, apart from the bit where I asked Sir Tim what he had for breakfast to check sound levels. Our little contribution to freeing up public data.

In order to see this content you need to have both Javascript enabled and Flash installed. Visit BBC Webwise for full instructions. If you're reading via RSS, you'll need to visit the blog to access this content.

Comments

  • Comment number 1.

    It is crystal clear that we are entering an Information Century. That the Internet, these new revolutioniary Platforms like Facebook and Twitter have afforded the Citizen a great deal more Power, they can conjoin, scale and turn their whisper into a roar. Moldova was an excellent article and we are seeing the ripple effects all across the Globe. The Political class is indeed feeling the Citizens hot breath on their collar. And this dynamic has a momentum of its own. Its practically a Tsunami wave. Mr. Berners Lee is indeed tasked with managing this change. You cannot staunch it, thats for sure.

    And I do think the dynamic between the Rulers and the Ruled is being altered dynamically. There is a tide of History and this is it.

    Aly-Khan Satchu
    www.rich.co.ke
    Twitter alykhansatchu

  • Comment number 2.

    I'm amazed the person using the cycle data on a map hasn't been taken to court by Ordanance Survey for using 'derived data' on a Google map. After all, if you've even looked at or even dreamed about an OS map then any address data is classed as 'derived' by OS. By putting this 'derived data' on a Google map you're effectively sharing copyrighted OS data with Google and breaching copyright. This is the type of backward thinking that needs to be stopped in its tracks. It's stifling innovation.

  • Comment number 3.

    It is amazing that someone has been given access to accident data that could be mapped, it is a flagrant breach of the t&cs that surround its use. What next - mapping crime statistics? Where will it end, voters might get to see where they are getting a good bang for their buck on all sorts of community spending.
    This will just not do.

  • Comment number 4.

    It is something I have been wondering might happen in journalism - if all non-confidential materials used in preparing the article would be attached to the article itself. This means that if the reporter is deliberately skewing comments out of context it makes it easier for them to get caught/corrected. It could be seen as analogous to how scientists try to make all data they use available to help replication/confirmation of their work, and references to all the previous papers their work is built on.

  • Comment number 5.

    The test for this is the Ordnance Survey; if the data they collect were released free to use, just as the US equivalent's is, we know the government are serious about this, if not, it's just empty rhetoric.

  • Comment number 6.

    What about the Free Our Bills campaign ( http://www.theyworkforyou.com/freeourbills/ ) that's been campaigning for ages to try to get the parliamentary bills put online. They reckon that would at most require one person working full time to do that. If the government can't stump up for that - providing access to the documents that are the cornerstone of our society, then I can't believe they're very serious about this.

  • Comment number 7.

    if ... the government are serious about this, if not, it's just empty rhetoric.

    On past and present evidence... the very idea.

  • Comment number 8.

    We have to start with the Ordnance Survey. As long as they are tasked with making money rather than providing a public service nothing will change.

    The Guardian have been campaigning about this for years.

  • Comment number 9.

    You can't say "free our data" without saying what you want it freed for. Many organisations are campaigning for free data so that they can derive products from it which they then sell. Fair enough if they are UK companies paying UK taxes, but competition law says that the data has to be also made available to non-UK companies - net return to the taxpayer potentially zero. It is actually fairer and more transparent to make data available on reasonable terms to all comers. This was the premise of the Reuse of Public Sector Information (RoPSI) directive. Problem is too much is excluded.

    So the question becomes are the Ordnance Survey charges reasonable? I am with most other posters on that one.

  • Comment number 10.

    maybe I'm simply too cynical but firstly he's "Sir" Tim Berners-Lee, ie. a "safe hand" who's unlikely to rock the boat, and secondly, when he stresses "..the importance of setting free all sorts of public data.." he's not saying setting free all data -- big difference.

  • Comment number 11.

    do read "blind faith" by ben elton. once you have this data available, it will only be mis-interpreted.

    So there are a lot of motorbike accidents on my street. is it an unsafe street, or are there just a lot of motorbikes? (or one person particulatly non-adept on his choppper?)

    recent example - crassly misquoted as research is for journalists.

    "Red wine stops cancer"

    or maybe this could be "people who drink an occasional glass of red wine are more likley to have a healthy/active lifestyle than others..."

    or "people who drink red wine aren't drinking beer - which may or may not give you cancer" (as i said - i'm making this up)

    evergrowingbrain.blogspot.com

  • Comment number 12.

    The problem with "freeing data" is a matter of ownership. It sounds wonderful when it is data that is nothing to do with yourself, but becomes a different matter when you are involved or are within that data to some extent.

    The problem is that we have never defined clearly the whole idea of Privacy - at least not in terms of how that relates to the internet. Without stating the complete obvious, the internet is completely different to all previous forms of data communication: unlike phone, mails, fax and so on, the world wide web in order to COMMUNICATE data has to STORE data.

    Of course, I suppose this makes in intrinsically very inefficient (which it is). But more importantly, if you wish to use the internet, you have to be prepared to give personal information away to people with whom you have no contract.

    To be clearer:

    If you use and old fashioned telephone in the UK, you had a contract with the General Post Office to supply a physical line and the telephone on the end of it. Your dealings with them were direct and confidential. When you made a phone call, you were physically connected to the recipient mechanically, and it was not possible to store that transaction. The only data recorded was the distance of the connection and time taken, and that was only connected by the company with which you had voluntarily contracted with.

    With the Web, you have a contract with your ISP, but once the data leaves their network, it is in territory completely outside your control. The data is stored physically on data storage devices around the world as connections are made and a copy of your data (not the original) is eventually delivered. Hopefully the various copies of the data created in the process are then deleted - but you have no control over that what so ever. It is a trust issue.

    This type of data communication, where to be transmitted the data is stored and copied, is now affecting more and more communication types, including the telephone.

    As a user, you have to basically wave all your privacy rights, or at least control over your privacy, or not use the systems at all.

    So, our privacy is not defined, and our control over our data cannot, technically, be assured.

    We now have this mad situation where governments (theoretically our servants) demand the right to collect our data beyond the control of our privacy.

    In return, with freedom of information, we demand the right to expose that data to the public.

    So not only have we no control of our data, but we are complicit in exposing that data, or at least the effects of our private existence to the general public.

    Now, obviously, there is in most of this the safety of anonymity. But remember that us being anonymous is not something automatic or implicit in the data processing, but something intentional that has to be written in.

    One way or another, we are throwing away the idea of a private existence. So much so that to attempt to have one is questioned and seen as suspicious not just by the authorities but by society as a whole. To be private, the best thing is for your family to live a hermit-like existence, self sufficient, without any form of modern, storable communication, using cash transactions, home education ... you get the picture.

    But even then, a government employee will come and ask you to register your hermit cave and family, because "this is important for the government to know."

  • Comment number 13.

    The point about organisations campaigning to get "free" data to derive products that they can sell is a good one - there are already examples of organisations in the UK (both of the "not for profit" kind and commercial entities with whom the former compete in some arenas) who are either vocal about or (apparently) surprised by having a cost of sale! Welcome to the real world! Screen scrapers in the UK and offshore are already harvesting unstructured data from diverse sources both government and commercial to create new services (often of great social value) and products - where these are offshore (and the information economy is more subject to this than some might realise) there is little or no net return.
    It call comes down to what we understand by "free" - it seems that in this and other cases it is actually being used as a verb, in the sense that data should be accessible, be understood through metadata, be accompanied with details regarding the terms under which that data can be utilised. The term is not being used to mean "at no cost".
    Ordnance Survey may be the most high profile target for the lobbyists and hobbyists but there are myriads of other arms of government that hold far more data of far higher value than the Ordnance Survey - local authorities are a prime example but there are many others - LEAs, PCTs, SHAs, RDAs, research councils, executive agencies and central government departments, not forgetting organisations in which government has a controlling interest (and that includes some financial institutions as I recall!). With even Free Our Bills struggling to be delivered upon, the release of metadata describing data from such genuinely fully tax payer funded parts of the government machine would at the very least allow all of us to begin to understand what it is that the government collects, how it collects it and make decisions about whether it will be useful to us and if so understand the terms on which we can then use it.
    The UK Location Strategy and the POIT report have both in their own ways mooted a central repository of government data. Not only will this fall foul of civil service inertia, data sharing and privacy concerns but runs the real risk of being a central government IT project carrying with it all the baggage and cost that that entails. Mandating the publishing of metadata by all agencies kicks that into the long grass while satisfying public pressures to reform and perform.
    On a related matter, those who follow the US may be aware that YourStreet recently dropped the use of maps from their hyperlocal news site, presumably because they weren't worth it (i.e. they do cost money) and newsfeed aggregation doesn't an LBS application make. Maps can provide context and do help in understanding analyses in space and over time but if you can't get the content to undertake that analysis and generate value to you and your target market then, as they say in Ireland, "I wouldn't be starting from here".
    On this basis I would hope and expect that Sir TBL will focus on delivering the promise of the verb rather than the hollow rhetoric of the adjective. In many cases the latter may well follow as a matter of course.

  • Comment number 14.

    You might also be interested to see Rufus Pollock’s original post on this on the Open Knowledge Foundation blog [1] - which Tim Berners-Lee cites as the origin of the “Raw Data Now” meme [2].

    [1] http://blog.okfn.org/2007/11/07/give-us-the-data-raw-and-give-it-to-us-now/

    [2] http://www.w3.org/2009/Talks/0204-ted-tbl/#(34)

  • Comment number 15.

    [Unsuitable/Broken URL removed by Moderator] On their behalf, if that was true, I hope they take a civil action against him.

 

The BBC is not responsible for the content of external internet sites

BBC.co.uk