Finding truth and beauty in data
- 29 June 2011
- From the section Technology
"Information wants to be free" has been one of the rallying cries of geeks, digital activists and hackers since the earliest days of the net.
That's free in the sense of not costing a penny and in the sense that it is always looking for ways to escape. It wants to get out from the databases and drives where it is stored and mingle with the web users of the world.
For a long time the information being freed online has been the expertise of web users. Many, many communities have sprung up around the places where information on all manner of subjects is shared.
Increasingly, the information finding its way on to the web is the raw stuff, the numbers, the data.
Some of that data comes from day-to-day use of the web but many organisations - local and national governments, corporations and web firms - are making huge stores of it available to anyone and everyone to play with.
Even better, the data visualisation tools that can manipulate and present that information are getting easier to use and available to anyone.
When it comes to data, visualisation means turning those raw numbers into graphs, diagrams and animations.
"There's a strength to visualisation because if you showed the data as a series of numbers it wouldn't mean much," said Dr Martin Austwick, a research fellow at UCL's Centre for Advanced Spatial Analysis who uses data visualisation techniques in his work.
In one project, Dr Austwick has been visualising data generated by users of Boris bikes in London to map the ebb and flow of the hireable bicycles around the capital. It is, he said, a great example of the complex data sets that people are starting to visualise.
The data involves about 400 bike stands that log when a bike is picked up or dropped off from them. They know which bike has gone where so estimates can be made of the route a user takes between two stands.
Understanding the dynamics of this any other way than visually would be impossible, said Dr Austwick.
Animating it, using circles glowing blue and red to represent activity at bike stands, makes it almost symphonic. It lays bare one of the dynamics of London life.
"We're trying to find patterns in the disorder and map the underlying ebbs and flows of the city," he said.
A faithful convert to data visualisation is Andy Kirk who has spent much of his working life in operational research roles. That job typically involves gathering data about a process or procedure within a business then carrying out exhaustive analysis to find better ways to do it.
When he took up a job in academia, Mr Kirk started questioning how he presented the data he was analysing and started looking for tools to help do a better job.
He was lucky because at about the same time that the software tools to do data visualisation, many of them open source and freely available, were starting to appear.
For him those visualisation tools often provide a quick route to understanding what data has captured. The best visualisations were a mix of art and science, said Mr Kirk, and use the aesthetics to lay bare what would otherwise stay concealed.
"It brings patterns to light that would not otherwise see," he said. "It's about making data accessible but that does not necessarily mean simplistic."
Data visualisation can mean that the facts buried in a data set become unearthed, no matter how unpalatable they are.
"You can obscure truth with statistics," said Mr Kirk, "but the beauty of visualisation when done correctly is that it brings out the true pattern that can be uncomfortable to interpret."
On that theme, Mr Kirk is helping to judge a competition that aims to find the best way to visualise data about the ethnic breakdown of students at Britain's universities.
The competition came about via a Tweet from data visualisation guru David McCandless who did not have time to do the job himself. The winners of the competition will be announced on 4 July.
And it is in the social and political aspects of data visualisation that its real value emerges.
"If you make these tools and data available to a broader range of people you are just going to get better ideas," said Dr Austwick.
Given that a lot of the data becoming available is from official sources, those better ideas could have a lasting impact.
"Long term, if you can analyse these systems and understand them a lot better we can have policy improvements that make them work better," he said. "These are things that affect a lot of people, it's about quality of life."