Data journalism: What’s new, what’s not, and work in progress

is a freelance journalism trainer and consultant, specialising in data journalism

Man of Steel: data journalists don't need to be superhuman

If you’ve heard about data journalism but haven’t worked out what it is yet - or, more importantly, what it can do for you - there are several ways to get started. You could go on a course; you could research some basic techniques on the web; you could read an introductory book on the subject.

If it’s a book you’re after there’s no shortage. A quick search of a well-known online bookstore found more than a hundred books related to data journalism. (If you think that’s good, a search for ‘big data’ - one of the buzz phrases of 2013 - produced more than 5,000 entries in the catalogue.)

Into the mix comes a new book edited by John Mair and Richard Lance Keeble entitled Data Journalism: Mapping the Future, which comes out in January. Like the Data Journalism Handbook it is made up of around 20 self-standing chapters written by experts in various subfields of data journalism.

Invited to get a preview of some of them, I plumped for Data Visualisation: Now for the Science, by Jacqui Taylor; A Beginner’s Guide to Data Journalism and Data-Mining/Scraping, by Nicola Hughes; and Data Journalism Workflow: Confronting the Myths, by Paul Bradshaw.

In her chapter, the Times data journalist Nicola Hughes offers some very helpful thoughts on coding based on how she came to it herself. There are no technical lessons here, but perhaps a reassuring comparison - for the uninitiated - with something much more familiar. The power of coding, Hughes concludes, is that “you are not building things from scratch but reusing code others have packaged for you.” She’s talking about ‘libraries’.

“A good analogy is apps for smartphones. A smartphone has very basic functions but becomes incredibly useful once you start installing apps. The same is true for coding and coding libraries.”

As well as his necessarily short piece in this book, it’s useful to know that investigative journalist, academic and blogger Paul Bradshaw has written what many regard as the best introductions: ebooks Scraping for Journalists and Data Journalism Heist.

This time he starts by challenging the “myths” that data journalism is resource-intensive (it can save resources); it only ever involves lengthy digging (it can react quickly); a data journalist needs to know everything from scraping to data visualisation (they’re not superhuman).

He identifies a “cultural battle” between journalists who want to “own” a story and the “open” culture of collaborative online networks - like the Guardian’s.

And in his exhortation to “lower the barriers to collaboration and seek out collaborators”, Bradshaw, like Hughes, recognises the familiar: “Think about communities of people who might help you do better journalism.”

That might be the ScraperWiki mailing list if you need data scraping, or the forums on Stackoverflow if you’re “particularly geeky”. It might also be the person who handles data at your local fire service, or a local university statistician who can help you validate data. “Contacts have always been vital in journalism, and data journalism is no different,” Bradshaw says.

As for sticking to flow charts, ‘heuristics’ and other touted data journalism workflow models - well, why not? Even if they’re work-in-progress, visible systems help journalists identify ‘blind spots’ and think hard about the way they’re working. “Ultimately, that’s key,” Bradshaw argues, “because we haven’t yet worked out what best practice is.”

You might think there is little that is familiar to most of us in the science behind data visualisation - except that it’s all based on human behaviour, writes Jacqui Taylor, and particularly the habits of ‘Generation Z’ (those born after 1993).

That’s because, like the previous generation (‘Y’), they have strong visual abilities - but combined with advanced movement and touch (kinaesthetic) capabilities. Taylor predicts increased use of interactive visualisation by journalists to engage this audience, and some of the visual structuring patterns she outlines are pretty, well, scientific.

Take heart then from one intuitive approach she describes as the “progressive reveal technique”. This allows the data story to be told in stages, “rather like the traditional journalism techniques… headline, byline, body text etc”. And it has “stickiness” averaging eight to10 minutes viewing for web-facing visualisations, claims Taylor, who’s seen the future - and it’s visual.

“Data is a language which only a few will learn but we can use data visualisation to communicate this language of data to a global audience,” she concludes.

Not a lot of this is new, of course, apart from the ability of computers to do the number crunching for us. One of my favourite books is The Visual Display of Quantitative Information by Edward Tufte. First published in 1983 [not 2001 - corrected], it includes discussion of one of the most famous infographics ever published: a chart showing Napoleon's army's strength as it marched to, and retreated from, Moscow in 1812. The graphic, by William Playfair, dates from 1869!

But if you are looking for an entry point, I’d say the two guides to begin with are the Data Journalism Handbook and the Centre for Investigative Journalism’s Data Journalism. Both are free if viewed online. Both provide a great practical start in their field.

So where does Data Journalism: Mapping the Future fit in among all the other books on offer? I’m not sure. It’s unclear who the audience is and the multi-contributor genre - while covering many aspects of the field - leaves little room to go into depth. A launch event being hosted by the Media Society in January may take the debate further.

But, to return to the question of how to get into data, there’s another way which several contributors refer to in this latest book: the best practitioners simply learned by doing. They found out what they needed to know about data analysis/data mining/scraping/statistics/visualisation when they needed to know in order to get the story. Sound familiar?

Data Journalism: Mapping the Future? is published by Abramis on 3 January.

Other College of Journalism blogs by Jonathan Stoneman

Data blogs by Michael Blastland, creator of Radio 4’s More or Less programme

Reporting numbers: Chance and patterns

 Reporting big numbers