Open data and data journalism

Yesterday I attended a workshop organized by Etalab on data journalism. Since open data, data visualization and storytelling with data are my 3 work interests I could not just be found elsewhere that day.

Interestingly, while speakers and attendants were very much discussing the same subject, what was said (or inferred in questions asked) was very different. On some topics participants presented opposite opinions,  while on others there was a strong agreement.

Inspiration and enthusiasm

That was definitely the common denominator across presentations.

In short: visualization + journalism = win.

Every presenter, @dataveyes, Pierre Falga, @datastore, @sayseal, @we_do_data and @epelboin all showed are talked about things which were pretty awesome and which would have not been possible with data or visualization. While I was familiar with the other examples, I was most fired up by Fabrice Epelboin’s presentation of Tunisian media, and its dataviz gallery.

What was interesting was how it was easy to tell a memorable story with the support of data. I think for the picture to be complete you also have to include in the big picture the viewer’s assumption and the presenter/journalist narration. One example which was shown by both Caroline Goulard and Simon Rogers is the relationship between tweets and UK riots.

The unsaid assumption was that social media have helped organize the riots.

Facts in hand, in turns out that the bulks of the tweets related to a  riot happened after, not before, the event.  So the narrator help us conclude that riot caused tweets rather than the other way around.

Another example from
We assume that tertiary graduates have better job prospects than those with less education.

This isn’t the case in Tunisia where there graduates endure a 23% unemployment rate, while the rate for those who haven’t completed primary school is around 5%.
Comment by Fabrice Epelboin: the only thing left to do for them is prepare the revolution. I find this a very clear and rational explanation of the arab spring, in contrast with how television presented those events.

Is this difficult?

It requires work

And no one denies this. Cécile Dehesdin and We Do Data presented us their work process, from the original idea to the final piece. Cécile would stress more the usage aspects while Karen and François emphasized the benefits of illustration and aesthetics to the final result. They both tried to convey us the amount of time and effort it takes to achieve something.

and ressources… or not

Then Pierre Falga and Simon Rogers gave somewhat conflicting views of the inner working of a newsroom. While Simon Rogers depicts the process as relatively effortless and quick thanks to freely available tools, Pierre Falga’s views where that an online newsroom’s resources were very thin, which prevented most media from fully embracing data journalism. To nuance Rogers position and bring it closer to consensus, he argues that the work-intensive part is not the output proper, but rather the data collection, and like Cécile and Pierre he had his share of horror stories on this front.

Thank you, open data

All presenters were grateful for data being increasingly accessible through open data initiatives. Not all is rosy in dataland, however, as institutions here and there are not all excited about doing the prospects of spending their own resources to retrieve data for journalists – even in the case where they are legally forced to.
While data journalism obviously need open data, the reverse is possibly truer – that may be the motive for Etalab to organize the event. So far, official data portals haven’t proved to be directly useful to the concerned citizen, so it is those who are able to utilize those free data and turn them into attention-arresting stories that give them a purpose and demonstrate very visibly that the open data process truly benefits all.

Is there a demand for data journalism?

Presenters didn’t all address this question frontally but seemed to have mixed opinions about that. The guardian has been resorting to data journalism for over one century and gave no impression to ever have reconsidered the question. Others in the rooms, including attendants, had less faith on the matter. Pierre Falga and Eric Mettoux from admitted their share of responsibility as that demand is largely dependent on the supply of quality material from existing media.

More fundamentally, I see that the mix of data visualization and communication is commonly referred to as data journalism which may be a slight over simplification.
Why would the task of communicating with data visualization be restricted to journalists or media? Companies and government agencies alike have considerable budgets devoted to communication. IMO they should be the ones driving that effort. To a curious audience, that is, to the people who are actively seeking information on a certain topic, data visualization answers can be insanely more powerful and cost-effective than classic communication tailored for a more passive receiver.