At the recent VisWeek conference, Jessica Hullman and her coauthors presented “Benefitting Infovis with Visual Difficulties (pdf)”, a paper that suggests that the charts which are read almost effortlessly are not necessarily the ones that readers understand or remember best. To answer that claim, Stephen Few wrote a rather harsh critique of this paper (pdf). As I read this I felt the original paper was not always fairly represented, but more importantly, that the views develop by both parties are not at all inreconcilable. Let me explain.
What is cognitive efficiency, or “say it with bar charts”
For quite some time, we were told that to better communicate with data, we had to make visuals as clear as possible.
The more complicated way of saying that is talking of “cognitive efficiency”. By reducing the number of tasks needed to understand a chart and simplifying them, which is sometimes called reducing the “cognitive cost” or “cognitive load”, we improve all virtues of the chart.
For instance: bar charts are easier to process than pie charts, because it’s easier for the human eye to compare lengths than angles. So, with equivalent data, bar charts have a lower cognitive cost than pie charts. Likewise, bar charts which are ordered by value (smallest bars to largest bars) are easier to read than unordered ones. Ordered bar charts have an even lower cognitive cost than unordered ones.
Conversely, adding non-data elements add extra tasks for the reader and increase cognitive cost. These non-data elements have been reviled by Edward Tufte as “chartjunk”. His data-ink theory says that out of all the ink used for the chart, as much as possible should be devoted to data elements. Again, this goes in the direction of data efficiency.
Engagement rather than immediacy?
Again for quite some times those rules were held to be universal. Yet, several tried to challenge them, the latest being Jessica Hullman in her paper “Benefitting Infovis with Visual Difficulties“. This paper was so thought-provoking that it received an honorable mention at the recent IEEE Information Visualization Conference 2011 (as a note to the non-academic reader, this is quite a competitive achievement).
New information visualisation techniques are often evaluated. This paper argues that such evaluations typically consider response time or accuracy, and not how well users are able to interpret and remember visuals. When only the former criteria are taken into account then cognitive efficiency is the superior framework. But this is not the case of data storytelling (which is, arguably, a small subset of all data visualizations).
When visualizations attempt to transmit a message, then how well users can receive this message, as well their capacity to remember this for a long time are of utmost importance, much more than the ease with which a visualization is read.
In that case, Jessica Hullman proposes a trade-off between cognitive efficiency and “obstructions”. The idea is that such obstructions, or visual difficulties, can trigger active learning processes. In other words, if when trying to read a chart, a user doesn’t understand it effortlessly, but is somehow willing to get to the bottom of it, she will apply all her active brainpower to it. This effort surge will lead her to not only better interpret it but also to better remember it. To sum up, these obstructions can have positive effects, this is why when this effect works, they are called desirable difficulties.
Desirable difficulties are tricky, because if the “obstruction” is too large, if a small additional effort is not enough to understand the chart, then it will not work. So, this is definitely not about maximizing the difficulty to understand the visualizations.
In the recommendations parts of the paper the authors say:
Instead of minimizing the steps required to process visualization, induce constructive, self-directed, cognitive activity on the part of the user.
This doesn’t mean that anything goes. This paper does not argue to add as many difficulties as possible, to use every gratuitous effect in the book. Instead, the paper goes on to give actionable design suggestions to enhance reader stimulation and active information processing.
In my practice, for instance with the Better Life Index, I verify the analyses of the Hullman paper: the novelty of the form and the aesthetic appeal of the representation drive the users to overcome the difficulty posed by the unusual shape of the flower/glyph. Would bar charts have conveyed the data more efficiently and more accurately? Definitely! would the user engagement have been comparable? Definitely not.
A critique by Stephen Few
Stephen Few, whose work I have praised at multiple occasions in this blog, has published a critique of this paper (pdf). Reading his article, then the paper again, I had the feeling that they didn’t talk about the same things. In certain contexts, difficulties are not desirable at all and must be eradicated. Yet, in other contexts, cognitive efficiency does not provide the optimal solution.
For instance, Stephen writes:
Long-term recall is rarely the purpose of information visualization.
Fair enough! so let’s agree that when it is not the case, we should not trouble ourselves with seeking to add obstructions to the display. For instance: business intelligence systems, dashboards (for monitoring), visual analytics (and more on this shortly). Spreadsheets, mostly. All usages of data that support decision, and most usages in the corporate world. The Hullman paper only applies in the other cases anyway.
He would also write (emphasis by me):
Skilled data analysts learn to view data from many perspectives to prevent knee-jerk conclusions based on inappropriate heuristics.
Agreed! and by all means, let them analyse and let them view data from as many perspectives as they see fit, and don’t get in the way of their job.
This here is taken from a demo from Palantir government. Here analysts are tracking mortgage fraud. Each yellow dot on the top display is a transaction where a house has been sold for over 200% of its purchase value, and the ones which are connected are about the same house. We can immediately see 2 suspicious clusters where a property has been resold 4 times in these conditions. And if at the end of their work day the analysts don’t remember the address of the fraudulent transaction, it’s no big deal as long as they have identified a wrong practice.
Conversely, at the risk of repetition, the paper authors write of a trade-off between efficiency and obstructions – cognitive efficiency being generally positive. They say that obstructions become desirable difficulties only if they are constructive, that is if they are able to trigger active information processing. They are not championning 3D pie charts or atrocious dashboards as the one at the end of Stephen’s article. Jessica signals that novelty enhance active information processing. I don’t know how to characterize speed dials in dashboards, for instance, but novel would not be the word I’d use, and again they wouldn’t be favoured by the authors of the paper. So, I think it’s a bit unfair to associate the paper with the terrible, terrible visuals presented in Stephen’s article, the ones in the original paper being a little bit more defendable.
Consider this other chart (and let’s assume for the sake of discussion that its cognitive cost is low, while it could be much lower by showing fewer time series for instance). This was published in an OECD publication almost 2 years before the 2008 crisis. I would say this chart is easy to read (we see mortgage delinquency rates dropping in most countries) but difficult to interpret and to recall. Like other charts of the document, this one is an oracle of financial apocalypse, as the proportion of delinquent mortgage in the US, the only one without a downward trend, will have the consequences that we know. So if a different way of showing the same data could have made that more obvious at the cost of legibility, I think it would have been worth a shot.
Are we on common ground yet?
If not, let’s assume now that there exists visualizations where long-term recall is, indeed, the main purpose. Examples would include use in journalism, politics, advocacy, marketing… Jessica has been involved in the series of workshop Telling stories with data at VisWeek. This suggests an interesting distinction.
- visualizations which are tools with which a user accesses or manipulates data.
- visualizations where an author, with a specific intent, tries to frame data in a certain way to an audience. In that case, the author wants to make sure the audience receives the message as intended, and remembers it.