There are virtues to an illegible chart

It all started with an extreme pie chart

A few weeks back, there’s been a chart on aid who’s made the rounds of the internet:

All US ODA by recipient, 2004-2008, OECD data, taken from USAidWatchers.com

What this chart shows is that US aid is concentrated in a few countries. The article explains that this is a result of the 3D doctrine, which ties development with diplomacy and defense. This is why US gives so much to strategic countries like Afghanistan, Iraq, or Sudan, but relatively little to India – highlighted in the chart, which has “a huge chunk of the world’s poor”.

When I saw that chart I was planning to create a chart or a data visualization on the same subject for my work. The original chart was being heavily criticized for its form, because half of it is not legible. Chart purists don’t like pie charts for that very reason – they are difficult to read, especially if you add more items. But I found the chart interesting. It states in a very striking way that more than a hundred countries in the world get next to nothing from the USA.

An apology of  extreme charts

There are virtues to an illegible chart. In fact, I don’t believe that a chart should give equal prominence to each and every of its datapoints. In most cases, it’s here to support a story, so all it should do is bear a message. Tufte popularized the notion of data-ink ratio, which states that a chart designer should use the largest share of ink to represent data, not everything else. I feel this is taken too literally by many.

There is a tradition of extreme charts which purposely break presentation rules because of the very nature of the subject they are plotting. A famous example is Al Gore on his lift – if CO2 emissions hadn’t increased so much, he wouldn’t need that lift to show his chart.

 

Al Gore on his lift, the most memorable image of An Inconvenient Truth

Another one – from the NY Times, one of the charts that Matthew Ericson showed in his Infovis 2007 keynote speech:

In perspective: America's conflicts. NY Times

Click to see the full image - it is big. I really like this chart.

Again, if the number of US soldiers killed per month had not been so high in WWII, the 2nd group of bars wouldn’t overwrite the text above and sky-rocket to the top of the page. The logical thing to do would have been to scale the chart so that the maximum values would fit in a well-delimited space, and maybe use a logarithmic scale so that the values for other wars would remain legible. That’s how we would have done it if we had to plot that kind of series in an OECD book. The fact that the NYT designers chose, on the contrary, to let the data rise all the way to the top of  the page expresses in a very powerful way the extreme nature of the WWII casualties.

“A conventional chart couldn’t hold all that horror”, the chart seems to say. Likewise, if CO2 emissions had grown more steadily over the past couple of centuries, Al Gore wouldn’t have needed a lift. By the same token, if aid values to about 100 countries were more than negligible, they could be seen on that chart. So granted, there could be more academic ways to show that, like a giant bar chart with values too small to see for all but a handful of states. But all in all I think the original pie chart does a good job in communicating that in a nutshell,  ad absurdum if you will.

My take on the chart

I wanted to work on a specific subset of aid data, that which goes to fragile states, which are, simply put, the 43 countries in the greatest need of aid. Now official aid from developed countries, like US aid, is very concentrated, meaning that only 10 of these countries got more than $1b in 2008. Only 10 countries got more than $100 per capita in that year.

Another interesting aspect of the data is that for many of these countries, aid only mostly from one or two donors, so they are vulnerable to a policy change in that country. That’s what I wanted to show in the representation.

 

 

Playing with Tableau contest datasets

I’ve played a bit with the other 3 datasets of the Tableau Public contest. When I get to see what others have done, it will be easier to take something from that after having manipulated them. The one I’ve spent most time with is the US budget spending one. Here’s the sheet I came up with:

(if the viz doesn’t show in the blog, here’s the direct link)

a few explanations:

Unit: % of GDP

The dataset covers almost 40 years, and includes a notion of inflation. But even with that it’s too difficult to compare spending over time. Instead of trying to convert everything to 2009 constant dollars, it’s easier (and it makes more sense) to compare everything as percentage of GDP.

Filter: by function

The original dataset lists over 30 departments. I don’t think they are immediately comparable as is, some being much bigger than others. Besides, it’s just too complicated to ask people to choose between 30 items to make comparisons. So, instead, I grouped several departments by function, as defined by the COFOG (classifications of functions of government, a UN classification). To be honest I wasn’t extra careful when I assigned some departments to a function, for instance Veteran Affairs could have been assigned to Defense or to Social Protection (I chose the latter).  But the assignments are fair. The added bonus is that using functions enables us to make international comparisons:

Comparing with OECD values

Not too long ago I made a chart comparing OECD countries’ budget expenditures. So what I didn’t like about this dataset is that it didn’t give a way to determine whether US spendings in such or such area were high or low. From the dataset proper, one can tell that, for instance, that social protection expenses were never as high as in 2009. But are they really “high”? Or – defense expenditure were at an all-time low in 1999. But were they really low?

Comparing with other values help answer those questions. To continue on these 2 examples, social protection expenditure, in 2009, was 7.2% – a much higher share than in 1965 (3.9%) but still very low compared to OECD countries – the average being 15.2%. Conversely, defense, in 1999, only represented 3.1% of GDP – it was as high as 9% during Viet-Nam, and it’s almost 5% today. Meanwhile, the OECD average is 1.4%.

Again, that comparison is not very scientific, because the numbers used for those OECD averages include other levels of government (states, cities…) which are not included here. But still, they help putting the dataset in the context.