Flowing Data’s chart contest
This week, FlowingData has organized the contest. A chart was submitted, and contestants were asked to improve it.
A lot of my job revolves around reviewing and correcting graphs, so I was more than happy to compete.
Here is the original graph, hosted & designed by Swivel:
The rules of the contest stated that the new graph should use the same data. But instead of re-using the dataset hosted on Swivel, I checked the source to answer some questions I had.
Here goes:
|
Period |
Total |
Europe |
Asia |
Americas |
Africa |
Oceania* |
| 1820-30 | 151,824 | 106,487 | 36 | 11,951 | 17 | 33,333 |
| 1831 40 | 599,125 | 495,681 | 53 | 33,424 | 54 | 69,911 |
| 1841-50 | 1,713,251 | 1,597,442 | 141 | 62,469 | 55 | 53,144 |
| 1851-60 | 2,598,214 | 2,452,577 | 41,538 | 74,720 | 210 | 29,169 |
| 1861-70 | 2,314,824 | 2,065,141 | 64,759 | 166,607 | 312 | 18,005 |
| 1871-80 | 2,812,191 | 2,271,925 | 124,160 | 404,044 | 358 | 11,704 |
| 1881-90 | 5,246,613 | 4,735,484 | 69,942 | 426,967 | 857 | 13,363 |
| 1891-00 | 3,687,564 | 3,555,352 | 74,862 | 38,972 | 350 | 18,028 |
| 1901-10 | 8,795,386 | 8,056,040 | 323,543 | 361,888 | 7,368 | 46,547 |
| 1911-20 | 5,735,811 | 4,321,887 | 247,236 | 1,143,671 | 8,443 | 14,574 |
| 1921-30 | 4,107,209 | 2,463,194 | 112,059 | 1,516,716 | 6,286 | 8,954 |
| 1931-40 | 528,431 | 347,566 | 16,595 | 160,037 | 1,750 | 2,483 |
| 1941-50 | 1,035,039 | 621,147 | 37,028 | 354,804 | 7,367 | 14,693 |
| 1951-60 | 2,515,479 | 1,325,727 | 153,249 | 996,944 | 14,092 | 25,467 |
| 1961-70 | 3,321,677 | 1,123,492 | 427,642 | 1,716,374 | 28,954 | 25,215 |
| 1971-80 | 4,493,314 | 800,368 | 1,588,178 | 1,982,735 | 80,779 | 41,254 |
| 1981-90 | 7,338,062 | 761,550 | 2,738,157 | 3,615,225 | 176,893 | 46,237 |
| 1991-00 | 9,095,417 | 1,359,737 | 2,795,672 | 4,486,806 | 354,939 | 98,263 |
| 2001-06 | 7,009,322 | 1,073,726 | 2,265,696 | 3,037,122 | 446,792 | 185,986 |
|
187 Years |
72,066,614 |
39,346,127 |
10,525,281 |
20,082,410 |
1,075,980 |
1,036,816 |
* includes others unidentified by nationality
The FAIR (Federation for American Immigration Reform), who’ve published this on their website, also made a chart out of this data:

So let’s take a look at the data.
At first glance, it is very aggregated: data are not available per country or per year, but per continent and per decade. However, the last “decade” is only 6 years long. Also, Oceania includes all the unidentified immigrants. Immigrants from Africa and “Oceania” are a tiny fraction of the total flow so it would be difficult to draw a conclusion from their data.
So if I want to tell a story about this dataset, I would choose the following.
The total flow of immigrants to the USA has gone through major changes.
Looking at the composition of this flow: over 90% of the immigrants were Europeans at some point, but now that ratio is down to around 15%.
Now, for a critique of these two graphs.
Swivel’s:
- It’s not very telling to keep presenting those numbers aggregated by decade.
- Especially if the last decade is not corrected. All curves seem to dip, although the underlying variables are actually growing.
- You can clearly see the point where American immigrants take over Europeans (and later, when Asians do the same). But again, those absolute figures are not very interesting. You cannot see the share of the various continents to the total.
- The Africa and Oceania curve clutter the graph and bring little information.
- The fact that Oceania includes other countries is not disclosed (not that it would change the graph tremendously).
- To do this graph, they’ve annualised the data, which is a more sensible option.
- The year labels are difficult to read.
- The last column (2001-2006) is exactly similar to the others, which comprise 10 years.
- Again, Oceania and Africa don’t bring much to the graph.
- It’s very difficult to see the evolution of one given continent, except Europe.
Doing a matrix chart like this (several charts one top of the other, using the SAME SCALE, wich can be added vertically – and visually) is the textbook way of showing variables in such a way that one can see their evolution over time and their proportion in the total.
This kind of chart is not natively supported in Excel, so I’ve done it with processing.
(I wrote a program to make them in Excel, but will talk about that in a later post.)
It’s an interesting graph: it shows Europe immigration peak, then America taking off, followed by Asia. In the early 20th century, the Mexican revolutions caused much emigration to the US, this is the ripple in the graph.
But then, I thought it was too complex. Frankly, by glancing at it, you don’t get anything. You might learn information by examining it.
So I have done this one which I am going to submit.
And here I have my 2 stories in a much lighter graph.
The blue rectangles are the total immigrants. Various laws and events have shaped that curve, I first wanted to annotate it but I’ve decided against it. I just kept the Immigration Act which was in force between 1924 and 1965 and which largely explains the drastic drop in immigration in that time.
Without any other variable to compete with it, you can clearly follow its story.
Then, I’ve added the share of Europeans in all the immigrants. That’s another clear story: in the early 19th century, they made the bulk of the immigrants, but then, their share dropped sharply to around 15%. My guess, though, is that the shape of the first leg of this curve (from about 70% to over 90%) is due to the fact that many unidentified immigrants were really Europeans.
For the title of the left axis, I’ve chosen naturalization over number of immigrants or another denomination because most of the “immigrants” of the last few decades are really people already residing in the USA which get naturalized.
But that’s another contradiction in the dataset. In 1868, when the 14th amendment to the Constitution came into force, about 4 million former slaves became American citizens. They are not shown in the data. In 1924, the Native Americans who were not yet citizens were also granted citizenship. They too are not included int he dataset. However, since 1965, most “immigrants” are change of status migrants who were already in the USA. But then, we are to play with this dataset so that’s the best I could come up with.
Lastly, a few words about the design. I took some of the colours from a chart I really liked, by Viveka Weiley. In her chart she uses the MyriadPro font (guess she’s a Mac, but I’m a PC). I am using Frutiger which is quite similar.
Comments
Comment from Jon Peltier
Time December 11, 2008 at 5:15 pm
Jerome -
This description of the thought process is very informative. I try to follow this aswell in my blog, but I’m usually not this thoughtful.
Comment from Syed
Time September 20, 2011 at 12:07 am
Hi Jerome,
This is a very interesting data and studying this would be very beneficial for the immigration authorities and the students since it would give them an opportunity data visualization of number of migrants and study further immigration behavior.




Pingback from Winner of Tufte Books and Many Other Good Entries | FlowingData
Time December 11, 2008 at 8:00 am
[...] Jerome compared counts against Europeans as percentage of all immigrants: [...]