Getting to “Hello world” with d3

Back when I started learning programming, it was always fairly simple to achieve the canonical first step of accomplishments, that is, to get the system to announce that you are ready to do more by displaying “hello world” on the screen.

In most systems then, there was a command prompt somewhere that would usually do that when you would type, say:

PRINT "hello world"

.

Things have changed a lot since the early 80s. In some fields like fashion, I would argue it’s a good thing, but we’re definitely not going in the way of less complexity.

Now if you’re interested in web-oriented visualization and want to do it with d3.js, it’s still fairly simple, but it is built upon a number of technologies that you’re supposed to know a little. Front-end developers live and breathe the web and have been exposed to all things javascript, HTML, CSS, you name it, in enormous doses. Many developers probably have, at some point, tried to interface with the web and know enough of that to get started. So for this crowd, the amount of things you need to know to crack d3 code seems negligible, because they know all that and they are very familiar with it, just as well as people knew the first names of Friends characters by the end of the tenth season.

But what about those who didn’t? and the people who don’t see themselves as developers ? do they have to reimmerse themselves in 10 -odd years of web development history to get started? It turns out that this sum of knowledge, while not insurmountable, is certainly not trivial.

So without further ado, let’s get started

We’re cooking an omelette

And when we do, we need a few things: a pan, a recipe, eggs and stuff, a stove and then plates, knives and forks, etc.

The pan: a text editor

The first thing is really the pan. If you don’t have one when cooking eggs, you borrow one or go buy one. In our analogy the pan is the text editor. This is the tool with which you are going to make the files that will constitute your visualization.

There was a time when it was ok to use notepad (textedit if you are of the apple persuasion). And it’s still possible, but you are not making your life easier. What I recommend instead is that you get a hold of a copy of SublimeText2. (http://www.sublimetext.com/2). There are windows versions. And Mac versions. And linux versions. For windows users, there is a mobile version so you don’t need administrator access to install it. There is a free, unlimited evaluation version,  but unless you can’t spend $69, I strongly recommend that you buy it. Sublime Text 2 has a nearly infinite amount of niceties built in. And unlike some other powerful text editors, where the best features are only understandable by the tech masters, what’s really nice about Sublime Text 2 is that it would make you gain time even if you are an absolute beginner. One such nice things that it does is detect what language you are working with, automatically color and format the words as you type them depending on the category they fall in, and when possible, suggesting the word you are trying to type, automatically format and indent your code, all in a very unobtrusive and pleasant way. This will really help you troubleshoot problems like strings not closed properly or loose closing bracket which typically consume a lot of time.

Let’s type a fairly common d3 statement to see how SublimeText2 can help. First, it recognized the var keyword as such and writes it in italics and cyan. Second, when I type my opening parenthesis, it adds a closing one, and as long as my cursor touches either it underlines them both.

Let’s carry on. The function keyword is highlighted in italics cyan too – useful. The opening/closing thing works for curly braces too.

The return statement is highlighted in red. With the cursor on the closing parenthesis, we are starting to get a feel that the underlining function is a useful safety net

New line. Joy! the indentation is aligned with the line above.

We now have four consecutive opening or closing curly braces and parentheses. Typically, this is where errors sneak in, and where sublime text 2 really shines.

And we now have 5 consecutive closing curly braces and parentheses. This is fairly common in d3 code. Is the order correct? Thank you Sublime Text 2!

we finish up writing the statement.

When moving the cursor to the left side, where the line numbers are, we notice down-pointing arrows. We know our code is correct, and we don’t want to see it again, so…

we just click on the top one to collapse this section. If we need to edit it again we can expand it.

Finally, we add a comment above. Notice the syntax highlighting, comments are colored with an unobtrusive dark grey.

The recipe: a basic file structure

In d3, you can’t really type a “print” command from a prompt. You need to write some files, which are loaded by a browser (that’s your “plate” in the metaphor, but let’s not get ahead of ourselves).

You are going to need up to 5 types of files.

First, an html file. This will be the file that your browser will read, either locally, or uploaded on a website. We’ll get to cover this in detail in a minute.

Second, believe it or not, you are going to need the d3 library, which is also a file. You may link to the version on the d3js.org site, and so not worry about having the actual file handy. That has advantages (like the one we just said, also, you’re pretty sure to always have the latest version on hand), and two problems. First, you always need to have a live internet connection, so there’s no working in the park outside of free wifi space (for example), and also, it will probably be slower than having the file locally or on your own web space. And if having your own web server seems kind of scary, I’ll show you in a short while that it’s not.

The three next kind of files are optional, but hey.

The third file is a javascript .js file which would be where you put your code. Some people would rather put all their code in the html file, which is an option, especially for short programs. Personally, I prefer having a separate file. So to make d3 work, you need some script, but it doesn’t have to be in a separate file.

The fourth file is a style sheet, or css file. This can be used to define some formatting options, for instance to make all your circles blue by default, or some circles that meet some pre-defined criterion. Like the javascript file, any style information can be contained within the html file, but unlike the script, it is completely optional. I also like to keep it separate from the html.

Finally, you may want a data file, you know, with data (csv, txt, json, xml…). If you have lots of data to visualize, it’s easier to keep it in separate files than in variables within the script. But it doesn’t have to be that way. And you could also use d3 without data.

The ingredients: contents of the files

The HTML file

So let’s see how this articulates by looking at a typical d3 html file. I am using templates which I try to change as little as possible from project to project.

<!DOCTYPE html>
<html>
 <head>
   <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
   <title>My project</title>
   <script type="text/javascript" src="../d3.v2.js"></script>
   <link href="style.css" rel="stylesheet">
 </head>
 <body>
   <div id="chart">
   </div>
   <script type="text/javascript" src="script.js"></script>
 </body>
</html>

Well. That is certainly longer than the BASIC one-liner (and we haven’t even printed “hello world” yet).

Let’s take this piece by piece.

The first line is a doctype declaration. What this does is that it tells your browser that what follows should be interpreted as standard, HTML5-compliant HTML (standards mode). If you omit the doctype documentation, your browser will read the html in “quirks mode“, i.e. by replicating the non-standard behavior of Nescape 4 or IE5. You can still try to run d3 under quirks mode, but don’t be surprised if your HTML doesn’t behave as expected.

The doctype declaration doesn’t have to be more complicated than <!DOCTYPE html>.

The second line opens the html document proper. Technically, it’s ok to omit <html>, <head> and <body> tags in HTML5. The document will still be considered valid by tools like the W3C validator. But it seems that some browsers, in some complex cases, don’t like that so much, and I as a person find it more convenient to find those tags when reading code.

The next line opens the header section of the document. Again, it’s not absolutely necessary, but I consider it helpful to explicitly differentiate the header from the rest of the document.

The next line, which goes

<meta http-equiv="Content-Type" content="text/html;charset=utf-8">

is not absolutely required either. It specifies the encoding of the page, that is, what kind of characters will be seen in the page. Since I use non-ascii characters often, being French and all, I make sure to use it all the time. After all, this is a template, not something I type from beginning to end each time.

Next, we specify a title. This is what will appear in the title area of your browser, or, more likely, as the name of your tab.

In the next line, we load the d3 library. This is my preferred syntax. This is how my files are set up:

I have a directory where all my d3 projects are, and in this directory, I am also keeping (and maintaining reasonably up-to-date) a version of the d3 library, a file called d3.v2.min.js. (min stands for minified, which means that it’s not meant to be read by persons, but it’s faster to load). All my projects proper are in folders within that directory. So my html files are one level down from where the d3.v2.min.js file is kept. This is why the src attribute reads “../d3.v2.min.js”: the ../ part means, look one level up. If the d3.v2.min.js file were on the same directory where I keep my html, I would write src=”d3.v2.min.js”, if I kept it within a specific directory like “d3″, I could write src=”../d3/d3.v2.min.js”, and finally, there is always the option of getting it from the website, src=”http://d3js.org/d3.v2.min.js”.

I don’t have to load the d3 library then. I could have done it at the end of the page. The only requirement is that it should be before the script that will use it. But honestly, the file is so small that it doesn’t make much of a difference (9ms on my machine).

Next, I link to a style sheet. With this syntax, I am assuming that my style is specified in a file called style.css which will be in the same directory as this html page. And if there is no such file, it’s not a problem. It doesn’t prevent the page to load.

Instead of using this syntax, I could have written:


<style>

... // my style definitions

</style>

in the html file. And frankly, it is sometimes more convenient. But again, for the general case, it’s just as well to leave it like this.

Note that style information should always be in the header part of the file.

And that concludes the header, as noted by the closing tag </head>. Even if we use the <head> tag to mark the beginning of the header section, we may omit the closing tag </head>, and still get away with a valid (and slightly shorter) document, but I keep it for clarity’s sake.

The next part starts with <body>, and is where the content proper, which will get displayed on the screen, is described. <body> and </body>, just like <html> and <head>, are not mandatory, but do help, somewhat, to make the document easier to read.

So what do we find in the body section? Here, I’ve kept it very simple but also close to the conventions I use.

There is one <div> element, which is the basic building block of HTML, and with an id attribute – a document-wide, unique identifier – called “chart”.

Then, there is the <script> element, which is calling the javascript code we are going to use to create our visualization. It’s at the very bottom of the page, actually just before the closing tags (which, again, could be omitted, but let’s not).

Like for the style element, it is possible to leave the script inside the html document. Instead of using a src attribute – which, incidentally, assumes that the script is within the same directory as the html document with this syntax -, we can write:

<script>

// all our javascript instructions

</script>

And that’s it for the html document! A final word about the contents of the <body> element. In most of my projects, there is an interface such as buttons or controls which is also done in HTML. In that case, the contents of the <body> element get more complex. I would add a button to tweet the page, copyright notices, and other stuff. But I almost always have a <div> element with an identifier named “chart”.

ok, so now that you’re finished with writing your html file, you must save it under any name and use the “.html” extension (or .htm, but why no love for the l? why?)

The javascript file

In this section I will walk you through a very, very basic file, which includes things I do for every project.

var w=960,h=500,
svg=d3.select("#chart")
.append("svg")
.attr("width",w)
.attr("height",h);

var text=svg
.append("text")
.text("hello world")
.attr("y",50);

I like to define variables that describe the width and length of the visualization that I am creating. By putting these in variables, at the beginning of the file, I can easily modify them in case I need to. 960 and 500 work well for visualizations that should appear on their own page, by the way. No scrolling should be necessary.

The next statement use the d3.select construct. Here, it indicates that we are going to build something on top  of the element that meets the criterion that is described between the parentheses. The syntax used by that is that of css selectors, but long story short, #chart refers to whatever has an “id” attribute of “chart”. This is our lone <div> element in the html file. Then, we are going to add an svg element, which is what will hold the visualization proper in svg form, and give it a width of w and a height of h.

I always use that syntax, an “svg” variable that holds the top-level svg container, which resides in a <div> element which has an id of “chart”.

The final part of the file writes, finally, hello world proper. Note that I specify a y attribute (vertical position) else the text have its lower-left corner in the top-left corner of the browser window and will be effectively invisible.

Now, the HTML file we just created expect this file to be called “script.js”, so let’s save it under this name.

In this most simple example, we will not need a css file nor a data file. But, for the sake of discussion, let’s create a css file nonetheless.


text:{font-size:36px;}

and let’s save this under style.css (the name that, again, our HTML file expects). What this does is that it changes the size of the font to whatever the default was to a more massive 36 pixels.

The stove: a web server

As far as writing hello world, we’re done. You can load the html file you created in a browser, you should see the encouraging inscription. Congratulations!

Many visualizations can be seen in a browser directly, just by opening a local file. However, this won’t be the case for some, for instance, those who require external data. In that case, you need a web server. If you have web hosting, you may upload the files to your (remote) server, via FTP for instance, and see your visualization by typing the address of your site in the browser url bar. That said, it is a good idea to have a local web server, that is, one that runs on your computer, so you can view your files as if they were served by a web server, but with the added bonus that you can edit them and see the modifications directly without having to upload them each time you change them.

On Macs, you’re pretty much all set. All you have to do is enable web-sharing in your system preferences. Then, http://localhost/~YOURNAME will point to /Users/YOURNAME/Sites where YOURNAME is your user name. Just put your files there and go at it.

For windows, there are a bunch of solutions. The “Professional” versions of windows include the IIS web server, so, there. But beyond that, there is a lot of web server software available. I personally use EasyPHP. EasyPHP comes up with a web server (Apache), a mySql database, a PHP preprocessor and other niceties. And, as an aside, it doesn’t require administrator rights, for you corporate users.

EasyPHP installation is a breeze. When it’s on, by default, http://localhost/ points to the www/ directory in the install directory of EasyPHP, so you may want to install it in a place that suits you. Alternatively, you can create aliases in the admin panel of EasyPHP (http://localhost/home/index.php), in other words to give a name to any part of your hard drive. This is what I do, I put all my projects there and have a shortcut to that name in my browser, so whenever I want to see a project I use that shortcut and I can see the visualization as if it were on the web.

This is how you create aliases in EasyPHP.

The plate: a browser

We’ve talked browsers before, and chances are you have one (or several) on your computer.

Now I wish that by browsers, we could just skip it and mean “the latest version of chrome”, but it turns out that there are slight differences in the way that browsers handle d3 code so you should really test your work in at least chrome and firefox. As of this writing, Chrome + Firefox (version 5 and up) represent just under 50% of the browser market share. If you add all browsers that are d3-capable (Safari, earlier versions of Firefox, Opera, IE9) you reach about 75% of the market. Sadly, IE8 and IE7 which account for slightly over 20% of the market are not d3-compatible, though they can use the Google ChromeFrame free plug-in and do pretty much all that chrome does.

Knives and forks: the console

At the beginning of my dad’s engineering career, code came on a punch card. People then, allegedly, thought it through. You didn’t want to be the kid who didn’t follow your algorithm carefully enough to forecast an avoidable bug and waste a perfectly good card and oh-so precious computing time.

But now? no code is perfect by the time it hits the browser. You may want to launch incomplete code to get a feel for where you’re going. You may not be too sure of whether that should be a plus or a minus in that equation and just try either because it would be quicker to correct an unexpected outcome than to troubleshoot the formula on paper. You may want to iterate, to bring newer, more complex ideas to your visualization with each change to the code. Or just try out different aesthetic options.

Not too long ago, debugging javascript was really a pain. You’d have to fire those annoying alert boxes to understand what was the value of the variables, and dispatch them manually. Fortunately, that time is gone and now is the age of the Console.

There are console functionalities for Chrome, Firefox and Safari, and while the interface slightly varies, the idea is the same. The console allows you to do three main things:

– first, to see if your code executed without errors or warning. Some of those messages can be generated by javascript, and some can be added by you if certain unfortunate conditions are met. You get the position of the error in your code, which helps you to understand what went wrong and fix it.

I have planted an error at the end of the code and it’s been picked up by the console which explains what’s wrong and when. Notice the red cross in the lower-right corner which counts errors. If there were warning, they would be indicated by a yellow triangle.

– second, to inspect elements, that is to find out all the information about the elements displayed on screen, even if (especially if) they have been generated at runtime. So you can see if those elements you really wanted to create have been indeed added, and if the right attributes have been passed.  third, to interact with the code after it’s run (or while it runs, if you manage to pause if with breakpoints). The most common use of this is, IMO, is to check the value of variables, which you can do simply by typing their name at the console command prompt. But you can also type in one-liner javascript statements, even if they are quite complicated. So it’s a way to test your code before you write it in your script file.

What a relief! all those paths elements that were supposed to be created in the code have been added as expected.

– third, it can be used to interact with the code after it’s run (or even during run-time, because you can pause the code with breakpoints using the console, but we won’t go into that). The most common use for that IMO is to check the value of variables, which can be changed during the code execution, but it can also be used to enter one-liner statements, which can be quite complicated. Such a use allows you to test and preview code hypothesis before you write it down in your script file, or to troubleshoot a problem that you could have difficulties seeing outside of the context.

Here, I am using the console to check the value of one variable, and to enter a statement that turns all the shapes orange.

Voilà! the last thing you need when you cook food is people to share it with, same goes for visualizations!

 

Hollywood + data II: the sequel

a couple of days ago I posted the contribution of relative keywords to the earnings of a given movie.

Well, it occurs that the cast of a movie is much easier to obtain than keywords and much less messy. So, I also had scraped the 4 lead actors for each movie of the Beautiful Information awards in order to determine their contribution.

I’ve done it slightly differently than with the keywords. I’ve also taken the budget of the movie into account. So, in order to predict the earnings of a movie, you take the budget, multiply it by 3, remove 8 millions and then add (or remove) the contribution of each star.

Since the movie budget already (mostly 🙂 ) includes the pay of the lead actors, the way to read this contribution is how much extra these actor should be paid when they appear on film. For instance, each time Emma Stone does her thing, it would be fair to pay her $500k more.

the bang being a modest 500k pay raise.

Likewise, Kirsten Dunst here could be paid an extra 200k per movie, like Rebecca Hall or Rooney Mara. But I mostly felt like posting a picture of Kirsten to stand up against what google autosuggests as keywords when you look her up.

Kirsten, too, should be paid more.

Nicolas Cage, somewhat unsurprisingly,  should refund $2.44m for each movie he stars in.

Actors of movies who’ve done exceedingly well such as Avatar or the Harry Potter or Twilight sagas, movies which, it’s fair to say, owe their success to more than the actors, come out of this over valued. And because of how this is calculated, actors who have costarred with them in less ambitious movies come out undervalued: for instance, Elizabeth Banks or Anton Yelchin who both played with Sam Worthington are “paying” the fact that Man on the Ledge or Terminator: Renaissance haven’t been as successful as Avatar.

Twilight fans take note, though, that Harry Potter actors are considered more valuable.

Now – the data.

-8.72132 base value, million dollars
3.028754  x your budget
contribution
(million dollars)
actor occurrences
6.820566 Daniel Radcliffe 4
6.820566 Rupert Grint 4
6.589512 Emma Watson 5
6.469722 Michelle Rodriguez 4
5.919662 Zoe Saldana 3
5.869411 Sam Worthington 4
5.39418 Sigourney Weaver 4
4.66799 Robert Pattinson 6
4.486437 Kristen Stewart 7
4.131046 Taylor Lautner 4
3.511325 Michael Gambon 2
3.328213 Shia LaBeouf 8
3.091938 Josh Duhamel 4
2.771696 Tyrese Gibson 4
2.430676 Justin Bartha 3
2.177533 Anne Hathaway 8
2.145379 Ed Helms 3
1.985782 Helena Bonham Carter 3
1.922826 Ray Romano 1
1.922826 Denis Leary 1
1.922826 Eunice Cho 1
1.860584 Bradley Cooper 7
1.830633 Zach Galifianakis 6
1.809581 John Leguizamo 3
1.770726 Geoffrey Rush 3
1.722315 Christina Jastrzembska 1
1.681332 Rosie Huntington-Whiteley 1
1.654913 Ellen Page 4
1.521321 Xavier Samuel 1
1.495329 Antonio Banderas 3
1.487379 Brendan Gleeson 2
1.427076 Tim Allen 2
1.40105 Mia Wasikowska 2
1.391394 Heath Ledger 1
1.374587 Mike Myers 3
1.366225 Sandra Bullock 4
1.363958 Ned Beatty 2
1.339647 Johnny Depp 8
1.305249 Stellan Skarsgard 2
1.260785 Michael Caine 2
1.217668 Jae Head 2
1.169856 Tom Hanks 4
1.167732 Jason Lee 2
1.167732 David Cross 2
1.167722 Jason Segel 6
1.161184 Amanda Seyfried 6
1.134598 Freida Pinto 2
1.083914 Ken Watanabe 1
1.081577 Katie Featherston 2
1.081577 Micah Sloat 2
1.072914 Paul Walker 2
1.072914 Jordana Brewster 2
1.07084 Jason Bateman 10
1.055046 Aaron Eckhart 5
1.031855 Joseph Gordon-Levitt 5
0.990123 Megan Fox 4
0.9881 Gerard Butler 9
0.973714 Julie Andrews 2
0.95887 Rose Byrne 4
0.953622 Dan Castellaneta 1
0.953622 Julie Kavner 1
0.953622 Nancy Cartwright 1
0.953622 Yeardley Smith 1
0.946414 Gil Birmingham 1
0.921095 Pierce Brosnan 3
0.888618 Vin Diesel 3
0.869224 George Lopez 2
0.865954 Cameron Diaz 8
0.858122 Jackie Chan 4
0.857169 Tobin Bell 4
0.857169 Costas Mandylor 4
0.827224 Lena Headey 1
0.809226 Mila Kunis 6
0.775858 Vincent Cassel 2
0.768506 Russell Brand 4
0.758058 Wenwen Han 1
0.75382 Justin Timberlake 5
0.746798 Kim Cattrall 2
0.746798 Cynthia Nixon 2
0.746798 Kristin Davis 2
0.738043 Taraji P. Henson 3
0.737633 Katy Perry 1
0.737633 Jonathan Winters 1
0.73718 Karen Allen 1
0.719066 Neil Patrick Harris 3
0.706771 Diane Kruger 3
0.697058 Will Smith 3
0.691195 Meryl Streep 6
0.68017 Karen Disher 1
0.677442 Jesse Eisenberg 6
0.676622 Dominic West 3
0.661994 Zac Efron 3
0.661289 Lucas Grabeel 1
0.659597 Quinton Aaron 1
0.646669 Justin Long 5
0.645497 Edward Asner 1
0.645497 Jordan Nagai 1
0.645497 John Ratzenberger 1
0.644743 Eddie Murphy 6
0.644017 Hank Azaria 2
0.633971 Derek Jacobi 1
0.621914 Craig T. Nelson 1
0.599633 Tim McGraw 2
0.59531 Chloe Csengery 1
0.59531 Jessica Tyler Brown 1
0.59531 Christopher Nicholas Smith 1
0.59531 Lauren Bittner 1
0.586817 Scott Patterson 2
0.585204 John Lithgow 2
0.562875 Christopher Plummer 3
0.562458 Dustin Hoffman 4
0.557976 Terry Crews 2
0.557014 Jaden Smith 2
0.55002 Brad Garrett 1
0.55002 Lou Romano 1
0.55002 Ian Holm 1
0.545008 Mark Fredrichs 1
0.545008 Amber Armstrong 1
0.542971 Adam G. Sevani 2
0.542553 Ian McShane 4
0.536569 Molly Ephraim 1
0.536569 David Bierend 1
0.528423 Natalie Portman 8
0.519192 Emma Stone 6
0.517328 Leonardo DiCaprio 4
0.508157 Dwayne Johnson 6
0.506053 Liam Neeson 5
0.504138 Betsy Russell 3
0.498758 Winona Ryder 2
0.498382 Lucy Punch 1
0.489967 Katherine Heigl 5
0.489965 Saurabh Shukla 1
0.489965 Anil Kapoor 1
0.486999 Famke Janssen 1
0.486999 Leland Orser 1
0.482205 Javier Bardem 3
0.481332 Ross Bagdasarian Jr. 1
0.481332 Janice Karman 1
0.460322 Maya Rudolph 2
0.45842 Kristen Wiig 4
0.455105 Bryce Dallas Howard 2
0.449823 Cary Elwes 2
0.443563 Hailee Steinfeld 1
0.442303 Ashley Tisdale 2
0.436495 Ali Larter 2
0.434599 Salli Richardson-Whitfield 1
0.424407 Sarah Clarke 1
0.423429 Penélope Cruz 3
0.413782 Thandie Newton 3
0.412704 Patton Oswalt 3
0.401531 Octavia Spencer 1
0.401112 James Franco 4
0.385048 Terence Stamp 1
0.377733 Wentworth Miller 1
0.377733 Kim Coates 1
0.375668 Jason Flemyng 1
0.374325 Matt Damon 7
0.374325 Christian Bale 5
0.372746 David James 1
0.372746 Jason Cope 1
0.372746 Nathalie Boltt 1
0.368065 Chiwetel Ejiofor 3
0.368005 Édgar Ramirez 1
0.368005 Julia Stiles 1
0.367801 Mélanie Laurent 2
0.362202 Vera Farmiga 3
0.356965 Johnny Knoxville 1
0.356965 Steve-O 1
0.356965 Bam Margera 1
0.356965 Ryan Dunn 1
0.356953 Bill Nighy 6
0.356736 Elle Fanning 1
0.356736 Amanda Michalka 1
0.356736 Kyle Chandler 1
0.356736 Joel Courtney 1
0.356158 Sarah Jessica Parker 4
0.355846 Maggie Grace 2
0.353032 Louis Ferreira 1
0.351145 Ralph Fiennes 3
0.343755 Keir O’Donnell 1
0.343755 Jayma Mays 1
0.343755 Raini Rodriguez 1
0.339988 P.J. Byrne 1
0.33941 Christopher Mintz-Plasse 3
0.338666 Anna Kendrick 2
0.338404 Eli Roth 1
0.338222 Morgan Lily 1
0.338222 Trenton Rogers 1
0.313245 Jessica Lucas 1
0.313245 Lizzy Caplan 1
0.312825 Cassie Ventura 1
0.307954 Viola Davis 4
0.30272 Ty Simpkins 1
0.30272 Lin Shaye 1
0.298112 Bree Turner 1
0.298112 Eric Winter 1
0.296847 Andy Serkis 2
0.294687 Mike Vogel 2
0.294687 T.J. Miller 2
0.292065 Kurt Fuller 1
0.286915 Maggie Smith 1
0.286915 Ashley Jensen 1
0.284681 Seth Meyers 1
0.280837 Tobey Maguire 2
0.280773 Jason Sudeikis 2
0.269573 Kyra Sedgwick 1
0.269573 Madison Pettis 1
0.269573 Roselyn Sanchez 1
0.268595 Lisa Kudrow 2
0.258727 Patrick Dempsey 2
0.257049 David Wenham 3
0.254023 Briana Evigan 2
0.252523 Matthew Perry 1
0.252523 Thomas Lennon 1
0.246058 John Cusack 3
0.240009 Michael Jackson 1
0.240009 Alex Al 1
0.240009 Alexandra Apjarova 1
0.240009 Nick Bass 1
0.238714 Patricia Clarkson 3
0.235798 Brian Kerwin 1
0.230146 Sharni Vinson 1
0.230146 Rick Malambri 1
0.230146 Alyson Stoner 1
0.229488 Topher Grace 3
0.22835 Meagan Good 2
0.224193 Jaime King 2
0.221369 Zachary Gordon 2
0.221369 Robert Capron 2
0.221369 Rachael Harris 2
0.219378 Alice Braga 4
0.219044 Rooney Mara 2
0.215264 Kirsten Dunst 2
0.20931 Rebecca Hall 4
0.20615 Orlando Bloom 1
0.205058 Patrick Fabian 1
0.205058 Ashley Bell 1
0.205058 Iris Bahr 1
0.205058 Louis Herthum 1
0.20394 Michael Cera 6
0.202696 Woody Harrelson 4
0.202054 Aaron Yoo 2
0.195742 Jensen Ackles 1
0.195742 Kerr Smith 1
0.195742 Betsy Rue 1
0.192528 Michael Mantell 1
0.190413 David Morse 1
0.190413 Carrie-Anne Moss 1
0.189054 Piper Perabo 1
0.189054 Manolo Cardona 1
0.187252 Charlie Day 2
0.183573 Will Arnett 3
0.182781 Charlie Tahan 2
0.181449 Kate Bosworth 1
0.175751 Tommy Lee Jones 2
0.173441 Scott Speedman 1
0.173441 Gemma Ward 1
0.173441 Alex Fisher 1
0.170702 Mary Steenburgen 3
0.169352 Amanda Bynes 1
0.169352 Dan Byrd 1
0.165945 Chloë Grace Moretz 2
0.164513 Dougray Scott 1
0.163693 Rachel McAdams 5
0.157617 Harry Connick Jr. 2
0.156691 Emily Osment 1
0.156691 Billy Ray Cyrus 1
0.156691 Jason Earles 1
0.154703 Matt Lanter 3
0.153793 Amanda Peet 3
0.152353 Jenna Elfman 1
0.141247 January Jones 1
0.141247 Aidan Quinn 1
0.14017 Nika Futterman 1
0.14017 Tom Kane 1
0.14017 Ashley Eckstein 1
0.140131 Henry Thomas 1
0.140122 Jack Nicholson 1
0.140122 Sean Hayes 1
0.140122 Beverly Todd 1
0.138843 Chris Rock 3
0.138691 Jennifer Aniston 6
0.136256 Kevin Spacey 3
0.13405 Emma Bell 1
0.13405 Arlen Escarpeta 1
0.13405 Miles Fisher 1
0.130048 André Benjamin 1
0.130048 Maura Tierney 1
0.128801 Kathy Bates 3
0.128594 Anita Briem 2
0.127092 Ayelet Zurer 1
0.12695 Geoffrey Arend 1
0.124268 An Nguyen 1
0.123344 Danielle Panabaker 2
0.122693 Jared Padalecki 1
0.122693 Amanda Righetti 1
0.122693 Derek Mears 1
0.122554 Chandler Canterbury 1
0.122554 Lara Robinson 1
0.120986 Jon Hamm 1
0.120861 Ne-Yo 1
0.117797 Bruce McGill 1
0.117466 Nick Zano 1
0.117466 Krista Allen 1
0.117466 Andrew Fiscella 1
0.117466 Bobby Campo 1
0.114774 Kirk Cameron 1
0.114774 Erin Bethea 1
0.114774 Ken Bevel 1
0.114774 Stephen Dervan 1
0.113006 Donald Faison 2
0.111429 Chris Messina 2
0.110646 Devon Bostick 1
0.109958 Paul Rudd 7
0.10794 Dolph Lundgren 1
0.107433 Liam Hemsworth 1
0.107433 Bobby Coleman 1
0.105983 Christopher Evan Welch 1
0.105053 Bill Hader 2
0.101652 Dev Patel 2
0.100744 Melissa Leo 3
0.099702 Jerry O’Connell 2
0.09952 Jennifer Coolidge 1
0.09952 Adam Campbell 1
0.099046 Caroline Dhavernas 1
0.099046 Bokeem Woodbine 1
0.099046 Logan Marshall-Green 1
0.097749 Nick Bacon 1
0.096496 Penn Badgley 2
0.095386 Amber Valletta 2
0.09417 Eric Balfour 1
0.09417 Scottie Thompson 1
0.09417 Brittany Daniel 1
0.093839 Robert Hoffman 2
0.091974 Gary Cole 1
0.091644 Robert Knepper 2
0.090532 Bebe Neuwirth 1
0.090532 Megan Mullally 1
0.090532 Kay Panabaker 1
0.090298 Candice Bergen 1
0.090298 Bryan Greenberg 1
0.089631 Joshua Jackson 1
0.089631 Rachael Taylor 1
0.089631 James Kyson 1
0.089631 Megumi Okina 1
0.088905 Angela Lansbury 1
0.088905 Ophelia Lovibond 1
0.088475 Julian McMahon 1
0.088475 Shyann McClure 1
0.087343 Rowan Atkinson 1
0.087343 Roger Barclay 1
0.084964 Bryan Cranston 1
0.084964 Albert Brooks 1
0.081511 Alessandro Nivola 1
0.081511 Parker Posey 1
0.081511 Rade Serbedzija 1
0.080952 Kal Penn 3
0.077884 Jenn Proske 1
0.077884 Diedrich Bader 1
0.077884 Chris Riggi 1
0.07668 Marcia Gay Harden 2
0.074429 Andrew Garfield 2
0.073885 Rhys Wakefield 1
0.073885 Allison Cratchley 1
0.073885 Christopher Baker 1
0.073885 Richard Roxburgh 1
0.073624 Michael O’Keefe 1
0.07336 Daeg Faerch 1
0.073189 Kevin Kline 2
0.071229 Eugenio Derbez 1
0.071229 Kate del Castillo 1
0.071229 Adrian Alonso 1
0.071229 Maya Zapata 1
0.070675 Sarah Roemer 3
0.067179 Nathan Fillion 1
0.067179 Jeremy Sisto 1
0.066353 Scout Taylor-Compton 2
0.066353 Malcolm McDowell 2
0.066353 Tyler Mane 2
0.065523 Steve Zahn 2
0.065485 Gio Perez 1
0.065485 Joel Garland 1
0.063306 Mark Rolston 1
0.06132 Jude Law 3
0.0578 David Schwimmer 1
0.0578 Jada Pinkett Smith 1
0.057107 Jake T. Austin 1
0.056938 Khalid Abdalla 1
0.056938 Ahmad Khan Mahmoodzada 1
0.056938 Atossa Leoni 1
0.056938 Shaun Toub 1
0.05549 José Luis Garcia Pérez 1
0.05549 Robert Paterson 1
0.05549 Stephen Tobolowsky 1
0.054836 Jeremy Renner 2
0.050369 Malik Yoba 1
0.048526 Warren Christie 1
0.048526 Ryan Robbins 1
0.048526 Ali Liebert 1
0.048526 Lloyd Owen 1
0.048392 Nick Sullivan 1
0.047911 Richard Jenkins 5
0.046161 Annette Bening 3
0.045566 Carol Burnett 2
0.044954 Tom Mison 1
0.043956 Walter Raney 1
0.043917 Reiko Aylesworth 1
0.043917 Steven Pasquale 1
0.043917 John Ortiz 1
0.043917 Johnny Lewis 1
0.043044 Michael Moore 1
0.043044 Tucker Albrizzi 1
0.043044 Tony Benn 1
0.043044 George W. Bush 1
0.041406 Catherine Zeta-Jones 1
0.04094 Elisabeth Shue 1
0.038995 Garrett M. Brown 1
0.038936 Jon Voight 2
0.037779 Alfre Woodard 1
0.037779 Sanaa Lathan 1
0.037779 Rockmond Dunbar 1
0.036524 Joseph Ruskin 1
0.036472 Kate Mara 1
0.036472 Sean Bott 1
0.036463 Cillian Murphy 1
0.036463 Shyloh Oostwald 1
0.035315 Jeffrey Wright 2
0.034615 Haaz Sleiman 1
0.034615 Danai Gurira 1
0.034615 Hiam Abbass 1
0.033809 Sylvester Stallone 2
0.033294 Lauren German 1
0.033294 Heather Matarazzo 1
0.033294 Bijou Phillips 1
0.033294 Roger Bart 1
0.033228 Bill Maher 1
0.033228 Tal Bachman 1
0.033228 Jonathan Boulden 1
0.033228 Steve Burg 1
0.03308 Emilie de Ravin 1
0.03308 Caitlyn Rund 1
0.03308 Moisés Acevedo 1
0.032157 Jeremy Piven 2
0.031753 Jennifer Hudson 1
0.031753 Alicia Keys 1
0.0313 Sacha Baron Cohen 1
0.0313 Gustaf Hammarsten 1
0.0313 Clifford Bañagale 1
0.0313 Chibundu Orukwowu 1
0.029753 Tyler Perry 3
0.029491 Saffron Burrows 1
0.029491 Daniel Mays 1
0.029397 Goran Visnjic 1
0.028255 Odette Annable 2
0.025992 Brittany Snow 1
0.025992 Jessica Stroup 1
0.025992 Dana Davis 1
0.025977 Kyle Gallner 2
0.024707 Thomas Jane 1
0.024707 Laurie Holden 1
0.024707 Andre Braugher 1
0.024175 Kelsey Grammer 2
0.021951 Kat Dennings 2
0.021924 Will Patton 1
0.021924 Charlotte Milchard 1
0.021893 Nicholas D’Agosto 2
0.021888 John Hawkes 1
0.021888 Garret Dillahunt 1
0.021888 Isaiah Stone 1
0.021701 Paul Dano 1
0.021701 Martin Stringer 1
0.020605 Rafi Gavron 1
0.020451 Paul Schneider 1
0.02036 Jennifer Carpenter 1
0.02036 Jay Hernandez 1
0.01921 Alan Rickman 1
0.01921 Timothy Spall 1
0.018948 Sarah Burns 1
0.018836 Mike Epps 1
0.018836 Wood Harris 1
0.018712 Jill Scott 1
0.017932 Brit Marling 1
0.017932 William Mapother 1
0.017932 Matthew-Lee Erlbach 1
0.017932 DJ Flava 1
0.017423 Jason Spevack 1
0.017114 Josh Hutcherson 2
0.016759 Fred Willard 2
0.01642 Cuba Gooding Jr. 1
0.01642 Lochlyn Munro 1
0.01642 Richard Gant 1
0.01642 Tamala Jones 1
0.015848 Katt Williams 1
0.015441 Haley Bennett 1
0.015441 Chace Crawford 1
0.015441 Jake Weber 1
0.015441 Shannon Woodward 1
0.014565 Chris Hemsworth 1
0.014565 Tom Hiddleston 1
0.014266 Michael Stuhlbarg 1
0.014266 Richard Kind 1
0.014266 Sari Lennick 1
0.014266 Fred Melamed 1
0.010856 Bridget Moynahan 1
0.010856 Ramon Rodriguez 1
0.010561 Debra Messing 1
0.008341 Vanessa Hudgens 4
0.007363 Don Cheadle 4
0.007136 Ben Stein 1
0.007136 Lili Asvar 1
0.007136 Peter Atkins 1
0.007136 Hector Avalos 1
0.00691 Michael C. Hall 1
0.006893 Janet Jackson 3
0.004992 Alison Lohman 1
0.004992 Ruth Livier 1
0.004992 Lorna Raver 1
0.004762 Sharlto Copley 2
0.004547 Clifton Collins Jr. 1
0.004547 Dwight Yoakam 1
0.003298 Katie Cassidy 2
0.002413 Alan Arkin 2
0.002303 Asa Butterfield 1
0.002303 Rupert Friend 1
0.002303 Zac Mattoon O’Brien 1
0.002073 Elise Ivy 1
0.001311 Morris Chestnut 1
0.001311 Maeve Quinlan 1
0.001311 Kevin Hart 1
0.001177 Gwyneth Paltrow 4
0.000214 Elizabeth Perkins 1
0.000214 Kaley Cuoco 1
-0.00043 Sean Maguire 1
-0.00043 Kevin Sorbo 1
-0.00043 Ken Davitian 1
-0.0007 Aaron Johnson 2
-0.00302 Scarlett Johansson 4
-0.00381 Dominic Cooper 1
-0.00381 Charlotte Rampling 1
-0.00437 Joel McHale 1
-0.00437 Rowan Blanchard 1
-0.00452 Joan Allen 3
-0.00581 Minka Kelly 1
-0.00581 Shirley Norris 1
-0.00598 Ted Ludzik 1
-0.00701 Sheri Moon Zombie 1
-0.00739 Daniella Alonso 1
-0.00739 Jacob Vargas 1
-0.00739 Michael Bailey Smith 1
-0.00739 Michael McMillian 1
-0.00748 Joaquin Phoenix 1
-0.00748 Danny Hoch 1
-0.00758 Cary-Hiroyuki Tagawa 1
-0.00913 Scott Porter 3
-0.00925 Beyoncé Knowles 2
-0.00936 Loretta Devine 2
-0.01051 Willem Dafoe 1
-0.01051 Sam Neill 1
-0.01051 Claudia Karvan 1
-0.01133 Eva Longoria 1
-0.01183 Helen Hunt 1
-0.01183 Lorraine Nicholson 1
-0.01272 Charles S. Dutton 1
-0.01272 Lucas Black 1
-0.01276 Diego Luna 1
-0.01285 Kenny Wormald 1
-0.01285 Julianne Hough 1
-0.01285 Andie MacDowell 1
-0.01341 Patrick Stewart 1
-0.01341 Mako 1
-0.01341 Nolan North 1
-0.01393 Jeff Goldblum 1
-0.01461 Julianna Margulies 1
-0.01668 Alec Baldwin 3
-0.01764 Emma Roberts 2
-0.01805 Michael Parks 1
-0.01805 John Goodman 1
-0.01838 Michael Carman 1
-0.01856 Alice Eve 1
-0.01857 John Cho 2
-0.01872 Ryan Gosling 4
-0.01881 Gael Garcia Bernal 1
-0.01881 Marcia DeBonis 1
-0.01899 Angelina Jolie 7
-0.01907 Meg Ryan 2
-0.01973 Christopher Jordan Wallace 1
-0.02072 Emma Thompson 1
-0.02072 Maggie Gyllenhaal 1
-0.02072 Oscar Steer 1
-0.02083 George Clooney 10
-0.0215 Laura Linney 1
-0.02431 Virginia Madsen 1
-0.02431 Martin Donovan 1
-0.02516 Shea Whigham 1
-0.02516 Tova Stewart 1
-0.02519 Jon Favreau 1
-0.02534 Elizabeth Berrington 1
-0.02621 Isabelle Fuhrman 1
-0.02621 CCH Pounder 1
-0.02631 Mickey Rourke 2
-0.02666 Michael Sheen 1
-0.02666 Steven Mackintosh 1
-0.02671 Kadeem Hardison 1
-0.02734 Paolo Bonacelli 1
-0.02734 Violante Placido 1
-0.02734 Thekla Reuten 1
-0.02787 Christopher Meloni 1
-0.0281 Colin O’Donoghue 1
-0.02839 Common 1
-0.02839 James Pickens Jr. 1
-0.02906 Wesley Snipes 1
-0.02948 Connor Price 1
-0.02963 Kelsey Ford 1
-0.02963 Elena Anaya 1
-0.03052 Tom Sturridge 1
-0.03199 Ron Glass 1
-0.03202 Edward Burns 1
-0.03202 Shannyn Sossamon 1
-0.03202 Ana Claudia Talancón 1
-0.03202 Ray Wise 1
-0.03285 Alexandria M. Salling 1
-0.03299 Larry David 1
-0.03299 Adam Brooks 1
-0.0341 Jamal Woolard 1
-0.0341 Mohamed Dione 1
-0.03417 Keira Knightley 4
-0.03504 Stephen Campbell Moore 2
-0.03591 Ashlyn Sanchez 1
-0.03644 Amara Karan 1
-0.03933 Tamela J. Mann 1
-0.03933 David Mann 1
-0.03957 Ethan Hawke 2
-0.04132 Ludacris 2
-0.0447 Josh Lucas 1
-0.0447 Alexis Clagett 1
-0.0452 Kiele Sanchez 1
-0.04542 Richard Dreyfuss 2
-0.04634 Josh Hartnett 1
-0.04634 Melissa George 1
-0.04698 Danny Trejo 1
-0.04699 Selena Gomez 1
-0.04699 Cory Monteith 1
-0.04823 Beau Bridges 1
-0.04834 Sean Faris 1
-0.04834 Djimon Hounsou 1
-0.04848 John Carroll Lynch 1
-0.04855 Will Forte 1
-0.04855 Val Kilmer 1
-0.04988 Mia Stallard 1
-0.05001 Devin Brochu 1
-0.05014 Nikki Blonsky 1
-0.05014 Michelle Pfeiffer 1
-0.05018 Clifton Powell 1
-0.05062 Luke Wilson 1
-0.05062 Frank Whaley 1
-0.05062 Ethan Embry 1
-0.05137 Michelle Williams 1
-0.05137 Eddie Redmayne 1
-0.05137 Julia Ormond 1
-0.05152 Timothy Dalton 1
-0.05268 Michael Kelly 1
-0.05269 Mary-Kate Olsen 1
-0.05269 Justin Bradley 1
-0.05378 Neve Campbell 1
-0.05378 David Arquette 1
-0.05378 Lucy Hale 1
-0.05387 Nathan Gamble 1
-0.0553 Richard E. Grant 1
-0.05587 Brooklyn Decker 1
-0.0564 Ben Hollingsworth 1
-0.05818 Anne Heche 1
-0.05818 Isiah Whitlock Jr. 1
-0.0588 Rumer Willis 1
-0.0588 Carrie Fisher 1
-0.0588 Teri Andrzejewski 1
-0.059 Lily Collins 1
-0.059 Jake Andolina 1
-0.05922 Christina Applegate 1
-0.05929 Mos Def 2
-0.05943 Matthew Goode 1
-0.05943 Adam Scott 1
-0.05956 Amber Tamblyn 2
-0.05961 Rodrigo Santoro 1
-0.06019 Thomas Haden Church 4
-0.06219 Whoopi Goldberg 1
-0.06219 Kimberly Elise 1
-0.06228 Vicky Krieps 1
-0.06293 Cheryl Hines 2
-0.06335 Vanessa Minnillo 1
-0.06335 Nicole Parker 1
-0.06378 Carmen Electra 2
-0.06453 Claire Foy 1
-0.06484 Kerry Washington 2
-0.06553 Rosemarie DeWitt 1
-0.06553 Debra Winger 1
-0.06553 Sebastian Stan 1
-0.06615 Brian Geraghty 1
-0.06636 Madeline Carroll 1
-0.0668 Mark Ruffalo 3
-0.06737 Adrien Brody 3
-0.06842 Thomas D. Mahard 1
-0.06852 Jason Biggs 2
-0.06862 Chris Brown 1
-0.06876 Stella Maeve 1
-0.06949 Jennifer Lopez 1
-0.06949 Alex O’Loughlin 1
-0.06949 Michaela Watkins 1
-0.07143 Leslie Mann 5
-0.07286 Dylan Walsh 1
-0.07286 Sela Ward 1
-0.07287 Natalya Rudakova 1
-0.07287 François Berléand 1
-0.07344 Ari Graynor 1
-0.07413 Matthew Marsden 1
-0.07413 Graham McTavish 1
-0.07432 Milla Jovovich 5
-0.07449 Tracy Morgan 2
-0.07474 Tate Donovan 1
-0.07474 Craig Gellis 1
-0.07611 Drew Barrymore 5
-0.07615 Demi Moore 2
-0.07663 Zachary Quinto 2
-0.07686 Dennis Hopper 1
-0.07698 Amy Smart 2
-0.07733 Jacob Latimore 1
-0.07746 Sarah Habel 1
-0.07813 Mia Farrow 1
-0.07923 Jamie Lee Curtis 1
-0.07995 Tony Hale 1
-0.07995 Lucas McHugh Carroll 1
-0.08153 Kiefer Sutherland 1
-0.08153 Cameron Boyce 1
-0.0823 Justin Chatwin 1
-0.0823 James Marsters 1
-0.0823 Yun-Fat Chow 1
-0.0823 Emmy Rossum 1
-0.08247 Laurence Fishburne 2
-0.08276 Ben Burtt 1
-0.08276 Elissa Knight 1
-0.08276 Jeff Garlin 1
-0.08358 Tim Hodge 1
-0.08358 Mike Nawrocki 1
-0.08358 Phil Vischer 1
-0.08358 Cam Clarke 1
-0.08405 Stephen Merchant 1
-0.08611 Kevin Costner 2
-0.08636 Nia Vardalos 1
-0.08636 Rachel Dratch 1
-0.08636 Alexis Georgoulis 1
-0.0876 Ed Harris 1
-0.0876 Jeremy Irons 1
-0.08809 Leonard Nimoy 1
-0.08989 Rachel Bilson 2
-0.08997 Gerry Bednob 1
-0.09033 Juan Carlos Hernàndez 1
-0.09033 Cory Fernandez 1
-0.09096 Portia Doubleday 1
-0.09096 Jean Smart 1
-0.09157 Dianna Agron 1
-0.09398 Shailene Woodley 1
-0.09398 Amara Miller 1
-0.09398 Nick Krause 1
-0.09433 Izzy Meikle-Small 1
-0.09496 Téa Leoni 1
-0.09496 Jordan Carlos 1
-0.09575 Daniel Olbrychski 1
-0.09586 Dustin Milligan 1
-0.09586 Chris Carmack 1
-0.09586 Katharine McPhee 1
-0.09603 America Ferrera 1
-0.09662 Yifei Liu 1
-0.09665 Steve Harris 2
-0.0978 Gillian Anderson 1
-0.0978 Billy Connolly 1
-0.09831 Lukas Haas 1
-0.09851 Michelle Nolden 1
-0.09879 John Michael Higgins 1
-0.09889 John Turturro 1
-0.09889 Emmanuelle Chriqui 1
-0.10018 Charlize Theron 3
-0.10183 Sarah Mahoney 1
-0.10183 Roxana Ortega 1
-0.10214 Tom Cavanagh 1
-0.10263 Paul Rust 1
-0.10263 Jack Carpenter 1
-0.10263 Lauren London 1
-0.10292 Gemma Jones 1
-0.10305 Sam Shepard 1
-0.10313 James Earl Jones 1
-0.10313 Margaret Avery 1
-0.10387 Michael Keaton 1
-0.10387 Zach Gilford 1
-0.10396 Blythe Danner 2
-0.10608 Adam Brody 2
-0.10638 Ted Danson 1
-0.10643 Dane Cook 3
-0.10786 Suzanne Rico 1
-0.10843 Danny DeVito 1
-0.10878 Jonah Hill 6
-0.10929 Archie Panjabi 1
-0.10929 Saïd Taghmaoui 1
-0.11021 Carice van Houten 1
-0.1118 Colin Firth 2
-0.11216 Molly Sims 1
-0.11276 Leighton Meester 3
-0.11281 Hunter McCracken 1
-0.11283 Maggie Q 2
-0.11339 Robert Redford 1
-0.11467 Michael Angarano 2
-0.11492 John Hurt 1
-0.11492 Stephen Dorff 1
-0.11535 Trevor Gagnon 1
-0.11535 Philip Bolden 1
-0.11535 David Gore 1
-0.11535 Christopher Lloyd 1
-0.11589 Shawn Wayans 1
-0.11589 Shoshana Bush 1
-0.11589 Damon Wayans Jr. 1
-0.11672 Bailee Madison 1
-0.11672 Bruce Gleeson 1
-0.11701 John Cena 1
-0.11701 Ashley Scott 1
-0.11701 Aidan Gillen 1
-0.11778 Joan Cusack 4
-0.11808 Aziz Ansari 1
-0.121 Juliette Binoche 1
-0.121 James Ransone 1
-0.12193 Alexander Ludwig 1
-0.12267 Kate Winslet 1
-0.12448 Marco Khan 1
-0.12448 Cliff Curtis 1
-0.12498 Kelly Preston 1
-0.12557 Sean Penn 2
-0.12565 Ray Liotta 3
-0.12714 Stanley Tucci 4
-0.12778 Josh Zuckerman 1
-0.12813 Russell Means 1
-0.12921 Billy Burke 4
-0.12988 Michael Buie 1
-0.13011 Andy Samberg 1
-0.13011 Jeff Daniels 1
-0.13042 Istvàn Göz 1
-0.13061 Larry the Cable Guy 1
-0.13115 Mauricio Lopez 1
-0.13207 Will Poulter 1
-0.13287 Aly Michalka 1
-0.13287 Gaelan Connell 1
-0.1335 Jeffrey Dean Morgan 1
-0.13599 James Rebhorn 1
-0.13792 Ashley Judd 2
-0.13797 Jessica Chastain 2
-0.13797 Simon Pegg 2
-0.13835 Marisa Tomei 2
-0.13901 Hayley Atwell 1
-0.1391 Steven Strait 2
-0.14098 Jodelle Ferland 1
-0.14218 Craig Ferguson 2
-0.14296 Jerry Stiller 1
-0.14426 Alex Pettyfer 2
-0.14453 Tom Hardy 1
-0.14453 Nick Nolte 1
-0.14453 Jennifer Morrison 1
-0.14562 Shawn Ashmore 1
-0.14562 Jonathan Tucker 1
-0.14562 Laura Ramsey 1
-0.14615 David Tennant 1
-0.14615 Toni Collette 1
-0.14626 Joseph Cross 1
-0.14674 Kathleen Turner 1
-0.14783 Allison Janney 1
-0.14783 Carmen Ejogo 1
-0.14791 Henry Cavill 2
-0.15115 Ricky Gervais 2
-0.15316 Queen Latifah 4
-0.15327 William H. Macy 2
-0.1542 David Duchovny 2
-0.15482 John Magaro 1
-0.15482 Denzel Whitaker 1
-0.15482 Zena Grey 1
-0.15609 Brandon Routh 1
-0.15609 Sam Huntington 1
-0.15609 Taye Diggs 1
-0.1568 Cherry Jones 1
-0.15881 Rashida Jones 2
-0.15924 Colm Meaney 1
-0.15932 Louis C.K. 2
-0.15975 Jim Cummings 1
-0.15975 Bud Luckey 1
-0.16027 Frances Conroy 1
-0.16048 David Morrissey 1
-0.16131 Alicia Witt 1
-0.16131 Ben McKenzie 1
-0.16131 Leelee Sobieski 1
-0.16158 Kenneth Branagh 2
-0.16271 Susan Sarandon 4
-0.16305 Teri Hatcher 1
-0.16305 John Hodgman 1
-0.16305 Jennifer Saunders 1
-0.16324 Steve Coogan 1
-0.16324 Brandon T. Jackson 1
-0.16746 James Marsden 6
-0.16798 Evangeline Lilly 1
-0.16798 Dakota Goyo 1
-0.16863 Lily Rabe 1
-0.1693 Keanu Reeves 2
-0.17142 Kodi Smit-McPhee 1
-0.1731 Noah Bean 1
-0.17413 Diane Lane 2
-0.17575 Michael Vartan 1
-0.17575 Callum Blue 1
-0.17627 Paula Patton 3
-0.17794 Molly Shannon 1
-0.17794 Steve Buscemi 1
-0.17794 Myleene Klass 1
-0.17946 Darrin Dewitt Henson 2
-0.17949 Hayden Panettiere 2
-0.1843 Mélanie Thierry 1
-0.1843 Gérard Depardieu 1
-0.18736 Teri Polo 1
-0.18816 Matt Dillon 2
-0.18819 Noah Emmerich 1
-0.1885 Lee Pace 1
-0.1885 George 1
-0.18979 Jamie Bell 2
-0.18995 Kevin McKidd 2
-0.19227 Oprah Winfrey 1
-0.19227 Bruno Campos 1
-0.19379 Olga Kurylenko 2
-0.19412 Tony Goldwyn 1
-0.19669 Brooke Shields 1
-0.19669 Ricky Garcia 1
-0.19669 Eugene Cordero 1
-0.19796 Colin Ford 1
-0.1999 Alexis Bledel 2
-0.20019 Josh Peck 1
-0.20019 Alex Frost 1
-0.20019 Nate Hartley 1
-0.20054 Armie Hammer 1
-0.20054 Josh Hamilton 1
-0.20082 Ken Jeong 2
-0.201 Jennifer Garner 5
-0.20531 Hayden Christensen 3
-0.21062 Ice Cube 2
-0.21392 Charlie Cox 1
-0.21392 Claire Danes 1
-0.21392 Sienna Miller 1
-0.21392 Ian McKellen 1
-0.21416 Josh Brolin 8
-0.21465 Miley Cyrus 3
-0.21491 Greta Gerwig 1
-0.21652 David Thewlis 2
-0.21697 Nick Swardson 2
-0.21737 Olivia Williams 1
-0.21737 Jon Bernthal 1
-0.21748 Keith David 2
-0.218 Robert Downey Jr. 7
-0.21882 Rhys Ifans 1
-0.21882 Sebastian Armesto 1
-0.21891 Brad Pitt 7
-0.21899 Carter Jenkins 1
-0.21899 Austin Butler 1
-0.22287 Demetri Martin 1
-0.22287 Henry Goodman 1
-0.22287 Edward Hibbert 1
-0.22287 Imelda Staunton 1
-0.22299 Meredith Droeger 1
-0.22329 Bob Hoskins 1
-0.22329 Alexander Siddig 1
-0.22329 Caryn Peterson 1
-0.22569 Abigail Breslin 7
-0.22647 Keke Palmer 1
-0.22647 Tasha Smith 1
-0.22647 Jill Marie Jones 1
-0.22956 Max Thieriot 2
-0.23105 Tracey Ullman 1
-0.23431 John Malkovich 2
-0.23522 Idris Elba 3
-0.23559 Jeff Kahn 1
-0.23763 Vanessa Redgrave 2
-0.23795 Doug Hutchison 1
-0.24264 Gabriel Macht 2
-0.24452 David Krumholtz 1
-0.24452 Nat Faxon 1
-0.24508 Clark Duke 2
-0.25125 Ron Livingston 2
-0.25182 Kim Basinger 1
-0.25248 Emily Mortimer 4
-0.2531 Bruce Campbell 1
-0.25446 Selma Blair 1
-0.25446 Doug Jones 1
-0.25446 John Alexander 1
-0.25446 Anika Noni Rose 2
-0.25658 Seann William Scott 2
-0.25883 Armin Mueller-Stahl 1
-0.26063 Jason Schwartzman 2
-0.26107 Kathy Baker 1
-0.26156 Monica Bellucci 1
-0.26156 Stephen McHattie 1
-0.26745 Timothy Olyphant 6
-0.26757 Chris Massoglia 1
-0.26811 James Caan 2
-0.26866 Steve Carell 7
-0.27004 Colin Hanks 3
-0.27109 Tom Skerritt 1
-0.27329 Liv Tyler 2
-0.27555 Ben Affleck 4
-0.27608 Zooey Deschanel 5
-0.27655 James Russo 1
-0.27655 Charlie Yeung 1
-0.27655 Shahkrit Yamnarm 1
-0.27655 Panward Hemmanee 1
-0.27851 Hugh Grant 2
-0.27948 Diane Keaton 2
-0.28318 Judy Greer 2
-0.28336 Carey Mulligan 4
-0.28797 Colm Feore 1
-0.28797 Amy Ryan 1
-0.28797 Gattlin Griffith 1
-0.28933 Eva Mendes 3
-0.29241 Matt Long 1
-0.29307 Ben Mendelsohn 1
-0.29363 Emily Barclay 1
-0.29414 Rain 1
-0.29414 Rick Yune 1
-0.29414 Naomie Harris 1
-0.29414 Ben Miles 1
-0.29423 AnnaSophia Robb 3
-0.29464 Dan Aykroyd 2
-0.29572 Christopher Eccleston 1
-0.29799 Hugh Dancy 1
-0.29799 Krysten Ritter 1
-0.29968 Toby Jones 1
-0.29968 David Ryall 1
-0.29993 Antje Traue 1
-0.30149 Jonathan Rhys Meyers 1
-0.30149 Kasia Smutniak 1
-0.30149 Richard Durden 1
-0.30374 Jenna Fischer 2
-0.30433 Alan Alda 1
-0.30445 Shea Adams 1
-0.30445 Eddie Baroo 1
-0.30461 Frank Langella 2
-0.3085 50 Cent 1
-0.30904 William Fichtner 1
-0.31005 Joe Anderson 2
-0.31069 Andy Garcia 2
-0.3107 Dana Fuchs 1
-0.31076 Sarah Michelle Gellar 1
-0.31076 George Carlin 1
-0.31208 Julie Benz 2
-0.31313 Hilary Swank 4
-0.3135 Alexis Dziena 1
-0.31375 Arielle Kebbel 1
-0.31742 Columbus Short 5
-0.32086 Cher 1
-0.32086 Christina Aguilera 1
-0.32086 Alan Cumming 1
-0.32091 Anthony Mackie 4
-0.32244 Camilla Belle 2
-0.32251 Terrence Howard 2
-0.32407 Anthony Edwards 1
-0.32658 Elias Koteas 3
-0.33024 Danny Glover 2
-0.33186 Rosamund Pike 2
-0.3343 Bruce Willis 4
-0.33616 Michael Hadley 1
-0.3371 Sarah Bolger 1
-0.33889 Julianne Moore 3
-0.33895 Michelle Monaghan 6
-0.33954 Frances McDormand 2
-0.3401 Zahf Paroo 1
-0.34013 Giovanni Ribisi 1
-0.34013 Michael Rispoli 1
-0.34041 Sharon Leal 3
-0.34369 Evan Rachel Wood 2
-0.34673 Mary-Louise Parker 2
-0.34779 Eliza Bennett 1
-0.34779 Sienna Guillory 1
-0.3497 Jennifer Lawrence 2
-0.35049 Paul Bettany 4
-0.35309 Ryan Phillippe 3
-0.35499 Michael Shannon 3
-0.3563 Leslie Bibb 2
-0.35831 Mathieu Amalric 1
-0.35831 Judi Dench 1
-0.35885 Kevin James 5
-0.35968 Colin Farrell 3
-0.36023 Richard Gere 4
-0.36129 Stephanie Szostak 1
-0.36722 Drake Bell 1
-0.36722 Leslie Nielsen 1
-0.36722 Christopher McDonald 1
-0.37159 Kevin Bacon 1
-0.37197 Rasmus Hardiker 1
-0.37287 Christoph Waltz 2
-0.37325 Laz Alonso 1
-0.37325 Omar Benson Miller 1
-0.37446 Tony Kgoroge 1
-0.37446 Patrick Mofokeng 1
-0.37481 Kate Hudson 4
-0.37761 Jet Li 3
-0.37826 Ving Rhames 2
-0.3796 Amanda Crew 2
-0.38001 John Cleese 2
-0.38141 Abbie Cornish 3
-0.38142 Derek Jeter 1
-0.38538 Gabrielle Union 1
-0.38538 Scott Caan 1
-0.38604 Nick Frost 2
-0.38614 Catherine Keener 4
-0.3877 Daniel Hansen 1
-0.3877 Wesley Singerman 1
-0.3877 Jordan Fry 1
-0.38831 Noah Ringer 1
-0.38831 Nicola Peltz 1
-0.38831 Jackson Rathbone 1
-0.38893 Maria Bello 1
-0.39071 Tom Selleck 1
-0.39332 Jay Chou 1
-0.39413 Jennifer Connelly 4
-0.39438 Kieran Culkin 1
-0.39438 Alison Pill 1
-0.39473 David de Vries 1
-0.39492 Joel Edgerton 2
-0.39802 Dakota Fanning 4
-0.39804 Tilda Swinton 2
-0.39875 Casey Affleck 2
-0.40121 Viggo Mortensen 3
-0.40188 Thomas Kretschmann 1
-0.40735 Derek Luke 2
-0.40949 Bernie Mac 1
-0.40949 Adam Herschman 1
-0.41029 Kristin Kreuk 1
-0.41029 Neal McDonough 1
-0.41029 Michael Clarke Duncan 1
-0.41029 Chris Klein 1
-0.41274 Jon Heder 2
-0.41518 Julia Roberts 5
-0.41561 Jean Reno 2
-0.41773 Naveen Andrews 1
-0.41773 Nicky Katt 1
-0.41835 Ioan Gruffudd 2
-0.41855 Radha Mitchell 2
-0.4238 Tom Hollander 1
-0.42492 Stephen Colbert 1
-0.42514 Bette Midler 1
-0.42514 Chris O’Donnell 1
-0.42514 Jack McBrayer 1
-0.42584 Karl Urban 2
-0.42594 Yara Shahidi 1
-0.42594 Ronny Cox 1
-0.42702 Angela Bassett 2
-0.42845 Clancy Brown 2
-0.429 Tom Wilkinson 4
-0.42952 Anna Friel 2
-0.43227 Greg Kinnear 3
-0.43263 Hugo Weaving 2
-0.43371 Robert Duvall 2
-0.43428 Patrick Wilson 4
-0.43428 Jordi Mollà 2
-0.43598 Romany Malco 1
-0.43598 Jessica Simpson 1
-0.44493 Oliver Platt 3
-0.44639 Billy Bob Thornton 2
-0.44673 Tim Roth 1
-0.45012 Jason Behr 1
-0.45012 Amanda Brooks 1
-0.45012 Robert Forster 1
-0.45081 Elisabeth Moss 2
-0.45444 Kate Beckinsale 3
-0.45459 Elodie Tougne 1
-0.45644 Michael Peña 4
-0.45749 Ethan Suplee 1
-0.46307 Sara Paxton 2
-0.46648 William Hurt 2
-0.46705 Dominic Purcell 1
-0.4676 Eric Dane 2
-0.47286 Ewan McGregor 6
-0.47372 Bojana Novakovic 1
-0.47493 Rainn Wilson 2
-0.47496 Michael Douglas 3
-0.47656 Salma Hayek 3
-0.47821 Michael Chiklis 2
-0.47877 Susie Essman 1
-0.47877 Mark Walton 1
-0.48214 Tom Cruise 3
-0.48289 Rob Brown 2
-0.48494 Chris Tucker 1
-0.48494 Hiroyuki Sanada 1
-0.48495 Jerry Seinfeld 1
-0.48664 Amber Heard 4
-0.49151 Jeff Bridges 5
-0.49254 Julia Ormond 2
-0.49388 Jim Sturgess 5
-0.49397 David Alexander 1
-0.49599 Dakota Blue Richards 1
-0.49599 Ben Walker 1
-0.49633 Jack Black 7
-0.49996 Rachel Weisz 3
-0.50207 Rhona Mitra 3
-0.50598 Kristen Bell 3
-0.50922 Ulrich Thomsen 2
-0.50958 Paul Giamatti 4
-0.51353 Courteney Cox 2
-0.52123 Eric Christian Olsen 4
-0.52388 Bill Murray 2
-0.5256 Helen Mirren 4
-0.5371 Liev Schreiber 3
-0.53777 Luis Guzmàn 1
-0.53777 Victor Gojcaj 1
-0.53964 William Moseley 1
-0.54075 Amy Poehler 2
-0.54559 Chris Pine 2
-0.55487 Golshifteh Farahani 1
-0.56183 Ciaràn Hinds 5
-0.56769 Matthew McConaughey 3
-0.57323 Michelle Yeoh 2
-0.59291 Jodie Foster 3
-0.59744 Jeremy Northam 1
-0.59744 Jackson Bond 1
-0.61008 Harrison Ford 4
-0.61034 Zachary Levi 2
-0.61171 Ben Kingsley 2
-0.61301 Jena Malone 2
-0.61555 Keri Russell 3
-0.61562 Daniel Day-Lewis 2
-0.61821 Jessica Alba 8
-0.62343 Malin Akerman 4
-0.63052 Mel Gibson 2
-0.63521 John Krasinski 4
-0.63733 Marion Cotillard 1
-0.63733 Sophia Loren 1
-0.63784 Cate Blanchett 4
-0.63804 Donald Sutherland 3
-0.63962 Steve Martin 3
-0.64079 Jackie Earle Haley 2
-0.64477 Mary Elizabeth Winstead 2
-0.64757 Max Records 1
-0.64757 Pepita Emmerichs 1
-0.65084 David Strathairn 2
-0.65479 Vince Vaughn 4
-0.65669 Tina Fey 3
-0.65691 Bruce Boxleitner 1
-0.65739 Craig Robinson 3
-0.66044 Logan Lerman 3
-0.662 Saoirse Ronan 3
-0.66534 Jamie Foxx 5
-0.66583 Jim Carrey 5
-0.66613 Jason Momoa 1
-0.66613 Rose McGowan 1
-0.66613 Stephen Lang 1
-0.66867 Sam Elliott 2
-0.6712 Adewale Akinnuoye-Agbaje 1
-0.67171 Ben Barnes 2
-0.67171 Skandar Keynes 2
-0.67171 Georgie Henley 2
-0.67769 Katie Holmes 3
-0.68674 Eric Bana 4
-0.68959 Hugh Jackman 3
-0.69001 Anna Faris 6
-0.69111 Matthew Fox 2
-0.69807 Jorma Taccone 1
-0.70598 Jay Baruchel 3
-0.71044 Lake Bell 2
-0.71601 Matthew Broderick 2
-0.71688 Garrett Hedlund 2
-0.72146 Clint Eastwood 1
-0.72146 Bee Vang 1
-0.72146 Christopher Carley 1
-0.72146 Ahney Her 1
-0.72206 Reese Witherspoon 4
-0.72222 Forest Whitaker 4
-0.73722 Danny Huston 3
-0.73748 Michael Ealy 3
-0.74577 Steve Valentine 1
-0.74612 Chris Cooper 3
-0.7519 Guy Pearce 4
-0.75576 Ray Stevenson 3
-0.76908 Ben Foster 4
-0.7697 Emile Hirsch 3
-0.77558 John C. Reilly 5
-0.77761 Rob Corddry 3
-0.7787 Isla Fisher 2
-0.78113 Emily Browning 2
-0.78249 Channing Tatum 4
-0.78709 Marlon Wayans 2
-0.79442 Amy Adams 10
-0.79518 Edward Norton 3
-0.79572 Patrick Warburton 2
-0.80053 Gemma Arterton 1
-0.80891 Nicholas Elia 1
-0.81369 Ryan Reynolds 8
-0.81879 Chris Evans 7
-0.82393 Freddie Highmore 2
-0.82719 Moon Bloodgood 2
-0.83737 Cam Gigandet 6
-0.84704 Ashton Kutcher 3
-0.85929 Mark Wahlberg 8
-0.86703 Philip Seymour Hoffman 5
-0.87171 Brendan Fraser 5
-0.8803 Crispin Glover 2
-0.90228 Seth Rogen 9
-0.90251 Michael Fassbender 3
-0.90605 Ben Stiller 6
-0.907 Pink 1
-0.907 Carlos Alazraqui 1
-0.91934 Naomi Watts 5
-0.94107 Christina Ricci 3
-0.94334 Benicio Del Toro 1
-0.94334 Simon Merrells 1
-0.95701 Jason Statham 7
-0.98138 Emily Blunt 5
-0.98557 Morgan Freeman 6
-0.9859 Gary Oldman 4
-0.98804 Teresa Palmer 3
-0.99282 Samuel L. Jackson 6
-1.002 Anton Yelchin 3
-1.01902 Elijah Wood 2
-1.0212 James McAvoy 6
-1.03203 Elizabeth Banks 7
-1.0326 Carla Gugino 4
-1.03435 Robert De Niro 7
-1.03828 Catherine O’Hara 2
-1.0454 Al Pacino 4
-1.05981 Anthony Hopkins 4
-1.06493 Hugh Laurie 2
-1.06704 Robin Wright 3
-1.07278 Lauren Graham 1
-1.0907 Owen Wilson 10
-1.09614 Danny McBride 4
-1.09668 Jim Broadbent 1
-1.10827 Renée Zellweger 4
-1.10958 Matthew Macfadyen 2
-1.12118 Robin Williams 3
-1.12426 Raven-Symoné 1
-1.12426 Kym Whitley 1
-1.12426 Adam LeFevre 1
-1.13515 Rosario Dawson 4
-1.14922 Johnny Simmons 2
-1.15232 Sam Rockwell 3
-1.18263 Jessica Biel 4
-1.20545 Dennis Quaid 8
-1.21903 Jake Gyllenhaal 5
-1.242 Ray Winstone 2
-1.25532 Blake Lively 2
-1.25729 Elisabeth Harnois 1
-1.2839 Martin Lawrence 4
-1.28437 Clive Owen 4
-1.30597 Mark Strong 3
-1.31824 Donna Murphy 2
-1.32204 Denzel Washington 4
-1.32601 Max von Sydow 2
-1.38227 Seth Green 2
-1.38594 Mandy Moore 2
-1.44403 Peter Sarsgaard 3
-1.44474 Will Ferrell 7
-1.52446 John Travolta 6
-1.52673 Dan Fogler 4
-1.56454 Alfred Molina 3
-1.56534 Adam Sandler 7
-1.74681 Nicole Kidman 5
-2.09692 Russell Crowe 6
-2.20621 Olivia Wilde 4
-2.28186 Ron Perlman 4
-2.44066 Nicolas Cage 11
-2.7271 Daniel Craig 5


 

Don’t take my word for it

Inspiration

In June 2010, I attended a Wolfram|Alpha event called the London Computational Knowledge Summit where speakers mostly focused on how computers can transform the way we teach and transmit knowledge. Several of the presentations made a lasting impression, and mostly the talk by Jon McLoone:

Jon’s point was that academic papers today look an awful lot like those in the 17th century. Granted, they’re not in latin, they can be displayed online and there is color, but as far as maths are concerned it’s still long pages of difficult language and long formulas. The computer, however, can do so much more than transmit information. In the clip above (around 6’20”) Jon shows how a paper on edge detection can be much more effective if instead of using a static example to demonstrate the technique, the paper were able to use a live example, such as input from the camera. In that talk and throughout the day, there were more examples on how interactive displays could be useful for teaching.

Teaching, telling stories and getting a message across use similar functions. Fast forward to VisWeek 2010 and the first “Telling Stories with Data” workshop. Some of the presentations there (I’m thinking of Nick Diakopoulos and Matthias Shapiro mostly) hinted that there could be a process through which readers/users/audience could be taken through so they can make the most of an intended message. Interestingly, this process is not about transmitting as much data as effortlessly as possible but rather to engage the audience, to get them to challenge their assumptions.

Those two events really made me pause and think. Ever since I had started working in visualization, all my efforts had been focused on being as clear as possible, and my focus, on efficient visuals. However, for some tasks, clarity just isn’t optimal. That wasn’t much of an issue in most of my OECD work where such an approach makes a lot of sense but I started seeing that there was a world of possibility when it comes to changing people’s perception on a subject or even persuading them.

Application

French pension reform

Right at the moment of visWeek 2010, France was plagued by strikes against the proposed pension reform. At the peak of contestation up to 3m people demonstrated (that’s as much as one adult out of 14). I was quite irritated by the protests. In theory, left and right had very comparable views on this problem and only disagreed on unsignificant details. They both knew reform was unavoidable, and, again, had similar plans. But when those of the current government were implemented, the opposition capitalized on the discontent and attacked the plan vigorously. Their rethoric were entirely verbal – no numbers were harmed in the making of their discourse! Consequently, protesters and a large part of the population started to develop notions about the state of pensions which were completely disconnected from reality.

I believe that if numbers had been used early enough, it would have been enough to provide a counterpoint to such fallacies and while it may not have prevented demonstrations, it would have greatly helped to dampen their effect. With that in mind and with official data I tried to build a model to show what would happen if one changed this or that parameter of pension policy. Pension mechanics are quite simple: what you give on one side, you take on another; the evolution of population is quite well known, so making such a model is pretty straight forward. But putting that in a visual application really showed how the current situation was unsustainable. In this application I challenge the user to find a solution – any solution – to the pension problem, by using the same levers as the policy makers. It turns out that there is just one viable possibility. Yet, letting people find that by themselves and challenge that idea as hard as they could was very different from paternalizing and telling people that this was just the way it is.

On the course of the year I got involved in several occasions in situations like this where data visualization could be used to influence people’s opinion, and likewise I tried to use that approach. Instead of sending a top-down message (with or without data), instead confront the assumptions of the audience and get them to interact with a model. After this experience, their perception will have changed. This technique doesn’t try to bypass the viewers critical thinking, but instead to leverage their intelligence.

In politics

I am very concerned with the use of data visualization in politics for many reasons. One of them is because I’m a public servant. In my experience, most decisions are not taken by politicians, but by experts or technicians who are commited to the public good. Yet, when poorly  explained, these decisions can be misunderstood and attacked. Visualization, I believe, can help defend such decisions (those who are justifiable at least) and explain them better to a greater number.

Although a lot of data is available out there (or perhaps for that very reason) only few people have a good grasp of the economic situation of their country. This just can’t be helped. It’s not possible to increase the percentage of people who can guestimate the unemployment rate and it’s not really important. Very few people need to know such a number, now what is important is to be able to use that information in context when it is useful. For instance, at election time, a voter should be able to know if the incumbent has created or destroyed jobs. This is something that data visualization can handle brilliantly.

Finally, my issue with political communication is that it is written by activists, for activists. It works well to motivate people with a certain sensitivity but it is not very effective at getting others to change side. This is a bias which is difficult to detect by those in charge of political communications because, well, they’re activits too… and here this flavor of model-based data visualization, with its appearance of objectivity and neutrality, can complement the more verbal aspect of rhetoric quite well.

In the talk I used Al Gore’s An Inconvenient Truth as a counter example. This movie is a fine example of story-telling, and operating at an emotional rather that at a rational novel. I trust that people who feel concerned about climate change will be reinforced in their beliefs after seeing the movie. However, those who do not were unconvinced. In fact, the movie also gave a strong boost to climate skeptics. There was a real barrage of blog posts and websites attempting to debunk the assertions of that “truth”, most often with data. There is a missed opportunity: if the really well-made stories of the movie had been complemented with a climate model that people could experiment with, it would have been perceived as less monolithic, less manichean, less dogmatic.

The conclusions

In my practice using an interactive model can help a lot to get a message across (and no, I don’t have a rigorous evaluation for “a lot”, that’s the advantage of not being an academic).

Such models engage the users, they come out as more objective and truthful as static representations, and they can be very useful to address preconceptions. Chances are they’re more fun, too.

Then again, just because a model is interactive and built on transparent data and equations doesn’t mean it’s objective. It is usually possible to control the model or the interface so that one interpretation is more likely than the other, and that’s precisely the point if you are using data visualization to influence.

It can be very cheap and easy to turn a static representation into an interactive display. Every chart with more than 2 dimensions can be turned in a visualization where the user controls one dimension and sees data for the others evolve.

And if you build a model like this, you must be very open and transparent about the data and the equations and sometimes find ways to get people to overcome their doubts.

Besides, having a working interactive model is no guarantee of success. You really have to be careful that your users are not likely to interpret your visualization in ways you never intended.

The presentation



All examples I used in the presentation both good and bad, both mine and others can be found at http://www.jeromecukier.net/data-stories/

 

More fun with arrays in protovis

In my short tutorial on working with data and protovis I briefly covered some standard javascript and protovis methods to work with arrays. The more I work with Protovis, the more I am convinced that efficient array manipulation is key to achieving just about anything with the library. So, I would like to go into more detail in some javascript methods for building, processing and testing arrays that can really be helpful.

Going through arrays: map and forEach

I said rapidly that the map method was very useful in protovis especially used in combination with pv.range. But that's not very fair to map() to be treated this lightly. Protovis canon examples do not use many traditional loops such as for or while statements. One reason for that is that many constructs in protovis are de facto loops: when we pass an array to protovis as a data file, to create a bar chart for instance (or panels, pie wedges, you name it), it will go through each element of the array to create individul bars (panels, wedges...), to position them, style them, and so on and so forth. This is why it is so important to have your data elements in the best possible shape when you first pass them to protovis. It makes the rest of your code much nicer. Remember our early example:
var categories = ["a","b","c","d","e","f"];
var vis = new pv.Panel()
  .width(150)
  .height(150);

vis.add(pv.Bar)
  .data([1, 1.2, 1.7, 1.5, .7, .3])
  .width(20)
  .height(function(d) d * 80)
  .bottom(0)
  .left(function() this.index * 25)
  .anchor("bottom").add(pv.Label)
    .text(function() categories[this.index]);

vis.render();
This is not ideal to have values and categories in two separate places, because one could be changed without updating the other. So let's try to use map to create one single variable.
var categories = ["a","b","c","d","e","f"];
var data=[1, 1.2, 1.7, 1.5, .7, .3].map(function(d,i) {return {value:d, key: categories[i]};});
Let's look at our map() method here. First, it's right after an array. It will run against this array, so it will perform an operation on each element of this array, and the result will be another array of the same size with the outcome of each operation in the same order. The next thing here is a function with two arguments: d and i. Again the naming is arbitrary, call them what you want. And they are both optional. But it has to be a function. pv.range(3).map(3) will not return [3,3,3], you need to write pv.range(3).map(function() 3). The first argument refers to the current item of the array map is working on. So 1, then 1.2, etc. If the array is more complex, and is an array of arrays or similar, the current element can be itself an array, an object or anything. It doesn't have to be a number. Here, we want to create an array of associative arrays where the value handle corresponds to the values of the array, and where key corresponds to the category name. So we start our output by "{value: d,". This puts the value of each array element in sequence where we need it to be. The second argument corresponds to the index of the current item in the array, so - 0, 1, 2 etc. This is not unlike using "this.index" in other parts of protovis. This helps us getting the right category name, the one in the same position as the value we are fetching. So we write "key: categories[i]}". The rest of the code can then be changed to :
vis.add(pv.Bar)
  .data(data)
  .width(20)
  .height(function(d) d.value * 80)
  .bottom(0)
  .left(function() this.index * 25)
  .anchor("bottom").add(pv.Label)
    .text(function() d.key);
vis.render();
Now how about forEach()? forEach works in a very similar way to map(), the difference is that it doesn't output an array. It's just a function that runs on each element of the array. It can be used to perform an operation a number of times corresponding to the length of that array, for instance.

Testing arrays

There are some times when you need to know whether some or all the elements of your array fulfill a condition. And some other times, you need to be able to extract a subset of your array also on a conditional basis. Now, that would be possible using forEach or map methods as above, but fortunately javascript provides simpler means to achieve that.

Testing a condition on an array at once

There are two methods that do that: every() and some(). every() will return true is the condition is true for, well, every element of the array. some() will return true if the condition is true for at least one element of the array. So, they can also be used to tell if the condition is false for at least one element of the array, or all elements of the array respectively. This is how they work:
[0,1,2].every(function(d) {return d;})
// will return false: 0 is false, 1 is true and 2 is true.
[0,1,2].every(function(d) {return (d<3);})
// will return true. All elements are less than 3.

[0,1,2].some(function(d) {return d;})
// will return true. 1 is true, so at least one element in the array is true.

Creating conditional subsets of an array

It is also possible to get only the elements that fit a condition using the filter() method.
[0,1,2].filter(function(d) {return d;})
// this will return [1,2]. 0 is evaluated as "false".
[1,2,3,4,5].filter(function(d) {return (d>2);})
// this will return [3,4,5].
Of course, the more complex the array, the more interesting these functions get. With the barley array from part 4:
 barley.filter(function(d) {return d.variety=="Manchuria";}
/* this will return: 
  [{"yield":27,"variety":"Manchuria","year":1931,"site":"University Farm"},
   {"yield":48.86667,"variety":"Manchuria","year":1931,"site":"Waseca"},
   {"yield":27.43334,"variety":"Manchuria","year":1931,"site":"Morris"},
   {"yield":39.93333,"variety":"Manchuria","year":1931,"site":"Crookston"},
   {"yield":32.96667,"variety":"Manchuria","year":1931,"site":"Grand Rapids"},
   {"yield":28.96667,"variety":"Manchuria","year":1931,"site":"Duluth"},
   {"yield":26.9,"variety":"Manchuria","year":1932,"site":"University Farm"},
   {"yield":33.46667,"variety":"Manchuria","year":1932,"site":"Waseca"},
   {"yield":34.36666,"variety":"Manchuria","year":1932,"site":"Morris"},
   {"yield":32.96667,"variety":"Manchuria","year":1932,"site":"Crookston"},
   {"yield":22.13333,"variety":"Manchuria","year":1932,"site":"Grand Rapids"},
   {"yield":22.56667,"variety":"Manchuria","year":1932,"site":"Duluth"}]*/

Visualizing arrays

(without plotting them, of course) When you are manipulating arrays, turning them into maps, performing roll-ups and sorts, you may want to get a glimpse of the array. But, unless it's a single, one-dimensional array, firebug or chrome debugger will represent it as a cryptic [ > Object, > Object, > Object ]. Not being able to follow step by step what's happening to an array makes understanding the data functions much more difficult. Fortunately, you can use the JSON.stringify() method.
JSON.stringify(barley)
/*returns: 
"[
{"yield":27,"variety":"Manchuria","year":1931,"site":"University Farm"},
{"yield":48.86667,"variety":"Manchuria","year":1931,"site":"Waseca"},
{"yield":27.43334,"variety":"Manchuria","year":1931,"site":"Morris"},
{"yield":39.93333,"variety":"Manchuria","year":1931,"site":"Crookston"},
{"yield":32.96667,"variety":"Manchuria","year":1931,"site":"Grand Rapids"},
{"yield":28.96667,"variety":"Manchuria","year":1931,"site":"Duluth"},
{"yield":43.06666,"variety":"Glabron","year":1931,"site":"University Farm"},
{"yield":55.2,"variety":"Glabron","year":1931,"site":"Waseca"},
{"yield":28.76667,"variety":"Glabron","year":1931,"site":"Morris"},
{"yield":38.13333,"variety":"Glabron","year":1931,"site":"Crookston"},
{"yield":29.13333,"variety":"Glabron","year":1931,"site":"Grand Rapids"},
{"yield":29.66667,"variety":"Glabron","year":1931,"site":"Duluth"},
{"yield":35.13333,"variety":"Svansota","year":1931,"site":"University Farm"},
{"yield":47.33333,"variety":"Svansota","year":1931,"site":"Waseca"},
{"yield":25.76667,"variety":"Svansota","year":1931,"site":"Morris"},
{"yield":40.46667,"variety":"Svansota","year":1931,"site":"Crookston"},
{"yield":29.66667,"variety":"Svansota","year":1931,"site":"Grand Rapids"},
{"yield":25.7,"variety":"Svansota","year":1931,"site":"Duluth"},
{"yield":39.9,"variety":"Velvet","year":1931,"site":"University Farm"},
{"yield":50.23333,"variety":"Velvet","year":1931,"site":"Waseca"},
{"yield":26.13333,"variety":"Velvet","year":1931,"site":"Morris"},
{"yield":41.33333,"variety":"Velvet","year":1931,"site":"Crookston"},
{"yield":23.03333,"variety":"Velvet","year":1931,"site":"Grand Rapids"},
{"yield":26.3,"variety":"Velvet","year":1931,"site":"Duluth"},
{"yield":36.56666,"variety":"Trebi","year":1931,"site":"University Farm"},
{"yield":63.8333,"variety":"Trebi","year":1931,"site":"Waseca"},
{"yield":43.76667,"variety":"Trebi","year":1931,"site":"Morris"},
{"yield":46.93333,"variety":"Trebi","year":1931,"site":"Crookston"},
{"yield":29.76667,"variety":"Trebi","year":1931,"site":"Grand Rapids"},
{"yield":33.93333,"variety":"Trebi","year":1931,"site":"Duluth"},
{"yield":43.26667,"variety":"No. 457","year":1931,"site":"University Farm"},
{"yield":58.1,"variety":"No. 457","year":1931,"site":"Waseca"},
{"yield":28.7,"variety":"No. 457","year":1931,"site":"Morris"},
{"yield":45.66667,"variety":"No. 457","year":1931,"site":"Crookston"},
{"yield":32.16667,"variety":"No. 457","year":1931,"site":"Grand Rapids"},
{"yield":33.6,"variety":"No. 457","year":1931,"site":"Duluth"},
{"yield":36.6,"variety":"No. 462","year":1931,"site":"University Farm"},
{"yield":65.7667,"variety":"No. 462","year":1931,"site":"Waseca"},
{"yield":30.36667,"variety":"No. 462","year":1931,"site":"Morris"},
{"yield":48.56666,"variety":"No. 462","year":1931,"site":"Crookston"},
{"yield":24.93334,"variety":"No. 462","year":1931,"site":"Grand Rapids"},
{"yield":28.1,"variety":"No. 462","year":1931,"site":"Duluth"},
{"yield":32.76667,"variety":"Peatland","year":1931,"site":"University Farm"},
{"yield":48.56666,"variety":"Peatland","year":1931,"site":"Waseca"},
{"yield":29.86667,"variety":"Peatland","year":1931,"site":"Morris"},
{"yield":41.6,"variety":"Peatland","year":1931,"site":"Crookston"},
{"yield":34.7,"variety":"Peatland","year":1931,"site":"Grand Rapids"},
{"yield":32,"variety":"Peatland","year":1931,"site":"Duluth"},
{"yield":24.66667,"variety":"No. 475","year":1931,"site":"University Farm"},
{"yield":46.76667,"variety":"No. 475","year":1931,"site":"Waseca"},
{"yield":22.6,"variety":"No. 475","year":1931,"site":"Morris"},
{"yield":44.1,"variety":"No. 475","year":1931,"site":"Crookston"},
{"yield":19.7,"variety":"No. 475","year":1931,"site":"Grand Rapids"},
{"yield":33.06666,"variety":"No. 475","year":1931,"site":"Duluth"},
{"yield":39.3,"variety":"Wisconsin No. 38","year":1931,"site":"University Farm"},
{"yield":58.8,"variety":"Wisconsin No. 38","year":1931,"site":"Waseca"},
{"yield":29.46667,"variety":"Wisconsin No. 38","year":1931,"site":"Morris"},
{"yield":49.86667,"variety":"Wisconsin No. 38","year":1931,"site":"Crookston"},
{"yield":34.46667,"variety":"Wisconsin No. 38","year":1931,"site":"Grand Rapids"},
{"yield":31.6,"variety":"Wisconsin No. 38","year":1931,"site":"Duluth"},
{"yield":26.9,"variety":"Manchuria","year":1932,"site":"University Farm"},
{"yield":33.46667,"variety":"Manchuria","year":1932,"site":"Waseca"},
{"yield":34.36666,"variety":"Manchuria","year":1932,"site":"Morris"},
{"yield":32.96667,"variety":"Manchuria","year":1932,"site":"Crookston"},
{"yield":22.13333,"variety":"Manchuria","year":1932,"site":"Grand Rapids"},
{"yield":22.56667,"variety":"Manchuria","year":1932,"site":"Duluth"},
{"yield":36.8,"variety":"Glabron","year":1932,"site":"University Farm"},
{"yield":37.73333,"variety":"Glabron","year":1932,"site":"Waseca"},
{"yield":35.13333,"variety":"Glabron","year":1932,"site":"Morris"},
{"yield":26.16667,"variety":"Glabron","year":1932,"site":"Crookston"},
{"yield":14.43333,"variety":"Glabron","year":1932,"site":"Grand Rapids"},
{"yield":25.86667,"variety":"Glabron","year":1932,"site":"Duluth"},
{"yield":27.43334,"variety":"Svansota","year":1932,"site":"University Farm"},
{"yield":38.5,"variety":"Svansota","year":1932,"site":"Waseca"},
{"yield":35.03333,"variety":"Svansota","year":1932,"site":"Morris"},
{"yield":20.63333,"variety":"Svansota","year":1932,"site":"Crookston"},
{"yield":16.63333,"variety":"Svansota","year":1932,"site":"Grand Rapids"},
{"yield":22.23333,"variety":"Svansota","year":1932,"site":"Duluth"},
{"yield":26.8,"variety":"Velvet","year":1932,"site":"University Farm"},
{"yield":37.4,"variety":"Velvet","year":1932,"site":"Waseca"},
{"yield":38.83333,"variety":"Velvet","year":1932,"site":"Morris"},
{"yield":32.06666,"variety":"Velvet","year":1932,"site":"Crookston"},
{"yield":32.23333,"variety":"Velvet","year":1932,"site":"Grand Rapids"},
{"yield":22.46667,"variety":"Velvet","year":1932,"site":"Duluth"},
{"yield":29.06667,"variety":"Trebi","year":1932,"site":"University Farm"},
{"yield":49.2333,"variety":"Trebi","year":1932,"site":"Waseca"},
{"yield":46.63333,"variety":"Trebi","year":1932,"site":"Morris"},
{"yield":41.83333,"variety":"Trebi","year":1932,"site":"Crookston"},
{"yield":20.63333,"variety":"Trebi","year":1932,"site":"Grand Rapids"},
{"yield":30.6,"variety":"Trebi","year":1932,"site":"Duluth"},
{"yield":26.43334,"variety":"No. 457","year":1932,"site":"University Farm"},
{"yield":42.2,"variety":"No. 457","year":1932,"site":"Waseca"},
{"yield":43.53334,"variety":"No. 457","year":1932,"site":"Morris"},
{"yield":34.33333,"variety":"No. 457","year":1932,"site":"Crookston"},
{"yield":19.46667,"variety":"No. 457","year":1932,"site":"Grand Rapids"},
{"yield":22.7,"variety":"No. 457","year":1932,"site":"Duluth"},
{"yield":25.56667,"variety":"No. 462","year":1932,"site":"University Farm"},
{"yield":44.7,"variety":"No. 462","year":1932,"site":"Waseca"},
{"yield":47,"variety":"No. 462","year":1932,"site":"Morris"},
{"yield":30.53333,"variety":"No. 462","year":1932,"site":"Crookston"},
{"yield":19.9,"variety":"No. 462","year":1932,"site":"Grand Rapids"},
{"yield":22.5,"variety":"No. 462","year":1932,"site":"Duluth"},
{"yield":28.06667,"variety":"Peatland","year":1932,"site":"University Farm"},
{"yield":36.03333,"variety":"Peatland","year":1932,"site":"Waseca"},
{"yield":43.2,"variety":"Peatland","year":1932,"site":"Morris"},
{"yield":25.23333,"variety":"Peatland","year":1932,"site":"Crookston"},
{"yield":26.76667,"variety":"Peatland","year":1932,"site":"Grand Rapids"},
{"yield":31.36667,"variety":"Peatland","year":1932,"site":"Duluth"},
{"yield":30,"variety":"No. 475","year":1932,"site":"University Farm"},
{"yield":41.26667,"variety":"No. 475","year":1932,"site":"Waseca"},
{"yield":44.23333,"variety":"No. 475","year":1932,"site":"Morris"},
{"yield":32.13333,"variety":"No. 475","year":1932,"site":"Crookston"},
{"yield":15.23333,"variety":"No. 475","year":1932,"site":"Grand Rapids"},
{"yield":27.36667,"variety":"No. 475","year":1932,"site":"Duluth"},
{"yield":38,"variety":"Wisconsin No. 38","year":1932,"site":"University Farm"},
{"yield":58.16667,"variety":"Wisconsin No. 38","year":1932,"site":"Waseca"},
{"yield":47.16667,"variety":"Wisconsin No. 38","year":1932,"site":"Morris"},
{"yield":35.9,"variety":"Wisconsin No. 38","year":1932,"site":"Crookston"},
{"yield":20.66667,"variety":"Wisconsin No. 38","year":1932,"site":"Grand Rapids"},
{"yield":29.33333,"variety":"Wisconsin No. 38","year":1932,"site":"Duluth"}
]"*/
No matter the manipulations you inflict to an array you will always be able to make it reveal its innards by using this.
 

Protovis: analysis of the Map projections example

What is a map?

before we start looking at the code it may be a good idea to think of the best way to represent a country.
Countries are areas of land surrounded by borders, which are imaginary (or sometimes physical) lines going through a set of points.

Some countries are made of one of such surfaces, but many countries are not one contiguous territory (they may include islands for instance) so they could be made out of several disjointed polygons.
.

Now let’s put on our protovis hat. Let’s suppose we want to draw a map where each country could be colored differently (choropleth). What kind of data structure should be use to represent that?
First there should be a sort of array of countries. Each country should be an item in that array, so they can be indexed and assigned an individual color and various data points.
Then, at the lowest level, we would be drawing polygons, which are treated as pv.Line in protovis. For each polygon, we would require an array of coordinate pairs. To draw a country, we would need a list (array) of those polygons.

So the data structure we are looking at is:

var world=[  // an array of countries
    [ // an array of polygons
        [ // an array of pairs of coordinates
            [x0, y0], // coordinates of the first point
            [x1, y1], // coordinates of the next one
                ... 
            [xn, yn],
            [x0, y0]  // coordinates of the first point to close the polygon
        ]
       ...              // another polygon, but maybe not.
   ], 
   [                    // next country
  ...
   ]
...
]

the map projections example

Can be found here: http://vis.stanford.edu/protovis/ex/projection.html

/*
 * A diverging color scale, using previously-computed quantiles of population
 * densities; in the future, we might use a quantile scale here to do this
 * automatically. Map colors based on www.ColorBrewer.org, by Cynthia A. Brewer,
 * Penn State.
 */
var fill = pv.Scale.linear()
    .domain(140, 650, 1900)
    .range("#91bfdb", "#ffffbf", "#fc8d59");

/* Precompute the country's population density and color. */
countries.forEach(function(c) {
  c.color = stats[c.code].area
      ? fill(stats[c.code].pop / stats[c.code].area)
      : "#ccc"; // unknown
});

var w = 860,
    h = 3 / 5 * w,
    geo = pv.Geo.scale("hammer").range(w, h);

var vis = new pv.Panel()
    .width(w)
    .height(h);

/* Countries. */
vis.add(pv.Panel)
    .data(countries)
  .add(pv.Panel)
    .data(function(c) c.borders)
  .add(pv.Line)
    .data(function(b) b)
    .left(geo.x)
    .top(geo.y)
    .title(function(d, b, c) c.name)
    .fillStyle(function(d, b, c) c.color)
    .strokeStyle(function() this.fillStyle().darker())
    .lineWidth(1)
    .antialias(false);

/* Latitude ticks. */
vis.add(pv.Panel)
    .data(geo.ticks.lat())
  .add(pv.Line)
    .data(function(b) b)
    .left(geo.x)
    .top(geo.y)
    .strokeStyle("rgba(128,128,128,.3)")
    .lineWidth(1)
    .interpolate("cardinal")
    .antialias(false);

/* Longitude ticks. */
vis.add(pv.Panel)
    .data(geo.ticks.lng())
  .add(pv.Line)
    .data(function(b) b)
    .left(geo.x)
    .top(geo.y)
    .strokeStyle("rgba(128,128,128,.3)")
    .lineWidth(1)
    .interpolate("cardinal")
    .antialias(false);

vis.render();

In addition there are two arrays of the following shape:
First, stats which is an associative arrays of associative arrays, and which associate each 2-letter country code with values of population and area:

var stats = {
'AG': {pop:83039, area:44},
'DZ': {pop:32854159, area:238174},
...
'US': {pop:299846449, area:915896},
...
};

Then, countries, which is an array of associative arrays.

var countries = [
{code:'AG', name:"Antigua and Barbuda", 
borders:[ // an array of one or several areas, 
  [ // an array of coordinates, 
    [ // a pair of the form longitude, lattitude
       ...
    ]
  ]
]}
...
]

Now this second data structure looks a lot like the one we’ve drafted in the prologue. All the geographic information is tucked in a property called “borders”. The array has other properties for comfort.
Because the data is put in the right shape and order, this script can produce a very good map with a remarkable economy of code.
This example has been put together to showcase the various map projections of protovis (identity, mercator, and so on.). These projections have zero impact on the way data should be assembled for making maps, so we’ll just treat them as “magic”.

/*
 * A diverging color scale, using previously-computed quantiles of population
 * densities; in the future, we might use a quantile scale here to do this
 * automatically. Map colors based on www.ColorBrewer.org, by Cynthia A. Brewer,
 * Penn State.
 */
var fill = pv.Scale.linear()
    .domain(140, 650, 1900)
    .range("#91bfdb", "#ffffbf", "#fc8d59");

This part creates a color scale which will return a color according to the value passed to it. The color returned will be somewhere between the ones specified in the range, depending on where the value is relatively to the values specified in the domain. So a value of 140 will result in a color of #91bfdb (bluish), it will go towards the grey as the value moves up to 650, and towards #fc8d59 (redish) as the value goes up to 1900.

/* Precompute the country's population density and color. */
countries.forEach(function(c) {
  c.color = stats[c.code].area
      ? fill(stats[c.code].pop / stats[c.code].area)
      : "#ccc"; // unknown
});

As the remark says, this will precompute the country’s color once and for all.
The forEach() method goes to every element of the countries array.
the c.color = statement will add a color key to each element of that array (which, as you may recall, already has values for the code, name and borders keys.
What it does is that is retrieves the country code of that element of countries, c.code, and uses that to find out whether we have an area value for that country code (this is stats[/c].area?).
If this is the case, we are going to compute the color that should be attributed to the country, by passing the population divided by the area to the color scale we just made. Else, we just use light grey.

The next few lines are standard constants that will shape the vis.
Note however

geo = pv.Geo.scale("hammer").range(w, h)

This is a geographic scale, which will be used to convert longitudes and latitudes to X and Y coordinates on the screen.

/* Countries. */
vis.add(pv.Panel)
    .data(countries)
  .add(pv.Panel)
    .data(function(c) c.borders)
  .add(pv.Line)
    .data(function(b) b)
    .left(geo.x)
    .top(geo.y)
    .title(function(d, b, c) c.name)
    .fillStyle(function(d, b, c) c.color)
    .strokeStyle(function() this.fillStyle().darker())
    .lineWidth(1)
    .antialias(false);

This is where it all happens.
First, we create a series of panels, one for each country. So, we pass the countries array as data.
Then, we are going to create another series of panels for every country, that is, with as many panels as there are independent areas in the country. For instance, if there are islands, we are going to need extra panels to represent them. If the country is one contiguous mass of land, there will be just one panel here.
This time, we use function(c) c.borders as data. That is, we go into the borders array.

Finally, we are going to create a filled polygon for each of these independent areas. This is achieved by adding a pv.Line to the previous panels. Likewise, we use (function(b) b) as data, meaning that we go yet another level into the borders array. Now, we are accessing the pairs of longitude + latitude numbers.

geo.x and geo.y convert this pair of numbers to X and Y coordinates on the screen.
For the next two lines, title and fillStyle, we need to go back to the country level.
so, we use a function of the form function(d,b,c). d is the current item (pair of longitude, latitude), b its parent (individual area) and c, its grand-parent (the country).
so, function(d,b,c) c.name retrieves the country name, and function(d,b,c) c.color retrieves the color we had computed for that country to begin with.

For the color of the border, we wish to use a darker version of the fill color. This is what the this.fillStyle().darker() does.

The rest of the vis is longitude and latitude ticks, using the built-in properties of the scale.

 

Working with data in protovis – part 1 of 5

When I started using protovis I had only a very basic knowledge of javascript, which in theory isn’t a problem as protovis is meant to be learned by example, and as it has its own logic and structure which is different from typical javascript code. So I started by looking and modifying examples which was enough to do basic stuff.
But I soon felt limited by what hid behind a single property: data. I knew that protovis had lots of features to manipulate and process data but they were not obvious from the examples.

I mean,

var vis = new pv.Panel()
.width(150)
.height(150);

vis.add(pv.Bar)
.data([1, 1.2, 1.7, 1.5, .7, .3])
.width(20)
.height(function(d) d * 80)

vis.render();

Here, it’s pretty obvious that the bars represent the values 1, 1.2, 1.7, 1.5, 0.7 and 0.3 respectively. One can infer that the sizes of bars are 25 pixels wide and 80 times their value long.

But protovis doesn’t usually look like this “hello world” kind of example, but rather like this:

/* Compute yield medians by site and by variety. */
function median(data) pv.median(data, function(d) d.yield);
var site = pv.nest(barley).key(function(d) d.site).rollup(median);
var variety = pv.nest(barley).key(function(d) d.variety).rollup(median);
/* Nest yields data by site then year. */
barley = pv.nest(barley)
    .key(function(d) d.site)
    .sortKeys(function(a, b) site[b] - site[a])
    .key(function(d) d.year)
    .sortValues(function(a, b) variety[b.variety] - variety[a.variety])
    .entries();
[. . .]
/* A panel per site-year. */
var cell = vis.add(pv.Panel)
    .data(barley)
    .height(h)
    .top(function() this.index * h)
    .strokeStyle("#999");

What just happened? pv.nest, key, rollup, sortKeys, entries – what could that do?

To go beyond merely touching up examples, and do your own visualizations from scratch, it is important to get a good grip on how to feed protovis with data. In order to do so, you need a few javascript notions.

Arrays, arrays, how do they work?

In javascript, an array is an ordered list of stuff.

In our initial example, we had one such list:

[1, 1.2, 1.7, 1.5, .7, .3]

Anything can be put in an array: numbers, strings, Booleans (true/false values), objects … including other arrays. All elements of an array don’t have to be of the same type. Arrays can be assigned to a variable.

var a = [1, 1.2, 1.7, 1.5, .7, .3];

Elements of the array can be accessed using the [] notation. In javascript, indices start at 0, so the first element of an array can be obtained so:

a[0];

This returns 1. Javascript has many functions to create and manipulate arrays, which we will talk about later. For the time being, let’s look at arrays of arrays. If we wrote instead:

var a = [[1, 1.2], [1.7, 1.5], [.7, .3]];

a is now an array of arrays, or “multi-dimensional array”.

a[0] is now worth [1, 1.2]. To access the first number of the array, one has to write a[0][0], which will return the first element (1) of the first element ([1, 1.2]) of a.

Javascript also has another type of array called associative arrays, where values are assigned to keys instead of an index. For instance,

var a = {yield: 27.00000, variety: "Manchuria", year: 1931, site: "University Farm"};

is an associative array. To access a value, one can use a . operator:

a.yield

will retun 27.

a["yield"]

also works.

Like other variable types, it is possible to have an array of associative arrays. In fact, this is used quite often in protovis.

Protovis and arrays – deconstructing the first example

The reason why I introduced javascript arrays is that the data property requires an array. Protovis then loops through that array, performing operations on each of its elements. To that end, it uses things such as accessor functions and properties of an object called this.

To explain all of this let’s go back to the first example and analyse it line by line.

var vis = new pv.Panel()
  .width(150)
  .height(150);
vis.add(pv.Bar)
  .data([1, 1.2, 1.7, 1.5, .7, .3])
  .width(20)
  .bottom(0)
  .height(function(d) d * 80)
  .left(function() this.index * 25);
vis.render();

The first 3 lines create a panel, which is like the sheet of paper on which protovis will draw the chart. Its width and height properties must be filled, as they are 0 by default which would make the whole visualization invisible.

The next line adds a bar chart to this panel we’ve just created.

The line after specifies the data on which to work: here comes our array. Here, we have written the array literally in the data property, but nothing prevents us to assign it to a variable first and to pass the variable instead.

The next line, and the line with the bottom property, assign constant numbers to these properties. It means that all the bars will have a width of 20 pixels, and they will all be aligned with the bottom of the panel – that’s what

bottom(0)

does.

Now let’s look at the two remaining lines:

.height(function(d) d * 80)
.left(function() this.index * 25);

The first line uses an accessor function. What this does is that it looks at the current element, and perform an operation on it, the result of which will be the height of that element.

In proper javascript, we would have written:

function(d) {return d*80;}

but protovis uses a shorthand notation that allows us to omit curly braces and the return statement. By the way, d in the function is completely arbitrary, and could be any variable name –

function(a) a*80

also works. It’s just that the name of the variable between parentheses will represent the value of the current element.

The second line uses the this object. this represents what protovis is working on at the moment, and it has properties that can be used. The most commonly used is index: this.index returns the position of the current element in its array, so it is going to be: 0 for the first bar, 1 for the next one, etc.

So this line specifies that each new bar should start every 25 pixels from the left border of the panel.

You may wonder, why not write

.left(this.index * 25);

and omit the function()? Well, function() means that the content of the property gets re-evaluated. If we had omitted it, this.index * 25 would have been computed once (for a result of 0) and that value would have been used for all the bars.

By the way, instead of writing the height property as it is, we could have written:

.height(function()[1, 1.2, 1.7, 1.5, .7, .3][this.index] * 80)

Using an accessor function is shorter and clearer.

Next: Multi-dimensional arrays, inheritance and hierarchy

 

Misleading with road statistics

Changing driving behaviors with campaigning alone is a tall order, but is literally a life-or-death matter. Road fatalities range from about 40/million  in Japan, to about 6 times as much in Russia. Fortunately, the numbers tend to decrease in most places, due to better equipment, better roads, harsher punishment and safer behaviors.

Of all of these factors, drivers behavior is the only thing which isn’t directly controlled by governments, so it’s no surprise that it’s what the agencies try to target. Almost every angle has been tried: blaming alcohol, speed, showing the consequences of seemingly inocuous oversights, and, obviously, gore and shocking images.

This year, in France, they’ve tried a different approach with a campaign called the 12000: thanking the drivers for their better behavior, which has saved, well, 12000 lives since 2003.

I really appreciate the upbeat tone of campaign and its much welcome positive spin. Unfortunately, it’s based on such fallacy that it’s difficult to accept as such.

road1

Here’s one view of what has happened. The number of fatalities has dropped since 2003. (By the way, the unit for this and the following chart are fatalities per million population, indexed so that the value for France in 2003 is 100). It can be argued that lives have been saved, because if the number of fatalities had remained constant since 2003, the area in green would represent extra fatalities (around 6,000).

But that’s what the agency wants us to believe.

road21

Says the website,

12000 lives have been saved between 2003 and 2008. Fatalities have dropped from 6126 in 2003 to 4275 in 2008.

To actually come up with that number of 12000 person saved, they’ve simply multiplied the difference between the 2008 and 2003 figures by 6. As if there had been a sudden and drastic drop in 2003.

I wonder why they do that. Behaviors have changed on the road. 75% of French drivers have a perfect driving record, another 15% have only committed minor offences. Those are facts. So why inflate the numbers? and why, for instance, start at 2003 and not 2002, where mortality dropped by over 20% ? 12000, as an absolute figure, is not more striking than 1000 or 100000.

The visuals all repeat this figure. On all the posters of the campaign, we find the following footnote: “* If behaviors had not changed since 2002 in France, 12000 more people would have died on the road between 2003 and 2008. Source: ONISR. “. The ONISR says no such things in their report, so that number must have been invented for the campaign.

Speaking of the ONISR reports, they estimate that if people observed speed, alcohol and seat belt legislations, the numbers would drop by over 2000. So are we doing that well?

road3

That’s a comparison with the UK. Again, the units are ratio per population, not absolute figures. If France had the same road fatalities that the UK, over 10,000 persons would not have been killed over the 2003-2008 period…

Anyway. There’s no good reason why all western countries couldn’t go under 50 killed / million population within a reasonably short time frame.

 

Using data visualization to disinform

Two weeks ago I have been at DD4D conference, conveniently located at my workplace. I will write some more on DD4D, meanwhile you can see this post on infosthetics by Petra and Marian. One of the things that struck me at DD4D was that several talks were about either data visualization for advocacy, or for education purposes. One speaker said that data visualization could be used to protect people against those who use numbers to mislead and disinform. Yesterday, I saw this typical example of such a manipulation, reminding of the famous Disraeli quote.
disinform

This is a poster for restaurants to display. Yesterday, VAT for restaurants in France was cut from 19.6% to 5.5%. This is the result over 10 years of lobbying. Initially, restaurants asked for a VAT drop and committed to cut their listed prices accordingly. That cut in price would have attracted more consumers, eventually generating more profit and possibly more tax money. That would have been a win-win-win situation for the restaurant industry, the consumer and the state.

But eventually, the changes that restaurants have agreed to their price structure are as follow. They would cut the listed price of up to 10 menu items by 11.8% to “reflect the tax drop”. In exchange, they are allowed to display this poster, on which the chart ominously promises a massive price drop.

In reality, 11.8% is not enough to offset the VAT drop.

That should have been approximately 13.4%  or 100*(1.196/1.055 – 1) . Fast-food chains only have to drop some of their prices by 5% to get the poster.

The poster claims: “a cut in VAT is a cut in prices!”. But what happens really? For most items, listed price (incl tax) is unchanged, which means their actual prices raise by 13.4%. And for the discounted items, the sales price excluding tax still raises by 1.4% (or 7.7% for fast-food chains).

Is this what was implied by the chart?

In the past two weeks, I have collected more examples of shameless lies backed by seemingly official numbers and charts, and will continue to collect them.

 

New data services 2: Wolfram|alpha

In March this year, überscientist Stephen Wolfram, of Mathematica fame, revealed the world he was working on something new, something big, something different. The first time I heard of this was through semantic web prophet Nova Spivack, who is not known to get excited by less-than-revolutionary projects. That, plus the fact that the project was announced so short before its release, contributed to build anticipation to huge levels.

wolframalpha

Wolfram|alpha describes itself as a “computational knowledge engine” or, simply put, as an “answer engine”. Like google and search engines, it tries to provide information based on a query. But while search engines simply try to retrieve the keywords of the query in their indexed pages, the answer engine tries to understand the query as a question and forms an educated answer. In a sense, this is similar to the freebase project, which is to put all the knowledge of a world in a database where links could be established across items.

It attempts to detect the nature of each of the word of the query. Is that a city? a mathematic formula? foodstuff? an economic variable? Once it understands the terms of the query, it gives the user all the data it can to answer.

Here for instance:

wolframalpha-2

Using the same find access process present share diagram as before,

Wolfram|alpha’s got “find” covered. More about that below.

It lets you access the data. If data have been used to produce a chart, then there is a query that will retrieve those bare numbers in a table format.

Process is perhaps Wolfram|Alpha’s forte. It will internally reformulate and cook your query to produce all meaningful outputs in its capacity.

The presentation is excellent. It is very legible, consistent across the site, efficient and unpretentious. When charts are provided which is often, the charts are small but both relevant and informative, only the necessary data are plotted. This is unusual enough to be worth mentioning.

Wolfram|alpha doesn’t allow people to share its outputs per se, but since a given query will produce consistent results, users can simply exchange queries or communicate links to a successful query result.

Now back to finding data.

When a user submits a query, the engine does not query external sources of data in real time. Rather, it used its internal, freebase-like database. This, in turn, is updated by external sources when possible.

For each query, sources are available. Unfortunately, the data sources provided are for the general categories. For instance, for all the country-related informations, the listed sources are the same, and some are accurate and dependable (national or international statistical offices), some are less reliable or verifiable (such as the CIA world factbook or what’s cited as Wolfram|Alpha curated data, 2009.). And to me that’s the big flaw of this otherwise impressive system.

Granted, coverage is not perfect. That can only improve. Syntax is not always intuitive – to make some results appear in a particular way can be very elusive. But this, as well, will get gradually better over time. But to be able to verify the data presented, or not, is a huge difference – either it is possible or not. I’m really looking forward to this.