Tableau 2012 politics contest – justification and making-of

what led me to those choices

I was technically happy of my entry for the sports contest. I had done what I wanted: obtain a hard-to-find, interesting dataset, attempt to create an exotic, hard-to-make and never-tableau’d-before shape with aesthetic appeal and insights.

Yet the rules stated that the entries shall be judged on the story-telling front. While there were interesting insights, indeed, they didn’t constitute a story, a structured narration with a beginning and an end. Having worked on that subject on occasion, I think there is an inherent contradiction between a dashboard tool that lets a user freely manipulate a bunch of data and that articulated story where the user is more led throughout a process.

So that’s what led most of the work.

The second idea was that there is an unspoken, but IMO unnecessary rule about making Tableau dashboards compact things, highly interactive and interconnected. First, the elephant in the room: Tableau public is slow. It’s too slow. So too many interactions do not make a pleasant experience. Second, it is true that in Tableau one can assemble a dashboard out of interconnected worksheets, where clicking on one makes things happen in another. But just because you can doesn’t mean you should. Remember the <BLINK> element in the webpage of the 1990s? And this is this interconnectivity that causes dashboards to be compact and fit over the fold. If clicking on one element causes changes on another, you’d better be able to see both even on a laptop screen.

So the second idea was to create instead a long dashboard where a user would be held by the hand as she’s taken from point A to point B. Along the way, there would be texts and images to explain what’s going on, data – not necessarily interconnected, worksheets with little interactivity which can be understood at first sight, and which can stand some manipulation but don’t need to.

When visualization and storytelling intersect there is one form that I like which is to start with a preconception and to let the user find through manipulation that this idea is wrong. So I tried to use that in the dashboard as well.

The subject

That’s actually the #1 issue in French politics right now. Which strategy should the main right-wing party adopt? Typically, during the presidential campaign, both large parties fight for the votes of the center and are less radical than usual. But during this campaign the UMP, the party of the former president Nicolas Sarkozy, steered hard to the right in an attempt to steal back the voters gone to the far-right.

Apparently, that strategy was successful, even if he lost the presidential rate, he managed to somehow catch up against his rival.

Yet there are those who argue that if the party was more moderate, it would have been more successful and possibly win.

Anyway. The presidential race is over. But now the party is deciding which way to go next by electing its next leader.

Fortunately, there is data that can be used to determine whether the far-right or the moderate strategy can be more fruitful. This is what it is about.

Making the viz

Tableau dashboards can go up to 4000px in height, so that’s what I shot for.

So let’s say it loud and clear, it’s hell to manipulate large dashboards in Tableau, even with a very strong computer. When you add a new worksheet the legend part and the quickfilter part are added whenever there is room which could be thousands of pixels away. Since you can’t drag an element across screens you may have to proceed in babysteps. Once there is a certain number of elements, be they text, blanks or very simple and stable worksheets, adding another element takes a very long time, so does moving them around, etc.

As usual fixed size is your only friend, fixed heights, fixed widths, alternating horizontal and vertical layout containers.

So up to the last 2 worksheets there is really nothing to write home about. Only this: when you interact on the published workbook on the web it is painstakingly slow as the dashboard is reloaded and recomputed in its entirety. While this is ok for most of the worksheet, for the most complex one (the one with many sliders) it’s just unacceptable because the sheet won’t have time to be redrawn between two interactions.

So I came up with an idea: create a secondary dashboard with just that sheet, publish it independently, and then, in the previous dashboard, I have added a web page object. And that web page pointed to that other dashboard. So in effect, there is a dashboard within a dashboard, so when there is interactions in the complex worksheet, the secondary, smaller dashboard is the only one which is reloaded and recomputed, which is noticeably faster. Still not faster as in fast, but usable.

now publishing aspects aside this worksheet is interesting. The idea is to update a model based on 19 criteria. For every record, the outcome depends the “closeness” of the answers of the record and those of the candidates. The 19 parameters control the position of one of the candidates: Nicolas Sarkozy. So what I’ve done is calculate, outside of Tableau, the “distance” between each record and each of the other 8, and in the data file, I’ve specified that minimal distance and the name of the corresponding candidate. Then, in Tableau, I compute in real time the distance between the record and the parameters, and if that score is inferior to the threshold in the data file, then Sarkozy is deemed to be the closest, else it is the one from the data file. The worksheet tallies up the number of records which are closest to each candidate. Also, in order to keep the parameters legible I have constrained them to 9 values, when they really represent numbers between -2 and 2.

Also for the record, I have made a French and an English version. Why? Because I hope to get the French version published in a media and weight in on the debate, while I need the English version for the contest. This raises a lot of issues, all the worksheets need to exist in 2 versions, many variables need to be duplicated as well. As a sidenote the marks concerning a candidate are colored in tableau blue /orange in English in order to highlight candidate Sarkozy, but according to the campaign colors of the candidates in the French version.

That’s about it. I hope you enjoy my viz!

 

Tableau 2012 sports visualization contest entry

So here goes for the 2012 Tableau contest entry.
Before I go on explaining what this is about and how it’s been made, I’d like to express my thanks to Opta for letting me use this awesome dataset. They have been very reactive and supportive.

So what is this?

So the assignment was to create a visualization about sports. and who says sports, says football. There’s been a number of epic football games in recent history, and the Real Madrid / APOEL Nicosia match up last April is definitely one of them. It’s that game where APOEL defender Paolo Jorge, after losing 2 teeth after an unfortunate collision with a teammate, decided to pull out a third one that was hanging lose. And although Real ended up winning by a large margin 5-2, few teams were able to score 1 goal against 2012’s Real Madrid, let alone 2.

So I chose to represent the game as a network.

The circles on the right (Real Madrid) are much larger, which means that Real players were involved in more events in the game than their APOEL counterparts. The dataset I could work from had all sorts of events recorded such as passes or shots. The lines linking the left side also seem darker. You can also tell that lines across attackers (lower part of screen) are much thinner than those who involve mid-fields, who are really building the game, or defenders. Lines are colored by the team who initiated the action. So if a line of the color of one team ends connected to the player of another team, it means the ball has been stolen. So you can explore how defenders were able to stop individual attackers, duels among midfielders, etc.

This being a contest, I’ve thrown in a little subtlety with the possibility of switching the form of the display from a more abstract circle diagram network, to a second representation where players are positioned according to their average physical position during the game. For this second representation, I’ve chosen to show teams side by side, and not one on top of the other, as to keep the chart legible.

How is this done?

First let me explain what I attempt to do in a contest.

I try to do three things. First, work on interesting, non-trivial data. Second, aim for virtuosity in the execution, try to come up with a difficult technique and hopefully pull it out nicely. Finally, link the two together: make the visualization relevant, so it supports the interestingness of the data, without being too busy or complex.

So we talked about the data. Again I must say it’s been really comfortable working with a professional dataset. The football statistics that reach mass audiences are usually very aggregated, so it’s really nice to have access to a degree of finesse. And by the way, I only used the tip of the iceberg of that data.

Now the technique.

Some time ago I wrote how I regretted that Tableau doesn’t have a network graph module. Though, I had a fair idea how to hack one.

As in a previous post on treemaps, we are using the line mark with aptly chosen coordinates. Note that since the lines I am drawing rely on two records, we won’t be needing the path shelf this time.

There are two kinds of “lines” here: those who act as nodes, and those who act as links.

Nodes are where the players circles are, and are also used to support labels. Those are given a size depending on the number of events each player is involved in. That number has been precomputed and is made available with the data.

To explain how links work, I have 3 sheets in my data. One on players (which provide the location of each player depending on each mode, their names, and other such details), one on points, which lists all the begin- and end-points of all the links, all of these points being linked to a record in the player sheet. The final sheet is the one on lines, where each record has a start and an end point.

So essentially, all the positioning happens in the player sheet. Each player and each special location there is assigned two positions, one around the circle diagram and one in the other form. Both are calculated before the data is supplied to Tableau. In Tableau using a parameter we can simulate a transition from one form to another – the intermediary states are even legible 🙂

For the tooltips, I have prepared different scenarios depending on the type of line which will be mouseovered (passes, shots, goals, players…) so one formula can be used to generate a useful and relevant text.

Finally, I used a little trick around transparency. Here it is absolutely necessary to use transparency because else the networks will become very, very busy and difficult to read. However, I did want the goal lines to stand out. I wanted them to be not transparent. Problem: there is no transparency shelf. You can only set transparency for all the marks at once. So, each goal link is really several links one on top of the other, so the resulting line is really opaque.

And finally the relevance.

I didn’t want to pack too much data at once, I mean sure there is much more data than on a traditional game report we see in the sports press, but not to the point that it’s overwhelming. I tried to make it so that it’s possible to immediately see the outcome of the game by counting the thicker lines going out of the circle. 2 to 5.

Instead of packing the dashboard with tightly fitted panels I have used just this one and only used one control to let the user change the representation.

So, I hope you all like it!

 

Hollywood + data II: the sequel

a couple of days ago I posted the contribution of relative keywords to the earnings of a given movie.

Well, it occurs that the cast of a movie is much easier to obtain than keywords and much less messy. So, I also had scraped the 4 lead actors for each movie of the Beautiful Information awards in order to determine their contribution.

I’ve done it slightly differently than with the keywords. I’ve also taken the budget of the movie into account. So, in order to predict the earnings of a movie, you take the budget, multiply it by 3, remove 8 millions and then add (or remove) the contribution of each star.

Since the movie budget already (mostly 🙂 ) includes the pay of the lead actors, the way to read this contribution is how much extra these actor should be paid when they appear on film. For instance, each time Emma Stone does her thing, it would be fair to pay her $500k more.

the bang being a modest 500k pay raise.

Likewise, Kirsten Dunst here could be paid an extra 200k per movie, like Rebecca Hall or Rooney Mara. But I mostly felt like posting a picture of Kirsten to stand up against what google autosuggests as keywords when you look her up.

Kirsten, too, should be paid more.

Nicolas Cage, somewhat unsurprisingly,  should refund $2.44m for each movie he stars in.

Actors of movies who’ve done exceedingly well such as Avatar or the Harry Potter or Twilight sagas, movies which, it’s fair to say, owe their success to more than the actors, come out of this over valued. And because of how this is calculated, actors who have costarred with them in less ambitious movies come out undervalued: for instance, Elizabeth Banks or Anton Yelchin who both played with Sam Worthington are “paying” the fact that Man on the Ledge or Terminator: Renaissance haven’t been as successful as Avatar.

Twilight fans take note, though, that Harry Potter actors are considered more valuable.

Now – the data.

-8.72132 base value, million dollars
3.028754  x your budget
contribution
(million dollars)
actor occurrences
6.820566 Daniel Radcliffe 4
6.820566 Rupert Grint 4
6.589512 Emma Watson 5
6.469722 Michelle Rodriguez 4
5.919662 Zoe Saldana 3
5.869411 Sam Worthington 4
5.39418 Sigourney Weaver 4
4.66799 Robert Pattinson 6
4.486437 Kristen Stewart 7
4.131046 Taylor Lautner 4
3.511325 Michael Gambon 2
3.328213 Shia LaBeouf 8
3.091938 Josh Duhamel 4
2.771696 Tyrese Gibson 4
2.430676 Justin Bartha 3
2.177533 Anne Hathaway 8
2.145379 Ed Helms 3
1.985782 Helena Bonham Carter 3
1.922826 Ray Romano 1
1.922826 Denis Leary 1
1.922826 Eunice Cho 1
1.860584 Bradley Cooper 7
1.830633 Zach Galifianakis 6
1.809581 John Leguizamo 3
1.770726 Geoffrey Rush 3
1.722315 Christina Jastrzembska 1
1.681332 Rosie Huntington-Whiteley 1
1.654913 Ellen Page 4
1.521321 Xavier Samuel 1
1.495329 Antonio Banderas 3
1.487379 Brendan Gleeson 2
1.427076 Tim Allen 2
1.40105 Mia Wasikowska 2
1.391394 Heath Ledger 1
1.374587 Mike Myers 3
1.366225 Sandra Bullock 4
1.363958 Ned Beatty 2
1.339647 Johnny Depp 8
1.305249 Stellan Skarsgard 2
1.260785 Michael Caine 2
1.217668 Jae Head 2
1.169856 Tom Hanks 4
1.167732 Jason Lee 2
1.167732 David Cross 2
1.167722 Jason Segel 6
1.161184 Amanda Seyfried 6
1.134598 Freida Pinto 2
1.083914 Ken Watanabe 1
1.081577 Katie Featherston 2
1.081577 Micah Sloat 2
1.072914 Paul Walker 2
1.072914 Jordana Brewster 2
1.07084 Jason Bateman 10
1.055046 Aaron Eckhart 5
1.031855 Joseph Gordon-Levitt 5
0.990123 Megan Fox 4
0.9881 Gerard Butler 9
0.973714 Julie Andrews 2
0.95887 Rose Byrne 4
0.953622 Dan Castellaneta 1
0.953622 Julie Kavner 1
0.953622 Nancy Cartwright 1
0.953622 Yeardley Smith 1
0.946414 Gil Birmingham 1
0.921095 Pierce Brosnan 3
0.888618 Vin Diesel 3
0.869224 George Lopez 2
0.865954 Cameron Diaz 8
0.858122 Jackie Chan 4
0.857169 Tobin Bell 4
0.857169 Costas Mandylor 4
0.827224 Lena Headey 1
0.809226 Mila Kunis 6
0.775858 Vincent Cassel 2
0.768506 Russell Brand 4
0.758058 Wenwen Han 1
0.75382 Justin Timberlake 5
0.746798 Kim Cattrall 2
0.746798 Cynthia Nixon 2
0.746798 Kristin Davis 2
0.738043 Taraji P. Henson 3
0.737633 Katy Perry 1
0.737633 Jonathan Winters 1
0.73718 Karen Allen 1
0.719066 Neil Patrick Harris 3
0.706771 Diane Kruger 3
0.697058 Will Smith 3
0.691195 Meryl Streep 6
0.68017 Karen Disher 1
0.677442 Jesse Eisenberg 6
0.676622 Dominic West 3
0.661994 Zac Efron 3
0.661289 Lucas Grabeel 1
0.659597 Quinton Aaron 1
0.646669 Justin Long 5
0.645497 Edward Asner 1
0.645497 Jordan Nagai 1
0.645497 John Ratzenberger 1
0.644743 Eddie Murphy 6
0.644017 Hank Azaria 2
0.633971 Derek Jacobi 1
0.621914 Craig T. Nelson 1
0.599633 Tim McGraw 2
0.59531 Chloe Csengery 1
0.59531 Jessica Tyler Brown 1
0.59531 Christopher Nicholas Smith 1
0.59531 Lauren Bittner 1
0.586817 Scott Patterson 2
0.585204 John Lithgow 2
0.562875 Christopher Plummer 3
0.562458 Dustin Hoffman 4
0.557976 Terry Crews 2
0.557014 Jaden Smith 2
0.55002 Brad Garrett 1
0.55002 Lou Romano 1
0.55002 Ian Holm 1
0.545008 Mark Fredrichs 1
0.545008 Amber Armstrong 1
0.542971 Adam G. Sevani 2
0.542553 Ian McShane 4
0.536569 Molly Ephraim 1
0.536569 David Bierend 1
0.528423 Natalie Portman 8
0.519192 Emma Stone 6
0.517328 Leonardo DiCaprio 4
0.508157 Dwayne Johnson 6
0.506053 Liam Neeson 5
0.504138 Betsy Russell 3
0.498758 Winona Ryder 2
0.498382 Lucy Punch 1
0.489967 Katherine Heigl 5
0.489965 Saurabh Shukla 1
0.489965 Anil Kapoor 1
0.486999 Famke Janssen 1
0.486999 Leland Orser 1
0.482205 Javier Bardem 3
0.481332 Ross Bagdasarian Jr. 1
0.481332 Janice Karman 1
0.460322 Maya Rudolph 2
0.45842 Kristen Wiig 4
0.455105 Bryce Dallas Howard 2
0.449823 Cary Elwes 2
0.443563 Hailee Steinfeld 1
0.442303 Ashley Tisdale 2
0.436495 Ali Larter 2
0.434599 Salli Richardson-Whitfield 1
0.424407 Sarah Clarke 1
0.423429 Penélope Cruz 3
0.413782 Thandie Newton 3
0.412704 Patton Oswalt 3
0.401531 Octavia Spencer 1
0.401112 James Franco 4
0.385048 Terence Stamp 1
0.377733 Wentworth Miller 1
0.377733 Kim Coates 1
0.375668 Jason Flemyng 1
0.374325 Matt Damon 7
0.374325 Christian Bale 5
0.372746 David James 1
0.372746 Jason Cope 1
0.372746 Nathalie Boltt 1
0.368065 Chiwetel Ejiofor 3
0.368005 Édgar Ramirez 1
0.368005 Julia Stiles 1
0.367801 Mélanie Laurent 2
0.362202 Vera Farmiga 3
0.356965 Johnny Knoxville 1
0.356965 Steve-O 1
0.356965 Bam Margera 1
0.356965 Ryan Dunn 1
0.356953 Bill Nighy 6
0.356736 Elle Fanning 1
0.356736 Amanda Michalka 1
0.356736 Kyle Chandler 1
0.356736 Joel Courtney 1
0.356158 Sarah Jessica Parker 4
0.355846 Maggie Grace 2
0.353032 Louis Ferreira 1
0.351145 Ralph Fiennes 3
0.343755 Keir O’Donnell 1
0.343755 Jayma Mays 1
0.343755 Raini Rodriguez 1
0.339988 P.J. Byrne 1
0.33941 Christopher Mintz-Plasse 3
0.338666 Anna Kendrick 2
0.338404 Eli Roth 1
0.338222 Morgan Lily 1
0.338222 Trenton Rogers 1
0.313245 Jessica Lucas 1
0.313245 Lizzy Caplan 1
0.312825 Cassie Ventura 1
0.307954 Viola Davis 4
0.30272 Ty Simpkins 1
0.30272 Lin Shaye 1
0.298112 Bree Turner 1
0.298112 Eric Winter 1
0.296847 Andy Serkis 2
0.294687 Mike Vogel 2
0.294687 T.J. Miller 2
0.292065 Kurt Fuller 1
0.286915 Maggie Smith 1
0.286915 Ashley Jensen 1
0.284681 Seth Meyers 1
0.280837 Tobey Maguire 2
0.280773 Jason Sudeikis 2
0.269573 Kyra Sedgwick 1
0.269573 Madison Pettis 1
0.269573 Roselyn Sanchez 1
0.268595 Lisa Kudrow 2
0.258727 Patrick Dempsey 2
0.257049 David Wenham 3
0.254023 Briana Evigan 2
0.252523 Matthew Perry 1
0.252523 Thomas Lennon 1
0.246058 John Cusack 3
0.240009 Michael Jackson 1
0.240009 Alex Al 1
0.240009 Alexandra Apjarova 1
0.240009 Nick Bass 1
0.238714 Patricia Clarkson 3
0.235798 Brian Kerwin 1
0.230146 Sharni Vinson 1
0.230146 Rick Malambri 1
0.230146 Alyson Stoner 1
0.229488 Topher Grace 3
0.22835 Meagan Good 2
0.224193 Jaime King 2
0.221369 Zachary Gordon 2
0.221369 Robert Capron 2
0.221369 Rachael Harris 2
0.219378 Alice Braga 4
0.219044 Rooney Mara 2
0.215264 Kirsten Dunst 2
0.20931 Rebecca Hall 4
0.20615 Orlando Bloom 1
0.205058 Patrick Fabian 1
0.205058 Ashley Bell 1
0.205058 Iris Bahr 1
0.205058 Louis Herthum 1
0.20394 Michael Cera 6
0.202696 Woody Harrelson 4
0.202054 Aaron Yoo 2
0.195742 Jensen Ackles 1
0.195742 Kerr Smith 1
0.195742 Betsy Rue 1
0.192528 Michael Mantell 1
0.190413 David Morse 1
0.190413 Carrie-Anne Moss 1
0.189054 Piper Perabo 1
0.189054 Manolo Cardona 1
0.187252 Charlie Day 2
0.183573 Will Arnett 3
0.182781 Charlie Tahan 2
0.181449 Kate Bosworth 1
0.175751 Tommy Lee Jones 2
0.173441 Scott Speedman 1
0.173441 Gemma Ward 1
0.173441 Alex Fisher 1
0.170702 Mary Steenburgen 3
0.169352 Amanda Bynes 1
0.169352 Dan Byrd 1
0.165945 Chloë Grace Moretz 2
0.164513 Dougray Scott 1
0.163693 Rachel McAdams 5
0.157617 Harry Connick Jr. 2
0.156691 Emily Osment 1
0.156691 Billy Ray Cyrus 1
0.156691 Jason Earles 1
0.154703 Matt Lanter 3
0.153793 Amanda Peet 3
0.152353 Jenna Elfman 1
0.141247 January Jones 1
0.141247 Aidan Quinn 1
0.14017 Nika Futterman 1
0.14017 Tom Kane 1
0.14017 Ashley Eckstein 1
0.140131 Henry Thomas 1
0.140122 Jack Nicholson 1
0.140122 Sean Hayes 1
0.140122 Beverly Todd 1
0.138843 Chris Rock 3
0.138691 Jennifer Aniston 6
0.136256 Kevin Spacey 3
0.13405 Emma Bell 1
0.13405 Arlen Escarpeta 1
0.13405 Miles Fisher 1
0.130048 André Benjamin 1
0.130048 Maura Tierney 1
0.128801 Kathy Bates 3
0.128594 Anita Briem 2
0.127092 Ayelet Zurer 1
0.12695 Geoffrey Arend 1
0.124268 An Nguyen 1
0.123344 Danielle Panabaker 2
0.122693 Jared Padalecki 1
0.122693 Amanda Righetti 1
0.122693 Derek Mears 1
0.122554 Chandler Canterbury 1
0.122554 Lara Robinson 1
0.120986 Jon Hamm 1
0.120861 Ne-Yo 1
0.117797 Bruce McGill 1
0.117466 Nick Zano 1
0.117466 Krista Allen 1
0.117466 Andrew Fiscella 1
0.117466 Bobby Campo 1
0.114774 Kirk Cameron 1
0.114774 Erin Bethea 1
0.114774 Ken Bevel 1
0.114774 Stephen Dervan 1
0.113006 Donald Faison 2
0.111429 Chris Messina 2
0.110646 Devon Bostick 1
0.109958 Paul Rudd 7
0.10794 Dolph Lundgren 1
0.107433 Liam Hemsworth 1
0.107433 Bobby Coleman 1
0.105983 Christopher Evan Welch 1
0.105053 Bill Hader 2
0.101652 Dev Patel 2
0.100744 Melissa Leo 3
0.099702 Jerry O’Connell 2
0.09952 Jennifer Coolidge 1
0.09952 Adam Campbell 1
0.099046 Caroline Dhavernas 1
0.099046 Bokeem Woodbine 1
0.099046 Logan Marshall-Green 1
0.097749 Nick Bacon 1
0.096496 Penn Badgley 2
0.095386 Amber Valletta 2
0.09417 Eric Balfour 1
0.09417 Scottie Thompson 1
0.09417 Brittany Daniel 1
0.093839 Robert Hoffman 2
0.091974 Gary Cole 1
0.091644 Robert Knepper 2
0.090532 Bebe Neuwirth 1
0.090532 Megan Mullally 1
0.090532 Kay Panabaker 1
0.090298 Candice Bergen 1
0.090298 Bryan Greenberg 1
0.089631 Joshua Jackson 1
0.089631 Rachael Taylor 1
0.089631 James Kyson 1
0.089631 Megumi Okina 1
0.088905 Angela Lansbury 1
0.088905 Ophelia Lovibond 1
0.088475 Julian McMahon 1
0.088475 Shyann McClure 1
0.087343 Rowan Atkinson 1
0.087343 Roger Barclay 1
0.084964 Bryan Cranston 1
0.084964 Albert Brooks 1
0.081511 Alessandro Nivola 1
0.081511 Parker Posey 1
0.081511 Rade Serbedzija 1
0.080952 Kal Penn 3
0.077884 Jenn Proske 1
0.077884 Diedrich Bader 1
0.077884 Chris Riggi 1
0.07668 Marcia Gay Harden 2
0.074429 Andrew Garfield 2
0.073885 Rhys Wakefield 1
0.073885 Allison Cratchley 1
0.073885 Christopher Baker 1
0.073885 Richard Roxburgh 1
0.073624 Michael O’Keefe 1
0.07336 Daeg Faerch 1
0.073189 Kevin Kline 2
0.071229 Eugenio Derbez 1
0.071229 Kate del Castillo 1
0.071229 Adrian Alonso 1
0.071229 Maya Zapata 1
0.070675 Sarah Roemer 3
0.067179 Nathan Fillion 1
0.067179 Jeremy Sisto 1
0.066353 Scout Taylor-Compton 2
0.066353 Malcolm McDowell 2
0.066353 Tyler Mane 2
0.065523 Steve Zahn 2
0.065485 Gio Perez 1
0.065485 Joel Garland 1
0.063306 Mark Rolston 1
0.06132 Jude Law 3
0.0578 David Schwimmer 1
0.0578 Jada Pinkett Smith 1
0.057107 Jake T. Austin 1
0.056938 Khalid Abdalla 1
0.056938 Ahmad Khan Mahmoodzada 1
0.056938 Atossa Leoni 1
0.056938 Shaun Toub 1
0.05549 José Luis Garcia Pérez 1
0.05549 Robert Paterson 1
0.05549 Stephen Tobolowsky 1
0.054836 Jeremy Renner 2
0.050369 Malik Yoba 1
0.048526 Warren Christie 1
0.048526 Ryan Robbins 1
0.048526 Ali Liebert 1
0.048526 Lloyd Owen 1
0.048392 Nick Sullivan 1
0.047911 Richard Jenkins 5
0.046161 Annette Bening 3
0.045566 Carol Burnett 2
0.044954 Tom Mison 1
0.043956 Walter Raney 1
0.043917 Reiko Aylesworth 1
0.043917 Steven Pasquale 1
0.043917 John Ortiz 1
0.043917 Johnny Lewis 1
0.043044 Michael Moore 1
0.043044 Tucker Albrizzi 1
0.043044 Tony Benn 1
0.043044 George W. Bush 1
0.041406 Catherine Zeta-Jones 1
0.04094 Elisabeth Shue 1
0.038995 Garrett M. Brown 1
0.038936 Jon Voight 2
0.037779 Alfre Woodard 1
0.037779 Sanaa Lathan 1
0.037779 Rockmond Dunbar 1
0.036524 Joseph Ruskin 1
0.036472 Kate Mara 1
0.036472 Sean Bott 1
0.036463 Cillian Murphy 1
0.036463 Shyloh Oostwald 1
0.035315 Jeffrey Wright 2
0.034615 Haaz Sleiman 1
0.034615 Danai Gurira 1
0.034615 Hiam Abbass 1
0.033809 Sylvester Stallone 2
0.033294 Lauren German 1
0.033294 Heather Matarazzo 1
0.033294 Bijou Phillips 1
0.033294 Roger Bart 1
0.033228 Bill Maher 1
0.033228 Tal Bachman 1
0.033228 Jonathan Boulden 1
0.033228 Steve Burg 1
0.03308 Emilie de Ravin 1
0.03308 Caitlyn Rund 1
0.03308 Moisés Acevedo 1
0.032157 Jeremy Piven 2
0.031753 Jennifer Hudson 1
0.031753 Alicia Keys 1
0.0313 Sacha Baron Cohen 1
0.0313 Gustaf Hammarsten 1
0.0313 Clifford Bañagale 1
0.0313 Chibundu Orukwowu 1
0.029753 Tyler Perry 3
0.029491 Saffron Burrows 1
0.029491 Daniel Mays 1
0.029397 Goran Visnjic 1
0.028255 Odette Annable 2
0.025992 Brittany Snow 1
0.025992 Jessica Stroup 1
0.025992 Dana Davis 1
0.025977 Kyle Gallner 2
0.024707 Thomas Jane 1
0.024707 Laurie Holden 1
0.024707 Andre Braugher 1
0.024175 Kelsey Grammer 2
0.021951 Kat Dennings 2
0.021924 Will Patton 1
0.021924 Charlotte Milchard 1
0.021893 Nicholas D’Agosto 2
0.021888 John Hawkes 1
0.021888 Garret Dillahunt 1
0.021888 Isaiah Stone 1
0.021701 Paul Dano 1
0.021701 Martin Stringer 1
0.020605 Rafi Gavron 1
0.020451 Paul Schneider 1
0.02036 Jennifer Carpenter 1
0.02036 Jay Hernandez 1
0.01921 Alan Rickman 1
0.01921 Timothy Spall 1
0.018948 Sarah Burns 1
0.018836 Mike Epps 1
0.018836 Wood Harris 1
0.018712 Jill Scott 1
0.017932 Brit Marling 1
0.017932 William Mapother 1
0.017932 Matthew-Lee Erlbach 1
0.017932 DJ Flava 1
0.017423 Jason Spevack 1
0.017114 Josh Hutcherson 2
0.016759 Fred Willard 2
0.01642 Cuba Gooding Jr. 1
0.01642 Lochlyn Munro 1
0.01642 Richard Gant 1
0.01642 Tamala Jones 1
0.015848 Katt Williams 1
0.015441 Haley Bennett 1
0.015441 Chace Crawford 1
0.015441 Jake Weber 1
0.015441 Shannon Woodward 1
0.014565 Chris Hemsworth 1
0.014565 Tom Hiddleston 1
0.014266 Michael Stuhlbarg 1
0.014266 Richard Kind 1
0.014266 Sari Lennick 1
0.014266 Fred Melamed 1
0.010856 Bridget Moynahan 1
0.010856 Ramon Rodriguez 1
0.010561 Debra Messing 1
0.008341 Vanessa Hudgens 4
0.007363 Don Cheadle 4
0.007136 Ben Stein 1
0.007136 Lili Asvar 1
0.007136 Peter Atkins 1
0.007136 Hector Avalos 1
0.00691 Michael C. Hall 1
0.006893 Janet Jackson 3
0.004992 Alison Lohman 1
0.004992 Ruth Livier 1
0.004992 Lorna Raver 1
0.004762 Sharlto Copley 2
0.004547 Clifton Collins Jr. 1
0.004547 Dwight Yoakam 1
0.003298 Katie Cassidy 2
0.002413 Alan Arkin 2
0.002303 Asa Butterfield 1
0.002303 Rupert Friend 1
0.002303 Zac Mattoon O’Brien 1
0.002073 Elise Ivy 1
0.001311 Morris Chestnut 1
0.001311 Maeve Quinlan 1
0.001311 Kevin Hart 1
0.001177 Gwyneth Paltrow 4
0.000214 Elizabeth Perkins 1
0.000214 Kaley Cuoco 1
-0.00043 Sean Maguire 1
-0.00043 Kevin Sorbo 1
-0.00043 Ken Davitian 1
-0.0007 Aaron Johnson 2
-0.00302 Scarlett Johansson 4
-0.00381 Dominic Cooper 1
-0.00381 Charlotte Rampling 1
-0.00437 Joel McHale 1
-0.00437 Rowan Blanchard 1
-0.00452 Joan Allen 3
-0.00581 Minka Kelly 1
-0.00581 Shirley Norris 1
-0.00598 Ted Ludzik 1
-0.00701 Sheri Moon Zombie 1
-0.00739 Daniella Alonso 1
-0.00739 Jacob Vargas 1
-0.00739 Michael Bailey Smith 1
-0.00739 Michael McMillian 1
-0.00748 Joaquin Phoenix 1
-0.00748 Danny Hoch 1
-0.00758 Cary-Hiroyuki Tagawa 1
-0.00913 Scott Porter 3
-0.00925 Beyoncé Knowles 2
-0.00936 Loretta Devine 2
-0.01051 Willem Dafoe 1
-0.01051 Sam Neill 1
-0.01051 Claudia Karvan 1
-0.01133 Eva Longoria 1
-0.01183 Helen Hunt 1
-0.01183 Lorraine Nicholson 1
-0.01272 Charles S. Dutton 1
-0.01272 Lucas Black 1
-0.01276 Diego Luna 1
-0.01285 Kenny Wormald 1
-0.01285 Julianne Hough 1
-0.01285 Andie MacDowell 1
-0.01341 Patrick Stewart 1
-0.01341 Mako 1
-0.01341 Nolan North 1
-0.01393 Jeff Goldblum 1
-0.01461 Julianna Margulies 1
-0.01668 Alec Baldwin 3
-0.01764 Emma Roberts 2
-0.01805 Michael Parks 1
-0.01805 John Goodman 1
-0.01838 Michael Carman 1
-0.01856 Alice Eve 1
-0.01857 John Cho 2
-0.01872 Ryan Gosling 4
-0.01881 Gael Garcia Bernal 1
-0.01881 Marcia DeBonis 1
-0.01899 Angelina Jolie 7
-0.01907 Meg Ryan 2
-0.01973 Christopher Jordan Wallace 1
-0.02072 Emma Thompson 1
-0.02072 Maggie Gyllenhaal 1
-0.02072 Oscar Steer 1
-0.02083 George Clooney 10
-0.0215 Laura Linney 1
-0.02431 Virginia Madsen 1
-0.02431 Martin Donovan 1
-0.02516 Shea Whigham 1
-0.02516 Tova Stewart 1
-0.02519 Jon Favreau 1
-0.02534 Elizabeth Berrington 1
-0.02621 Isabelle Fuhrman 1
-0.02621 CCH Pounder 1
-0.02631 Mickey Rourke 2
-0.02666 Michael Sheen 1
-0.02666 Steven Mackintosh 1
-0.02671 Kadeem Hardison 1
-0.02734 Paolo Bonacelli 1
-0.02734 Violante Placido 1
-0.02734 Thekla Reuten 1
-0.02787 Christopher Meloni 1
-0.0281 Colin O’Donoghue 1
-0.02839 Common 1
-0.02839 James Pickens Jr. 1
-0.02906 Wesley Snipes 1
-0.02948 Connor Price 1
-0.02963 Kelsey Ford 1
-0.02963 Elena Anaya 1
-0.03052 Tom Sturridge 1
-0.03199 Ron Glass 1
-0.03202 Edward Burns 1
-0.03202 Shannyn Sossamon 1
-0.03202 Ana Claudia Talancón 1
-0.03202 Ray Wise 1
-0.03285 Alexandria M. Salling 1
-0.03299 Larry David 1
-0.03299 Adam Brooks 1
-0.0341 Jamal Woolard 1
-0.0341 Mohamed Dione 1
-0.03417 Keira Knightley 4
-0.03504 Stephen Campbell Moore 2
-0.03591 Ashlyn Sanchez 1
-0.03644 Amara Karan 1
-0.03933 Tamela J. Mann 1
-0.03933 David Mann 1
-0.03957 Ethan Hawke 2
-0.04132 Ludacris 2
-0.0447 Josh Lucas 1
-0.0447 Alexis Clagett 1
-0.0452 Kiele Sanchez 1
-0.04542 Richard Dreyfuss 2
-0.04634 Josh Hartnett 1
-0.04634 Melissa George 1
-0.04698 Danny Trejo 1
-0.04699 Selena Gomez 1
-0.04699 Cory Monteith 1
-0.04823 Beau Bridges 1
-0.04834 Sean Faris 1
-0.04834 Djimon Hounsou 1
-0.04848 John Carroll Lynch 1
-0.04855 Will Forte 1
-0.04855 Val Kilmer 1
-0.04988 Mia Stallard 1
-0.05001 Devin Brochu 1
-0.05014 Nikki Blonsky 1
-0.05014 Michelle Pfeiffer 1
-0.05018 Clifton Powell 1
-0.05062 Luke Wilson 1
-0.05062 Frank Whaley 1
-0.05062 Ethan Embry 1
-0.05137 Michelle Williams 1
-0.05137 Eddie Redmayne 1
-0.05137 Julia Ormond 1
-0.05152 Timothy Dalton 1
-0.05268 Michael Kelly 1
-0.05269 Mary-Kate Olsen 1
-0.05269 Justin Bradley 1
-0.05378 Neve Campbell 1
-0.05378 David Arquette 1
-0.05378 Lucy Hale 1
-0.05387 Nathan Gamble 1
-0.0553 Richard E. Grant 1
-0.05587 Brooklyn Decker 1
-0.0564 Ben Hollingsworth 1
-0.05818 Anne Heche 1
-0.05818 Isiah Whitlock Jr. 1
-0.0588 Rumer Willis 1
-0.0588 Carrie Fisher 1
-0.0588 Teri Andrzejewski 1
-0.059 Lily Collins 1
-0.059 Jake Andolina 1
-0.05922 Christina Applegate 1
-0.05929 Mos Def 2
-0.05943 Matthew Goode 1
-0.05943 Adam Scott 1
-0.05956 Amber Tamblyn 2
-0.05961 Rodrigo Santoro 1
-0.06019 Thomas Haden Church 4
-0.06219 Whoopi Goldberg 1
-0.06219 Kimberly Elise 1
-0.06228 Vicky Krieps 1
-0.06293 Cheryl Hines 2
-0.06335 Vanessa Minnillo 1
-0.06335 Nicole Parker 1
-0.06378 Carmen Electra 2
-0.06453 Claire Foy 1
-0.06484 Kerry Washington 2
-0.06553 Rosemarie DeWitt 1
-0.06553 Debra Winger 1
-0.06553 Sebastian Stan 1
-0.06615 Brian Geraghty 1
-0.06636 Madeline Carroll 1
-0.0668 Mark Ruffalo 3
-0.06737 Adrien Brody 3
-0.06842 Thomas D. Mahard 1
-0.06852 Jason Biggs 2
-0.06862 Chris Brown 1
-0.06876 Stella Maeve 1
-0.06949 Jennifer Lopez 1
-0.06949 Alex O’Loughlin 1
-0.06949 Michaela Watkins 1
-0.07143 Leslie Mann 5
-0.07286 Dylan Walsh 1
-0.07286 Sela Ward 1
-0.07287 Natalya Rudakova 1
-0.07287 François Berléand 1
-0.07344 Ari Graynor 1
-0.07413 Matthew Marsden 1
-0.07413 Graham McTavish 1
-0.07432 Milla Jovovich 5
-0.07449 Tracy Morgan 2
-0.07474 Tate Donovan 1
-0.07474 Craig Gellis 1
-0.07611 Drew Barrymore 5
-0.07615 Demi Moore 2
-0.07663 Zachary Quinto 2
-0.07686 Dennis Hopper 1
-0.07698 Amy Smart 2
-0.07733 Jacob Latimore 1
-0.07746 Sarah Habel 1
-0.07813 Mia Farrow 1
-0.07923 Jamie Lee Curtis 1
-0.07995 Tony Hale 1
-0.07995 Lucas McHugh Carroll 1
-0.08153 Kiefer Sutherland 1
-0.08153 Cameron Boyce 1
-0.0823 Justin Chatwin 1
-0.0823 James Marsters 1
-0.0823 Yun-Fat Chow 1
-0.0823 Emmy Rossum 1
-0.08247 Laurence Fishburne 2
-0.08276 Ben Burtt 1
-0.08276 Elissa Knight 1
-0.08276 Jeff Garlin 1
-0.08358 Tim Hodge 1
-0.08358 Mike Nawrocki 1
-0.08358 Phil Vischer 1
-0.08358 Cam Clarke 1
-0.08405 Stephen Merchant 1
-0.08611 Kevin Costner 2
-0.08636 Nia Vardalos 1
-0.08636 Rachel Dratch 1
-0.08636 Alexis Georgoulis 1
-0.0876 Ed Harris 1
-0.0876 Jeremy Irons 1
-0.08809 Leonard Nimoy 1
-0.08989 Rachel Bilson 2
-0.08997 Gerry Bednob 1
-0.09033 Juan Carlos Hernàndez 1
-0.09033 Cory Fernandez 1
-0.09096 Portia Doubleday 1
-0.09096 Jean Smart 1
-0.09157 Dianna Agron 1
-0.09398 Shailene Woodley 1
-0.09398 Amara Miller 1
-0.09398 Nick Krause 1
-0.09433 Izzy Meikle-Small 1
-0.09496 Téa Leoni 1
-0.09496 Jordan Carlos 1
-0.09575 Daniel Olbrychski 1
-0.09586 Dustin Milligan 1
-0.09586 Chris Carmack 1
-0.09586 Katharine McPhee 1
-0.09603 America Ferrera 1
-0.09662 Yifei Liu 1
-0.09665 Steve Harris 2
-0.0978 Gillian Anderson 1
-0.0978 Billy Connolly 1
-0.09831 Lukas Haas 1
-0.09851 Michelle Nolden 1
-0.09879 John Michael Higgins 1
-0.09889 John Turturro 1
-0.09889 Emmanuelle Chriqui 1
-0.10018 Charlize Theron 3
-0.10183 Sarah Mahoney 1
-0.10183 Roxana Ortega 1
-0.10214 Tom Cavanagh 1
-0.10263 Paul Rust 1
-0.10263 Jack Carpenter 1
-0.10263 Lauren London 1
-0.10292 Gemma Jones 1
-0.10305 Sam Shepard 1
-0.10313 James Earl Jones 1
-0.10313 Margaret Avery 1
-0.10387 Michael Keaton 1
-0.10387 Zach Gilford 1
-0.10396 Blythe Danner 2
-0.10608 Adam Brody 2
-0.10638 Ted Danson 1
-0.10643 Dane Cook 3
-0.10786 Suzanne Rico 1
-0.10843 Danny DeVito 1
-0.10878 Jonah Hill 6
-0.10929 Archie Panjabi 1
-0.10929 Saïd Taghmaoui 1
-0.11021 Carice van Houten 1
-0.1118 Colin Firth 2
-0.11216 Molly Sims 1
-0.11276 Leighton Meester 3
-0.11281 Hunter McCracken 1
-0.11283 Maggie Q 2
-0.11339 Robert Redford 1
-0.11467 Michael Angarano 2
-0.11492 John Hurt 1
-0.11492 Stephen Dorff 1
-0.11535 Trevor Gagnon 1
-0.11535 Philip Bolden 1
-0.11535 David Gore 1
-0.11535 Christopher Lloyd 1
-0.11589 Shawn Wayans 1
-0.11589 Shoshana Bush 1
-0.11589 Damon Wayans Jr. 1
-0.11672 Bailee Madison 1
-0.11672 Bruce Gleeson 1
-0.11701 John Cena 1
-0.11701 Ashley Scott 1
-0.11701 Aidan Gillen 1
-0.11778 Joan Cusack 4
-0.11808 Aziz Ansari 1
-0.121 Juliette Binoche 1
-0.121 James Ransone 1
-0.12193 Alexander Ludwig 1
-0.12267 Kate Winslet 1
-0.12448 Marco Khan 1
-0.12448 Cliff Curtis 1
-0.12498 Kelly Preston 1
-0.12557 Sean Penn 2
-0.12565 Ray Liotta 3
-0.12714 Stanley Tucci 4
-0.12778 Josh Zuckerman 1
-0.12813 Russell Means 1
-0.12921 Billy Burke 4
-0.12988 Michael Buie 1
-0.13011 Andy Samberg 1
-0.13011 Jeff Daniels 1
-0.13042 Istvàn Göz 1
-0.13061 Larry the Cable Guy 1
-0.13115 Mauricio Lopez 1
-0.13207 Will Poulter 1
-0.13287 Aly Michalka 1
-0.13287 Gaelan Connell 1
-0.1335 Jeffrey Dean Morgan 1
-0.13599 James Rebhorn 1
-0.13792 Ashley Judd 2
-0.13797 Jessica Chastain 2
-0.13797 Simon Pegg 2
-0.13835 Marisa Tomei 2
-0.13901 Hayley Atwell 1
-0.1391 Steven Strait 2
-0.14098 Jodelle Ferland 1
-0.14218 Craig Ferguson 2
-0.14296 Jerry Stiller 1
-0.14426 Alex Pettyfer 2
-0.14453 Tom Hardy 1
-0.14453 Nick Nolte 1
-0.14453 Jennifer Morrison 1
-0.14562 Shawn Ashmore 1
-0.14562 Jonathan Tucker 1
-0.14562 Laura Ramsey 1
-0.14615 David Tennant 1
-0.14615 Toni Collette 1
-0.14626 Joseph Cross 1
-0.14674 Kathleen Turner 1
-0.14783 Allison Janney 1
-0.14783 Carmen Ejogo 1
-0.14791 Henry Cavill 2
-0.15115 Ricky Gervais 2
-0.15316 Queen Latifah 4
-0.15327 William H. Macy 2
-0.1542 David Duchovny 2
-0.15482 John Magaro 1
-0.15482 Denzel Whitaker 1
-0.15482 Zena Grey 1
-0.15609 Brandon Routh 1
-0.15609 Sam Huntington 1
-0.15609 Taye Diggs 1
-0.1568 Cherry Jones 1
-0.15881 Rashida Jones 2
-0.15924 Colm Meaney 1
-0.15932 Louis C.K. 2
-0.15975 Jim Cummings 1
-0.15975 Bud Luckey 1
-0.16027 Frances Conroy 1
-0.16048 David Morrissey 1
-0.16131 Alicia Witt 1
-0.16131 Ben McKenzie 1
-0.16131 Leelee Sobieski 1
-0.16158 Kenneth Branagh 2
-0.16271 Susan Sarandon 4
-0.16305 Teri Hatcher 1
-0.16305 John Hodgman 1
-0.16305 Jennifer Saunders 1
-0.16324 Steve Coogan 1
-0.16324 Brandon T. Jackson 1
-0.16746 James Marsden 6
-0.16798 Evangeline Lilly 1
-0.16798 Dakota Goyo 1
-0.16863 Lily Rabe 1
-0.1693 Keanu Reeves 2
-0.17142 Kodi Smit-McPhee 1
-0.1731 Noah Bean 1
-0.17413 Diane Lane 2
-0.17575 Michael Vartan 1
-0.17575 Callum Blue 1
-0.17627 Paula Patton 3
-0.17794 Molly Shannon 1
-0.17794 Steve Buscemi 1
-0.17794 Myleene Klass 1
-0.17946 Darrin Dewitt Henson 2
-0.17949 Hayden Panettiere 2
-0.1843 Mélanie Thierry 1
-0.1843 Gérard Depardieu 1
-0.18736 Teri Polo 1
-0.18816 Matt Dillon 2
-0.18819 Noah Emmerich 1
-0.1885 Lee Pace 1
-0.1885 George 1
-0.18979 Jamie Bell 2
-0.18995 Kevin McKidd 2
-0.19227 Oprah Winfrey 1
-0.19227 Bruno Campos 1
-0.19379 Olga Kurylenko 2
-0.19412 Tony Goldwyn 1
-0.19669 Brooke Shields 1
-0.19669 Ricky Garcia 1
-0.19669 Eugene Cordero 1
-0.19796 Colin Ford 1
-0.1999 Alexis Bledel 2
-0.20019 Josh Peck 1
-0.20019 Alex Frost 1
-0.20019 Nate Hartley 1
-0.20054 Armie Hammer 1
-0.20054 Josh Hamilton 1
-0.20082 Ken Jeong 2
-0.201 Jennifer Garner 5
-0.20531 Hayden Christensen 3
-0.21062 Ice Cube 2
-0.21392 Charlie Cox 1
-0.21392 Claire Danes 1
-0.21392 Sienna Miller 1
-0.21392 Ian McKellen 1
-0.21416 Josh Brolin 8
-0.21465 Miley Cyrus 3
-0.21491 Greta Gerwig 1
-0.21652 David Thewlis 2
-0.21697 Nick Swardson 2
-0.21737 Olivia Williams 1
-0.21737 Jon Bernthal 1
-0.21748 Keith David 2
-0.218 Robert Downey Jr. 7
-0.21882 Rhys Ifans 1
-0.21882 Sebastian Armesto 1
-0.21891 Brad Pitt 7
-0.21899 Carter Jenkins 1
-0.21899 Austin Butler 1
-0.22287 Demetri Martin 1
-0.22287 Henry Goodman 1
-0.22287 Edward Hibbert 1
-0.22287 Imelda Staunton 1
-0.22299 Meredith Droeger 1
-0.22329 Bob Hoskins 1
-0.22329 Alexander Siddig 1
-0.22329 Caryn Peterson 1
-0.22569 Abigail Breslin 7
-0.22647 Keke Palmer 1
-0.22647 Tasha Smith 1
-0.22647 Jill Marie Jones 1
-0.22956 Max Thieriot 2
-0.23105 Tracey Ullman 1
-0.23431 John Malkovich 2
-0.23522 Idris Elba 3
-0.23559 Jeff Kahn 1
-0.23763 Vanessa Redgrave 2
-0.23795 Doug Hutchison 1
-0.24264 Gabriel Macht 2
-0.24452 David Krumholtz 1
-0.24452 Nat Faxon 1
-0.24508 Clark Duke 2
-0.25125 Ron Livingston 2
-0.25182 Kim Basinger 1
-0.25248 Emily Mortimer 4
-0.2531 Bruce Campbell 1
-0.25446 Selma Blair 1
-0.25446 Doug Jones 1
-0.25446 John Alexander 1
-0.25446 Anika Noni Rose 2
-0.25658 Seann William Scott 2
-0.25883 Armin Mueller-Stahl 1
-0.26063 Jason Schwartzman 2
-0.26107 Kathy Baker 1
-0.26156 Monica Bellucci 1
-0.26156 Stephen McHattie 1
-0.26745 Timothy Olyphant 6
-0.26757 Chris Massoglia 1
-0.26811 James Caan 2
-0.26866 Steve Carell 7
-0.27004 Colin Hanks 3
-0.27109 Tom Skerritt 1
-0.27329 Liv Tyler 2
-0.27555 Ben Affleck 4
-0.27608 Zooey Deschanel 5
-0.27655 James Russo 1
-0.27655 Charlie Yeung 1
-0.27655 Shahkrit Yamnarm 1
-0.27655 Panward Hemmanee 1
-0.27851 Hugh Grant 2
-0.27948 Diane Keaton 2
-0.28318 Judy Greer 2
-0.28336 Carey Mulligan 4
-0.28797 Colm Feore 1
-0.28797 Amy Ryan 1
-0.28797 Gattlin Griffith 1
-0.28933 Eva Mendes 3
-0.29241 Matt Long 1
-0.29307 Ben Mendelsohn 1
-0.29363 Emily Barclay 1
-0.29414 Rain 1
-0.29414 Rick Yune 1
-0.29414 Naomie Harris 1
-0.29414 Ben Miles 1
-0.29423 AnnaSophia Robb 3
-0.29464 Dan Aykroyd 2
-0.29572 Christopher Eccleston 1
-0.29799 Hugh Dancy 1
-0.29799 Krysten Ritter 1
-0.29968 Toby Jones 1
-0.29968 David Ryall 1
-0.29993 Antje Traue 1
-0.30149 Jonathan Rhys Meyers 1
-0.30149 Kasia Smutniak 1
-0.30149 Richard Durden 1
-0.30374 Jenna Fischer 2
-0.30433 Alan Alda 1
-0.30445 Shea Adams 1
-0.30445 Eddie Baroo 1
-0.30461 Frank Langella 2
-0.3085 50 Cent 1
-0.30904 William Fichtner 1
-0.31005 Joe Anderson 2
-0.31069 Andy Garcia 2
-0.3107 Dana Fuchs 1
-0.31076 Sarah Michelle Gellar 1
-0.31076 George Carlin 1
-0.31208 Julie Benz 2
-0.31313 Hilary Swank 4
-0.3135 Alexis Dziena 1
-0.31375 Arielle Kebbel 1
-0.31742 Columbus Short 5
-0.32086 Cher 1
-0.32086 Christina Aguilera 1
-0.32086 Alan Cumming 1
-0.32091 Anthony Mackie 4
-0.32244 Camilla Belle 2
-0.32251 Terrence Howard 2
-0.32407 Anthony Edwards 1
-0.32658 Elias Koteas 3
-0.33024 Danny Glover 2
-0.33186 Rosamund Pike 2
-0.3343 Bruce Willis 4
-0.33616 Michael Hadley 1
-0.3371 Sarah Bolger 1
-0.33889 Julianne Moore 3
-0.33895 Michelle Monaghan 6
-0.33954 Frances McDormand 2
-0.3401 Zahf Paroo 1
-0.34013 Giovanni Ribisi 1
-0.34013 Michael Rispoli 1
-0.34041 Sharon Leal 3
-0.34369 Evan Rachel Wood 2
-0.34673 Mary-Louise Parker 2
-0.34779 Eliza Bennett 1
-0.34779 Sienna Guillory 1
-0.3497 Jennifer Lawrence 2
-0.35049 Paul Bettany 4
-0.35309 Ryan Phillippe 3
-0.35499 Michael Shannon 3
-0.3563 Leslie Bibb 2
-0.35831 Mathieu Amalric 1
-0.35831 Judi Dench 1
-0.35885 Kevin James 5
-0.35968 Colin Farrell 3
-0.36023 Richard Gere 4
-0.36129 Stephanie Szostak 1
-0.36722 Drake Bell 1
-0.36722 Leslie Nielsen 1
-0.36722 Christopher McDonald 1
-0.37159 Kevin Bacon 1
-0.37197 Rasmus Hardiker 1
-0.37287 Christoph Waltz 2
-0.37325 Laz Alonso 1
-0.37325 Omar Benson Miller 1
-0.37446 Tony Kgoroge 1
-0.37446 Patrick Mofokeng 1
-0.37481 Kate Hudson 4
-0.37761 Jet Li 3
-0.37826 Ving Rhames 2
-0.3796 Amanda Crew 2
-0.38001 John Cleese 2
-0.38141 Abbie Cornish 3
-0.38142 Derek Jeter 1
-0.38538 Gabrielle Union 1
-0.38538 Scott Caan 1
-0.38604 Nick Frost 2
-0.38614 Catherine Keener 4
-0.3877 Daniel Hansen 1
-0.3877 Wesley Singerman 1
-0.3877 Jordan Fry 1
-0.38831 Noah Ringer 1
-0.38831 Nicola Peltz 1
-0.38831 Jackson Rathbone 1
-0.38893 Maria Bello 1
-0.39071 Tom Selleck 1
-0.39332 Jay Chou 1
-0.39413 Jennifer Connelly 4
-0.39438 Kieran Culkin 1
-0.39438 Alison Pill 1
-0.39473 David de Vries 1
-0.39492 Joel Edgerton 2
-0.39802 Dakota Fanning 4
-0.39804 Tilda Swinton 2
-0.39875 Casey Affleck 2
-0.40121 Viggo Mortensen 3
-0.40188 Thomas Kretschmann 1
-0.40735 Derek Luke 2
-0.40949 Bernie Mac 1
-0.40949 Adam Herschman 1
-0.41029 Kristin Kreuk 1
-0.41029 Neal McDonough 1
-0.41029 Michael Clarke Duncan 1
-0.41029 Chris Klein 1
-0.41274 Jon Heder 2
-0.41518 Julia Roberts 5
-0.41561 Jean Reno 2
-0.41773 Naveen Andrews 1
-0.41773 Nicky Katt 1
-0.41835 Ioan Gruffudd 2
-0.41855 Radha Mitchell 2
-0.4238 Tom Hollander 1
-0.42492 Stephen Colbert 1
-0.42514 Bette Midler 1
-0.42514 Chris O’Donnell 1
-0.42514 Jack McBrayer 1
-0.42584 Karl Urban 2
-0.42594 Yara Shahidi 1
-0.42594 Ronny Cox 1
-0.42702 Angela Bassett 2
-0.42845 Clancy Brown 2
-0.429 Tom Wilkinson 4
-0.42952 Anna Friel 2
-0.43227 Greg Kinnear 3
-0.43263 Hugo Weaving 2
-0.43371 Robert Duvall 2
-0.43428 Patrick Wilson 4
-0.43428 Jordi Mollà 2
-0.43598 Romany Malco 1
-0.43598 Jessica Simpson 1
-0.44493 Oliver Platt 3
-0.44639 Billy Bob Thornton 2
-0.44673 Tim Roth 1
-0.45012 Jason Behr 1
-0.45012 Amanda Brooks 1
-0.45012 Robert Forster 1
-0.45081 Elisabeth Moss 2
-0.45444 Kate Beckinsale 3
-0.45459 Elodie Tougne 1
-0.45644 Michael Peña 4
-0.45749 Ethan Suplee 1
-0.46307 Sara Paxton 2
-0.46648 William Hurt 2
-0.46705 Dominic Purcell 1
-0.4676 Eric Dane 2
-0.47286 Ewan McGregor 6
-0.47372 Bojana Novakovic 1
-0.47493 Rainn Wilson 2
-0.47496 Michael Douglas 3
-0.47656 Salma Hayek 3
-0.47821 Michael Chiklis 2
-0.47877 Susie Essman 1
-0.47877 Mark Walton 1
-0.48214 Tom Cruise 3
-0.48289 Rob Brown 2
-0.48494 Chris Tucker 1
-0.48494 Hiroyuki Sanada 1
-0.48495 Jerry Seinfeld 1
-0.48664 Amber Heard 4
-0.49151 Jeff Bridges 5
-0.49254 Julia Ormond 2
-0.49388 Jim Sturgess 5
-0.49397 David Alexander 1
-0.49599 Dakota Blue Richards 1
-0.49599 Ben Walker 1
-0.49633 Jack Black 7
-0.49996 Rachel Weisz 3
-0.50207 Rhona Mitra 3
-0.50598 Kristen Bell 3
-0.50922 Ulrich Thomsen 2
-0.50958 Paul Giamatti 4
-0.51353 Courteney Cox 2
-0.52123 Eric Christian Olsen 4
-0.52388 Bill Murray 2
-0.5256 Helen Mirren 4
-0.5371 Liev Schreiber 3
-0.53777 Luis Guzmàn 1
-0.53777 Victor Gojcaj 1
-0.53964 William Moseley 1
-0.54075 Amy Poehler 2
-0.54559 Chris Pine 2
-0.55487 Golshifteh Farahani 1
-0.56183 Ciaràn Hinds 5
-0.56769 Matthew McConaughey 3
-0.57323 Michelle Yeoh 2
-0.59291 Jodie Foster 3
-0.59744 Jeremy Northam 1
-0.59744 Jackson Bond 1
-0.61008 Harrison Ford 4
-0.61034 Zachary Levi 2
-0.61171 Ben Kingsley 2
-0.61301 Jena Malone 2
-0.61555 Keri Russell 3
-0.61562 Daniel Day-Lewis 2
-0.61821 Jessica Alba 8
-0.62343 Malin Akerman 4
-0.63052 Mel Gibson 2
-0.63521 John Krasinski 4
-0.63733 Marion Cotillard 1
-0.63733 Sophia Loren 1
-0.63784 Cate Blanchett 4
-0.63804 Donald Sutherland 3
-0.63962 Steve Martin 3
-0.64079 Jackie Earle Haley 2
-0.64477 Mary Elizabeth Winstead 2
-0.64757 Max Records 1
-0.64757 Pepita Emmerichs 1
-0.65084 David Strathairn 2
-0.65479 Vince Vaughn 4
-0.65669 Tina Fey 3
-0.65691 Bruce Boxleitner 1
-0.65739 Craig Robinson 3
-0.66044 Logan Lerman 3
-0.662 Saoirse Ronan 3
-0.66534 Jamie Foxx 5
-0.66583 Jim Carrey 5
-0.66613 Jason Momoa 1
-0.66613 Rose McGowan 1
-0.66613 Stephen Lang 1
-0.66867 Sam Elliott 2
-0.6712 Adewale Akinnuoye-Agbaje 1
-0.67171 Ben Barnes 2
-0.67171 Skandar Keynes 2
-0.67171 Georgie Henley 2
-0.67769 Katie Holmes 3
-0.68674 Eric Bana 4
-0.68959 Hugh Jackman 3
-0.69001 Anna Faris 6
-0.69111 Matthew Fox 2
-0.69807 Jorma Taccone 1
-0.70598 Jay Baruchel 3
-0.71044 Lake Bell 2
-0.71601 Matthew Broderick 2
-0.71688 Garrett Hedlund 2
-0.72146 Clint Eastwood 1
-0.72146 Bee Vang 1
-0.72146 Christopher Carley 1
-0.72146 Ahney Her 1
-0.72206 Reese Witherspoon 4
-0.72222 Forest Whitaker 4
-0.73722 Danny Huston 3
-0.73748 Michael Ealy 3
-0.74577 Steve Valentine 1
-0.74612 Chris Cooper 3
-0.7519 Guy Pearce 4
-0.75576 Ray Stevenson 3
-0.76908 Ben Foster 4
-0.7697 Emile Hirsch 3
-0.77558 John C. Reilly 5
-0.77761 Rob Corddry 3
-0.7787 Isla Fisher 2
-0.78113 Emily Browning 2
-0.78249 Channing Tatum 4
-0.78709 Marlon Wayans 2
-0.79442 Amy Adams 10
-0.79518 Edward Norton 3
-0.79572 Patrick Warburton 2
-0.80053 Gemma Arterton 1
-0.80891 Nicholas Elia 1
-0.81369 Ryan Reynolds 8
-0.81879 Chris Evans 7
-0.82393 Freddie Highmore 2
-0.82719 Moon Bloodgood 2
-0.83737 Cam Gigandet 6
-0.84704 Ashton Kutcher 3
-0.85929 Mark Wahlberg 8
-0.86703 Philip Seymour Hoffman 5
-0.87171 Brendan Fraser 5
-0.8803 Crispin Glover 2
-0.90228 Seth Rogen 9
-0.90251 Michael Fassbender 3
-0.90605 Ben Stiller 6
-0.907 Pink 1
-0.907 Carlos Alazraqui 1
-0.91934 Naomi Watts 5
-0.94107 Christina Ricci 3
-0.94334 Benicio Del Toro 1
-0.94334 Simon Merrells 1
-0.95701 Jason Statham 7
-0.98138 Emily Blunt 5
-0.98557 Morgan Freeman 6
-0.9859 Gary Oldman 4
-0.98804 Teresa Palmer 3
-0.99282 Samuel L. Jackson 6
-1.002 Anton Yelchin 3
-1.01902 Elijah Wood 2
-1.0212 James McAvoy 6
-1.03203 Elizabeth Banks 7
-1.0326 Carla Gugino 4
-1.03435 Robert De Niro 7
-1.03828 Catherine O’Hara 2
-1.0454 Al Pacino 4
-1.05981 Anthony Hopkins 4
-1.06493 Hugh Laurie 2
-1.06704 Robin Wright 3
-1.07278 Lauren Graham 1
-1.0907 Owen Wilson 10
-1.09614 Danny McBride 4
-1.09668 Jim Broadbent 1
-1.10827 Renée Zellweger 4
-1.10958 Matthew Macfadyen 2
-1.12118 Robin Williams 3
-1.12426 Raven-Symoné 1
-1.12426 Kym Whitley 1
-1.12426 Adam LeFevre 1
-1.13515 Rosario Dawson 4
-1.14922 Johnny Simmons 2
-1.15232 Sam Rockwell 3
-1.18263 Jessica Biel 4
-1.20545 Dennis Quaid 8
-1.21903 Jake Gyllenhaal 5
-1.242 Ray Winstone 2
-1.25532 Blake Lively 2
-1.25729 Elisabeth Harnois 1
-1.2839 Martin Lawrence 4
-1.28437 Clive Owen 4
-1.30597 Mark Strong 3
-1.31824 Donna Murphy 2
-1.32204 Denzel Washington 4
-1.32601 Max von Sydow 2
-1.38227 Seth Green 2
-1.38594 Mandy Moore 2
-1.44403 Peter Sarsgaard 3
-1.44474 Will Ferrell 7
-1.52446 John Travolta 6
-1.52673 Dan Fogler 4
-1.56454 Alfred Molina 3
-1.56534 Adam Sandler 7
-1.74681 Nicole Kidman 5
-2.09692 Russell Crowe 6
-2.20621 Olivia Wilde 4
-2.28186 Ron Perlman 4
-2.44066 Nicolas Cage 11
-2.7271 Daniel Craig 5


 

VAST challenge 2011

This year I have participated to the VAST Challenge (VAST stands for visual analytics science and technology). The VAST symposium is part of the yearly VisWeek conferences.

Anyway. The rules required contestants to send videos with voiceovers, so without further ado here they are.


Watch me in HD instead!!


Watch me in HD too!!

If you want to play with the tools you can download them here: mini-challenge 1, mini-challenge 3.

Unfortunately, I couldn’t find the time to complete mini-challenge 2 and the grand challenge. I’m making this on my free time and I had to balance all kinds of commitments, so I couldn’t secure enough time to finish. Unlike previous years, though, I managed to find enough time to start ! so, in the words of Charlie Sheen: winning.

So what is this about?
In the fictional Vastopolis, a mysterious infection strikes. Where does it come from and how is this transmitted? To answer these questions we have one million tweets sent by residents in the past 3 weeks. and among that million, there are quite a few about people reporting symptoms.

The first thing that I did was coming up for a method to tell whether one tweet was actually about a disease or not. so I scored them. I made a list of words that were required to consider that one message related to sickness, they were fairly univoquial like sick, flu, pneumonia, etc. Each of those words added one point to a “sickness” score. Then there was a second list of more ambiguous words like “a lot”, “pain”, “fire” etc. I added one point for each of these words or phrase, if a message already contains a required word. So, there were a few false negative, a few false positive, but all in all it was fairly accurate.

Fairly soon I had the idea to show the sums of all the scores of a part of the map, rather than showing each individual tweet. But originally, the sectors were quite large and I showed data by day.

Then, I worked with finer sectors and by 6 hours chunks. That’s how I could exhibit how people moved towards the center of the map by day, and back to its edges every night. With finer geographic details I could also see some spikes in various areas of the map during the period that I couldn’t see before, which were not necessarily related to the disease.

Eventually, I wanted to read what the tweets corresponded to, so I loaded the full text of the messages so that clicking on a square would reveal what was said at that moment. In this dataset, every spike in volume corresponds with an event that’s been added by the designers, so it was fun to discover everything happening there, from baseball games to accidents or buildings catching fire. Often, there were articles in the mini-challenge 3 dataset that would give more information about what really happened.

so, what was mini-challenge 3 about? nothing less than diagnosing possible terrorist threat. This time we were given not one million tweets, but thousands of articles which were much longer than 140 characters! From reading a few sample articles, I saw that most didn’t talk about terrorism or vastopolis at all. But couldn’t they contain clues that could link 2 and 2?

my first idea was to find all entities in the articles, that is names of people, or names of organizations (which follow a certain syntax) and arrange them in a network. The problem is that there were just too many names and groups (thousands of both) and I couldn’t tell from such a list which sounded suspicious. Although, a group called “network of hate” is probably not a charity. I’m sure it is possible to solve the challenge like this, but I chose another way to get my first leads.

I just did like in mini-challenge 1 and scored my articles, but I gave them several scores instead of just one by comparing them to several series of words. One series, for instance, was all the proper names in Vastopolis, like names of neighborhoods, because articles about Vastopolis are probably more interesting. The other series corresponded to various kind of threats.

That allowed me to create the scatterplot form which I used both to represent articles and to narrow the selection by selecting an area if needed. Then, as time went by I added more and more features to the tool, for instance an interface to read articles with keywords highlighted, the possibility to filter articles by keyword in addition to a graphical interface, being able to see all the articles as a list and select from that list, not just from the scatterplot, and finally the possiblity to mark articles as interesting and regroup them in another list…

That was about when I felt I could run out of time, so I didn’t add the other features I had planned or worked on making a decent interface. Also, I spent a lot of time not just trying to solve the challenge, but reading all the stories that were planted in the dataset, linking them to the tweets of MC1, etc.

Anyway. I quite enjoyed working on that and really, really appreciated the humongous work that went into creating the vast challenge universe. I’m looking forward seeing what other teams came up with. On a side note, it’s probably my last protovis projects as it makes sense to completely switch to D3 now…

 

… and tableau contest results

have been announced. My entry got honorary mentions which is the best I could hope for not being based in the US! anyway, the rewards is going through the process, seeing what others have done. and learn from the process.

I was pleasantly surprised to see that Agnieszka – Russian Sphinx chose a subject similar to mine – real estate, but in Moscow. Here’s her entry:

Now she and I used a different approach to display our map and I didn’t really comment on how I made mine (let alone how she made hers)
Here’s my map for the record:

To create my map, I used the default mapping function of Tableau, where you declare the geographic role of two of your measures – one as latitude, one as longitude – and you can plot marks on a map, that can be zoomed, panned etc.

If one of your dimensions is geographical in nature, like a country, US state, US county name or code, Tableau can calculate latitude and longitude so you don’t have to provide them. This also works for zip codes. So, if you have a valid US address, you can extract the zip code out of that (before you put your data in Tableau) and Tableau can represent this on a map quite accurately.
In my case, I had tens of thousands of marks, which were available at a much finer level than the zip code (sometimes down to the appartment number) so I thought it would be a pity to aggregate them so. My data came with notions like neighborhood or block numbers, which is specific to the New York administration I suppose, but interesting in my context. That idea of block number sounded like the right level of aggregation for my data.
The problem is, how do I get the latitude and longitude of each of the blocks?
How I did this was to chose one address per block, and then geocode it with python and google maps. Geocoding is the process of converting addresses, well, in geographical coordinates and while doing it in Python does require coding it is trivial and there are many ready-made examples around which can be reused as is.
then, in my data, I put the coordinates of each block for every record. So any building sale can be associated with a laritude and longitude, which is that of their block.
In another view, I have a simpler map – by neighborhood. Neighborhoods are another proprietary notion, they are groups of blocks. So, I simply average the latitude and longitude of all the buildings in the neighborhood to position their mark.

While it can be desirable to let users interact with the map, in many cases it is not a safe option. Once you let your users unzoom and/or scrool away from the action, it is not obvious for them to get back.
So Agnieszka has used another nifty technique, that of background images.
Her map is really a static file imported with the background image function which is found in the data menu:

Like this, the user can really specify what to use for a map, when does it start, where does it end. The map won’t move afterwards.
However, by doing so one cannot use geographic coordinates to display marks. One has to obtain, outside of Tableau, the x and y coordinates of each mark on the screen. Then, those x and y coordinates can be made into measures, and assigned respectively to the column and row shelves of the viz.

I’ll let you figure out which one is best for your next map project. Note that you can generate highly customizable map backgrounds with a service like cloudmade.

 

Tableau contest 2011

Here is my Tableau contest 2011 entry.
The rules stipulated it had to be about business, real-estate or finance, so I chose real-estate. Since everyone loves New-York, and since the data for real-estate transaction is relatively easy to get, that’s what I show. In order to be within Tableau Public 100,000 row limit, I chose to show only the residential transactions (apartment buildings or buildings converted to apartment buildings), larger than 100 square foot and sold at at least $1.

Before I started, I wanted to create a Tableau that would be legible as a static image.
We tableau tribe have seen a lot of interactive dashboards with sophisticated action. Here, I wanted to cater to the discriminate analyst who gets their kicks from seeing patterns emerge with a glance to 90k datapoints.

What does it show for instance? well, while the bulk of transactions are at a price close to the average there are a large part which are done considerably below that, and comparatively, very few which are markedly over that.
The marks which are aligned show either many transactions at the same price (there are distinct $1, $10 and $100 lines for instance) or sales of similar properties. There were a string of sales of identical apartments at 1335 6th Avenue, for instance.

It’s possible to switch the function of the Y-axis of the scatterplot, which can either represent square feet or sale price. The largest transaction on the period was around $4 billion, and the largest in terms of area was just above 10m square feet.
It is also possible to explore through the lists on the right. By selecting “other types” of buildings for instance you can see which apartment buildings once were yeshivas or fire stations or (what?) “detention houses for wayward girls” (there’s one in upper east side).
You can also switch to a map-based version of this dashboard. Speaking of maps, a right-click on any mark can open a google maps window centered on that address for you.

So here goes, enjoy the viz and good luck to all contestants

 

My tableau contest entry

so here it is. I chose to compete on the Activity Rates and Healthy Living data set, because after downloading it I really enjoyed exploring it.

If the viz doesn’t show well in the blog, here’s a link to its page

My main reason for entering the contest is to be able to see what others have done. There are obviously many, many ways to tackle this and I am very much looking forward to see everyone’s work! my interactions with the Tableau community, especially through the forum, have always been very rewarding and what better way to learn than from example!

So for the fellow contestants that will see my work, here is my train of thoughts for the dashboard.

The dataset

I’m aware of USDA’s food environment atlas. It’s an application where people can see various food-related indicators on a map. The dataset we were handled is actually the background data of this. So, there is already a place where people can consult food indicators.

Now this beeing Tableau and all, I wanted to create an analytical dashboard where people could understand if and how the input variables affected the output variables.

The dataset consists mostly of input variables: various indicators that influence how healthy a local population is. That status (output) is expressed through a few variables, such as adult and child obesity rates and adult diabetes rates. Those variables are highly correlated with each other, so in my work I chose to focus on adult obesity rates which is the simplest one.

Now, inputs. The rest of the variables fall in several categories:

  • income (median household income, poverty rates);
  • diet (consumption of various food items per capita);
  • shopping habits (for various types of stores or restaurants, the dataset would give their number and the money spent in each county, both in absolute numbers and per capita);
  • lifestyle information (data on households without cars and far from stores, on the physical activity level of the population, and the facilities offered by the state);
  • pricing variables (price ratios between some “healthy” food items and some less healthy, equivalent food items, for instance fruits vs. snacks; tax information on unhealthy food);
  • policy variables (measuring participation to various programmes such as SNAP or WIC);
  • socio-demographic variables (ethnic groups in population, “metro” status of county, whether the county was growing and shrinking, and voting preferences).

Yes, that’s a lot of variables (about 90, plus the county and state dimensions).

Oddly enough, there wasn’t a population measure in the dataset, and many indicators were available in absolute value only, so I constructed a proxy by dividing two variables on the same subject (something like “number of convenience stores” and “number of convenience stores / capita”).

That enabled me to build indicators per capita for all subjects, so I could see if they were correlated with my obesity rates.

Findings – using Tableau desktop to make sense of the dataset

The indicators which were most correlated with obesity were the income ones, which came as no surprise. All income indicators were also very correlated to each other. In the USA, poverty means having an income below a certain threshold which is defined at the federal level. But in other contexts, poverty is most often defined in relation to the median income (typically, a household is in poverty if its income is below half of the median income), so it can be used to measure inequality of a community, and dispersion of incomes.

As a result, many indicators appear to be correlated with obesity because they are not independent of income. This is the case for instance for most of the policy indicators: if a programme has many recipients in a county, it is because poverty is widespread, so residents are more likely to be affected by obesity. This makes it difficult to measure the impact of the programmes with this dataset. This is also the case, unfortunately, for racial indicators, as most of the counties with a very high black population have a low income.

Diet indicators also appear to be uncorrelated with obesity. This is counter-intuitive – isn’t eating vegetables or fresh farm produce the most certain way to prevent obesity? But one has to remember that this dataset is aggregated at the county level. Just because a county has a high level of, say, fruits consumption per capita doesn’t mean that every household is eating that much. Realistically, consumption will be very dispersed: the households where people cook, which are less likely to be affected by obesity, will buy all the fruits, and those where people don’t cook will simply buy none. Also, just because one buys more vegetable than average doesn’t mean they don’t also buy other, less recommended foodstuff.

The only diet indicator that appear to be somewhat correlated to obesity is the consumption of soft drinks.

When it comes to lifestyle habits, surprisingly, the proportion of households without car and living far from a store – people likely to walk more, so to be healthier – is positively correlated with obesity. This is because counties where this indicator is high are also poorer than average – again, income explains most of this. However, physical activity in general plays a positive role. States where people are most active, such as Colorado, enjoy the lowest obesity figures. In fact, all the counties with less than 15% of obesity are in Colorado.

Finally, pricing didn’t seem to have much impact on neither obesity, nor consumption. Why is that? Economists would call this “low price elasticity”, meaning that price changes do not encourage people to switch products and habits. But there is another explanation. Again, people who can’t cook are not going to buy green vegetables because they are cheaper. Also, consider the tax amount that are applied: no more than 7% in the most aggressive states. Compare that figure to the 400%+ levy that is applied to cigarettes in many countries of the world! Clearly, 4-7% is not strong enough to change habits. However, this money can be used to sponsor programmes that can help people adopt safer behaviors.

What to show? making the visualization

First, I wanted to show all of those findings. If 2 variables that you expect to be correlated (say, consumption of vegetables and obesity) are in fact not correlated, a point is made! But visually, nothing is less interesting than a scatterplot that doesn’t exhibit correlation. It’s just a stupid cloud of dots.

So instead I chose to focus on the correlations I could establish, namely: obesity and income, and obesity and activity. Those are the 2 lower scatterplot of my dashboard. I chose the poverty rate measure, because I’d rather have a trend line going up, than going down.

I duplicated that finding with a bar chart made with median income bins. For each bin (which represent all the counties where the median income fall in that range), I would plot the average obesity rate, and, miracle! this comes up as a markedly decreasing bar chart. Now, this figure doesn’t establish correlation, let alone causality, but it certainly suggests it more efficiently than a scatterplot. Also, it can be doubled as a navigation aide: clicking on a bar would highlight or select the relevant counties.

Finally, I decided to do a map. Well, actually, it was the first thing I had done, but  had second thoughts about it, and eventually I put it in. Why? first, to allow people to look up their county. Technically, my county is Travis county (Austin, TX) and I can find it easily on a map. Less so if I have to look for county names listed in order of any of their indicators. I added a quick filter on county name, for those who’d rather type than look up.

I also wanted to see whether there was a link between geography and obesity. So try the following.

  • Where are the counties with obesity rates less than 15% ? Colorado only.
  • If we raise the threshold a little, we get San Francisco and New York. But until 20%, these counties remain very localized.
  • Likewise, virtually all counties above 35% are in the South – Alabama, Louisianne, Mississipi.

Population also has an importance. The counties with a population above 1m people tend to have lower rates – their citizens also usually have higher incomes.

I decided to zoom the map on the lower 48 by default. It is possible to zoom out to see Alaska and Hawaii, but I don’t think that the advantage of seeing them all the time is greater than the inconvenient of having a smaller view point even if they are not necessary.

Regarding the marks. Originally, I didn’t assign any variable to their size, but then thought that the larger counties (i.e. LA, Harris (Houston), Cook (Chicago) …) were underrepresented. So I assigned my population proxy to size. But then, the density of the marks competed with the intensity of the color, which was attributed to the obesity rate. So I removed that and chose a size so that marks wouldn’t overlap each other too much. Regarding color, I wasn’t happy with the default scale. If I let it as is, it would consider that 12.5%, the minimum value of the dataset, is an extremely low number. But in absolute terms, it’s not. Most developed countries have obesity rates lower than that value at the national level. Japan or Korea are below 4%. So I made the scale start at 0. But I didn’t like the output: the counties with the highest values didn’t stand out. Eventually, I chose a diverging scale, which helped counties with high and low values to be more visible.

I edited an tooltip card for the view. In another version of the dashboard, I had a sheet with numbers written out that would change depending on which country was last brushed. I like the idea that this information can stay on. But I got confused in the configuration of the actions, and couldn’t completely prevent the filter that applied to this sheet to be disabled, sometimes, which caused number for all counties to overlap, and an annoying downtime as that happens. So I made an tooltipinstead. Anyway, it’s easier to format text like this. But the problem is that it can hide a good portion of the dashboard. So I exercised constraint and only chose what I found the 15 or so most relevant variables.

Voilà! that’ s it. I hope you like my dashboard, and I look forward to see the work of others! If you are a contestant, please leave a link to your entry in the comments. Good luck to all!!

 

Slideshare.net 2009 contest: I’m endorsing Dan Roam

Slideshare 2009 contest is up again, and there’s about 1 week to vote. For the contest, I’m endorsing Dan Roam and counting on everyone to vote for him and support his presentation. Previous winners of the contest include Shift happens or Thirst who got a lot of coverage and views. I think that Dan’s unique presentation style should get more exposure. One way to see the contest entries is by votes, so the ones with the most votes show on top. Dan’s presentation is currently #10, less than 200 votes behind the top spot. But you can only vote once per account. So if you see a presentation you like and give it your vote, it is gone forever.

Dan wrote The Back of the Napkin which is also the name of the blog he maintains. I enjoyed this book, and I think you should too.

The idea: all of the world’s problems can be solved by drawing. And even if you think you can’t draw, as most adults, it’s much simpler than it seems and it’s quite fun.

Problems can be reduced to 6 types of questions: who/what, how many, how, where, when and why. Each of these questions can be associated with a broad type of representation, for instance “where?” questions can be solved by a map where different elements are plotted. So that’s one way of categorizing visual representations.

The other axis that the author develops is what he calls SQUID. Depending on your audience, what you want to show may be:

  • simple or elaborate,
  • quality vs quantity,
  • vision vs execution,
  • individual vs comparison,
  • change (Delta) or as-is.

The combination of the SQUID framework and the who, how many, how, where, when, and why questions lead you to one logical choice of representation, which will work make your audience go “a-ha” – guaranteed.

The logic holds, although I feel he tweaked his process for most if not all of the examples in the book. Anyway, this line of thought can easily be reproduced and can solve problems. Now the hand-drawn style is not necessary to this process, but is a nice touch. I’ve used it in presentations and it gets attention and sympathy. I was amazed to see how much easier and quicker it is to draw a visual that works by hand than with a user-friendly software. I’m enclined to think that the corporate world would be much more interesting (and fun) if there were more drawings and fewer word documents.

For those reasons, go vote for Dan Roam.

Update

Dan Roam won! congratulations!

 

Re: flowing data contest, code for my entry

Here is the processing applet with source code for my entry. The code is based on a charter program I’ve been working on and off for a while, which I’ll publish once it’s more polished.
Select the applet then press a key to alternate between the 2 representations. The text file is the data, and the two png files are the images of the results.

Inserting processing applets in wordpress is not obvious. Here’s a discussion post for anyone interested.



source: fdcontest.pde

media files: params.txt image.png image_matrix2.png