So here goes for the 2012 Tableau contest entry.
Before I go on explaining what this is about and how it’s been made, I’d like to express my thanks to Opta for letting me use this awesome dataset. They have been very reactive and supportive.
So what is this?
So the assignment was to create a visualization about sports. and who says sports, says football. There’s been a number of epic football games in recent history, and the Real Madrid / APOEL Nicosia match up last April is definitely one of them. It’s that game where APOEL defender Paolo Jorge, after losing 2 teeth after an unfortunate collision with a teammate, decided to pull out a third one that was hanging lose. And although Real ended up winning by a large margin 5-2, few teams were able to score 1 goal against 2012’s Real Madrid, let alone 2.
So I chose to represent the game as a network.
The circles on the right (Real Madrid) are much larger, which means that Real players were involved in more events in the game than their APOEL counterparts. The dataset I could work from had all sorts of events recorded such as passes or shots. The lines linking the left side also seem darker. You can also tell that lines across attackers (lower part of screen) are much thinner than those who involve mid-fields, who are really building the game, or defenders. Lines are colored by the team who initiated the action. So if a line of the color of one team ends connected to the player of another team, it means the ball has been stolen. So you can explore how defenders were able to stop individual attackers, duels among midfielders, etc.
This being a contest, I’ve thrown in a little subtlety with the possibility of switching the form of the display from a more abstract circle diagram network, to a second representation where players are positioned according to their average physical position during the game. For this second representation, I’ve chosen to show teams side by side, and not one on top of the other, as to keep the chart legible.
How is this done?
First let me explain what I attempt to do in a contest.
I try to do three things. First, work on interesting, non-trivial data. Second, aim for virtuosity in the execution, try to come up with a difficult technique and hopefully pull it out nicely. Finally, link the two together: make the visualization relevant, so it supports the interestingness of the data, without being too busy or complex.
So we talked about the data. Again I must say it’s been really comfortable working with a professional dataset. The football statistics that reach mass audiences are usually very aggregated, so it’s really nice to have access to a degree of finesse. And by the way, I only used the tip of the iceberg of that data.
Now the technique.
Some time ago I wrote how I regretted that Tableau doesn’t have a network graph module. Though, I had a fair idea how to hack one.
As in a previous post on treemaps, we are using the line mark with aptly chosen coordinates. Note that since the lines I am drawing rely on two records, we won’t be needing the path shelf this time.
There are two kinds of “lines” here: those who act as nodes, and those who act as links.
Nodes are where the players circles are, and are also used to support labels. Those are given a size depending on the number of events each player is involved in. That number has been precomputed and is made available with the data.
To explain how links work, I have 3 sheets in my data. One on players (which provide the location of each player depending on each mode, their names, and other such details), one on points, which lists all the begin- and end-points of all the links, all of these points being linked to a record in the player sheet. The final sheet is the one on lines, where each record has a start and an end point.
So essentially, all the positioning happens in the player sheet. Each player and each special location there is assigned two positions, one around the circle diagram and one in the other form. Both are calculated before the data is supplied to Tableau. In Tableau using a parameter we can simulate a transition from one form to another – the intermediary states are even legible 🙂
For the tooltips, I have prepared different scenarios depending on the type of line which will be mouseovered (passes, shots, goals, players…) so one formula can be used to generate a useful and relevant text.
Finally, I used a little trick around transparency. Here it is absolutely necessary to use transparency because else the networks will become very, very busy and difficult to read. However, I did want the goal lines to stand out. I wanted them to be not transparent. Problem: there is no transparency shelf. You can only set transparency for all the marks at once. So, each goal link is really several links one on top of the other, so the resulting line is really opaque.
And finally the relevance.
I didn’t want to pack too much data at once, I mean sure there is much more data than on a traditional game report we see in the sports press, but not to the point that it’s overwhelming. I tried to make it so that it’s possible to immediately see the outcome of the game by counting the thicker lines going out of the circle. 2 to 5.
Instead of packing the dashboard with tightly fitted panels I have used just this one and only used one control to let the user change the representation.
So, I hope you all like it!