Stuff I do with Tableau

I put together a list of some of the things I’ve done with Tableau public there. I had put the link on twitter last Friday, and I just saw the number of connections through bit.ly – never thought something posted just before the week-end could get so much attention! so thanks, twitter followers. I’m putting another link here on the blog for convenience.

I’ve been using Tableau public ever since its release (well actually a tad before).  Over time I’ve been trying to use it less as a visually pleasing and convenient way to shove a lot of data in a limited space (like here), and more as a way to promote a certain angle when looking at a dataset (like here), that is, as an invitation to the viewers to reach the same conclusions than us, but by giving them access to the data so they can see for themselves.

One of my first vizzes - just a layer on top of the dataset

This one is less neutral. I would like readers to embrace the opinion in the associated article, so I use Tableau to present the data in a way that supports this opinion.

That long list is still a subset of what I’ve done with Tableau, mostly because like many people in datavis I use Tableau at two stages.

I use it to communicate a finished visualization such as these, although I may use a static image or another interactive tool.

But I also almost systematically use it in the early stages, when I receive a dataset and I need to make sense of it before I can represent it. By manipulating a dataset in Tableau and testing various basic dimension combinations one can quickly see the points of interest in the data and come up with relevant questions to ask the data, to which a visualization is the answer. So while I can’t share these “drafts” they are very very helpful.

Also over time I got better (hopefully) at controlling dashboards so they look exactly the way I want them to and not how I manage to put them together. What helps is setting the size of the dashboard as exact dimensions (ie 600 by 400), not as a range, and, unsurprisingly, to draw the dashboard on paper first. Anyway, all the dashboards which are on that page are freely downloadable if you want to see how they are done

More on Tableau Public

Yesterday’s post on Tableau Public generated a surge of traffic so I thought I should add more examples and practical information for people interested in the software.

 

Here’s a quick one on health, based on OECD Health at a Glance:

click to interact

Just select two indicators, and you see how one influences the other. Or rather, is correlated because correlation doesn’t imply causation!

Here are links to more example done with Tableau Public.

Another Paris-based intergovernmental organisation is using Tableau – the UNESCO.

These 2 have been done by PAHO to describe the situation in Haiti (the 2nd is really powered by Tableau Server, but it’s close enough)

 

There are further examples on the Tableau blog.

Now more about Tableau Public and the Beta.

Tableau Public doesn’t exactly allow you to do everything that Tableau does from the web. To prepare the views which are going to be published on the web, you need to use a software that runs on your computer.  It lets you do whatever you can do with the regular Tableau Desktop, with a couple of limitations: you have to stick to basic source file types (access, excel, and text file, no exotic database) and you are limited to 100,000 records of data. One other difference with the regular Tableau Desktop  is that you can’t save your work locally: you have to save it on the web, in your private space on Tableau servers. However, there are the same analytical and visual features in Tableau Public than in Tableau Desktop.

When your work is published, users don’t have access to all the tools you had when creating the view: they can’t move dimensions around, create exotic filters or calculations. They really see the chart as you intended it to be seen. There are a certain number of interactions built-in, however: users can select, highlight, sort and filter. If you are publishing a dashboard, the different tables and charts of the dashboard can be linked, meaning that an action (such as highlighting one dimension) in one place will be replicated elsewhere, or not. The underlying data can also be downloaded. So there is a great deal of interactivity, but not enough to twist your display beyond recognition. That being said, other Tableau Public users can download your workbook and manipulate it with the client software.

About the Beta: currently, Tableau Public is in closed beta. It will be in open Beta in February, as far as I know. To get a spot in the close beta, you need to write to the people of Tableau.

 

Using Tableau Public: first thoughts

I am currently beta testing Tableau Public. Essentially Tableau Public let you bring the power of Tableau analysis online. With Tableau public, your audience doesn’t need to download a workbook file that they can see in an offline, software client – they can see and interact with your work directly on a web page.

There are quite a few examples of the things you can do with Tableau public. These are the examples you are given when you start the product:

Tracking Economic Indicators by FreakalyticsA Tale of 100 Entrepreneurs by Christian ChabotBird strikes by airport by CrankyflierInteractive Running Back Selector by CBS sports

And there are always more on Tableau’s own blog. I’ve done quite a few which I’ll share progressively on this blog and on my OECD blog, http://www.oecd.blog/statistics/factblog.

So that’s the context. What’s the verdict?

1. There is no comparable data visualization platform out there.

There are many ways to communicate data visually. Count them: 1320, 2875… and many more.

However these tools have a narrower focus than Tableau, or require the user some programming ability. For instance, Many Eyes uses a certain number of types of data visualization which can be set up in seconds, but which cannot be customized. Conversely, Protovis is very flexible but requires some knowledge of Javascript. And even for a skilled developer, coding an interactive data visualization from scratch takes time.

By contrast, Tableau is a fully-featured solution which doesn’t require programming. It has many representation types which can be deeply customized: every visual characteristic of a chart (colour, size, position, etc.) can depend on your data. Several charts can also be combined as one dashboard. On top of that, data visualization done in Tableau comes with many built-in controls, with an interface to highlight and filter data, or to get more details on demand. For dashboards, it is also possible to link charts, so that actions done on one chart (highlighting records, for instance) affect other charts.

2. The solution is not limitless.

Tableau enables you to do things which are not possible using other packages. But it doesn’t allow you to do anything. That’s for your own good – it won’t allow you to do things that don’t make sense.

There are many safety nets in Tableau, which you may or may not run into. For instance, you can’t make a line chart for data which don’t have a temporal dimension – so much for parallel coordinates. However, the system is not fool-proof. Manipulating aggregates, for instance, can lead to errors that you wouldn’t have to worry about in plain old Excel, where the various steps through which data are computed to create a graph are more transparent (and more manual). Compared to Excel, you have to worry less about formatting – the default options for colours, fonts and positions are sterling – and be more vigilant about calculations.

3. Strength is in numbers.

Over the years, many of us grew frustrated with Excel visual capacities. Others firmly believed that anything could be done with the venerable spreadsheet and have shown the world that nothing is impossible.

The same applies to Tableau. The vibrant Tableau community provides excellent advice. “Historic” Tableau users are not only proficient with the tool, but also have a better knowledge of data visualization practices than the average Excel user. Like any fully-featured product, there is a learning curve to Tableau, which means that there are experts (the proper in-house term is Jedis) which find hacks to make Tableau even more versatile. So of course, it is possible to do parallel coordinates with Tableau.

The forum, like the abundant training, available as videos, manuals, list of tips,or online sessions with an instructor, doesn’t only help the user to solve their problems, but it also a fantastic source of inspiration.

With the introduction of Tableau Public, the forum will become even more helpful, as there will be more questions, more problems and more examples.

 

Review of Tableau 5.0

Those last 2 weeks, I finally found time to give Tableau 5.0. Tableau enjoys a stellar reputation among the data visualization community. About a year ago, I saw a live demo of Tableau by CEO and salesman extraordinaire Christian Chabot. Like most of the audience, I was very impressed, not so much by the capacities of the software but by the ease and speed with which insightful analysis seemed to appear out of bland data. But what does it feel on from the user perspective?

Chartz: ur doing it wrong

Everyone who wrote about charts would pretty much agree that the very first step in making one is to decide what to show. The form of the display is a consequence of this choice.

Most software got this wrong. They will ask you how you want your display to look like, then ask you for your data. Take this screenshot from Excel:

excel

When you want to insert a chart, you must first choose what kind of chart (bar, line, column, pie, area, scatter, other charts) and one of its sub-types. You are not asked, what data does this apply to, and what that data really is. You are not asked, what you are trying to show through your chart – this is something you have to manage outside of the software. You just choose a chart.

I’m picking Excel because with 200m users, everyone will know what I’m talking about, but virtually all software packages ask the user to choose a rather rigid chart type as a prerequisite to seeing anything, despite overwhelming theoretic evidence that this approach is flawed. In Excel, like in many other packages, there is a world of difference between a bar chart and a column chart. They are not of the same nature.

A reverted perspective

Fortunately, Tableau does it the other way round. When you first connect with your data in Tableau, it distinguishes two types of variables you can play with: dimensions and measures. And measures can be continuous or discrete.

tableau-dimensions(This is from an example file).

Then, all you have to do is to drag your dimensions and your measures to the center space to see stuff happening. Let’s drag “close” to the rows…

tableau-dragging-1We already see something, which is not terribly useful but still. Now if we drag Date into the columns…

tableau-dragging-2

Instant line chart! the software found out that this is the type of representation that made the most sense in this context. You’re trying to plot continuous variables over time, so it’s pretty much a textbook answer. Let’s suppose we want another display: we can click on the aptly name “show me!” button, and:

tableau-show-me

These are all the possible representations we have. Some are greyed out, because they don’t make sense in this context. For instance, you need to have dimensions with geographic attributes to plot things on a map (bottom left). But if you mouse over one of those greyed out icons, you’ll be told why you can’t use them. So we could choose anything: a table, a bar chart, etc.

A simple thing to do would be to switch rows and columns. What if we wanted to see date vertically and the close horizontally? Just drag and drop, and:

tableau-flip

Crafting displays

Gone are the frontiers between artificial “chart types”. We’re no longer forcing data into preset representations, rather, we assign variables (or their automatic aggregation, more on that shortly) to possible attributes of the graph. Rows and columns are two, which shouldn’t be taken too literally – in most displays, those would be better described as abcissa and ordinate – but all the areas in light grey (called “shelves”) can welcome variables : pages, filters,path, text, color, size, level of detail, etc.

more-dimensions

Here’s an example with a more complex dataset. Here, we’re looking at sales figures. We’re plotting profit against sales. The size of the marks correspond to the volume of the order, and the colour, to their category. Results are presented year by year. It is possible to loop through the years. So this display replicates the specs of the popular Trendalyzer / Motion chart tool, only simpler to set up.

Note that as I drag variables to shelves, Tableau often uses an aggregation that it thinks makes more sense. For instance, as I dragged Order Date to the page shelf, Tableau picked the year part of the date. I could ask the program to use every value of the date, the display will be almost empty but there would be a screen for each day. Likewise, when I dragged Order Quantity to the Size shelf, Tableau chose to use the sum of Order Quantity instead. Not that it makes much of a difference here, as each bubble represents only one order. But the idea is that Tableau will automatically aggregate data in a way that makes sense to display, and that this can always be overridden.

But if I keep the data for all the years in the display, I can quickly see the transactions where profit was negative.

sets1And I can further investigate on this set of values.

So that’s the whole idea. Because you can assign any variable to any attribute of the visualization, in the Tableau example gallery you can see some very unusual examples of displays.

Using my own data

When I saw the demos, I was a little skeptical of the data being used. I mean, things were going so smoothly, evidence seemed to be jumping at the analyst, begging to be noticed. Tableau’s not bad at connecting with data of all forms and shapes, so I gave it a whirl with my own data.

Like a lot of other official data providers, OECD’s format of choice for exporting data is SDMX, a flavor of XML. Unfortunately, Tableau can’t read that. So the next easiest thing for me was Excel.

I’m not going to get too much into details, but to come up with a worksheet that Tableau liked with more than a few tidbits of data required some tweaking and some guessing. The best way seems to be: a column for each variable, dimensions and dates included, and don’t include missing data (which we usually represent by “..” or by another similar symbol).

Some variables weren’t automatically reckognized for what they were: some were detected as dimensions when they were measures, date data wasn’t processed that well (I found that using 01/01/2009 instead of 2009 or 1/2009 worked much better). But again, that was nothing that a little bit of tweaking couldn’t overcome.

On a few occasions, I have been scratching my head quite hard as I was trying to understand why I could get Y-o-Y growth rates for some variables, but not for some others, or to make custom calculated fields. Note that there are plenty of online training videos on the website. I found myself climbing the learning curve very fast (and have heard similar statements of recent users who quickly felt empowered) but am aware that practice is needed to become a Tableau Jedi. What I found recomforting is that without prior knowledge of the product, but with exposure to data design best practices, almost everything in Tableau seems logical and simple.

But anyway – I was in. Here’s my first Tableau dashboard:

my-dashboardA Dashboard is a combination of several displays (sheets) on one space. And believe me, it can become really sophisticated, but here let’s keep it simple. The top half is a map of the world with bubbles sized after the 2007 population of OECD countries. The bottom half is the same information as a bar chart, with a twist: the colour corresponds to the population change in the last 10 years. So USA (green) have been gaining population while Hungary has seen its numbers decrease.

I’ve created an action called “highlighting on country” to link both displays. The best feature of these actions is that they are completely optional, so if you don’t want to have linked displays, it is entirely up to you and each part of the dashboard can behave independantly. You can also bring controls to filter or animate data which I left out for the sake of simplicity. However, you can still select data points directly to highlihght them in both displays, like this:

my-dashboard-highlight-bottomHere I’ve highlighted the top 5 countries. The other ones are muted in both displays. Here my colour choice is unfortunate because Japan and Germany, which are selected, don’t look too different from the other countries. Now I can select the values for the countries of Europe:

my-dashboard-highlight-europe

And you’ll see them highlighted in the bottom pane.

Display and style

Representing data in Tableau feels like flipping the pages of a Stephen Few book, which is more than coincidiential as he is an advisor to Tableau. From my discussion with the Tableau consultant that called me, I take that Tableau takes pride in their sober look and feel, which fervently follows the recommendation of Tufte, and Few. I remember a few posts from Stephen’s blog where he lashed as business intelligence vendors for their vacuous pursuit of glossiness over clarity and usefulness. Speaking of Few, I’ve upgraded my Tableau trial by re-reading his previous book, Information Dashboard Design, and I could really see where his philosophy and that of Tableau clicked.

So there isn’t anything glossy about Tableau. Yet the interface is state-of-the-art (no more, no less). Anyone who’ve used a PC in the past 10 years can use it without much guessing. Colours of the various screen elements are carefully chosen and command placement makes sense. Most commands are accessible in contextual menus, so you really feel that you are directly manipulating data the whole time.

When attempting to create sophisticated dashboards, I found that it was difficult to make many elements fit on one page, as the white space surrounding all elements becomes incompressible. I tried to replicate displays that I had made or that I had seen around, I was often successful (see motion chart reproduction above) but sometimes I couldn’t achieve the level of customization that I had with visualizations which are coded from scratch in Tableau. Then again even Tableau’s simplest representations have many features and would be difficult to re-code.

Sharing data

According to Dan Jewett, VP of product development at Tableau,

“Today it is easier to put videos on the Web than to put data online.”

But my job is precisely to communicate data, so I’m quite looking forward this state of affairs to change. Tableau’s answer is twofold.

The first half is Tableau Server. Tableau Server is a software that organizes Tableau workbooks for a community so they can access it online, from a browser. My feeling is that Tableau Server is designed to distribute dashboards within an organization, less so with the anyone on the internet.

That’s where the second part of the answer, Tableau Public, comes into play. Tableau Public is still in closed beta, but the principle is that users would have a free desktop applications which can do everything that Tableau Desktop does, except saving files locally. Instead, workbooks would have to be published on Tableau servers for the world to see.

There are already quite a few dashboards made by Tableau Public first users around. See for instance How Long Does It Take To Build A Technology Empire? on one of the WSJ blogs.

Today, there is no shortage of tools that let users embed data online without technical manipulations. But as of today, there is no product that could come close to this embedded dashboard. Stephen McDaniel from Freakalytics notes that due to Tableau’s technical choices (javascript instead of flash), dashboards from Tableau Public can be seen in a variety of devices, including the iPhone.

I’ve made a few dashboards that I’d be happy to share with the world through Tableau Public.

This wraps up my Tableau review. I can see why the product has such an enthusiastic fan base. People such as Jorge Camoes, Stephen Few, Robert Kosara, Garr Reynolds, Nathan Yau, and even the Federal CIO Vivek Kundra have all professed their loved for the product. The Tableau Customer Conference, which I’ve only been able to follow online so far, seems to be more interesting each year. Beyond testimonies, the gallery of examples (again at http://www.tableausoftware.com/learning/examples, but do explore from there to see videos and white papers), still in the making, shows the incredible potential of the software.