Working with data in protovis: part 5 of 5

11 February, 2011 (20:11) | charts, data visualization, protovis, tips | By: jerome

previous: reshaping complex arrays (4/5)

Working with layouts

In this final part, we’re going to look at how we can shape our data to use the protovis built-in layouts such as stacked areas, treemaps or force-directed graphs.
This is not a tutorial on how to use layouts stricto sensu, and I advise anyone interested to first look at the protovis documentation to see what can be done with this and to understand the underlying concepts.

But if there is one thing to know about layouts, it’s that they allow you to create non-trivial visualizations in even less code than regular protovis, provided that you pass them data in a form they can use, and this is precisely where we come in.

Three great categories of layouts

Currently, there are no fewer than 13 types of layouts in Protovis. Fortunately, there are examples for all of them in the gallery.
There are layouts for:

In addition, there are layouts like pv.Layout.Bullet which require data to have a certain specific shape but the example from the gallery is very explicit. (et tu, Horizon layout).

Arrays of data

In order to work with this kind of layout, the simplest thing is to put your data in a 2-dimensional array:

var data=[
   [8,3,7,2,5],
   [9,6,1,7,4],
    ...
   [7,4,3,6,8]
];

For the grid layout, this gives you an array of cells divided in columns (number of elements in each line) and rows (number of lines).
The idea of the grid layout is that your cells are automatically positioned and sized, so afaik the only thing you can do is add a mark such as a pv.Bar which would fill them completely, but which you could still style with fillStyle or strokeStyle. You can’t really access the underlying data with functions but you can use methods that rely on default values, like adding labels.

For instance, you can use it to generate a QR code:

var qr=[
"000000000000000000000000000",
"011111110001010100011111110",
"010000010101001110010000010",
"010111010000010100010111010",
"010111010111011110010111010",
"010111010010000001010111010",
"010000010110110010010000010",
"011111110101010101011111110",
"000000000011100100000000000",
"011111011110101110101010100",
"000010101001010111101000100",
"010101111001001011111010110",
"001011000100010101010100010",
"001100010111011010010101110",
"010101100110001101001010100",
"010011010011111111100110110",
"010111101010100101000010010",
"010100110010111101111101000",
"000000000101010111000111000",
"011111110100011001010111110",
"010000010000110011000110110",
"010111010110001011111111000",
"010111010101101100110101110",
"010111010100000111001001010",
"010000010111010101101110010",
"011111110101001100011111110",
"000000000000000000000000000",
].map(function(i) i.split(""));

var vis = new pv.Panel()
    .width(216)
    .height(216);
vis.add(pv.Layout.Grid)
    .rows(qr)
 	.cell.add(pv.Bar)
 	    .fillStyle(pv.colors("#fff", "#000"))
     ;
vis.render();
(BTW, this is the QR code to this page)

On line 29, I’m using a map function to turn this array of strings, which is easier and shorter to type, into a bona fide 2-dimensional array.

That’s all there is to grids, of all the layouts they are among the easiest to reproduce with regular protovis.

Now, stacks.
The easiest way to use them is to pass them 2-dimensional arrays. Now it doesn’t have to be arrays of numbers, it can be arrays of associative arrays in case you need to do something exotic. But for the following examples let’s just assume you don’t. Here is how you’d do a stacked area, stacked columns and stacked bars respectively:

var data=[
[[1000,1200,1500,1700]]
[[100,500,300,200]]
]
var vis=new pv.Panel().width(200).height(200);
vis.add(pv.Layout.Stack)
    .layers(data)
    .x(function() 50*this.index)
    .y(function(d) d/20)
    .layer.add(pv.Area)

all you need is to feed the layers, x, y properties of your stack, then say what you want to add to your layers.
Now, columns:

vis.add(pv.Layout.Stack)
    .layers(data)
    .x(function() 50*this.index)
    .y(function(d) d/20)
    .layer.add(pv.Bar).width(40)

and finally, bars:

vis.add(pv.Layout.Stack)
    .layers(data)
    .orient("left")
    .x(function() 50*this.index)
    .y(function(d) d/20)
    .layer.add(pv.Bar).height(40)

For bars, there is a little trick here. I specify that the layer orientation is horizontal (“left”) and I change the height instead of the width of the added pv.Bar.
And that all there is. You can create various streamgraphs by playing with the order and offset properties of the stack but this doesn’t change anything to the data structure, so we’re done here.

Representing networks

Protovis provides 3 cool layouts to easily exhibit relationships between nodes: arc diagrams, matrix diagrams and force-directed layouts.
The good news is that the shape of the data required by those three layouts is identical.

They require an array that correponds to the nodes. This can be as simple as a pv.range(), or as sophisticated as an array of associative arrays if you want to style your network graph according to several potential attributes of the node.

And they also require an array for the links. This array has a more rigid form, it must be an array of associative arrays of the shape: {source: #, target: #, value: #} where the values for source and target correspond to the position of a node in the node array, and value indicates the strength of the link.

So let’s do a simple one.

var nodes=pv.range(6); // why more complex, right?
var links=[
{source:0, target:1, value:2},
{source:1, target:2, value:1},
{source:1, target:3, value:1},
{source:2, target:4, value:4},
{source:3, target:5, value:1},
{source:4, target:5, value:1},
{source:1, target:5, value:3}
]
var vis = new pv.Panel()
    .width(200)
    .height(200)
    ;
var arc = vis.add(pv.Layout.Arc)
    .nodes(nodes)
    .links(links)
	.bottom(100)
arc.link.add(pv.Line);
arc.node.add(pv.Dot)
    .size(50)
vis.render();

Here, by varying the strength of the link, the thickness of the arcs changes accordingly. The nodes are left unstyled, had we passed a more complicated dataset to the nodes array, we could have changed their properties (fillStyle, size, strokeStyle, labels etc.) with appropriate accessor functions.

With little modifications we can create a force-directed layout and a matrix diagram.

var force = vis.add(pv.Layout.Force)
    .nodes(nodes)
    .links(links);

force.link.add(pv.Line);

force.node.add(pv.Dot)
	.size(50)
	.anchor("center").add(pv.Label)
		.text(function() this.index);

vis.render();

Here I labelled the nodes so one can tell which is which. This is done by adding a pv.Label to the pv.Dot that’s attached to the node, just like with any other mark.

var Matrix = vis.add(pv.Layout.Matrix)
	.nodes(nodes)
	.directed(true)
	.links(links)
	.top(20).left(20)

Matrix.link.add(pv.Bar)
    .fillStyle(function(d) pv.Scale.linear(0, 2, 4)
      .range('#eee', 'yellow', 'green')(d.linkValue))

Matrix.label.add(pv.Label).text(function() Math.floor(this.index/2))

vis.render();

For the matrix things are slightly more complex than for the previous 2. Here I opted for a directed matrix, as opposed to a bidirectional one: this means that each link is shown once, to its source from its target, and not twice (ie from its target back to its source) which is the default.
I chose to color the bar attached to my links (which are cells of the matrix) according to the strength of my links. Again, if my nodes field was more qualified, I could have used these properties.

Finally, we’ve added labels to the custom property Matrix.label. Only, the labels are numbered from 0 to 11 so to get numbers from 0 to 5 for both rows and columns I used Math.floor(this.index/2) (integer part of half of this number).

Hierarchized data

Like for networks, the shape of the data we can feed to treemaps, icicles and other hierarchical representation doesn’t change. So once you have your data in order, you can easily switch representations.

Essentially, you will be passing a tree of the form:

var myTree={
   rootnode: {
      node: {
      ...  
         node: {
            leaf: value,
            leaf: value,
            ...
            leaf: value
         },
      ...  
}

The protovis examples use the hierarchy of flare source code as an example, which really shows what can be done with a treemap and other tree represenations.

For our purpose we are going for a simpler tree, inspired by the work of Periscopic on congressspeaks.com which Kim Rees showed at Strata.
Kim presentation featured tiny treemaps that showed the voting record for a congressperson, and whether they had voted for or against their party.

So let’s play with the voting record of an hypothetic congressperson:

var hasVoted={
	didnt: 100,
	voted: {
	    yes: {
	        yesWithParty: 241,
	        yesAgainstParty: 23
	    },
	    no: {
	        noWithParty: 73,
	        noAgainstParty: 5
	    }
	}
};

Once you have your tree, you will need to pass it to your layout using pv.dom, like this:

pv.dom(hasVoted).root("hasVoted").nodes()

Based on that let’s do two hierarchical representations.
Let’s start with a tree:

var vis = new pv.Panel()
    .width(500)
    .height(200)
    ;
var tree = vis.add(pv.Layout.Tree)
    .nodes(pv.dom(hasVoted).root("hasVoted").nodes())
    .depth(40)
    .breadth(100)
    .top(30)
    .right(100)
    ;
tree.link.add(pv.Line);
tree.node.add(pv.Dot)
    .size(function(n) n.nodeValue)
	.anchor("center").add(pv.Label).textAlign("center").text(function(n) n.nodeName)
vis.render();

And here is the result:

There are many styling possibilities obviously left unexplored in this simple example (you can control properties of the tree.link, tree.node, tree.labels which we didn’t use here, etc.), but this won’t change much as far as data are concerned.

Now let’s try a treemap with the same dataset.

var vis = new pv.Panel()
    .width(400)
    .height(200)
    ;

var tree = vis.add(pv.Layout.Treemap)
	.width(200).height(200)
    .nodes(pv.dom(hasVoted).root("hasVoted").nodes())
    ;

tree.leaf.add(pv.Panel)
	.fillStyle(function(d) d.nodeName=="didnt"?"darkgrey":d.nodeName.slice(0,3)=="yes"?
	d.nodeName.slice(-9)=="WithParty"?"powderblue":"steelblue":
	d.nodeName.slice(-9)=="WithParty"?"lightsalmon":"salmon")

vis.add(pv.Panel)
	.data([
		   {label:"yes with party", 	color: "powderblue"},
		   {label:"yes against party", 	color: "steelblue"},
		   {label:"no with party", 		color: "lightsalmon"},
		   {label:"no against party", 	color: "salmon"},
		   {label:"didn't vote", 		color: "darkgrey"}
		   ])
	.left(220)
	.top(function() 50+20*this.index)
	.height(15)
	.width(20)
	.fillStyle(function(d) d.color)
	.anchor("right").add(pv.Label).textAlign("left").text(function(d) d.label)

vis.render();

and what took the longest part of the code was making the legend.

Here is the outcome:

Comments

Pingback from Tweets that mention Jerome Cukier » Working with data in protovis: part 5 of 5 — Topsy.com
Time February 11, 2011 at 9:29 pm

[...] This post was mentioned on Twitter by Nathan Yau, Jérôme Cukier and sonngbogong, Andy Kirk. Andy Kirk said: RT @jcukier: I've posted part 5/5 of my #protovis tutorial on working with data http://bit.ly/gBb7as, on layouts (treemaps, network grap … [...]

Pingback from Working with Data in Protovis on Datavisualization.ch
Time February 17, 2011 at 5:34 pm

[...] as how to structure data to work with complex structures like treemaps or force-directed layouts (5).For the past year or so I have been dabbling with protovis. I don’t have a heavy CS background [...]

Comment from Jake Lin
Time September 12, 2011 at 8:04 pm

Very helpful series. I have been using javascript and protovis but this 5 part series is a nice complement to the protovis examples.

Write a comment