Hollywood + data III: our info+beauty awards entry. Bonus: making of.

So Jen and I released our Info+beauty awards entry.

How did we end up with this?

it’s really cool working around movies, because it’s something we can relate to.

A part of my movie ticket stubs stash.

At first I wanted to do something out of keywords we could grab on the movies but  Jen came up with another idea I found more worth pursuing: working around the story types (which was the most interesting aspect of the curated contest dataset) and see if there was not some kind of grand truth we could unravel there. She also requested stars and glitter, because we were not going to work on this glamorous dataset with a tedious dashboard done in Excel.

That truth didn’t take so much time to find: the most frequently used story types (like comedy or movies with monsters) do not perform well in the box office while different story types (stories of teens growing up, or when the main character turns into something else), which are used less often, are much more profitable. So why doesn’t hollywood make more Junos and Black Swans and fewer College Road Trips or Dylan Dogs?

That’s the idea. Now the making.

Fair warning – the rest of this post is fairly technical. 

Making stars

If I had to contribute significantly to the project it had to be done in d3/svg.

Fortunately, it’s easy to generate star shapes in d3. Once you have the coordinates of where the points of one unitary star should be, you can easily make stars of any size with a function and a parameter.

var c1=Math.cos(.2*Math.PI),c2=Math.cos(.4*Math.PI),
    s1=Math.sin(.2*Math.PI),s2=Math.sin(.4*Math.PI),
    r=1,

    // ok the constant after r1 is the thickness of the branches.
    // 1 is a "straight" star, less is narrower, more is thicker.

    r1=1.5*r*c2/c1,
    star=[
        [0,-r],
        [r1*s1,-r1*c1],
        [r*s2,-r*c2],
        [r1*s2,r1*c2],
        [r*s1,r*c1],
        [0,r1],
        [-r*s1,r*c1],
        [-r1*s2,r1*c2],
        [-r*s2,-r*c2],
        [-r1*s1,-r1*c1],
        [0,-r]
        ];
    // this is a list of the pair of coordinates of the points that make a star.
lineStar=function(k) {
	var line=d3.svg.line()
		.x(function(d) {return d[0]*k;})
		.y(function(d) {return d[1]*k;})
	return line(star)+"Z"; // this will stitch everything together.
}

Now, running lineStar(10) will return the path description of a star with a radius of 10, thusly:

"M0,-10L3.367709824346891,-4.635254915624212L9.510565162951535,-3.0901699437494745L5.449068960040206,
1.770509831248423L5.877852522924732,8.090169943749475L0,5.729490168751577L-5.877852522924732,
8.090169943749475L-5.449068960040206,1.770509831248423L-9.510565162951535,-3.0901699437494745
L-3.367709824346891,-4.635254915624212L0,-10Z"

Placing, moving (and spinning) the stars

The next idea was placing the stars.

And for this we need two things: being able to position them somewhere, and being able to move them easily from point A to point B, ideally with some cool effect in between.

So, it would be possible to change the x and y attributes of the path, but each would have to be dealt with separately with a different function call. I found it a better approach to rely on the transform attribute and translate. Each time I want to position a star somewhere, I need it to be set at an x and y coordinate, which will always correspond to either the data of the star, or that of a group above it. For instance, a star corresponding to a movie will need to be at the position corresponding to the data of that movie, or that of the story type above it if it’s still collapsed, or that of the high-level grouping of story types if that’s collapsed.

Now all of the data structures for that are array of objects which all have x and y keys. In other terms, for any star-shaped object, I can always expect the underlying datum d to have d.x and d.y values. So, I wrote a function translate(d) which works on those 2 properties. And as a result, when I need to position any object all I have to write is:

.attr("transform",translate)

and the object will be positioned according to its underlying data. (this is equivalent to writing .attr(“transform”,function(d) {return translate(d);}) )

If I need to be them elsewhere, i.e. at the position of their parent, I can pass the data of that parent as an argument, for instance:

.attr("transform",function(d) {return translate(structs[d.struct]);})

For a cheap bit of extra action, I’ve added a spinning effect in the translate function. Since translate(d) returns a value for the transform attribute, nobody said it just had to be instructions for translation! so I’ve added a rotate after the translate. The arguments for the rotate function depend on the x and y properties of the argument as well, so when stars move across the screen, the rotate angle changes slightly with each increment of either coordinate, giving the impression of spinning.

Explosions, starlets and other effects

Most of the cool things happening in the visualization rely on one very simple principle about d3 transitions: chaining them.
In the code you’ll find oftentimes this pattern:

.selectAll("someobject").data(...).enter().append(...) // creates the items
... // sets the initial attributes
...
.transition()
... // change the attributes
...
...
...
.each("end", function() { // stuff to be done on each item after the transition is over

and within that function, you’ll find either:
another transition which starts exactly when the previous one ends, so for instance opacity can decrease (causing a fading effect): d3.select(this).transition()…

or a command to remove the object: d3.select(this).remove().

When another transition is called, there can be another one after, then another one, then another one, then eventually the object can be removed (or not).

Now you may think of transitions as ways to get one object to change smoothly from state A to state B, like a rectangle moving across the screen. But if you start to think that the objects can be discarded after the transitions, you’ll realize that there is an unbelievable number of things that can be done with them.
For instance, upon clicking on some stars, I am creating another star shape at that same location. Initially it has a the same size as the star, but I increase that radius to a large number (1000px) while decreasing its opacity to 0. So it seems that the new star is both exploding and fading. When it’s become transparent I remove it.

gStructs.append("svg:path") // here I'm creating a "path" shape
.style("stroke","none") // with no outline
.style("fill",colorXp)  // with the fill color of the explosion
.style("opacity",0.2)  // and a low opacity to start with (translucent)
.attr("d",lineStar(d.size[sizeAxis])) // I give it the shape of a star and the size of the
                                      // star that's being clicked
.attr("transform",translate(d)) // and I position it on that star

.transition() // action!

.duration(500)	// a 500ms transition. Long enough to see the effect.
.attr("d",lineStar(1000)) // the star expands to a radius of 1000.
.style("opacity",0) // while fading to transparency.

.each("end",function() {d3.select(this).remove();}) // and when it's done - it's removed.

Changing axes

In this visualization I let the user change what’s plotted along the axes. It’s not very difficult to do but it’s a hassle to do it late in the project as it has been our case because it requires a lot of housekeeping. This is really about the data structures that will support our items. Instead of having just one value for x, y and size they have an object with several keys, one per axis. Then we maintain one variable per axis type, so everywhere we should write: d.x, we write instead: d.x[xAxis].

So when there is an axis change, of course, we do a transition so that the stars and everything move smoothly to their new position. But what if the objects were already moving? When an unplanned transition interferes with an ongoing one, the results are often ugly, especially if the current transition had chained transitions waiting to be triggered. In other words, this will leave a mess.

The way I’ve dealt with this is by keeping a tab on the number of transitions going on at a certain time. The axis change could only occur if no other transitions were taken place. If that was the case they were simply denied. There are other ways to do that like a queue of actions but that seemed the simple and adequate way to deal with this.

Bootstrap and google fonts

This was the first non-trivial project where I used bootstrap and I’m just never going back. Bootstrap simply removes all the hassle of arranging all the elements of a visualization on a screen and is very easy to use. Plus, it comes up with sensible options for buttons, forms, and the like. Since the contest it has evolved faster than a pokémon, for instance it is now possible to specify custom colors in a form and bootstrap will generate the appropriate css files. Google fonts are another great help as they are a very easy solution to choose fonts among a relatively large number of choices without relying on the fact that all the users have these fonts on their computer.

Wrapping it up

There’s a lot of other hacks in the code which you are welcome to explore, I admit I don’t remember them all because I took too much time to write this blog post after creating the entry (bad). However if there is one point you would like be to explain please ask in the comments.
I’m not entirely sure of what happened when I submitted the entry though. First it wasn’t listed with the others, then I got a message saying it hadn’t been reviewed, so it didn’t win anything, yet some time after the prizes have been handled it appeared in the “shortlisted” visualizations for the contest (which I found by accident). So whether or not it was good, I let you guys judge, at any rate it was fun making.

Leave a Reply