Getting beyond hello world with d3

About a year ago I proposed a very simple template to start working with d3. But is that the way I actually work? Of course not. Because, though you and I know that displaying hello world in d3 is not exactly trivial, this is not a highly marketable skill either. That said, I just can’t afford to start each of my projects from scratch.

So in this article I will present you the template I actually use. I won’t go in as much detail as last time because the exact instructions matter less than the general idea.

My template is a set of two files, an html file and a js file. Of course, extra files can be used as needed.

There’s not much to the html file – its role is really to call the javascript code. There is a little twist though. This is also where the interface elements (ie buttons and other controls) may be. Another thing is that I don’t load a css file through the html. The reason is that when I work with svg, I may export the svg proper to a file to have it reworked in Adobe Illustrator etc. and so having style inside the file makes things easier. So I would instead load a style sheet into the svg through javascript.

The javascript file is written with the following assumptions:

  • there could be other scripts called by the same page, so let’s try to avoid conflict as much as possible.
  • some variables need not to be accessed by other scripts.
  • the execution of my visualization is divided into several phases:
    • initialize: assigning initial values to variables, if needed forecasting interaction,
    • loading data: acquiring and processing external data,
    • drawing: this is where the visualization will be actually rendered or updated

    In addition to these three phases which always occur in my visualizations, there are several optional operations which I may or may not use which are included in the template.

    • reshaping data: operations like filtering or sorting the initial dataset after certain choices of the user. Following such an operation, the visualization has to be re-rendered.
    • self-playing animation: when this is required, then the visualization should be able to update itself at given intervals of time. If that is the case, then the html will include controls such as a start and stop button and a slider that can be used to move to an arbitrary time. Then, the javascript includes functions to start and stop the animation, and the drawing function is done so it can be called with a time argument, and not assuming that it will always just show the next step (because the slider can be used to jump ahead or back).
    • helper functions which can make me gain time but which don’t need to be accessed by other scripts.

To address the first concern, I wrap all my code in an anonymous function, like so:

(function() {
// ... my code
})();

within that function, any variable which is declared using the var keyword is not accessible to other scripts. Variables which are declared without the var keyword, however, can be accessed. So in order to minimize the footprint of my code, I create one single object, vis, so I can store the functions and values I will need to call as properties of that object.

(function() {
vis = {}
vis.init = function() {
// code for my init function ...
}
vis.height = 100;
var local = 50;
})();

So outside of that anonymous function, I can call vis.init(), I can access and change the value of vis.height, but I cannot access local.

One step further:

(function() {
vis = {}
vis.init = function(params) {
  // code for my init function ...
  vis.loaddata(params);
}
vis.loaddata = function(params) {
  // code for loading data ...
  vis.draw(params);
}
vis.draw = function(params) {
  // code for drawing stuff ...
}
})();

This gets a bit closer to how the code actually works. From the HTML, I would call vis.init and pass it parameters. vis.init will do its thing (assigning values to variables, creating the svg object, preparing interactions etc.) then call vis.loaddata, passing the same parameters. vis.loaddata will fill its purpose (namely, load the data and perhaps do a little data processing on the side) then call the drawing function.

Any of these functions can be called from the outside (from the HTML, ot from the console for debugging). The nice thing about it is that nothing really happens unless there’s an explicit instruction to start the visualization.

Let’s go a step deeper:

(function() {
vis = {}
var chart, svg, height, width;
vis.init = function(params) {
  if (!params) {params = {};}
  chart = d3.select(params.chart || "#chart");
  height = params.height || 500;
  width = params.width || 960;
  chart.selectAll("svg").data([{height: height, width: width}]).enter().append("svg");
  svg = chart.select("svg");
  svg
   .attr("height", function(d) {return d.height;})
   .attr("width", function(d) {return d.width;})
  vis.loaddata(params);
}
vis.loaddata = function(params) {
  if (!params) {params = {};}
  d3.csv((params.data || "data.csv") + (params.refresh ? ("#" + Math.random()) : ""), function(error, csv) {
    vis.csv = csv;
    vis.draw(params);
  })
}
vis.draw = function(params) {
  // code for drawing stuff ...
}
})();

Now we’re much closer to how it actually works. After we create our publicly accessible object vis, we create a bunch of local variables. Again, these can be used freely by the functions within our anonymous function, but not outside of it (notably in the console). I’m assuming that the code can be called without passing parameters, in which case within the functions I am testing if params actually exists, failing that I give it an empty object value. This is because down the road, if it is undefined and I try to access its properties, that would cause a reference error. If params has a value, even that of an empty object, if a property is not assigned, its value is “undefined”. So let’s take a look at the first 2 lines of vis.init:

if(!params) {params = {};}
chart = d3.select(params.chart || "#chart");

if params is not passed to vis.init, it gets an empty object value (that’s the first line). So, all of its properties have an undefined value. So the value of (params.chart || “#chart”) will be “#chart”. Likewise, if params is passed to vis.init, but without a chart property, params.chart will also be undefined, and (params.chart || “#chart”) will also be “#chart”. However, if params is passed to vis.init and a chart property is defined (i.e. vis.init({chart: “#mychart”}), then params.chart will be worth “#mychart” and (params.chart || “#chart”) will also be “#mychart”.
So that construct of assigning an empty object value to params then using || is like giving default values which can be overridden.

Within vis.init, we use local variables for things like height, width etc. so we can redefine them with parameters, and they can be easily accessed by anything within the anonymous function, but not outside of it.
I’ve also fleshed out the vis.loaddata function.
Likewise, we use the same construct as above: instead of hardcoding a data file name, we allow it to be overridden by a parameter, but if none is specified, then we can use a default name.
The part with params.refresh is a little trick.
When developing/debugging, if your data is stored in a file, you are going to load that file many times. Soon enough your browser will use the cached version and not reload it each time. That’s great for speed, but not so great if you edit your file as part as your debugging process: changes would be ignored! By adding a hash and a random string of character at the end of the file name, you are effectively telling your browser to get your file from a different url, although it is the same file. What this does is that it will force your browser to reload the file each time. Once you’re happy with the shape of your file, you can stop doing that (by omitting the refresh parameter) and the browser may use a cached version of your file.
In the vis.loaddata function, the most important part is that d3.csv method. As you may remember this is what loads your csv file (and btw if your data is not in csv form, d3 provides other methods to load external files – d3.json, d3.text etc.). How this method works is that the associated function (i.e the bit that goes: function(error, csv) {}) is executed [em]once the file is loaded[/em].
So since loading the file, even from cache, always take some time, what’s inside that function will be executed [em]after[/em] whatever could be written after the d3.csv method. This is why in the loaddata function, nothing is written after the d3.csv method, as there is no reliable way of knowing when that would be executed. The code continues inside the function. At the very end of that function, we call vis.draw, passing parameters along.
If I need to load several files, I would nest the d3.csv functions like this:

d3.csv((params.data || "data.csv"), function(error, csv) {
  // .. do things with this first file
  d3.csv((params.otherfile || "otherfile.csv"), function (error, csv) {
    // .. and then things with that other file. repeat if necessary..

    // the end of the innermost function is when all files are loaded, so this is when we pass control to vis.draw
    vis.draw(params);
  })
})

Another way to do this is using queue.js which I would recommend if the nesting becomes too crazy. For just 2 small files it’s a matter of personal preferences.

It’s difficult to write anything inside the code of vis.draw in a template, because this will have to be overwritten for every project. But here is the general idea though.
vis.draw can be called initially to, well, draw the visualization a first time when nothing exists but an empty svg element. But it can also be called further down the road, if the user presses a button that changes how it should be displayed, etc.
So, if the external context doesn’t change, running vis.draw once more should do nothing. As such, I avoid using constructs like “svg.append(“rect”) ” and instead use “svg.selectAll(“rect”).data(vis.data).enter().append(“rect”)” systematically.
The difference between the two is that using append without enter will add elements unconditionally. Using it after enter would only add new elements if there are new data items.
But what if I need to draw just one element? well, instead of writing “svg.append(“rect”)”, I would write something like “svg.selectAll(“rect.main”).data([{}]).enter().append(“rect”).classed(“main”, 1)”.
Let me explain what’s happening there.
What I want is the function to create one single rectangle if it doesn’t exist already. To differentiate that rectangle from any other possible rectangles I am going to give it a class: “main”. Why a class and not an id if it is unique to my visualization? Well, I may want to have several of these visualizations in my page and ids should really be unique. So I never use ids in selections, to the exception of specifying the div where the svg element will sit.
If there is already one rect element with the class “main”, svg.selectAll(“rect.main”).data([{}]).enter() will return an empty selection and so nothing will happen. No new rect element will be appended. This is great because we can run this as often as we want and what’s supposed not to change will not change.
However, if there is no such rect element, since there is one item in the array that I pass via data, svg.selectAll(“rect.main”).data([{}]).enter().append(“rect”) will create one single rect element. The classed(“main”, 1) at the end will give it the “main” class, so that running that statement again will not create new rectangles. Using [{}] as default, one-item array is a convention, but it’s better than using, say [0] or [""] because when manipulating our newly-created element, we can add properties to the data element (i.e. d3.selectAll(“rect.main”).attr(“width”, function(d) {d.width = 100; return d.width;}) ) which you couldn’t do if the data elements were not objects. (try this for yourself).

That being said, the general outline of the vis.draw function is so:

  • remove all elements that need to be deleted,
  • create all elements that need to be added, including a bunch of one-off elements that will only be created once (ie legend, gridlines…)
  • select all remaining elements and update them (using transitions as needed).

One last thing: how to call vis.init() in the first place? Well, the call would have to happen in the HTML file.

<script>
var params = {data: "data.csv", width:1400,height:800};
var query = window.location.search.substring(1);

var vars = query.split("&");
vars.forEach(function(v) {
	var p = v.split("=");
	params[p[0]] = p[1]
})
vis.init(params);
</script>

What’s going on there?
First, I initiate the params variable with some values I want to pass in most cases.
Then, the next line is going to look at the url of the page, and more specifically at the search part, that is, whatever happens after the ?. (I use .substring(1) as to not include the “?”).
The idea is that when I would like to pass parameters via the browser, like so: …/vis.html?mode=1&height=500&data=”anotherfile.csv”
The two splits (first by &, then by =) allow to get the parameters passed by url, and add them to params, possibly overriding the existing ones.
Then we pass the resulting params variable to vis.init.

Wihtout further ado, here are the two files in their entirety.

<!DOCTYPE html>
<meta charset="utf-8">
<head>
	<title></title>
	<style>

	</style>
</head>
<body>
<script src="http://d3js.org/d3.v3.min.js">
</script>
<div id="chart"></div>
<script src="template.js"></script>
<script>
var params = {data: "data.csv", width:960,height:500};
var query = window.location.search.substring(1);

var vars = query.split("&");
vars.forEach(function(v) {
	p=v.split("=");
	params[p[0]]=p[1]
})
vis.init(params);
</script>
</body>
</html>
(function() {
	vis={};
	var width,height;
	var chart,svg;
	var defs, style;
	var slider, step, maxStep, running;
	var button;

	vis.init=function(params) {
		if (!params) {params = {}}
		chart = d3.select(params.chart||"#chart"); // placeholder div for svg
		width = params.width || 960;
		height = params.height || 500;
		chart.selectAll("svg")
			.data([{width:width,height:height}]).enter()
			.append("svg");
		svg = d3.select("svg").attr({
			width:function(d) {return d.width},
			height:function(d) {return d.height}
		}); 
		// vis.init can be re-ran to pass different height/width values 
		// to the svg. this doesn't create new svg elements. 

		style = svg.selectAll("style").data([{}]).enter() 
			.append("style")
			.attr("type","text/css"); 
		// this is where we can insert style that will affect the svg directly.

		defs = svg.selectAll("defs").data([{}]).enter()
			.append("defs"); 
		// this is used if it's necessary to define gradients, patterns etc.

		// the following will implement interaction around a slider and a 
		// button. repeat/remove as needed. 
		// note that this code won't cause errors if the corresponding elements 
		// do not exist in the HTML.  
		
		slider = d3.select(params.slider || ".slider");
		
		if (slider[0][0]) {
			maxStep = slider.property("max");
			step = slider.property("value");
			slider.on("change", function() {
				vis.stop(); 
				step = this.value; 
				vis.draw(params);})
			running = params.running || 0; // autorunning off or manually set on
		} else {
			running = -1; // never attempt auto-running
		}
		button = d3.select(params.button || ".button");
		if(button[0][0] && running> -1) {
			button.on("click", function() {
				if (running) {
					vis.stop();
				} else {
					vis.start();
				}
			})
		};
		vis.loaddata(params);
	}
		
	vis.loaddata = function(params) {
		if(!params) {params = {}}
		d3.text(params.style||"style.txt", function (error,txt) {
			// note that execution won't be stopped if a style file isn't found
			style.text(txt); // but if found, it can be embedded in the svg. 
			d3.csv(params.data || "data.csv", function(error,csv) {
				vis.data = csv;
				if(running > 0) {vis.start();} else {vis.draw(params);}
			})
		})
	}
	
	vis.play = function() {
		if(i === maxStep && !running){
			step = -1; 
			vis.stop();
		}
		if(i < maxStep) {
			step = step + 1; 
			running = 1;
			d3.select(".stop").html("Pause").on("click", vis.stop(params));
			slider.property("value",i);
		vis.draw(params);} else {vis.stop();}	
	}

	vis.start = function(params) {
		timer = setInterval(function() {vis.play(params)}, 50);
	}

	vis.stop = function (params) {
		clearInterval(timer);
		running = 0;
		d3.select(".stop").html("Play").on("click", vis.start(params));
	}

	vis.draw = function(params) {
		// make stuff here! 
	}
})();

4 thoughts on “Getting beyond hello world with d3

  1. Hi Jerome,

    I’m new to D3JS, and I’m trying to get my head around this post. Do you have a data.csv with dummy data that could be used just to get these template up and running. I’ve tried creating my own data.csv but it’s not displaying anything. Do I need to amend the code first, or should it work straight off the bat?

    Thanks,
    Jonathan

Leave a Reply