My last project is an interactive map of the Paris subway.

Quite some time ago, I had seen Tom Carden’s Tube Travel Times and was fascinated. So while I initially had tried to do something different I ended up borrowing a lot of his design, so it wouldn’t be fair to start this post without acknowledging that.

Then again, there are a few original things in this worth writing about.

A subway system can be seen as a network of nodes and edges. Nodes would be the stations, and edges, the subway lines that connect them. Finding the shortest path between two parts of the graph is a classic problem of computer science. That path is the sum of the length of all the edges it takes to traverse the graph from the source to the destination.

However a typical subway journey, and I should know, is not entirely spent in a train. You first have to get from the street to the platform, wait for your train; in the event of a connection, you have to go from one platform to another, wait for another train; and eventually you have to return to the street level.

So in my representation of the subway system in addition to the stations themselves, I’ve added nodes for all the platforms within a station and edges between them. So if you have to go from one station A to station B, which is three stops away on a given subway line, you would really traverse 5 edges: station A to platform, stop 1, stop 2, stop 3, then platform to station B.

While the Paris subway agency, RATP, has recently publicized its open data intent, the travel times between two stations are not public. I actually had reliable and accurate measured travel times for 90% of the network, on the basis of which I could estimate the rest. The great unknown that remains is the complexity of the stations, some are very straightforward so going from the street to a train is a matter of seconds, while some are downright mazes. While I think I have travelled through almost every stations by now, I definitely not have stopped in many of them. So as part of the project I am asking users to complement my data on the stations they know well. Actually, I didn’t use much of the datasets that RATP published. The position of the stations on the canonical plan has a lot of inaccuracies. For the geographical positions I used my own measures rather than the published data (I had been playing on and off with subway data for close to 5 years now, http://www.openprocessing.org/sketch/22252). I did use the trafic data to give an idea of the occupancy of a given section, and the official color codes as well. Because I am using my own measures, I have not added other railway systems like the RER or tramway for which I don’t have enough data.

So when a user selects a station, the rest of the network moves according to their (shortest path) distance to the selected station. So at the heart of the exercise there is a shortest path calculation from any station to any other. Including stations, platforms, connections, entrances and exits, that’s a network of 680 nodes and 1710 edges (for 299 stations, I didn’t include a couple of the very last ones). And it’s a directed network, because there are some sections which go in only one direction. At most, a network that big could have close to 500,000 edges, so it’s fairly sparse.

There are several algorithms to compute shortest paths in the code. I am using the best known one, Dijkstra, which works on path with no negative edge length. The way it is implemented here, with binary heaps, makes it so fast that it’s not noticeable. Not using that improvement, or relying on a more versatile but slower shortest path algorithm, would result in considerably greater computing times.

When a user selects one station a few things happen. First, all of the others are arranged around the selected station. Imagine the station is the fixed point in a polar coordinate system, all the other stations retain their angular coordinate but their radii now depends on the shortest path time. That idea came directly from Tom’s applet. Another thing I took from him is the addition of rings in the background which give an indication of the travelling time to those destinations. But I settled for that after much fumbling and with the conviction that there was no better solution.

What also happens is that all the edges which are not part of any shortest path disappear. So subway lines are no longer continuous.

Once a station is selected, more things happen when hovering on a second station, the actual shortest path is revealed while all others fade out, and the station marks are hidden except those of the start, destination and all connections on the way, for which names are displayed.

Finally, an interace appears that let users enter estimations for the times I don’t know so well about the stations. I haven’t added any hard figures there so I am only asking for impressions, but I believe that with some usage the time estimations can get much more accurate.

**Edit 1**

the vis has been quite successful, so I decided to awesomize it a bit.

Now by clicking on any colored line, it’s possible to “block” any given subway section. The shortest routes become those which do not go through that section. You can simulate what happens, for instance, when a given line is down.

The other change is that I’ve added walking distances. For each station I’ve computed the beeline distance with its 5 nearest neighbors. I’ve translated that distance into time (knowing that pedestrians can’t traverse buildings, have to mind trafic etc.).

But in many cases it is still quicker to go from one station to the next on foot than by train.