Guardian Tag Explorer: When the Guardian Open Platform met d3.js

My pseudo PhD supervisor for my mocorate degree, Dr Tony Hirst (Open University – and now Visiting Senior Fellow in Networked Teaching and Learning/Senior Fellow of the University of Lincoln – congrats) recently, amongst many other things, started Tinkering with the Guardian Platform API – Tag Signals and Visualising New York Times Article API Tag Graphs Using d3.js.

Like any fake PhD candidate it’s important to follow the work of your supervisor, after all they will be marking your imaginary neverending thesis. So after much toil and many pointers from Tony here’s what I’ve come up with - a collision of the Guardian Platform API and visualisation with the d3.js library – GuardianTagExplorer.

In this post I’ll highlight a couple of features of the interface and then try to recall many of the lessons learned. Below is a short clip to show how it’s supposed to work or you can have a play yourself via the link above (because it uses SVG the 9% of you who use Internet Explorer 8 or less won’t see anything):

What does it do

When you enter a search term it asks the Guardian Open Platform if there are any articles associated with that term. Each of these articles has some metadata attached including a list of tags used to categorise the piece. Using a ported version of Tony’s python code these tags are collected and the number of other articles from the search result with the same tag are counted. The page then renders this information as a force layout diagram using the d3.js visualisation library (tags and links = nodes and edges) and a histogram by putting the same data into the Google Visualization API.

I didn’t show it in the video but you can create predefined searches for linking and embedding. For example, here’s one for the term ‘JISC’ and if your RSS reader hasn’t stripped out the iframe the same page is embedded below:

How it was made/What I learned

I mentioned to my unsupervisor that I was thinking of doing something with the a live version of the Guardian Open Platform with d3 based on his Friendviz example and he immediately spotted a couple of problems, the biggest being the Guardian API prefer it if you keep your api key a secret.

Yahoo Pipes as a proxy service

Fortunately Tony also had the answer of using Yahoo Pipes with a private string block as a proxy sketchservice (I’m not sure if there is much benefit to doing this as while the API key is still hidden anyone can access the pipe. The API is rate limited anyway and I hope the Guardian people see I’m keeping to the spirit of the terms and conditions.

So data source, check. Porting python to JavaScript. Relatively straight forward apart from no combinations mapping function, but having sketched out what was going on I think I’ve got an equivalent.

ddd dd dd d3.js

Big headache! even though I’ve churned out a fair bit of code I’m not or never have been a professional programmer so getting my head around d3.js has been a big challenge. There were a couple of examples I spent a lot of time picking over trying to understand what was going on. The main ones were:

[These examples are renderings of GitHub Gists using the bl.ocks.org service created by … mbostock and is a great way to publish little snippets of stuff]

I also got a peak at the generated code for Tony’s Visualising New York Times Article API Tag Graphs Using d3.js. You’ll notice that my offering is similar in appearance and functionality to yohman’s example (I’m quietly ignoring his copyright mark – fair use etc :-s).

Its hard to convey exactly what I learned from the last couple of days of pushing pixels. The big difference between d3 and the similar protovis library I used here is there is a lot more setting up to do in the code. The payoff is you have far more control of the end result. Having spent days trying to understand d3, it was contrasted by the minutes needed to create the tag histogram using the Google Visualization API.

One thing I never got working is a zoom/pan effect. I’ve seen tiny snippets of code that does this for charts. Unfortunately the API Reference for this behaviour is still to be written.

Where next

Now that I’ve got a basic framework for visualising tag/category information I interested in refining this by trying out some other examples. So if you have an API you want me to play with drop me a line ;)

PS my code is here and you can see how it renders in blocks here

PPS really must setup a labs page outlining my various experiments

PPPS forgot to say during a d3 low I few together a prototype of the tag explorer using protovis in a couple of minutes