Synthesising Google Fusion Tables and Visualization API to allow data exploration and mashups

As part of some work I’m doing with the Open University around the OER Research Hub project I developed this high fidelity prototype which let users explore survey responses collected by the project (the short video below highlights the main features):

In the guest post I wrote on the ‘OER Survey Exploratoratorium‘ I outlined the problem:

When presented with over 4,000 survey responses the challenge was how to let people explore the data set. When presented with this challenge the first thought invariably is what is the shape of the data. In this case the survey responses were collected in Survey Monkey. After considering options like consuming the data into the OER Impact Map via the Survey Monkey API, the overhead in terms of developing user interfaces and squeezing into a WordPress data structure resulted in the exploration of other options. The approach that looked like it would squeeze the most functionality out of little development time was to use Google Fusion Tables to host the data and put a visualisation layer over the top. The reason for choosing Fusion Tables is it allows a Guardian Datastore model of letting the user easily access and reuse the source data either in Fusion Tables itself or exporting into another tool. If you would like to peek at the data behind this there are two tables: one with the survey questions and another with the survey data.

I’ve extracted the main part of the code into this gist so you can get a sense of what’s going on. If this is something you are interested in doing yourself there is some documentation on the Google Visualisation API  for getting Google Fusion Tables. This page is has one example of how you can fetch data from Fusion Tables. It’s however worth noting that as Google Fusion Tables implements  the Chart Tools Datasource Protocol you can query the data as a datasource. This allows you to use the Google Visualization API Query Language with  SQL like syntax. The gist below is a reworking of this query example in the Google Code Playground which you can use to see the differences. The main one is how the query is set by specifying which table the data is from in the query. A couple of notes I have on using Google Fusion Tables as a datasource in this way are:

  • data returned limited to 500 rows. If you want more you can turn to the full Google Fusion Tables API  which has a separate SQL like query language. Using this API is rate limited and requires OAuth and/or API key. I got more than 500 by using LIMIT and OFFSET in my queries. (the full Google Fusion Table API is worth bookmarking for cross-referencing).
  • using back-quotes ` specified in the Visualisation API to wrap column names with spaces doesn’t appear to work. You do specify columns by their name rather that A, B C etc as used in Google Sheets. (The Google Fusion Tables API specifies single quotes which I don’t think work in this scenario – this is an example of where cross-referencing helps)
  • Google Fusion Tables doesn’t implement the OR operator (related issue ticket marked Won’t Fix. When I mentioned to Tony Hirst (@psychemedia) he suggested De Morgan’s laws which would be an alternative)

Hope you enjoy and look forward to seeing you Google Fusion/Visualization mashups ;)