It looks like Twitter are finally rolling out the option to download all your tweets. As well as providing a nice offline search interface it appears that “the archive also includes CSV and JSON files, the latter complete with each tweet’s metadata”. I’m looking forward to see the data visualisations/mashups people come up with around their data. 

The Twitter API has long allowed you to extract a user’s status updates the limitation being you can could only get the last 3,200 tweets. This is something Sheila MacNeill discovered when she tried vizify’s ‘Year on Twitter’ tool.

the archive of meExporting tweets was something I looked at in 2011 in the cryptically titled Google Spreadsheets and floating point errors aka when is 65078736491511804 + 1 = 65078736491511808 (and automatically archiving your Twitter Status Updates). With this I’ve got a record of my tweets going back to April 2010 which is triggered to update itself every week. A reason I do this is often I need to find things I’ve previously said in ‘the archive of me’.

Here’s the template (File > Make a copy) and follow the instructions if you want to try (please be aware of the Twitter Developer Rules of the Road). I’ve updated the code to make it compatibly with version 1.1 of the Twitter API. One of the options I’ve added is a JSON dump which is saved to your Google Drive. It only took two lines of code using Google Apps Script HT +Romain Vialard 

  var blob = Utilities.newBlob(Utilities.jsonStringify(json), "application/json", filename);

[The JSON dump is a bit buggy – some issues with character escaping somewhere]



The 'Deprecated notice' is because the access to the Topsy API (which is used to extract Twitter activity) is now behind a developer key.

A post from Alan Levine (@cogdog) on Syndication Machines (plus syndication of syndication) which details how feeds for the cMOOC course ds106 are aggregated and resyndicated got me thinking if there was additional data around student blog posts that could be usefully captured. Previously for something like that I would have turned to PostRank Analytics, which allowed you to specify any rss feed and it would aggregate social activity like tweets, bookmarks from a wide range of services and let you see it all in one place. Unfortunately PostRank were bought by Google and while some of this data is now accessible from Google Analytics it’s restricted to your account and data from social networks Google isn’t directly competing with.

So I thought it would be useful/interesting to start looking at what social activity you could pull together without authenticated access or API keys. So far I’ve identified couple and given some overlap with other work (and some late night play time), I’ve come up with a Google Spreadsheet template which pulls data together from comment feeds, Twitter and Delicious (with social counts for these plus Facebook, LinkedIn and Google+). You can give the spreadsheet a try with the link below:

*** Blog Activity Data Feed Template ***

Blog Activity Data Feed Template Overview


  • Collects last 10 posts from a RSS feed
  • Uses to get overall post share counts from Facebook, Twitter, Google+, LinkedIn and Delicious
  • For supported blogs (mainly manila WordPress) extracts comment snippets
  • Uses Topsy to collect tweets mentioning the post
  • Fetches all the Delicious bookmarks for the post url
  • Summarises activity from comments, tweets and bookmarks on the dashboard

You can make as many copies of the template as you like to track other RSS feeds.

If you prefer a slightly different overview then the RSS Feed Social Share Counting Google Spreadsheet gives a table of share counts for a feed

*** RSS Feed Social Share Counter ***

Social share count matrix

Technical highlights

There are a couple of cool things under-the-hood for the Activity Data template worth noting.

Managed Library = Easy to add more sources

So far I’ve limited the template to sources that have easy access. Previously if an API disappeared, changed or I discover another service I would have to modify the template and users would have to make a fresh copy. With Managed Libraries the Google Spreadsheet template now only needs 11 lines of code (shown below) for 3 custom formulas (see cells Dashboard!C8, and #n!G2 and #n!B8).

function getSharedCountActivity(url){
  return socialActivity.getSharedCount(url);

function getItemFeed(url, num){
  return socialActivity.getItemRSSFeed(url, num)

function getActivityFeed(url){
  return socialActivity.getActivityFeed(url);

Because the template uses my own socialActivity library and because I’ve left the template in Development Mode when I update the code changes should (I hope) filter to all the existing templates in use. This means if I can add more services with open APIs details should start appearing on the post activity pages (denoted with # at the beginning).

Fun with sparklines

Activity SparklinesIf you were wondering how I did the activity sparklines, here’s how. Sparklines are a built-in Google Spreadsheet formula (here’s the  reference). To get them to work you need some data if you want some options. For the data all you need is an array of values and sparkline will plot them equally spaced.

In this use case I’ve got comments, tweets, and bookmarks with date/times. To plot the sparkline I need to know how many of these were made between specific time periods. Another function we can use to get this data is FREQUENCY. This lets us specify some intervals and source data and returns an array of the frequency of data points.

One issue is that for each post the activity range varies, so one post might only have activity in July, and another in August. There are other formula (MIN/MAX) to get the range values. Here’s how the final solution was pieced together.

On each post sheet (prefixed with ‘#’) in cells A1:A20 are some formula to calculate date intervals based on the minimum and maximum date ranges. Frequency data used for each sparkline is then calculated inside the sparkline formula by indirectly looking at the data in each post sheet. The indirectly part is handled by the INDIRECT formula which evaluates a cell reference from other cell values. So on row 9 of Dashboard INDIRECT(B9&"!D9:D") evaluates to ‘#1!D9:D’, row 10 is ‘#2!D9:D and so on. I could have hardcoded these cell references but I prefer the ‘clever’ way. Here’s part of the final formula with a breakdown of what it does:

SPARKLINE(FREQUENCY(INDIRECT(B9&"!D9:D"),INDIRECT(B9&"!A1:A20")),{"charttype","line";"ymin",-1;"ymax",MAX(FREQUENCY(INDIRECT(B9&"!D9:D"),INDIRECT(B9&"!A1:A20")))+1} )

  • INDIRECT(B9&"!D9:D") – Frequency data source built using INDIRECT which evaluates a cell reference from other cell values. On row 9 of Dashboard this evaluates to ‘#1!D9:D’, row 10 is ‘#2!D9:D
  • INDIRECT(B9&"!A1:A20") – similar trick to get an interval array for the FREQUENCY formula
  • FREQUENCY(INDIRECT(B9&"!D9:D"),INDIRECT(B9&"!A1:A20")) – gives our sparkline data array
  • MAX(FREQUENCY(… – get the maximum value from the frequency data just so we can add 1 to it to give a margin on the sparkline


Hopefully you’ll enjoy and if you can think of any improvements or other services to tap into just get in touch.


Update: Here's the updated Topsy Media Timeline Google Spreadsheet Template v2.1, which pulls data straight from Topsy. Follow the instructions in this template for your own Twitter/Topsy media timeline

Recently I posted an Experiment to dynamically timeline media posted on Twitter using Topsy and Timeline (my contribution to @Arras95) #arras95 which uses a Yahoo Pipes to extract tweets with images and videos using the Topsy Otter API, which is then pulled into a Google Spreadsheet before being rendered in a Timeline tool developed by Vérité.

This recipe appears to be working, that is the timeline is automatically updating with new media. There’s a separate question about practicality of the timeline and navigation which I’m quietly ignoring, instead I want to highlight some technical hit/misses and present a revised version.

Technical misses

Because Topsy includes tweets to image sites like twitpic and yfrog in the search results, which redirect to those sites rather than having an image source url these appear in frames (up until recently Timeline used the API to convert into a thumbnail but this was recently removed because the free service was stopped).


To get around this I’ve modified the source Yahoo Pipe to only let image urls (new source Pipe here). This limits results to those uploaded via the Twitter official interfaces (e.g. Web/New Tweetdeck). Update: I've now coded the data collection from Topsy directly in the Google Spreadsheet using Google Apps Script. New version is available via the link at the end of the post. image

If you wanted to replicate the original experiment another drawback was that you would have to host the Timeline code somewhere. As not everyone has easy access to a web host I’ve published an interface which lets you include the published Google Spreadsheet key in the URL. Here’s an example for #cam12

Here’s a new version of:

*** Topsy Media Timeline Google Spreadsheet Template v2.1 ***


[PS I’m looking forward to seeing what Sam comes up with using Timeline ;)]


Update: New version of this spreadsheet template here

There’s a new kid on the block if you are considering an open source timeline tools. For a long time the Simile Exhibit Timeline tool has been the tool of choice appearing in places like (click on Timeline in this page to see a history of internet search engines).

A nice feature of ‘Timeline’ is it’s focus on making it easy to embed content from other sites including individual tweets, videos from YouTube and Vimeo, images hosted on Flickr or with a direct url and audio from SoundCloud. Here’s an out-of-the-box example (I tried to use the embed code in this post but it seems to conflict with some of blog theme code (a jQuery problem))

I wanted to try out ‘Timeline’ to see how it preformed under different use cases. The two I had in mind were: Timeline/Google Spreadsheet as a simple OER creation tool (in part influenced by Pat Lockley’s post on using Google Maps as an OER authoring tool); and using Google Spreadsheet’s built-in functions to scrape and automagically publish information into a dynamic timeline.

The first case is fairly easy to do using the template and instructions on the Timeline site (although more complicated than Pat’s example). A couple of ‘gotchas’ for you. When I changed the spreadsheet setting to United Kingdom formats it messed up the dates on the timeline. I also had problems using Google Maps with external KML files (I’ve opened an issue). On to the fun bit though, gluing webservices together to generate dynamic timelines.

The glue – Google Spreadsheet

Because Google Spreadsheet sits in the cloud and has a number of ways to get live data feeds in they are great for gluing data streams together and republishing in different formats. Also as Timeline likes Google Spreadsheets  all we need to do is get some data in a format Timeline likes and it should happily start updating itself … in theory anyway.

The data side left me scratching my head a bit. There’s lots of data out there its just finding some with readable timestamps. I had thought about pulling information from Wikipedia but found tables of dates not particularly machine readable. Then I started reading about the @Arras95 event which is happening as part of the JISC funded WW1C project run by the University of Oxford.

Between the 9th April and 16th May 2012 an experiment in social media will take place. We will tweet the events of the Battle of Arras in realtime, from the perspective of a neutral reporter on the field. What makes this Twitter event different from other realtime tweeting initiatives (and there are some great ones out there!) is that @Arras95 will engage online communities, crowdsourcing facts about Arras and the individuals who played a part, asking for reappraisals and additions to the action as it happens.

You can read more about how to get involved in the Contribute. Collaborate. Commemorate. I could just scrape the @Arras95 tweets and put them in Timeline, but where would the fun be in that ;) Instead I want to capture some of the visual richness. Whilst I could start to unpick media links to videos and images from the official Twitter stream, there’s no need as the social web search site Topsy already does this and the data is accessible via the Topsy Otter API.

More glue – Yahoo Pipes

As Arras95 hasn’t started yet here’s an example call to #ukoer looking for video. The result is in JSON which is usually great for other mashups but unfortunately it’s a format Google Spreadsheet’s doesn’t like (although you can handle it with Google Apps Script, but on this occasion I was trying to avoid that route). Instead I turned to Yahoo Pipes, which hopefully won’t disappear just yet despite Yahoo laying off 2,000 of its staff this week.

Yahoo Pipe pulling Topsy dataPipes is right at home with JSON  and what's more (despite hiding the option) you can output the data in .csv which Google Spreadsheet does like. Here’s a Pipe which builds a search query for images and videos posted on Twitter for a user entered search term. I’ve also prepared a slightly different Pipe which has the search hard-coded as well as pulling tweets from the @Arras95 twitter account (in both these you can edit/clone the source)

Piecing it together – importing Yahoo Pipes into Google Spreadsheets

From the Timeline site there is a Google Spreadsheet Template. This gives us the format we need to get the data in. For now lets keep working with #ukoer as this gives us some data to play with. Here’s a copy of the template with an extra sheet called data. In cell B1 of the data sheet is the formula:



This comes from running the Pipe with a search term and copying the ‘Get as RSS’ link, which is:

getting the data feedYou’ll see I’ve highlighted two parts of this url. At _render I’ve changed rss to csv and in the formula the search term is replaced by a cell value (the latter was so I could share/reuse the template). I should say urlencode is a custom formula I wrote using Apps script to encode the search term. It’s a nice little ditty that goes like this:

function urlencode(text) {   
 return encodeURIComponent(text)

Down column A of data there is another custom function to convert entity numbers into characters eg turn ' into apostrophe’s. That particular ditty goes:

function entitydecode(text){   
 return text.replace(/&#(\d+);/g,function(match, number){ return String.fromCharCode(number); });

Back in the spreadsheet on the ‘od1’ sheet we start pulling in the bits of data we need for the timeline. This mainly uses ArrayFormulas in row 3 to populate all the data without having to manually fill in the column. For example in D3 we have:

=ARRAYFORMULA(IF(ISBLANK(data!E2:E),"",(data!E2:E/ 86400) + 25569))

which reads as ‘if the cell in column E of data is blank do nothing otherwise divide by 86400 and add 25569 (converts Unix epoch times used in the Topsy API into human/spreadsheet readable formats)

Slapping it into a Timeline

All that’s left to do is in the spreadsheet File > Publish to the web… and then find somewhere to host your timeline page. So that you can see what it looks like here’s one for #ukoer.

#ukoer media timeline

@Arras95 Living Timeline

Here is the @Arras95 timeline and the source spreadsheet.

@Arras95 Dynamic Timeline

Nothing much to see now apart from a test tweet. The theory is that this will self populate over time as data filters into Topsy. It’ll be interesting to see if it actually works or if I need to set up a Apps Script trigger for force a refresh.

If you would like to make your own dynamic timeline from tweeted media here’s:

*** The Topsy Timeline Template ***
[File > Make a copy to use]


In Using Google Reader to create a searchable archive of Twitter mentions Alan Cann commented:

Subscribing to RSS feeds in Google Reader is my bog standard way of archiving Twitter feeds. Now to figure out how to get an RSS feed from a Google+ hashtag…

Lets look at how it might be possible. So there’s no visible RSS feed from the Google+ Search page. Looking at the API documentation there is documentation on Activities: search. So we could have a query like:

but there are a couple of problems. Data is returned in JSON and would need remapping to RSS. The real deal breaker, which is highlighted if you click on the link above, is you need to register for an API key from Google’s API Console to get the the data. So at this point I could setup a service to convert Google+ Searches into RSS feeds (and someone may have already done this), show you how to do it via the Console or show you some other way. For now I’m opting for ‘another way’.

Publishing any XML format using Google Spreadsheets

Using the same trick in Tracking activity: Diigo site bookmark activity data in a Google Spreadsheet (who is saving your stuff) we can extract some information from a Google+ Search page like this one into a Google Spreadsheet using the importXML function and XPath queries to pull out part of the page (here are parts of the same search pulled into a Google Spreadsheet). There is an option to publish a Google Spreadsheet as RSS or ATOM but it’s not structured in the same way as for a blog feed (title is a cell reference etc. like this).

What we need is a way to trick Google into formatting the data in a different way. As part of the Google Earth Outreach project a Spreadsheet Mapper was developed. This spreadsheet template allows you to enter geographic data which is automatically formatted as KML data (KML is another XML language for presenting geo data). This is achieved by creating a KML template within the spreadsheet and using the plain text output as KML. 

So using the same trick here’s a:

*** Google Spreadsheet Template to Turn Google+ Search into an RSS Feed ***

Google+ Search in Google Reader

Entering a search term and publishing the spreadsheet gives me a custom RSS feed of activity data. This feed works in Google Reader (haven’t tested any others), and with Reader we have the benefit of the results being cached (still not sure what the limitations are).

Important: Some things to be aware of. Because the data for this is extracted using XPath when Google change the page styling this solution probably won’t work anymore. Also the RSS feed being produced is for the last 10 search items. If you’ve got an active term then data might get lost.

So yet more resource based activity/paradata for you to digest!


I keep getting this nagging voice in my head to do something with Google+ and Google Spreadsheets. I've resisted until now as I don't think there is enough of an official API there to satisfy my current interest in social network analysis. This hasn't stopped other people from getting off the beaten track and developing some really useful stuff.

Who folk who put me in a g+ circle followA while ago Tony (Hirst) posted a Fragment: Looking up Who’s in Whose Google+ Circles…,which highlights how friend and follower information can be extracted from Google using an undocumented API. Tony followed this up with So Where Am I Socially Situated on Google+? in which he used a genius bit of Python scripting to extract a large fragment of his extended Google+ network which he then processed in Gephi.

Here's my take on the problem, a Google Spreadsheet template to extract Google Plus network information for analysis in the tool of your choice, in my case NodeXL.  

If you want to go and play straight away here's a link to:

*** Get Google Plus Friend/Follower Information ***
(File > Make a copy to use yourself)

Below is an example output from my network generated in NodeXL (and a copy in my online NodeXL viewer ;):

mhawksey-googleplus friend follower

How to use

Instructions for use are contained in the Spreadsheet Readme sheet. My top tip is be patient, it can take a while to grab the data so don’t be alarmed if a minute goes by without a notification from the sub-level run. If nothing is happening after 5 minutes something has gone wrong to try running ‘Google+ > Start/continue collection’ again (it should continue from where it left off). Also this script is hot off the press so if you are having problems comments are very welcome.

One other very important caveat is looking at the execute logs the script isn’t getting friend data for everyone. Not entirely sure why but might be Google trying to prevent use of an undocumented API. This means some nodes (in one sample 10%) are lacking outbound data.

Getting data into NodeXL

I'm not familiar with getting the data into tools like Gephi (other than via a NodeXL export) so if someone wants to post or link to something to do this for other tools leave a comment or get in touch. Instead here are some basic instructions for getting the data into NodeXL:

  1. Once you collected the data from the Google Spreadsheet select File > Download As > Excel
  2. Open the downloaded file in Excel making sure you the Links sheet is active
  3. Start a new NodeXL template and from the NodeXL ribbon menu select Import form Open Workbook
  4. Make sure the dialog is pulling data from the Links sheet and ‘Columns have headers’ is not ticked
  5. Make sure Type is set to Directed and click Prepare Data > Get Vertices from Edge List
  6. Back in the Excel spreadsheet with your downloaded data  right-click on the Nodes tab and select ‘Move or Copy…’ and change the To book dropdown to your new NodeXL template
  7. Back in your NodeXL template open the Vertices sheet and add two ‘Other columns’ with the headings ‘Name’ and ‘ImageUrl’
  8. In the cell beneath the new Name column enter the formula =IF(VLOOKUP([@Vertex],Nodes!A:C,2,FALSE)=0,[@Vertex],VLOOKUP([@Vertex],Nodes!A:C,2,FALSE))
  9. In the cell beneath the new ImageUrl column enter the formula =VLOOKUP([@Vertex],Nodes!A:C,3,FALSE)
  10. In the cell beneath the Visual Properties Image File column you can enter the formula =[@ImageUrl] then fill the Shape column with Image (you may of course decide not to use images for your nodes, it’s up to you)
  11. Do your other analysis: graph metrics, groups, autofill etc

Some technical notes on using Google Apps Script to collect the data  

Processing time – using ScriptProperties as a cookie

Using Google Spreadsheet/Google Apps Script I can’t the same extent of information (where Tony got a sample of who his friends were also following I’m just getting which of my friends follow each other). This is because Apps Scripts are executed server-side and automatically timeout after around five minutes. Even with a more focused data set in mind I was finding I could processed all the data in 5 minutes so had to resort to breaking the process into chucks using the build-in ScriptProperties (it’s like a script cookie) to store where the process was up to.

Caching in the Cloud with Google Apps Script

Google recently announced the addition of a CacheService in Google Apps Script. In Tony’s original code he was also caching responses from APIs to prevent multiple lookups. In my version there is less need for caching as I only make one grab for each friend or follower. I still implemented the CacheService as a way to backup a sublevel processing run. The scenario being you get most of the way through the 50 calls per block when the script fails, the next time it can just re-read the data it already got from the cache speeding things up. Doing this I did however find the limits of the CacheService:

  • cache names have a size limit  (Solution: I trimmed mine to 240 characters),
  • cache size looks like it might have a 100k limit (Solution: caching only the data I needed and wrapping put in a try/catch exception – if I couldn’t put it I’d just make a fresh call if I needed the data)
  • anything cached is stored as a string (Solution: some array splitting and joining in the right places)

Below is how the function for this ended up looking (I’m I’m sure it can be refined – I’m just a hobbyist coder). The full code is here.

function getCachedDataOids(url) {
  var cache = CacheService.getPublicCache(); // initialize
  var oids = cache.get("f"+url.substr(0,240)); // get data if any (url trimmed to prevent error)
  if (oids == null || oids == "") { // if null or empty means there is no cached data or last fetch failed
    var requestData = {"method":"GET", "headers": { "User-Agent": ""}}; // prepare request
    var response = UrlFetchApp.fetch(url, requestData); // try fetch
    if (response.getResponseCode() == 200) { // if response then
      var resp = response.getContentText(); // get response text 
      var reobg = new RegExp("\\d{21}","g"); // only interested in Google Plus user ID so prepare regexp
      oids = resp.match(reobg); // get an array of ids
      if (!oids == null){ // if something stringify it
        oids = oids.join(",");
      try {
        oids = oids.join(",");
        cache.put("f"+url.substr(0,240), oids, defCache); // try and put it in cache
      } catch(e){
        return oids; // if too big just return it 
  return oids;

Enjoy ;)


One of the things I really liked about the network analysis and visualisation tool NodeXL which I wrote about last week was the built-in tools for grabbing data from Twitter. I said:

The advantage of NodeXL, particularly for graphing Twitter communities, is it has built-in features for grabbing the data for you. Not only that the coding is clever enough to handle the data collection for mere mortals, so when you hit your rate limit NodeXL waits until it should be able to get more data.

What I didn’t mention at the time was that it can take a long time to get complex network data (as in set it running overnight) and also I was having problems getting this to work. I haven’t looked closely at how NodeXL is generating the data but say for example I wanted to find out if the people I followed also followed each other. So I can get all the user ids of the people I follow using which give me 497 ids.

To see if @psychemedia also follows any of these I could get his list of friend ids and see how many id’s match. Assuming everyone I follow has less than 5000 friends (this is the maximum the Twitter API can return in one call) I can do this in 496 API calls. NodeXL also captures more user information (friend/follower counts, bio etc) which it can do in batches of 100 using

So in summary:

  • 1x my following ids
  • 496 x who are my friends following (but more if any of my friends follow more than 5000)
  • 4 x details user info

Which means at least 501 calls (with a 350 api calls per hour this has to be done in two batches). And if you want to look beyond that and seeing who are the friends of your friends it’s a lot more.

Fortunately there is a way to get this information a lot quicker. A separate API which is mentioned at the very end of Tony’s Visualising Twitter Friend Connections Using Gephi is the Google Social Graph API. The Social Graph API “makes information about public connections between people easily available and useful” and importantly includes connection information from Twitter. Here’s a demo page from Social Graph using my Twitter id as a starting point.

As I’ve already Ported: Tony Hirst’s Using Protovis to Visualise Twitter Connections which uses Social Graph to a Google Spreadsheet it was a quick hack to modify this to take a list of twitter usernames your interested in and create a two column edge list required by NodeXL, Gephi and other network visualisation tools.

How-to get Twitter edges using Google Social Graph

Before I start I should say the endpoint in this is using NodeXL but Gephi is also able to import the edge csv file generated with this spreadsheet.

  1. Grab a copy of this Google Spreadsheet template for Get Twitter Edges from Screen Names

    If you are having problems getting a copy of the template try opening this spreadsheet and File > Make a copy
  2. Paste a list of twitter usernames you want to graph connections for in the source sheet (I’ve a collection of Google Spreadsheets I’ve developed which can be used to get friends/follows, search terms and more. I also recently wrote about SocialBro which can do a .csv export of usernames (but as pointed out by Angelo you can’t export other peoples data)) If you want to play along here’s a list from #eas11
  3. In the Twitter Edges spreadsheet open Tools > Script editor…  and follow the instructions for registering for a Twitter API key and secret (if you have used any of my other Twitter/Google Spreadsheet mashups you can use the same key/secret).
  4. Once you’ve entered the key and secret don’t forget to Run > authorize
  5. Next on line 24 you need to enter the level of data. For this example lets stick with 1.5
  6. Now Run > getConnections.
  7. Once the the yellow status bar with Running getConnections disappears close the script window.
  8. Back in the spreadsheet view open the ‘Edges’ sheet which should be populated with data
  9. Select File > Download as … > CSV (Current Sheet)
  10. Start a new NodeXL template
  11. Open the CSV in Excel and copy the edges data to your new NodeXL template Edges sheet
  12. From the NodeXL ribbon menu click on Prepare Data (left hand side) and select ‘Merge Duplicate Edges’ and then again from Prepare Data select ‘Get Vertices from Edge List’

And now you can do the rest of your analysis of the data (if you haven’t generated Twitter network visualisations in NodeXL before you can follow parts of the the basic recipe here).

and here’s one I made earlier

#eas11 hashtag community

One very important caveat – the data from Social Graph isn’t 100% accurate and as I posted in Social media wars: Measuring the battle lines since Google+ has come along this data might be becoming less reliable.

PS In making this I found that when I passed 50 usernames to the Social Graph API I only got data back for 15 usernames. I wasn’t able to find any documentation on why. Reducing the call to 10 names at a time seemed to work (but means there are bugs in my other Social Graphs bits and pieces:(


Update: The Social Graph API has been deprecated by Google and Protovis is no longer being developed. I've revisited this recipe using new tools in Deprecated Bye-bye Protovis Twitter Community Visualizer: Hello D3 Twitter Community Visualizer with EDGESExplorer Gadget

RT @psychemedia: How do folk who send you twitter messages connect? < see address bar (this is depressingly good)

Was what I tweeted on the 13th April 2011. What Tony had managed to do was use the Protovis JavaScript library to let people make their own network visualizations for Twitter search terms (if you haven’t seen Tony’s other work on hashtag communities, mainly using Gephi, its well worth a read). The concept for the latest iteration was explained in more detail by Tony in  Using Protovis to Visualise Connections Between People Tweeting a Particular Term.

Two limitations of this solution are: it relies on a current Twitter search which will disappear after 7 days; and it’s difficult to embed elsewhere. Tony and I had discussed collecting the data in a Google Spreadsheet using one of my many Twitter/Spreadsheet mashups and then draw the network visualization from the data.

I thought I would go one step further not only collecting the data in the Spreadsheet but then also generate the Protovis visualization in the sheet by making a gadget for it. The reason for going down the gadget route is in theory they provide an easy way to embed the visualization in other webpages and web apps.

This wouldn’t be my first foray into gadgets having already done My first Google Gadget to embed a flickr photoset with Galleria navigation, and I already knew that to gadgetize Tony’s original code just needed some xml wrapping and a dash of the Google Visualization API. In fact because I used Google Apps Script to collect the data there was very little needed to do with Tony’s code as both used JavaScript syntax.

So here it is the:

*** Protovis Twitter Community Visualizer ***
[If the link above doesn't work open this version and File > Make a copy (make sure you are signed in)]

and here is some output from it (if you are reading this in an RSS aggregator you'll need to visit this post) Update: Turns out the Protovis library doesn't work with IE so you'll just have to use a proper browser instead:

PS you can interact with the diagram by click-dragging nodes, using your mousescroll for zoom and click-drag empty parts to pan

Life is so much easier when you stand on the shoulders of giants ;)


First I should say I don’t think this is the best solution, in fact an earlier post from 2008 DIY: A wi-fi student response system is probably a solution, if perhaps needing more tidying up, but I’m posting anyway just on the of chance that this might inspire or help solve someone else's problem.

This post has come about as a result of a couple of things:

  1. I’m in a bit of a Google Apps run.
  2. I read and enjoyed Donald Clarks Clickers: mobile technology that will work in classes
  3. I saw and contributed to Tom Barrett’s crowdsourced X Interesting Ways to Use Google Forms in the Classroom (I added #39 Collaboratively building a timeline which I discovered through Derek Bruff’s class example.

Concept: Using Google Forms as an interface for a mobile/desktop based voting system.

Issue: If you are asking multiple questions the form needs prepopulating with options making it a long list for the student to navigate and potentially creating a predefined route of questions.

Solution: Use a generic form with one set of response options, the class using a unique question identifier for response graphs to be generated from.

The finished result

Below (if you aren’t reading this via RSS) is this Google Form. [You can make a copy of the related Spreadsheet and customise the text and options. For example, you might want to turn qID into a list option rather than free text.]


And here is a page with a summary of responses, which allows the user to choose which response set to display (screenshot shown below):

Screenshoot of summary of responses

How it was done

Some of you might already be familiar with Google Chart. This service allows you to create chart images by encoding the data in the URL. I’ve used this service in a number of my mashups, in fact all of my previous voting mashups use it in some way, and not surprisingly in Generating charts from accessible data tables using the Google Charts API.

Google Chart works well if it easy for you to get the data and format it for the image URL. For more complex tasks there is Google Visualization. The advantage of Visualization is it gives you a way of querying a data source before displaying as a table or chart. To see what you can do (and the place where I aped most of the code for this mashup) you should visit the interactive gallery of Visualization examples.

Using the Using The Query Language example as a stating point I could see you could lookup data from a Google Spreadsheet and filter the response using Google Visualization API Query Language, which is very similar to SQL. What I wanted to do was SELECT the data from the spreadsheet WHERE it matched a question identifier and COUNT the number of occurrences for each GROUP of response options. An extract from the table of data is:

Timestamp qIDAnswer

My attempts to convert the SQL version of this query which is something like:

SELECT C, Count(C) AS CountOfC WHERE B = ‘questionID’ GROUP BY C

initially I was left with keyboard shaped indentations on my forehead trying to get this to work but Tony Hirst (@psychmedia) was able to end my frustration with this tweet. This meant I was able to use the following query VQL friendly:

SELECT C, Count(B) WHERE B = ‘questionID’ GROUP BY C

The next part of the problem was how to let the user decide which question ID they wanted to graph. Looking at the Simple Visualization example I could see it would be easy to iterate across the returned data and push out some html using JavaScript. What I wanted to do was GROUP the questionID’s and COUNT the number of responses, which is possible using:


This returns a table of unique question IDs and a separate column with a count of the number of responses. A form list element is populated with the results using:

for (i=0; i
var ansText = data.getValue(i, 0)+' (No of votes '+data.getValue(i, 1)+')';
var valText = data.getValue(i, 0);
ansSet.options[ansSet.options.length]=new Option(ansText,valText);

And that’s it. If you want to play around with this the code is here. Enjoy and if you find this idea useful or if you spot any issues as always I value your comments.


I’ve written about the different ways you can do electronic voting without buying clickers a number of times from creating a simple wi-fi system, to using services like, to even using Twitter (more on the latest on this one in a separate post).

For the ‘eAssessment Scotland 2010: Marking the decade’ conference we ran a poster competition and not wanting to collect lots of slips of papers we thought it would be good to have a SMS vote. Having seen the Learning Apps (formerly xlearn textwall) being used at other events and knowing it allowed data to be exported via RSS it was the ideal candidate. Using the same concept for voting via Twitter (TwEVS) of counting the occurrences of options after a hashtag it was easy to just substitute the feed from Twitter search with the one from Learning Apps.

Wanting to add a bit more than just a static Google Chart I was interested to see if I could get the graph to update automatically without browser refresh. After looking at a couple of options including the Javascript plotting library ‘flot’ I came across a post by Sony Arianto Kurniawan on Create Realtime Chart Without Page Refresh using FusionCharts Free and Ajax (prototype.js), which worked a treat.

The advantage of this home grown solution is it gives you some flexibility in how it is used in particular using the space before the question identifier for users to explain why they think their answer is correct. You can access the voting site using the link below (here is also the source code for download).

*** XVS – SMS voting using Learning Apps ***


  1. Rent a textwall from Learning Apps (xlearn) for £25/year (this solution only requires you to receive messages so you won’t need any additional credit unless you plan on contacting students via SMS)
  2. Once created login to the xlearn admin panel and click either ‘Text Wall’ or ‘Inbox’ and note/copy the code after ' (might be 12 random characters) Update: There's a new url so it's the code after
  3. When you want to ask a question give users the options and instructions like “to vote for option ‘A’ send a text message to 07XXX XXX XXX with ‘xyz #q1 A’ (where 07XXX XXX XXX is the mobile number and xyz is the short code provided by Learning Apps).The question identifier (in this example #q1) can be anything you like as long as it starts with ‘#’ and the options can be anything you like (a, b, c … 1, 2, 3 … etc)
  4. On the XVS site enter your textwall RSS id saved earlier and the hashtag identifier without the ‘#’ (in this example it would be ‘q1’). You can also optionally set the maximum number of options to graph. The reason you would use this is to try and prevent any malicious uses like sending rude messages.
  5. Once the form is submitted you can swap between the live results and a static chart. (the url of this page can be included in PowerPoint slides allowing you to link directly to the results) Below is the format it uses:{see note}&tag=q1&options=-&type=live

id - is an encode version of your textwall RSS id. It’s encode to try and prevent direct access to you entire text wall. The encoded id is fixed so can be reused

tag – your question identifier

options – optional number to restrict the number of options displayed

type – setting to ‘live’ displays the chart with realtime updates. Leaving blank displays the static chart

One last thought. As this solution uses RSS feeds to pull the voting results, just as with the Twitter voting example, it would be very straight forward to combine the two (already a feature of, but something I’m not interested in doing).