On Friday (31st August) I was at eAssessment Scotland 2012. This is an event I’ve had a long running involvement with and it’s always a pleasure to travel up to Dundee with around 300 other delegates to attend one of the premier eAssessment events in the UK (if not Europe the world). Sheila was also at the conference at has already posted some notes and there isn’t much for me to add (although I do want to write something on Open Badges after attending Doug Belshaw's session). Instead I wanted to have a quick look at the conference twitter backchannel.

The archive of 1196 tweets taken between 23rd August and 3rd September has contributions from 184 twitter accounts. The median number of tweets per account was 2.


As well as monitoring the number of tweets being posted during the day, following on from my work on analysing threaded Twitter discussions from large archives using NodeXL #moocmooc, I included the number threaded replies* in the archive. Of the 1,196 #eas12 tweets 10% (n.115) were threaded replies. I was able to monitor this in real-time with the graph below which is embedded in the dashboard.

*the metadata from Twitter includes a ‘in reply to id’ field which identifies if the tweet is part of a thread. This isn’t 100% accurate as it is only recorded if the user uses a reply button.

Graph of tweets and threaded replies at #eas12
Graph of tweets and threaded replies at #eas12

At the time I commented:

interesting that #eas12 tweeters are in broadcast mode (red line bottom left graph) docs.google.com/spreadsheet/cc…

— Martin Hawksey (@mhawksey) August 31, 2012

which was responded to (as a threaded reply which creates all sorts of problems from Twitter’s embedded tweet option):

@mhawksey What does that say about us then... (/makes effort to think, edit and RT...) :D #eAS12

— Derek Jones (@plug103) August 31, 2012

@plug103 perhaps if the person at the front is in broadcast mode then it's okay for us to be our own individual mini-broadcasters ;) #eas12

— Martin Hawksey (@mhawksey) August 31, 2012

So how does this compare with #moocmooc. Hopefully the graph below illustrates this (interactive version here), but in numbers the '#moocmooc archive contains 6,883 tweets where 44% (n.3046) are conversation threads.

Graph of tweets and threaded replies at #moocmooc
Graph of tweets and threaded replies at #moocmooc

It’s perhaps not surprising that different modes of twitter usage produce different conversation patterns. Next week I’ll be at ALT-C 2012 in Manchester where I’ll also be monitoring the twitter backchannel. This is a 3 day event aimed at learning technologists so it’ll be interesting to see if a different pattern of usage emerges.

Something else I’m reminded preparing the TAGSExplorer view of eas12 (graph of replies – solid lines, and in this case mentions – dashed lines, between twitter account using eas12 hashtag) is that there is more interaction going on that could be analysed (there’s a PhD in all of this – if Twitter keep the data flowing).


Other links


I noticed a couple of search referrals around scheduling timed triggers in Google Apps Script so here is what I know and how to do it. Within Google Apps Script you can create basic time-based triggers (poor mans cron) to run functions on specific date/time intervals. Using the ‘Current Script’s triggers’ dialog (accessed from Script Editor > Resources > ‘Current Script’s triggers’):

  • minute (1, 5, 10 ,15 or 30)
  • hour (2, 4, 6, 8 or 12)
  • day (hour of day intervals)
  • week (day of week and hour of day interval)
  • specific date/time (in YYYY-MM-DD HH:MM format)

Current project's triggers dialog

Recently on one of my posts someone asked: ‘Is there a way to schedule when the script runs?’. In particular they were interested in running a particular function every 10 minutes for a set period.

I did briefly look at scripting time-based triggers, but quickly realised that my original plan to control a number of timed triggers from a central spreadsheet wasn’t possible because the class TriggerBuilder doesn’t allow forSpreadsheet on TimeBased triggers. Instead I came up with this code snippet:

function scheduledCollection(){
  var schedule = [];
  // dates/times in mm/dd/yyyy hh:mm - timezone matches settings in File > Project properties 
  schedule.push({start:"08/29/2012 15:00", end:"08/29/2012 16:00"});
  schedule.push({start:"08/29/2012 20:00", end:"08/29/2012 22:00"});

function checkCollect(schedule){
  var now = new Date();
  for (i in schedule){
    var start = new Date(schedule[i].start);
    var end = new Date(schedule[i].end);
    if (now > start && now < end){
      // insert function you want to run here


To use you enter the time ranges you want to use in the first function, enter the sub function you want to run in ‘interest function you want to run here’, and then create basic time-based triggers at a low interval (in the example above every 10 minutes) calling scheduledCollection. Enjoy!

Share this post on:
| | |
Posted in Google Apps Script on by .

Here's some posts which have caught my attention this month:

Automatically generated from my Diigo Starred Items.
Share this post on:
| | |
Posted in Starred on by .


This method uses a UK based SMS gateway and most likely not suitable for international use

Two years ago I wrote how you could have a  free SMS voting using intelliSoftware SMS Gateway service. This recipe automatically forwarded text messages from the IntelliSoftware SMS gateway to a blogger account using posting via email. Votes were then extracted from messages from the blogs RSS feed using some PHP code on my server.

Last year a modified version of this was used to collect votes for the poster competition at eAssessment Scotland 2011. I was recently asked if the recipe would still work for this year’s conference. It does but I thought I could make it better.

30 lines of code, source is in the templateThe main change is to directly ingest SMS messages into a Google Spreadsheet (using 30 lines of code) which makes it easier for manipulation and presentation. The method for doing this is relatively simple because the IntelliSoftware gateway has a HTTP interface and you can also use Google Spreadsheets as a Database – INSERT with Apps Script form POST/GET submit method.

If you would like to do this yourself here’s how:

  1. Signup for an account at intelliSoftware (it’s free!)
    Note: the username you select is also used to direct texts so you might want to use a class or course name)
  2. Open a copy of this Google Spreadsheet template (also free)
  3. Open Tools > Script editor...
  4. Select Run > setup and okay, then Publish > Deploy as web app.. and:
    - enter Project Version name and click 'Save New Version'
    - set execute web app 'as me'
    - security level as 'anyone, even anonymously'
  5. Click Update and copy the service url you are given (it will look like https://script.google.com/macros/s/[random_characters]/exec
  6. Now open your IntelliSoftware control panel
  7. Click on Forwarding and change, tick 'Enable incoming message forwarding' and change forwarding type to http
  8. Copy the web app url into the address field and click Save

To receive messages tell users to send a text message to 07786 XXX XXX with ‘xyz and their message’ (where 07786 XXX XXX is the mobile number found in the Trial Service section and xyz is your username created with intelliSoftware).

Simple response graphIn the example template I show how you can parse messages to generate a response graph. You might also want to look at how I’ve used a Google Form for Hacking stuff together with Google Spreadsheets: A simple electronic voting system, at the very basic level you’ve got a free SMS textwall to play with. If you do come up with any interesting mashups please leave a note in the comments :)


This post is a bit messy. I got caught trying out too many ideas at once, but hopefully you'll still find it useful

Sheila recently posted Analytics and #moocmooc in which she collects some thoughts on the role of analytics in courses and how some of the templates I’ve developed can give you an overview of what is going on.  As I commented in the post I still think there is more work to make archives from event hashtags more useful even if just surfacing tweets that got most ‘reaction’.

There are three main reactions that are relatively easy to extract from twitter: retweets, favouring and replies. There are issues with what these actions actually indicate as well as the reliability of the data. For example users will use ‘favouring’ in different ways, and not everyone uses a twitter client that can or uses a reply tweet (if you start a message @reply without clicking a reply button Twitter looses the thread).

But lets ignore these issues for now and start with the hypothesis that a reaction to a tweet is worth further study. Lets also, for now, narrow down on threaded discussions. How might we do this? As mentioned in Sheila's post we’ve been archiving #moocmooc tweets using Twitter Archiving Google Spreadsheet TAGS v3. As well as the tweet text other metadata is recorded including a tweet unique identifier and, where available the id of the tweet it is replying to.

Google Spreadsheet columns

We could just filter the spreadsheet for rows with reply ids but lets take a visual approach. Downloading the data as a Excel file we can open it using the free add-in NodeXL.

NodeXL allows us to graph connections, in this case conversation threads. NodeXL allows use to do other useful things like group conversations together to make further analysis easier. Skipping over the detail here’s what you get if you condense 6,500 #moocmooc tweets into grouped conversations.

 moocmooc grouped converstations

This is more than just a pretty picture. In NodeXL I’ve configured it so that when I hover over each dot which represents and individual tweet I get a summary of what was said by who and when (shown below).

NodeXL being used to examine nodes

It’s probably not too surprising to see strings of conversations, but by graphing what was an archive of over 6500 tweets we can start focusing on what might be interesting subsets and conversation shapes. There are some interesting patterns that emerge:

conversation group 1 conversation group 2conversation group 3

Within NodeXL I can extract these for further analysis. So the middle image can be viewed as:

Examination of conversation group 2

There’s a lot more you can do with this type of data, start looking at how many people are involved in conversations, number of questions per conversations and lots more. I should also say before I forget that NodeXL can be configured to collect twitter search results with it’s built-in twitter search tool. It can also be configured to do the collection on a regular basis (hmm I should really have a go at doing that myself). So potentially you’ve got a nice little tool to analysis twitter conversations in real-time …

If you’d like to explore the data more it’s available from the NodeXLGraphGallery. I’m going off to play some more ;)

Share this post on:
| | |
Posted in NodeXL, Twitter and tagged on by .


Templated Export allows users to reshape and selectively publish data from Google Spreadsheets in custom formats. The tool is powered by Google App Script which allows integration with Google Spreadsheets using existing authenticated access. Example uses include: turning data from a spreadsheet into a RSS feed and publishing data from a spreadsheet in JSON format to be used by other tools/services. The framework for creating a custom wrapper for your data means there are many more opportunities.

Templated Export

Main features

  • Cloud based
  • Selective publication of data (down to individual cells)
  • Publication of data in different file types (text, atom, rss, xml, json)
  • Complete customisation of output file structure


  1. Open Templated Export for Google Spreadsheets
  2. File > Make a copy of the script
  3. File > Manage Versions and enter a description and Save New Version
  4. Run > onInstall twice (first time to authorise, second to fun the function)
  5. Publish > Deploy as web app... and click Update and open the 'Current web app URL' or click on the 'latest code' link if available. Bookmark this link (there are many tutorials for turning chrome bookmarks into applications)

Example usage

Generating a custom RSS Feed of data from a Google Spreadsheet

Publishing data from other sources like Guardian Datastore

Selectively publishing cell ranges as JSON

Some example templates

If you download this JSON file it contains the templates used in the demonstrations above. Use the Local Template: Backup/Restore button in Templated Export to install them.

Comments/suggestions welcome

I make the usual warranties that I'm not a professional programmer and I'm already aware of some niggly bugs, but comments and suggestions are always welcome. If there is enough interest in this tool I'm happy to put it on a community coding site or donate it to someone who actually knows what they are doing ;)


Warning: Very techie post, the noob version is here Google Spreadsheet Template for getting social activity around RSS feeds HT @dajbelshaw

In my last post announcing a Google Spreadsheet Template for getting social activity around RSS feeds I mentioned it was built from spare parts. It came out of on-going work at CETIS exploring activity data around CETIS blogs (including this one) and JISC funded projects. In this post I’ll highlight some of the current solutions we are using and how they were developed.

To put this work into context it aligns with my dabbling's as a data scientist.

A data scientist … is someone who: wants to know what the question should be; embodies a combination of curiosity, data gathering skills, statistical and modelling expertise and strong communication skills. … The working environment for a data scientist should allow them to self-provision data, rather than having to rely on what is formally supported in the organisation, to enable them to be inquisitive and creative. [See this post by Adam Cooper for the origins of this definition]

It’s perhaps not surprising that the main tool I use for these explorations with data is Google Spreadsheets/Google Apps Script. The main affordances are a collaborative workspace which can store and manipulate tablature data using built-in and custom functionality. To illustrated this let me show how easy it is for me to go from A to B.

Example 1: OER Rapid Innovation Projects – Blog post social activity

I’ve already posted how we are Using Google Spreadsheets to dashboard project/course blog feeds #oerri. For this we have a matrix of project blog feeds and a set predefined categories/tags. As projects make different posts a link to them appears on the dashboard.

OERRI Project Post Directory

As part of the programme completion JISC are piloting a new project survey which will include reporting the ‘reach of the papers/articles/newsitems/blogposts out of the project’. Out of curiosity I wanted to see what social share and other activity data could be automatically collected from project blogs. The result is the page shown below, which is also now part of the revised OERRI Project Post Directory Spreadsheet. The important bit is the columns on the right with social counts from 10 different  social network services. So we can see a post by Xenith got 29 tweets, a post by bebop got 5 comments, etc.

Caveat: these counts might not include modified urls with campaign tracking etc …

OERRI Post Social Activity

How it works

The three key aspects are:

  • Getting a single sheet of blog posts for a list of RSS feeds.
  • Getting social counts (Likes, Tweets, Saves) for a range of services.
  • Allowing columns sorting on arrays of results returned by custom

Getting a single sheet of blog posts
You can use built in formula like importFeed to do this, but you are limited to a maximum 20 post items and you need to do some juggling to get all the results from 15 projects in one page. An alternative, which I quickly ruled out was fetching the rss xml using Google Apps Script and parsing the feed. The issue with this one is RSS feeds are usually limited to the last 10 posts and I wanted them all.

The solution was to use the Google Feed API (can’t remember if this is on Google cull list). The two big pluses from this are it can pull historic results (up to 100 items), and results can be returned in JSON which is easy to work with in App Script.

Here’s what the function looks like which is called as a custom formula in cell Data!A4 (formula used looks like =returnPostLinks(Dashboard!A3:B17,"postData")):

function returnPostLinks(anArray,cacheName){
  var output = [];
  var cache = CacheService.getPublicCache(); // using Cache service to prevent too many urlfetch 
  var cached = cache.get(cacheName);
  if (cached != null) { // if value in cache return it
    output = Utilities.jsonParse(cached);
    for (i in output){
      output[i][3] = new Date(output[i][3]); // need to recast the date
    return output; // if cached values just return them
  // else make a fetch
  var options = {"method" : "get"};
  try { 
    for (i in anArray){ // for each feed url
      var project = anArray[i][0]; // exact the project name from 1st col
      if (project != ""){ // if it's not blank
        var url = "https://ajax.googleapis.com/ajax/services/feed/load?v=1.0&num=100&q="+anArray[i][1];
        var response = UrlFetchApp.fetch(url , options);
        var item = Utilities.jsonParse(response.getContentText()).responseData.feed.entries; // navigate to returned items
        for (j in item){
          // for each item insert project name and pull item title, link, date, author and categories (other available are content snippet and content)
          output.push([project,item[j].title,item[j].link,new Date(item[j].publishedDate),item[j].author,item[j].categories.join(", ")]);
    // cache the result to prevent fetch for 1 day
    cache.put(cacheName, Utilities.jsonStringify(output), 86400); // cache set for 1 day
    return output; // return result array
  } catch(e) {
    return output;


Getting social share counts

Collecting social share counts is something I’ve written about a couple of times so I won’t go into too much detail. The challenge is this project was to get results back from over 100 urls. I’m still not entirely sure if I’ve cracked it and I’m hoping more result caching helps. The issue is because the custom formula used in this appears in each row of column H on the Data sheet, when the spreadsheet opens it simultaneously tries to do 100 UrlFetches at the same time which Apps Script doesn’t like (Err: Service has been invoked too many times). {I’m wondering if away around this would be to set a random sleep interval (Utilities.sleep(Math.floor(Math.random()*10000))… just tried it and it seems to work}

Sorting columns when arrays of results are returned by custom formula

Sort order selectionWhen you use spreadsheet formula that write results to multiple cells sorting becomes a problem because the cells are generated using the CONTINUE formula which evaluates a result from the initiating cell. When you sort this reference is broken. The solution I use is to pull the data into a separate sheet (in this case Data) and then use the SORT formula in the ‘Post SN Count’ sheet to let the user choose different sort orders. The formula used in ‘Post SN Count’ to do this is in cell A4: =SORT(Data!A4:Q,D1,IF(D2="Ascending",TRUE,FALSE)). To try and prevent this being broken an ‘Item from list’ data validation is used with the ‘Allow invalid data’ unchecked.

Time to hack together: functional in 2hrs, tweaks/cache fixes 4hrs

Example 2: JISC CETIS Blog Post Dashboard

Having got social share stats for other peoples feeds it made sense for the JISC CETIS blogs. At the same time because I keep my blog on my own domain and Scott (Wilson) is on wordpress.com it was also an opportunity to streamline our monthly reporting and put all our page views in one place.

Below is a screenshot of this de-page viewed version of the CETIS dashboard (the code is all there I’ve just haven’t authenticated it with our various Google Analytic accounts). With this version the user can enter a date range and get a filtered view of the posts published in that period. Sort and social counts make a re-appearance with the addition of an ‘Examine Post Activity’ button. With a row highlighted, clicking this gives individual post activity (shown in the second screenshot). The code for this part got recycled into my last post Google Spreadsheet Template for getting social activity around RSS feeds.

CETIS Post Dashboard

Individual post activity

Getting to B

To get to this end point there are some steps to pull the data together. First the ‘Sources’ sheet pulls all our blog urls from the CETIS contacts page using the importHtml formula. Next on the ‘FeedExtract’ sheet the array of blog urls is turned into a sheet of posts using the same code in example 1. Social counts are then collect on the ‘Data’ sheet which is read by the Dashboard sheet.

A detour via page views

Having some existing code to get stats from Google Analytics made it easy to pass a url to one of two GA accounts and get stats back. Because Scott is on wordpress.com another way was required. Fortunately Tony Hirst has already demonstrated a method to pull data from the WordPress Stats API into Google Spreadsheets. Following this method stats are imported to a ‘Scott’sStats’ sheet and tallied up with a code loop (WordPress stats come in on per post, per day).

WordPress Stats API import

Here’s the custom formula code to return page views for a url:

function getCETISPageViews(url, startDate, endDate){
  // format dates
  var startDate = Utilities.formatDate(startDate, "BST", "yyyy-MM-dd");
  var endDate = Utilities.formatDate(endDate, "BST", "yyyy-MM-dd") ;
  // extract domain and path
  var matches = url.match(/^https?\:\/\/([^\/:?#]+)(?:[\/:?#]|$)/i);
  var domain = matches && matches[0];
  var path = url.replace(domain,"/");
  // switch for google analytic accounts
  var ids = false;
  if (domain == "http://mashe.hawksey.info/"){
    ids = ""//ga-profile-id  
  } else if (domain == "http://blogs.cetis.ac.uk/"){
    ids = ""//ga-profile-id;
  } else if (domain == "http://scottbw.wordpress.com/"){
    // code to compily stats imported to sheet
    // get all the values
    var doc = SpreadsheetApp.getActiveSpreadsheet();
    var sheet = doc.getSheetByName("Scott'sStats")
    var data = sheet.getRange(1, 4, sheet.getLastRow(), 2).getValues();
    var count = 0;
    // count on url match
    for (i in data){
      if (data[i][0] == url) count += data[i][1];
    return parseInt(count);
  if (ids){
    // GA get data using sub function
    return parseInt(getGAPageViews(startDate, endDate, path, 1, ids));

Note: I use a custom Google Analytics function available in the spreadsheet script editor. There’s a new Analytics Service in Apps Script as an alternative method to connect to Google Analytics.

Time to hack together: functional in 3hrs, tweaks fixes 1hr


So there you go two examples of how you can quickly pull together different data sources to help us record and report the social reach of blog posts. You’ll notice I’ve conveniently ignored whether social metrics are important, the dangers of measurement leading to gaming, the low recorded social activity around OERRI projects. I look forward to your comments ;)


The 'Deprecated notice' is because the access to the Topsy API (which is used to extract Twitter activity) is now behind a developer key.

A post from Alan Levine (@cogdog) on Syndication Machines (plus syndication of syndication) which details how feeds for the cMOOC course ds106 are aggregated and resyndicated got me thinking if there was additional data around student blog posts that could be usefully captured. Previously for something like that I would have turned to PostRank Analytics, which allowed you to specify any rss feed and it would aggregate social activity like tweets, bookmarks from a wide range of services and let you see it all in one place. Unfortunately PostRank were bought by Google and while some of this data is now accessible from Google Analytics it’s restricted to your account and data from social networks Google isn’t directly competing with.

So I thought it would be useful/interesting to start looking at what social activity you could pull together without authenticated access or API keys. So far I’ve identified couple and given some overlap with other work (and some late night play time), I’ve come up with a Google Spreadsheet template which pulls data together from comment feeds, Twitter and Delicious (with social counts for these plus Facebook, LinkedIn and Google+). You can give the spreadsheet a try with the link below:

*** Blog Activity Data Feed Template ***

Blog Activity Data Feed Template Overview


  • Collects last 10 posts from a RSS feed
  • Uses sharedcount.com to get overall post share counts from Facebook, Twitter, Google+, LinkedIn and Delicious
  • For supported blogs (mainly manila WordPress) extracts comment snippets
  • Uses Topsy to collect tweets mentioning the post
  • Fetches all the Delicious bookmarks for the post url
  • Summarises activity from comments, tweets and bookmarks on the dashboard

You can make as many copies of the template as you like to track other RSS feeds.

If you prefer a slightly different overview then the RSS Feed Social Share Counting Google Spreadsheet gives a table of share counts for a feed

*** RSS Feed Social Share Counter ***

Social share count matrix

Technical highlights

There are a couple of cool things under-the-hood for the Activity Data template worth noting.

Managed Library = Easy to add more sources

So far I’ve limited the template to sources that have easy access. Previously if an API disappeared, changed or I discover another service I would have to modify the template and users would have to make a fresh copy. With Managed Libraries the Google Spreadsheet template now only needs 11 lines of code (shown below) for 3 custom formulas (see cells Dashboard!C8, and #n!G2 and #n!B8).

function getSharedCountActivity(url){
  return socialActivity.getSharedCount(url);

function getItemFeed(url, num){
  return socialActivity.getItemRSSFeed(url, num)

function getActivityFeed(url){
  return socialActivity.getActivityFeed(url);

Because the template uses my own socialActivity library and because I’ve left the template in Development Mode when I update the code changes should (I hope) filter to all the existing templates in use. This means if I can add more services with open APIs details should start appearing on the post activity pages (denoted with # at the beginning).

Fun with sparklines

Activity SparklinesIf you were wondering how I did the activity sparklines, here’s how. Sparklines are a built-in Google Spreadsheet formula (here’s the  reference). To get them to work you need some data if you want some options. For the data all you need is an array of values and sparkline will plot them equally spaced.

In this use case I’ve got comments, tweets, and bookmarks with date/times. To plot the sparkline I need to know how many of these were made between specific time periods. Another function we can use to get this data is FREQUENCY. This lets us specify some intervals and source data and returns an array of the frequency of data points.

One issue is that for each post the activity range varies, so one post might only have activity in July, and another in August. There are other formula (MIN/MAX) to get the range values. Here’s how the final solution was pieced together.

On each post sheet (prefixed with ‘#’) in cells A1:A20 are some formula to calculate date intervals based on the minimum and maximum date ranges. Frequency data used for each sparkline is then calculated inside the sparkline formula by indirectly looking at the data in each post sheet. The indirectly part is handled by the INDIRECT formula which evaluates a cell reference from other cell values. So on row 9 of Dashboard INDIRECT(B9&"!D9:D") evaluates to ‘#1!D9:D’, row 10 is ‘#2!D9:D and so on. I could have hardcoded these cell references but I prefer the ‘clever’ way. Here’s part of the final formula with a breakdown of what it does:

SPARKLINE(FREQUENCY(INDIRECT(B9&"!D9:D"),INDIRECT(B9&"!A1:A20")),{"charttype","line";"ymin",-1;"ymax",MAX(FREQUENCY(INDIRECT(B9&"!D9:D"),INDIRECT(B9&"!A1:A20")))+1} )

  • INDIRECT(B9&"!D9:D") – Frequency data source built using INDIRECT which evaluates a cell reference from other cell values. On row 9 of Dashboard this evaluates to ‘#1!D9:D’, row 10 is ‘#2!D9:D
  • INDIRECT(B9&"!A1:A20") – similar trick to get an interval array for the FREQUENCY formula
  • FREQUENCY(INDIRECT(B9&"!D9:D"),INDIRECT(B9&"!A1:A20")) – gives our sparkline data array
  • MAX(FREQUENCY(… – get the maximum value from the frequency data just so we can add 1 to it to give a margin on the sparkline


Hopefully you’ll enjoy and if you can think of any improvements or other services to tap into just get in touch.


For those not familiar with MOOCs Dave Cormier & co. have a nice video explaining them here. Dave should know as the term:

MOOC was coined in 2008 by Dave Cormier, Manager of Web Communication and Innovations at the University of Prince Edward Island, and Senior Research Fellow Bryan Alexander of the National Institute for Technology in Liberal Education in response to an open online course designed and lead by George Siemens, associate director, Technology Enhanced Knowledge Research Institute at Athabasca University and Stephen Downes, Senior Researcher at The National Research Council (Canada). The course was called "Connectivism and Connective Knowledge" and was presented to 25 tuition-paying students at the University of Manitoba in addition to 2,300 other students from the general public who took the online class free of charge – from Wikipedia

If MOOCs aren’t already on the Gartner ‘peak of inflated expectations’ I’m sure by the end of this year they will as various people are gambling Where there’s MOOC, there’s brass?. Recently projects like Coursera, Udacity and edX have attracted over $100 million in investment to deliver MOOCs. It’s worth reiterating that not all MOOCs are the same and as George Siemens recently wrote:

Largely lost in the conversation around MOOCs is the different ideology that drives what are currently two broad MOOC offerings: the connectivist MOOCs (cMOOCs?) that I [George Siemens] have been involved with since 2008 (with people like Stephen Downes, Jim Groom, Dave Cormier, Alan Levine, Wendy Drexler, Inge de Waard, Ray Schroeder, David Wiley, Alec Couros, and others) and the well-financed MOOCs by Coursera and edX (xMOOCS?).

George’s post was entitled ‘MOOCs are really a platform’, which got me thinking that given the recent announcement that Coursera is offering it’s platform to other partners, including the University of Edinburgh, this falls into the ‘as a Service’ model as used in cloud computing. So Coursera are offering chosen partners (for a fee) ‘MOOCs as a Service’ (MaaS), or using the distinction from above ‘xMaaS’.

One other consideration is whether xMOOCs are really MOOCs. Martin Weller recently questioned if the recent offerings from Coursera et al. are truly open. So ignoring the xMOOCs for now I thought it would be useful to survey some ‘real’ MOOCs and identify if there were any technological trends and possible future directions. This process has left me the questioning if there is a need for more shared guidance and support on aggregation architectures. [This post is mainly for me to record thoughts and resources but it might be something we (JISC CETIS) take forward. If this is something you’d like please comment or share this post to let us gauge the level of interest]

How the data was collected

A list of MOOCs was taken from the ‘(Very) Partial History of MOOCs’ section of mooc.ca (maintained by Stephen Downes). This was added to with UK based examples funded by JISC/HEA. All 14 courses were then visited and examined for technologies used. The criteria for inclusion were that the course had to have finished or started.

Because of the nature of MOOCs there is a grey line between the technology: chosen by the course team; recommended to students; and used by students either as part of their personal learning environment or used to facilitate smaller study groups. A distinction was attempted to only include technologies chosen/used by the course team.

Data collection was also not entirely conclusive due some of the early MOOCs (CCK08  and CCK09) no longer having a web presence and others like MobiMOOC reusing their existing web presence (the 2011 version of the course has been edited to become the version for the 2012 session).

A Google Spreadsheet with the data and annotations is here. Please contact me for edits/additions.


Frequency of MOOC tech

MOOC tech treemapAbove is the obligatory word cloud of the frequency different technologies were used in the MOOCs surveyed (if you are feeling brave you can explore the data as a treemap on ManyEyes).

Two things are apparent in this data. Firstly, email is often a hidden technology. Services like Google Groups, Twitter and Facebook all allow (and in some cases rely on) email notifications. Secondly, it’s of limited use to know what technologies are being used, the important question is how they are being used.

We can get some insight into this from the diagram below taken from George Siemens “What is the Theory that Underpins Our MOOCs?

Siemens: MOOC Architecture

gRSShopper is an application that allows you to define your own community of RSS feeds, aggregates content from those feeds and organizes it, and helps you integrate that content into your own posts, articles and other content [source]

Because of the connectivist theory underpinning cMOOCs gRRShopper is a key component in aggregating distributed activity. It’s worth noting that only 5 of the 14 MOOCs surveyed used gRSShopper, but there is evidence that other mechanisms are in place to preform similar functions. For example in Digital Storytelling (DS106) this functionality is handled by the FeedWordPress plugin which allows administrators to specify feeder RSS feeds and selectively repost content to a self hosted WordPress blog. In PHONAR and PICBOD, which are photography based courses, students were directed to publish work to Flickr using a common tag to allow aggregation of work.

The general sense of it all – distributed, chaotic, emergent 

The ‘distributed, chaotic, emergent’ comes from a recent presentation by George Siemens for EDUCAUSE talking about cMOOCs. It’s apparent from the survey of MOOC technology that course teams are taking a loosely joined set of tools that they are comfortable with to facilitate a shared experience with the learner. As commented by Downes when writing about gRSShopper “the users are assumed to be outside the system for the most part, inhabiting their own spaces, and not mine”. It’s also apparent that people are taking discipline based approaches using tools aligned to study areas as previously described with PHONAR/PICBOD.

Even with the bespoke nature of MOOCs there are still opportunities to start collectively raiding the parts bin. Given the widespread use of Twitter in MOOCs are there tools/techniques required to aggregate and disseminate the course discussions? Given the wide use of WordPress within education are there opportunities for MOOC specific themes or plugins? With the ability to freely record and stream video from a Google Hangout do we need a wrapper to allow comment collection and annotation?


It’s just not the technology that sits on top. It’s been fascinating to read the blog posts from JIm Groom as he and his colleagues come to grips with the infrastructural issues of running ds106.us. As commented by Stephen Downes, and as Jim is finding, “aggregation kills websites”. So if it’s the aggregation of content that’s the glue in cMOOCs, perhaps this becomes the area of priority? Perhaps this is the area where JISC CETIS can most useful provide or facilitate guidance?  As Jim recently commented:

No where in the raging discussion around MOOCs is there anyone talking about sharing the infrastructural/architectural work they’ve done freely with others – Jim Groom in Integrating FeedWordPress with BuddyPress

So having loosely started with the pedagogy, looked at some of the technology I’m beginning to think aggregation architecture/infrastructure might be the best place where an organisation like JISC CETIS could help. Do you agree?   


Today Tony Hirst posted Data Shaping in Google Refine – Generating New Rows from Multiple Values in a Single Column. Coincidently I faced a similar problem today working with a Google Spreadsheet so thought I’d share my solution.

Here’s a Google Spreadsheet where I’ve been capturing some MOOC infrastructure (contract me if you’d like to contribute/edit). For the technology column I’ve just been dumping in comma separated terms.

Source spreadsheet

Having entered some data I started thinking about how it could be presented. One idea was to use the Smile timeline widget, which I’ve experimented with before here. To get the data in the right shape I need to split the technologies to have one per row. To do this I wrote the custom function below in Google Apps Script which splits the technology into one per row (I’m undecided now if I’ll still use this for a timeline – it’s another case of it’s taken longer to write this up than actually do it)

Reshaped spreadsheet

// splitColumnAndRepeatRows iterates row by row in the input range and returns an array spliting
// by a comma separated column
// Arguments:
//   - anArray: values[][]
//   - splitColumnIndex: specifies the column number where the cell values need to be comma split 
//                       0 indexed
// Returns an [][] Array of values.
function splitColumnAndRepeatRows(anArray, splitColumnIndex) {
  var output = [];
  for (i in anArray){ // for each row
    var splitArray = anArray[i][splitColumnIndex].split(","); // split values in specified column
    for (j in splitArray){ // for each split cell value
      var row = anArray[i].slice(0); // take a copy of source row 
      row[splitColumnIndex] = alltrim(splitArray[j]); // replace comma separated value with current split value
      output.push(row); // push new row to output
  return output;

function alltrim(str) {
  return str.replace(/^\s+|\s+$/g, '');

Update: added an alltrim function to remove whitespaces