Archive for the 'Google Apps Script' Category

Creating a ‘full fat’ RSS feed for Google Gmail labels (enabling some dark social judo)

In this post I want to cover three things. First I want to introduce a little app I’ve developed which allows you to create a RSS feed for any of your Gmail labels (with the option to remove certain links – useful if you don’t want others unsubscribing you from mailing lists). Secondly I explain how it was made and how you can use it yourself. Finally I want to discuss how this could be used in an open course environment, utilising the vast processing power from services like Twitter and reusing there target marketing emails to your benefit with a bit of ‘dark social judo’.

What is Gmail Label Feeder?

It’s a little Google Apps Script app that you can setup to select one of your Gmail labels, preview content (with the option to remove all the links you’d like not to publish, like unsubscription links) and publish a public feed of the result. This video (embedded below) gives an overview of the problem and how the ‘Gmail Label Feeder’ app works:


How can I setup Gmail Label Feeder and how does it work?

Whilst logged in to your Google account open this script and then follow the instructions below:

  1. In the Script Editor File > Make a copy (this makes your own personal copy of the script which lives in Google Drive. With this copy you can control permissions etc and means I except no liability if it goes wrong, breaks, doesn’t work)
  2. From the Script Editor Run > setup
    IMPORTANT: when authenticating use the Gmail account you want to create feeds for
  3. Still in the Script Editor File > Manage versions… and create an initial name and ‘Save New Version’
  4. Then, Publish > Deploy as web app… and select ‘Execute the app as: me’ and allow anyone access even anonymously
  5. Finally open (and bookmark) the ‘current web app url’ to create (atom) feeds for your gmail labels

The process covering the creation of individual feeds is covered in the video above. The video below shows the five steps for the initial setup:

How it works

Having already used Eric Koleda’s Feed+ script when creating a similar app to turn Scoop.it searches into an RSS feed it wasn’t too much effort to change the source data to a Gmail account. It’s not the first time I’ve hacked the functionality of Gmail having already used my inbox to re-enable Google Reader social share features, so I already knew/had code to GmailApp.search using the ‘label:’ search operator. One thing I struggled with was removing selected links from emails. The UI side was straight forward thanks to the fantastic framework already developed by Eric. Initially I tried Removing html tags and content where tag content matches an array of values using Xml.parse() but as you see from the answer and comments from Jonathan Broughton and Bruce Mcpherson (thanks guys!) I used regex instead (only after getting caught out by line breaks and tabs).

Dark social judo: Pulling the email push

I was first made aware of ‘dark social’ via Alan Cann (who is ironically quoted in the THE today for his work around social media) and in particular his move to email as the main communication tool for students.

Alexis Madrigal at The Atlantic — who writes about the influence of what he calls “dark social” on engagement and traffic patterns. While everyone is busy watching Twitter and Facebook because they are easy to track, Madrigal argues that most social traffic still comes from old-fashioned or difficult-to-track sources like email and chat messages – (From Dark social: Why measuring user engagement is even harder than you think)

The use of email is something we’ve paid particular attention to in the open online course ocTEL, developing a daily newsletter to push a automated summary of course activity (talking about automated summaries read Tony Hirst’s Notes on Narrative Science and Automated Insights). There’s nothing particularly new in this but one whole I wanted to plug was getting an archive of these back into the ocTEL Course Reader (an RSS aggregation of all available course activity), hence the Gmail Label Feeder. Whilst doing this and picking over my own inbox for example emails I started to think about the ‘push’ I got from social sites like Twitter, Google and LinkedIn keeping me aware of activity and making suggestions for people and content I might like. Whilst some of these are very basic action reporting others require a degree of processing to generate.

Social push

The particular scenario I had in mind was if for example you were running a course Twitter account, when you get the inevitable ‘suggestions similar to’ or ‘do you know’ would there be some value judo style ’pull when your opponent pushes’  and publishing this content so it’s available to all your students. Even if this is of little individual value to the student wouldn’t aggregating this data in a machine readable way be useful down the line. Have a look at the menu of push options, wouldn’t at least some of these be useful to your students. Should we be doing a bit more dark social judo?

Twitter dark social menu

Playing with Google Docs sidebar using Google Apps Script

The Google Apps Script team have recently announced a host of new features. The three that caught my eye were:

  • Script editor added to Google Docs and Forms
  • Addition of the Forms Service which lets you programmatically manipulate forms
  • Extending Google Docs functionality using Custom menus and user interfaces including creating custom sidebars

The last one in particular looked interesting. Having a scriptable area to supplement the main control area immediately made me think about resurrecting tools like the citation robot ‘Igor’ or supplement Google Spreadsheets on the fly graphs or extra info from 3rd party sites.

As Tom Smith (University of York) has discovered sidebar integration in Google Spreadsheets isn’t available yet, but the word from Google I/O session announcing this feature (video not available yet) is it’ll be here in a couple of weeks (see comments thread here).

Word Navigation PaneSo, like Tom, to kick the tyres on the Google Docs sidebar functionality I set myself a small project. One of the features of MS Word I like is the ‘Navigation Pane’, in particular for jumping around a document using section headings. Given this operates from a sidebar it seems an ideal candidate to try and replicate.

Looking at the Google Apps Script documentation we can see that we can getLinkUrl() from a TableOfContents within a Google Doc. Using an example from stackoverflow it’s easy to extract the link urls using:

 var tocDat = {};
  var doc = DocumentApp.getActiveDocument(); //get active document
  for (var i = 0; i < doc.getNumChildren(); i++) { // loop all the document elements
    var p = doc.getChild(i);
    if (p.getType() == DocumentApp.ElementType.TABLE_OF_CONTENTS) { // if the element type is a TABLE_OF_CONTENTS extract item links
      var toc = p.asTableOfContents();
      for (var ti = 0; ti < toc.getNumChildren(); ti++) { // looping over each ToC item
        var itemToc = toc.getChild(ti).asParagraph().getChild(0).asText();
        var itemText = itemToc.getText();
        var itemUrl = itemToc.getLinkUrl();
        tocDat[itemText] = itemUrl; // create object array
      }
    }
  }

It’s worth noting that to get this requires the user to have already inserted a table of contents into the document. There is an open issue ticket to do this using script. Something else I was unable to do was return what level the heading link was for (e.g. Heading 1, Heading 2 etc). To do this I had to loop arose the entire document, which you can see in the final project code.

Here is a copy of the example document with the code included. Because no need edit rights to run custom menus you’ll need to File > Make a copy to get the ‘Custom’ dropdown menu option.

custom menu

The first time you select Custom > Show Document Map you get a big scary authentication window (another one of the new features announced was a pilot of a new authentication flow). Once you’ve ‘Ok’ you can run Custom > Show Document Map which launches the sidebar:

doc map

At this point you are probably asking where are the links in the document map. For some reason the caja sanitisation is stripping the anchor link. Regardless of this, if you dig around the page source you’ll see as part of the sanitisation links target _blank which will open a new browser tab.

href target blank

In the sidebar documentation it says that communication with other Apps Script services is possible, which might be a way to hook the navigation functionality in, but as I can’t find any methods to change document position it looks like for now it’s a lost cause.

So while I’ve hit a dead-end having the sidebar, particularly when it reaches Google Sheets, is a big bonus but as always it’s important to be aware of the limitations. I’m looking forward to what others come up with.

PS Must try the programmable forms next (it might be an opportunity to update EventManager v3)

Tapping the rhizomes of dotlife: Creating an RSS search feed for Scoop.it posts

This was a useful post to write and reflect on some ideas. If you are here because you want to make RSS feeds for Scoop.it posts searches here’s the Feed+ Machine Chrome Web App and the standalone Feed+ Machine App

Recently when reflecting on my career path two key moments came to mind. First around 1999 having graduated as a structural engineer I got a job converting course content marked up in Word documents into HTML. Why this was important was I was forced to understand the raw building blocks of the web, I was in a foreign world and I needed to learn the language and fast.

Luke, view the source

A decade later and my interests shifted, less about static content and more dynamic data. Tapping into the work, primarily of Tony Hirst, my eyes were opened to the opportunity of remixing the web. Taking existing data, a dash of code/3rd party services and creating new things and ideas. In this world data feeds are the new building blocks to play with. In particular RSS and Atom feeds are the prize. Usually openly available and easy to manipulate/render.

Luke, find the feed

Unfortunately this ease of access appears to be putting off the big 3rd party services. With the latest Twitter API atom isn’t an option, with Google+ RSS feeds should be forgot about. This is a great shame a number of open online courses recognise the value of learners finding their own space, feeding their activity into tools which can aggregate, potentially sense make, and feed the rhizomes of the networked learner.

Like other open courses this is the challenge we face in ocTEL, pulling distributed activity into the machine. For some services RSS is still championed.  Diigo where the RSS badge with pride..

RSS from Diigo

JISCMail go even further also flying the auto-discovery flag (yet another front being eroded) …

RSS from JISCMail

What about Delicious …

Where's the RSS from Delicious

Mendeley … exists but you have to go digging

Mendeley have some, but not telling you

Google+ … this is where it gets interesting

[trumpets] Feed+ by Eric Koleda

Eric Koleda has created a Chrome Web App (Feed+) that lets you create RSS(Atom) feeds for Google+ searches. Using Google Apps Script Eric has created a user interface and it does all the negotiation with the Google+ API converting the data into RSS. Given the availability of a Scoopit API and using Eric’s base code I’ve forked the project to create the Feed+ Machine Chrome Web App and the standalone Feed+ Machine App (Google login is required to manage your feeds). To begin with it only supports creating feeds from Scoop.it! post searches, but given Eric has done such a great job on the design it’ll be easy to extend. I’ll spare you the details of the code but the source files are here if you want to unpick what’s going on and ask questions.

Here’s an example of a feed for Scoop.its referencing ocTEL, which we are already digesting in the ocTEL Course Reader. But why do this? My interest is primarily in capturing as much of the digital landscape, each artefact collected adding more detail to the map. Whilst material pulled from Scoop.it is potentially yet another churn of existing content, who is sharing and any insight they add provides additional context and may be another way in which participants can identify and migrate to new clusters of activity.

There is a downside to this to this approach. Having a search feed from Scoop.it is useful but having wrestled with the ScoopIt API there is much more data available not being captured such as metadata around posts and topics. So perhaps like others I should abandon the feed and favour JSON and start filling the Tin Can

PS NEW!!! CETIS briefing on Activity Data and Paradata for more developments in this area.

Guest post: How Apps Script Makes Classroom Observation Quicker and Easier

Just a quick note to highlight a guest post I’ve written for the Google Apps Developer Blog on How Apps Script Makes Classroom Observation Quicker and Easier. In this Google Apps Script example I show how a couple of lines of code can do some custom Google Form handling. In this case the scenario was:

Justin Marckel, the assistant principal at Cornatzer Elementary School in North Carolina, asked for help in modifying one of my existing Apps Script examples. Justin was recording teachers’ classroom activities using a Google Form, then manually copying and pasting data into separate spreadsheets for each teacher to review. Justin wanted to know whether there was a way for a Google Form to store the results in a master spreadsheet, then filter results to each teacher’s spreadsheet.

You can rad the full post here

PS Will Welch spotted that there will be new Google Apps Script/Google Forms functionality announced at Google I/O.

#ocTEL Using Google Spreadsheets for a basic analytic to find your fledgling bloggers

One of the nice things about open courses like ocTEL is whilst having your own blogging space wasn’t a mandatory requirement we were aware that a number of participants had setup one up anyway. Conscious that activities like blogging can be very lonely and also aware that this is often a critical moment in motivating people to engage in reflective writing we’ve been encouraging tutors to show these blogs some extra love, which seems to have gone down well.

This raises the question that given blogging is optional and we haven’t asked people to self-declare if they are new to blogging how do we identify this community? With this question in mind I quickly (initial pass took me about 5 minutes thanks in part to earlier work in this area) put together a spreadsheet which took the participant blog list and got feed counts.

post table

How it was made

If you look at Sheet 1 you’ll see there is a static list of Feed urls (Column B). This was obtained by using the Chrome Screen Scrape extension. The reason it is static is I wanted a column where tutors could make comments and a dynamic import would screw this up. Looking at Sheet 2 cell A1 you can see how to do a dynamic list using the importXML function (see this post for more examples of spreadsheet import options).

To get the post counts and dates some Google Apps Script was required to write a custom function (custom function, you ask? Where have you been, read my blog). After that it was a bit of conditional formatting.

What’s missing

It would be useful to have a dynamic list of feeds which could retain comment cells (ideas on how to do this very welcome. My thought was to do a cron job to read the sheet, fetch more feeds then output the result)

Another metric it would be useful to add a comment count. So 1 post 3 comments would be identified as a lower priority than 1 post 0 comments.

So do you think this spreadsheet is potentially a useful tool in the open course toolbox? How would you improve it?

Geek Out Webinar: Using Google App Scripts and Spreadsheets to Analyze Canvas Usage Data

Just a note to say on 26th April at 4pm BST (in your timezone) I’ll be giving a webinar on using Google Apps Script to extract data from Canvas (LMS by Instructure). Even if you’re not a Canvas user hopefully there will be some general techniques you’ll find useful. It might also be useful for people thinking about applying basic learning analytic techniques to your courses. I’ll update this post with any resources from the session.

Martin Hawksey of Jisc CETIS started playing around with discussion usage data in Canvas when he was participating in the Learning Analytics course in Canvas Network. Using Google’s cloud scripting service Google Apps Script,  free for Google Apps and Google account users, he could dig deep into the data and start doing some interesting (and valuable) analysis all from the comfort and familiarity of a spreadsheet.

Join us for this free 50-minute technical webinar on Friday, April 26th as Martin details his experience using the Canvas APIs and walks through building scripts to get meaningful Canvas data using Google Apps Script and the Canvas APIs.

Registration is free and you can book here.

Here’s a recording of the session and the bundle of links (including slides)

Sankey your Google Spreadsheet Data #d3js

Sankey DiagramSankey diagrams are a specific type of flow diagram, in which the width of the arrows is shown proportionally to the flow quantity.” Wikipedia.

I first came across Sankey diagrams by (the OKFN’s latest School of Data’s contributors ;) Tony Hirst in d3.js Powered Sankey Diagram. Subsequently Bruce McPherson showed how to create Sankey diagrams from Excel also using d3.js.

Having collect some survey data for Analytics and Institutional Capabilities at #cetis13 (here’s a copy of the form) we were looking for a way to take us beyond the Google Form reporting and gaining extra insight. In particular I was interested in trying to see if there were any relationships between the multiple choice questions. Using a Sankey diagram seemed like a possible solution and my colleague David Sherlock quickly came up with a modification of Mike Bostock’s Sankey example to accept a csv input (I’ll link if/when it gets written up).

Seeing this I thought it might be useful to make a wrapper to generate Sankey diagrams for data stored in a Google Sheet. The solution was relatively straight forward, using the Google Visualisation API to get the data in the right shape for d3js. An example of the prototype is here

There’s no interface yet for you to select a spreadsheet, sheets, columns etc but you can take you Google Spreadsheet ‘publish to the web’ and then add the following data to the url.

One of the issues with this solution is you might not want to make all your data available. To get around this I’ve written a Google Apps Script that lets you use a custom formula to preformat the data. To see this in action this Spreadsheet contains an example. The formula is in cell A1 and uses the format =setSankey(datarange, cols , separator)

  • datarange – sheet/cell reference for source data eg ‘Form Responses – Edit’!A2:D Note must start with column A
  • cols – comma separated list of columns to use to generate a chart for eg “B,C,D”
  • separator {optional} – used to split multi value cells defaults to “, ” eg ” | “

To use this in your own spreadsheets open Tools > Script editor and copy the code from this gist. Here’s an example url using pre-processed data. My main difference is the addition of the &output=1 to the querystring.

Obviously creating your own querystrings to render the data isn’t ideal and it would be relatively straight forward to create a UI wrapper similar to the one used in EDGESExplorer, but it’s something I reluctant to do unless there is enough demand. The other consideration is the question – does the sankey diagram provide useful insight for the type of data or is it just more ‘damn lies’.

It would have of course been nice to write a Google Gadget to include this in a Spreadsheet … but Google are discontinuing those :(.

#LAK13: Recipes in capturing and analyzing data – Using SNA on Canvas Discussions with NodeXL (for when it’s not a SNAPP)

In my last post on Canvas Network Discussion Activity Data I mentioned I was a little disappointed to not be able to use social network analysis (SNA) modelling techniques on the Canvas Network discussion boards. My main barrier was accessing the data via the Canvas API using my preferred toolset. Fortunately Brian Whitmer at instructure.com (the company behind Canvas) posted a comment highlighting that as a student on the course it was easy for me to get access to this data using a token generated on my Canvas profile page. With this new information in this post I’ll cover three main areas:

  • a very quick introduction into techniques/opportunities for analysing threaded networks using SNA;
  • how I retrieved data from the Canvas platform for the #LAK13 discussions; and finally
  • some analysis using the NodeXL add-on for Microsoft Excel (Windows).

On Friday 1st March at 4pm GMT I’ll also be having a live Hangout on Air with Marc Smith, one of the original creators and continued project coordinator of NodeXL. The live feed will be embedded below embedded here and you can ask question via Twitter or Google+ using the combined hashtags #lak13 and #nodexl  e.g. ‘What the best place to find out more about NodeXL? #lak13 #nodexl’. For the hangout session we’ll look at how easy it is to use NodeXL to analyse a Twitter hashtag community in a couple of clicks. [The rest of this post is introducing a more advanced use of NodeXL so if I lose you in the rest of this post fear not as the session will be a lot easier going]

Opportunities for analysing threaded networks using SNA

Hello graph

WiredUK friend/follower graphOne of the dangers of SNA is people see images like the one to the right and miss the point of using this modelling technique (the image is from one of my first goes at using NodeXL so I’d be the first to admit there’s room for improvement). If you do there are couple of things to bear in mind. Often these images are exhaust, generated as a snapshot of a visual and/or exploratory analytic. A certain level of literacy is required to understand the underlying structure of the graph. Taking this last point a threshold concept for me was understanding that a basic concept of these graphs are more often than not constructed from an edge list which is essentially two columns of data representing and start point and end point to a node. For example, all the names in columns Vertex 1 and Vertex 2 are nodes and each row represents an edge between the nodes so in this case Beth –> Adam generates the highlighted edge.

Basic edge list and force layout

There is a lot more to understand about these types of graphs, but this basic concept means I know if I have any relationship data its easy to graph and explore.

Hello SNAPP

At this point it’s worth mentioning the browser plugin SNAPP.

The Social Networks Adapting Pedagogical Practice (SNAPP) tool performs real-time social network analysis and visualization of discussion forum activity within popular commercial and open source Learning Management Systems (LMS). SNAPP essentially serves as a diagnostic instrument, allowing teaching staff to evaluate student behavioural patterns against learning activity design objectives and intervene as required a timely manner.

Valuable interaction data is stored within a discussion forum but from the default threaded display of messages it is difficult to determine the level and direction of activity between participants. SNAPP infers relationship ties from the post-reply data and renders a social network diagram below the forum thread. The social network visualization can be filtered based upon user activity and social network data can be exported for further analysis in NetDraw. SNAPP integrates seamlessly with a variety of Learning Management Systems (Blackboard, Moodle and Desire2Learn) and must be triggered while a forum thread is displayed in a Web browser.

The social network diagrams can be used to identify:

  1. isolated students
  2. facilitator-centric network patterns where a tutor or academic is central to the network with little interaction occurring between student participants
  3. group malfunction
  4. users that bridge smaller clustered networks and serve as information brokers

The paper referencing SNA research supporting these areas was presented at LAK11 (if you don’t have access also available in the presentation’s slidedeck).  The paper Visualizing Threaded Conversation Networks: Mining Message Boards and Email Lists for Actionable Insights (Hansen, Shneiderman & Smith, 2010) also highlights simple ways to identify question people, answer people and discussion starters which are all potentially very useful within courses for identifying network clusters individuals might want to join/follow.

Retrieving data from Canvas

Hopefully with that quick intro you can see there might be some value in using SNA from threaded discussion analysis. Reading the SNAPP overview hopefully you spotted that it currently doesn’t support extracting data from Canvas discussion boards. This is an opportunity to understand some of the analysis SNAPP is doing behind the scenes.

Hello Google Apps Script

If you have been following my posts you’ll see that I favour using Google Apps Script as a lightweight tool for extracting data. Thanks to Brian (Instructure) I’ve got a way to access the Discussion Topics API. Looking at the API documents I decided the best way to proceed was to get all of the LAK13 discussion topics (top level information) and use this to get the full topic data. If you speak JSON we are essentially turning this:

Data as rendered in Canvas

into this:

Data in JSON

finally getting this (web version here):

Data in spreadsheet

The code to do this is available here. I’m going to spare you the details of the code but here are the instructions is you’d like to export data from other Canvas hosted discussion boards. If you’re not interested in that you can just jump to the next section.

Generating an edge list (extracting data) from Canvas to Google Sheets

  1. Create a new Google Spreadsheet and then in Tool > Script editor copy in the code from here
  2. If you are not pulling data from LAK13 you need to edit values in lines 2-4. If you visit your course homepage hopefully you can decode the url pattern based on the example for LAK13 https://learn.canvas.net/courses/33 (I should also point out you need to be enrolled on the course to receive data. Also read Canvas API Policy)
  3. Next you need an access token which is generated from your Canvas Profile Settings page. Scroll down to the bottom and click New Access Token, filling in a purpose and leaving expires blank. Make a copy of the token as it’s needed for the next step (I added a copy to a .txt file just in case the next step didn’t work
  4. Back in the Script Editor in Google Spreadsheets click File > Project Properties. In the ‘Project properties’ tab click  ‘+ Add row’ and replace (name) with access_token and (value) with the token you got from Canvas before clicking Save
    Project Properties
  5. Make sure everything is saved in the Script Editor and then Run > getCanvasDiscussionEdges, wait for the script to finish and on Sheet1 you should have a bunch of data to play with.

Using NodeXL to analyse Canvas Discussions

There are a number of different questions we could ask of the LAK13 data. The particular one I want to look at is who are the core community members stimulating/facilitating discussion (e.g. applying a connectivist theory who are the people you might want to connect with). To do this we need to (I’m assuming you’ve already installed NodeXL):

  1. Download the data extracted to the Google Spreadsheet (File > Download as > Microsoft Excel). [If you just want the data I’ve extracted here’s the download link – the data is automatically refreshed nightly]
  2. Open the download file in Excel and in the created_at column select all and Format Cells as General (I needed to do this because NodeXL was miss formating dates on import)
  3. Start a new NodeXL Template (I use the NodeXL Excel Template option from my windows Start menu)
  4. From the NodeXL ribbon you want to Import > From Open Workbook
    Import > From Open Workbook
  5. In the import dialog vert1_name and vert2_name are edges, anything else prefixed with ‘vert’ is assigned to the corresponding Vertex n property column and everything else is an  Vertex 1 property:
    import dialog
  6. Once imported you can open the Edges sheet, select the created_at column and Format Cells reassigns a date/time format.
  7. In Prepare Data chose ‘Count and merge duplicate edges’ and select Count and Vertex1 and Vertex 2
    Count and merge duplicate edges
  8. In the Graph section of the NodeXL ribbon we want to make this a directed graph (replies are directed) and choose you layout algorithm (I usually go Harel-Koren Fast Multiscale)
    make this a directed graph
  9. Next we want to prepare the data we want to analyse. In the Autofill Columns (within Visual Properties portion of the ribbon) set Edge Visibility to ‘topic_id’ and in Edge Visibility Options set ‘If the source column number is: Not equal to 558’ Show otherwise Skip (this will skip edges that are responses to the Pre-course discussion forum – I’ll let you question this decision in the comments/forum)
    Edge Visibility Options
  10. Click Ok then Autofill
  11. Next open the Vertices sheet and select all the rows (Ctrl+A) and from the Visibility option select ‘Show if in an Edge’
    Show if in an Edge
  12. Now find the row in the Vertices sheet for George Siemens and Skip (doing this were creating a ‘what if George wasn’t there’ scenario
    ‘what if George wasn’t there’ scenario
  13. Open the Graph Metrics window and add Vertex in-degree, vertex out-degree, Vertex betweenness and closeness centrality and Top items (in the Top items options you’ll need to add these as the metrics you want top 10s for), finally click Calculate metrics.
    Graph Metrics window

At this point you could use the calculated metrics to weight nodes in a graph, but for now I’m going to skip that. You should now have a Top Items sheet with some useful information. In the Betweenness Centrality list you should have these names:

  • Martin Hawksey
  • Simon Knight
  • Alex Perrier
  • Khaldoon Dhou
  • Rosa Estriégana Valdehita
  • Maha Al-Freih
  • Suzanne Shaffer
  • Maxim Skryabin
  • Bryan Braul
  • Peter Robertso

Excluding the pre-course discussion forum and George Siemens the discussions these people have engaged with provide the shortest paths to other people engaging in discussions on the Canvas site. Strategically these are potentially useful people  within the network that you might want to follow, question or engage with.

Getting to this point obliviously hasn’t been straight forward and had SNAPP been available in this instance it would have turned this in to a far shorter post. Programmatically using tools like R we could have arrived at the same answer with a couple of lines of code (that might be my challenge for next week ;). What it has hopefully illustrated is if you have data in an edge format (two column relationships) tools like NodeXL make it possible for you use SNA modelling techniques to gain insight. (I’m sure it also illustrates that data wrangling isn’t always straight forward, but guess what that’s life).

NodeGL for LAK13 discussionsSo at this point some of you might be wondering what does the LAK13 reply network look like. I could give you a flat image but why don’t you use my NodeXL online graph viewer to explore it yourself or download the data from the NodeXL Graph Gallery.

#LAK13: Recipes in capturing and analyzing data – Canvas Network Discussion Activity Data

In my last post I looked at the data available around a course hashtag from Twitter. For this next post I want to start looking at what’s available around the Canvas Network platform which is being used to host Learning Analytics and Knowledge (LAK13). Sizing up what was available I did come across the Canvas LMS API documentation, which provides a similar method of accessing data as the Twitter API. I wasn’t sure if this extended to Canvas Network but because the authentication method it uses (oAuth2) isn’t possible using my dev tools of choice (mainly Google Apps Script) I looked for something else.

Whilst browsing on the discussion page for the course I noticed that my browser was auto-detecting a feed:

image

an excerpt of this feed is below:

https://learn.canvas.net/courses/33/discussion_topicsLearning Analytics and Knowledge Discussion Feed
  2013-02-20T12:39:02+00:00
  	Discussion: Week 8 Discussion Forum
    tag:canvas.instructure.com,2013-02-01:/discussion_topics/discussion_topic_580
    2013-02-01T21:15:20+00:00
    2013-02-01T21:15:02+00:00
    	George SiemensDiscussion: Week 3 Discussion Forum
    tag:canvas.instructure.com,2013-02-01:/discussion_topics/discussion_topic_575
    2013-02-01T21:15:56+00:00
    2013-02-01T21:13:24+00:00
    	

      George Siemens

Looking at the raw feed I could see it wasn’t limited (often feeds only contain the last 10 entries) and contained the entire content of messages.

Update: Below I use Google Apps Script to extract the data. Since then I’ve created a ‘no code’ solution that only uses existing Spreadsheet formula. There is an accompanying presentation (the webinar should eventually appear here)

Looking for an easy way to consume this I first turned to the importFeed formula in Google Spreadsheet’s but unfortunately it only returned the last 20 results. A trick I’ve used in the past is to put feeds through Yahoo Pipes to get a JSON/CSV to work with, but as working with dates this way isn’t straight forward I opted for some Google Apps Script which would create a custom formula to fetch the feed from Canvas Network and enter the results into a sheet.  The 12 lines of code for the main part of this are below:

function getCanvasDiscussions(url) {
  var response = UrlFetchApp.fetch(url);
  var contentHeader = response.getHeaders();
  if (response.getResponseCode() == 200) {
    var d = Xml.parse(response.getContentText()).feed.entry;
    var output = [['published','updated','title','author','link','id','content']];
    for (i in d){
      output.push([getDateFromIso(d[i].published.Text),getDateFromIso(d[i].updated.Text),d[i].title.Text,d[i].author.name.Text,d[i].link.href,d[i].id.Text,d[i].content.Text]);
    }
    return output;
  }
}

The getDateFromIso is a subfunction I use quite often and is available in this stackoverflow answer. Adding the above code to a Google Sheet (via Tools > Script editor..) allows me to use a custom formula to fetch the data.

Below is a quick look at the data returned (here it is published in a table). In the columns we have publish dates, title, author, link, uri and post content. As it goes this isn’t too bad. The big thing that is missing is whilst we can see which topic the message is in the reply threading is lost.

Canvas Network Discussion Data

Even with this like the #lak13 Twitter dashboard from last week I can quickly add some formulas to process the data and get an overview of what is going on (for the live view visit this spreadsheet – File > Make a copy is you want to edit).

CanvasNetworkDashboard

This obviously isn’t a complicated analytic and it wouldn’t surprise me if the course tutors didn’t have something similar on the backend of Canvas Network. As a student it’s useful for me to see how I’m doing compared to others on the course and get a sense of who else is contributing. [Adam Cooper has a great post on How to do Analytics Right... with some tips he picked up for John Campbell who is behind Purdue’s Signals Project which fits in nicely here.]

Summary

So with a bit of ken and a couple lines of code I can see how the #lak13 discussions are going. Again I’ve avoided any deep analytics such as analysing what has been said, to who, at what time, but hopefully now that I’ve highlighted and freed the data you can do something else with it. Not being able to extract the conversation thread is a little disappointing as it would have been nice to fire up SNAPP or NodeXL, but I’ll have to save those for another day ;)

Twitter Archiving Google Spreadsheet TAGS v5

For a couple of years now to support my research in Twitter community analysis/visualisation I’ve been developing my Twitter Archiving Google Spreadsheet (TAGS). To allow other to explore the possibilities of data generated by Twitter I’ve released copies of this template to the community.

In September 2012 Twitter announced the release of a new version of their API (the spreadsheet uses this to request data from Twitter). Around the same time Twitter also announced that the old version of their API would be switched off in March 2013. This has required some modification of TAGS to work with the new API. The biggest change for TAGS is that all requests now need authenticated access.

So here it is:

*** Twitter Archive Google Spreadsheet – TAGS v5.0 ***
[If the first link doesn't work try Opening this Spreadsheet and File > Make a copy]

Instructions for setting up TAGSv5

Instructions are included on the Readme/Settings sheet of the template. If you are having problems it’s worth checking written by Stacy Blasiola (@Blasiola) or this modified version by Karen Smith & Shanifa Nasser made for Open Data Day Toronto available as CC-BY-SA.

What will happen to my existing TAGS sheets that aren’t version 5.0?

When Twitter turn off the old API (test outages this March) all authenticated and unauthenticated search requests will stop working.

How do I upgrade existing versions of TAGS spreadsheets (v3.x to v4.0) to keep collecting beyond March 2013?

As I can’t push an update to existing copies of TAGS you’ll have to manually update by opening your spreadsheet, then opening Tools > Script editor… and replacing the section of code that starts function getTweets() { and finishes 134 lines later (possiblly with the line function twDate(aDate){ ) with the code here. [And yes I know that’s a pain in the ass but best I could do] … or you can just start a new archive using TAGSv5.0

More additional tips and info when I get a chance

About

This blog is authored by Martin Hawksey Google+

JISC CETIS Learning Technology Advisor (OER Programme Support)
jisc cetis logo

The MASHezine (tabloid)

It's back! A tabloid edition of the latest posts in PDF format (complete with QR Codes). Click here to view the MASHezine

Preview powered by:
Bluga.net Webthumb

The MASHebook

You can also download this post as:

Subscribe to monthly email digest of posts

Loading...Loading...


Subscribe to per post email updates

Enter your email address:

Delivered by FeedBurner

Copyright License

Creative Commons Licence
This work is licensed under a Creative Commons Attribution 3.0 Unported License. CC-BY mhawksey

Privacy /Cookies

This blog uses Google Analytics (which makes use of 'cookie' technologies) to provide information on usage. Here's an overview of Google Analytics Privacy and how to opt-out (other 3rd party services like Twitter might also be tracking you via this site, but as far as possible I try and prevent this by removing official tweet buttons).

Badges

. . .