First look at analysing threaded Twitter discussions from large archives using NodeXL #moocmooc

This post is a bit messy. I got caught trying out too many ideas at once, but hopefully you’ll still find it useful

Sheila recently posted Analytics and #moocmooc in which she collects some thoughts on the role of analytics in courses and how some of the templates I’ve developed can give you an overview of what is going on.  As I commented in the post I still think there is more work to make archives from event hashtags more useful even if just surfacing tweets that got most ‘reaction’.

There are three main reactions that are relatively easy to extract from twitter: retweets, favouring and replies. There are issues with what these actions actually indicate as well as the reliability of the data. For example users will use ‘favouring’ in different ways, and not everyone uses a twitter client that can or uses a reply tweet (if you start a message @reply without clicking a reply button Twitter looses the thread).

But lets ignore these issues for now and start with the hypothesis that a reaction to a tweet is worth further study. Lets also, for now, narrow down on threaded discussions. How might we do this? As mentioned in Sheila’s post we’ve been archiving #moocmooc tweets using Twitter Archiving Google Spreadsheet TAGS v3. As well as the tweet text other metadata is recorded including a tweet unique identifier and, where available the id of the tweet it is replying to.

Google Spreadsheet columns

We could just filter the spreadsheet for rows with reply ids but lets take a visual approach. Downloading the data as a Excel file we can open it using the free add-in NodeXL.

NodeXL allows us to graph connections, in this case conversation threads. NodeXL allows use to do other useful things like group conversations together to make further analysis easier. Skipping over the detail here’s what you get if you condense 6,500 #moocmooc tweets into grouped conversations.

 moocmooc grouped converstations

This is more than just a pretty picture. In NodeXL I’ve configured it so that when I hover over each dot which represents and individual tweet I get a summary of what was said by who and when (shown below).

NodeXL being used to examine nodes

It’s probably not too surprising to see strings of conversations, but by graphing what was an archive of over 6500 tweets we can start focusing on what might be interesting subsets and conversation shapes. There are some interesting patterns that emerge:

conversation group 1 conversation group 2conversation group 3

Within NodeXL I can extract these for further analysis. So the middle image can be viewed as:

Examination of conversation group 2

There’s a lot more you can do with this type of data, start looking at how many people are involved in conversations, number of questions per conversations and lots more. I should also say before I forget that NodeXL can be configured to collect twitter search results with it’s built-in twitter search tool. It can also be configured to do the collection on a regular basis (hmm I should really have a go at doing that myself). So potentially you’ve got a nice little tool to analysis twitter conversations in real-time …

If you’d like to explore the data more it’s available from the NodeXLGraphGallery. I’m going off to play some more ;)

Last updated by at .

16 Responses to “First look at analysing threaded Twitter discussions from large archives using NodeXL #moocmooc”


Comments are currently closed.

About

This blog is authored by Martin Hawksey Google+

JISC CETIS Learning Technology Advisor (OER Programme Support)
jisc cetis logo

The MASHezine (tabloid)

It's back! A tabloid edition of the latest posts in PDF format (complete with QR Codes). Click here to view the MASHezine

Preview powered by:
Bluga.net Webthumb

The MASHebook

You can also download this post as:

Subscribe to monthly email digest of posts

Loading...Loading...


Subscribe to per post email updates

Enter your email address:

Delivered by FeedBurner

Copyright License

Creative Commons Licence
This work is licensed under a Creative Commons Attribution 3.0 Unported License. CC-BY mhawksey

Privacy /Cookies

This blog uses Google Analytics (which makes use of 'cookie' technologies) to provide information on usage. Here's an overview of Google Analytics Privacy and how to opt-out (other 3rd party services like Twitter might also be tracking you via this site, but as far as possible I try and prevent this by removing official tweet buttons).

Badges

. . .