I recently had a chance to spend some time with Marc Smith co-founder of the Social Media Research Foundation which is behind the Microsoft Excel networks add-in NodeXL. I’ve done a couple of blog posts now with NodeXL and after a prompted by Marc I thought it was worth a revisit. So in this post I’ll highlight some of the new features of NodeXL's Twitter Search tools that make it a useful tool for community/resource detection and analysis.
Importing a Twitter Search – expanding all the urls
Before going too far I should also point out there has been a separate social network importer for Facebook Fan pages for a while now. On the new Twitter Search import there is now an option to ‘Expand URLs in tweets’. This is useful because Twitter now wraps all links in it’s own shortening service t.co. The shortened urls are also unique for each tweet* even if the target url is the same. Having a feature that expands these is useful to see what people are linking to (it makes it easier to see if people are sharing the same resources or resources from the same domain). And as you’ll see later makes it easier data to use in mashups.
*except new style RTs which use the same t.co link
Did you know you can use urls and website domains in your search? This is a trick I’ve been using for a long time and I’m not sure how widely known it is. For example here is everyone who has been sharing the new Creative Commons License chooser at http://creativecommons.org/choose/ or just everyone sharing a link that has anything that links to the Creative Commons website domain. In Tweetdeck I use a search column with ‘hawksey.info OR cetis.ac.uk’ to pickup any chatter around these sites.
Replicable research and the (almost) one button expert
NodeXL has been a great tool for me to start learning about network analysis, but as I play with various settings I’m conscious that I’m missing some of the basic tricks to get the data into a meaningful shape. For a while now people have been able to upload and share their network analysis in the NodeXLGraphGallery. This includes downloading the NodeXL data as an Excel Workbook or GraphML (this is a nice way to allow replicable research).
An even newer feature is to download the NodeXL Options the graph author used. This means a relative amateur like myself with no sociological background, and unlike Marc unaware of what the increasing popularity of zombie films might be saying about our society (although they can be used to explain betweenness centrality), can tap into their expertise and format a graph in a meaningful way with a couple of clicks. There’s still the danger that you don’t understand the graph, but it can still be a useful jumpstart.
Twitter Search Top Items Summary
The next new thing is a Twitter Search Network Top Items page. I did a search for ‘#oer OR #ukoer’ to pull the last 7 days tweets. By importing the options from this NodeXL Graph Gallery example and running the ‘Automate’ you can reuse my settings on your own search result. By running Graph Metrics > Twitter search network top items (part of my Automate options) I get this sheet which I’ve uploaded to Google Spreadsheet
This sheet lets you quickly see overall and group level:
- Top Replied-To in Entire Graph
- Top Mentioned in Entire Graph
- Top URLs in Tweet in Entire Graph
- Top Hashtags in Tweet in Entire Graph
- Top Tweeters in Entire Graph
These are useful summaries to look at who is most active in the community, what urls are most being shared, overlapping tag communities. I admit that it can look like a scary dashboard of stuff which not all of you will like, but NodeXL is a network graphing tool so it’s easy to visually explore the data.
So looking at macro level we can quickly graph the ripples typical within a Twitter community which mainly showing the effects of retweets (this view was extracted from my online NodeXL Google Spreadsheet Graph Viewer). This can help you quickly see the smaller clusters within the community who are generating retweets and conversations.
Community (group) in a box
Because my data was also on the NodeXL Graph Gallery Marc kindly created this view which groups sub-communities using an algorithm and overlays the most used hashtags used by the sub-community (Marc’s version on NodeXL Graph Gallery). The group hashtag labels, which are in frequency order, are very useful in this situation because the search term I used was pulling in overlapping hashtag communities (#oer and #ukoer). So looking for boxes where ‘ukoer’ is near the beginning would indicate they are from the uk crowd.
Getting more from the data
Earlier I mentioned that having expanded urls was useful for further analysis. Something I quickly played with that I’m not entirely sure how to get the most out of (if anything) is reusing my RSS Feed Social Share Counting Google Spreadsheet code to get social share data from the most tweeted links. Here’s the result (embedded below). Let me know if you have any thoughts on how it can be used: