TAGSExplorer

Search Archive feature in TAGSExplorer

At IWMW12 I made a searchable/filterable version of TAGS Spreadsheets. This feature lets you use the Google Visualisation API to filter tweets stored in a Google Spreadsheet (more about TAGS). It has been available via a separate web interface for some time but I’ve never got around to publicizing it. As TAGSExplorer also uses the Google Visualisation API to wrap the same data in a different visualisation tool (predominantly d3.js) it made sense to merge the two. So now in any existing TAGSExplorer archive (like this one for #jiscel12) you have should now also have a button to ‘Search Archive’.

The archive view has some basic text filtering from tweeted text and who tweeted the message as well as a time range filter (dragging the handles indicated). The scattered dots indicate when messages were tweeted. The denser the dots, the more tweets made.

I’ve hastily thrown this together so feedback very welcome.

8 Comments

It’s here folks. The most advanced aggregation and visualisation of tweets for the JISC Innovating e-Learning 2012 online conference taking place next week. Over two years ago I started developing a Google Spreadsheet to archive tweets and since not only have I been evolving the code I’ve been creating tools which use the spreadsheet as a data source. It’s pleasing to see these tools being used for a wide range of projects from citizen journalism,  to a long list of academics, students and community groups, and even TV broadcasters.

I’ve been a little remise in posting some of the latest developments and I’ll have to cover those soon. For now here’s your #jiscel12 Twitter basecamp.  

Overview of features

 #jiscel12 Twitter basecamp

Whilst I probably just looks like another spreadsheet you should explore:

A. The ability to easily filter archive by person

The ability to easily filter archive by person
[Still need to document]

B. The TAGSExplorer conversation overview

TAGSExplorer conversation overview
[TAGSExplorer: Interactively visualising Twitter conversations archived from a Google Spreadsheet]

C. The entire searchable/filterable archive

entire searchable/filterable archive
[Still need to document]

D. The question and answer filter

question and answer filter
[Any Questions? Filtering a Twitter hashtag community for questions and responses]

Dashboard

image[Contains a number of summaries – I find ‘most RTs in last 24hrs’ one of the most useful (how this works also need documenting]

Currently these are automatically updating every hour, but I’ll probably crank up the frequency next week. Your thought on these always gratefully received ;) 

7 Comments

It’s interesting to watch the popularity of Twitter hashtag chat communities (like #uklibchat) grow. It’s also interesting to the number of different ways these chats are recorded from dumping/exporting a Twitter Search into a word or pdf doc to using tools like Storify to capture the highlights. If you organise or are thinking of organising a #chat here’s one way you might want to keep a complete record of what was said and a couple of ways you can use this data.

Archiving the conversation

Perhaps not surprisingly it starts with my Google Spreadsheet Template for capturing Twitter searches. I realise that this solution isn’t as straight forward as old services like Twapper Keeper and it can look a bit daunting but trust me it’s not that bad and even with my dodgy typing it can be setup in under 3 minutes.

Publish archive as a spreadsheet

Here’s an example from the last #uklibchat (for demo purposes only, the archive isn’t been updated). What you can do is select the rows that cover your chat period and paste them to a new sheet (as I have done here). Once you’ve copied you might want to sort on the Time column to get the tweets back in chronological order. At this point you have a couple of options. You can File > Publish to the web generating a link for the sheet (or as I’ve done just File > Share then entire spreadsheet).

Embed tweets in your blog

By adding a column with the formula ="[tweet "&M2&"]" and filling it down the entire column of your chat sheet when you paste the values generated into certain blogging tools like WordPress.com the twitter status urls are automatically embedded in the page like the one shown below (you might want to be selective about the tweets you include as too many will kill your page load time):


Experimental - Visualise, interact and replay the conversation

TAGSExplorer #uklibchatWith you #chat in a Google Spreadsheet another thing you can do is use my TAGSExplorer to let other people see and interact with the conversation.  Here’s #uklibchat for the 12th January:

Some things worth noting: click on a node lets you see their tweets, replies and mentions; and you can replay the conversations with that person 

What do you think?

4 Comments

When I initially pushed TAGSExplorer to the world one of the first reactions I got was from my friend Tony Hirst who suggested having a simple version which let users select two columns to generate a force layout diagram (that’s what I think he was suggesting anyway ;).

I’ve played around with other ways to get online network (force layout) visualisations from tools like NodeXL and Gephi, but these need a high level of faff. To take an edge list (here's an example of edge data), upload it to Google Spreadsheets and then point a browser window it makes the process a lot easier.

So stripping out some of the extra code from TAGSExplorer gives:

*** EDGESExplorer – A tool to generate force layout diagrams from a Google Spreadsheet ***

EDGESExplorer

How to use it

  1. Generate or upload a 2 column edge list (target/source) to Google Spreadsheets
  2. File > Publish to the web…
  3. Copy the spreadsheet url and paste it in http://hawksey.info/edgesexplorer
  4. Click ‘get sheet names’ and select the sheet with your data on
  5. Enter the comma separated column letters for where your data is stored
  6. ‘go’

Advanced tips

You can specify a 3rd column with link styles for each of your edge rows. The default is solid but there are two more options ‘dashed’ and ‘dashedblue’

Like TAGSExplorer the data used for your diagram is queryable  so you can refine what you see.

Something for TAGS users

If you are using a more recent version of my Twitter Archiving Google Spreadsheet (TAGS), v2.3 or greater, you can generate an edge list for follow/followee relationships from the Summary sheet data. To do this create a sheet called ‘Edges’ then run Twitter > Get Protovis.  Here’s an example from the #devxs tweets I captured

(note the relationships are generated from the Google Social Graph API so aren’t 100% accurate. The code for this is based on Tony Hirst’s Friendviz)

BTW I had a query about whether the code for this is open-source and for the processing/generating d3.js svg area the answer is yes (I feel the html and css parts include too much from yohman's Twitter Smash so I’m asking for approval first).

PS I now have a nagging voice in my head to turn EDGESExplorer into a Google Gadget as a replacement to the protovis version. Anyone?

2 Comments

Since pushing out my Twitter archive visualisation tool, TAGSExplorer, it nice to see people are already pushing out tweets for their own archives. After a full on development period of a couple of weeks, TAGSExplorer is reasonably stable but like most other web services there is some continual tweaking going on behind the scenes. One of the biggest tweaks comes from a suggestion from Tony (Hirst). Tony thought it would be great to give some more control over the data visualised, essentially a queryable visualisation tool.

#mozfest - click for larger imageFor example if you take the archive for #mozfest which has almost 8,000 tweets in it the visualisation you get is impressive but will send your browser into overdrive computing almost 2,000 nodes and 1,000 edges. Part of the problem is you get a lot of isolated nodes around the edge taking up navigation. These nodes represent people who tweeted #mozfest but didn’t @reply anyone with this tag.

So how can we easily filter these out? Fortunately Tony has a lot of experience with using Google Spreadsheets as a database and way back in 2009 started developing an explorer tool to help users built spreadsheet queries. This tool (here’s one of the latest versions) lets you input your spreadsheet url and provides tools for writing queries in the Google Visualization API Query Language. To take it back one step TAGSExplorer uses the Google Visualization API to read data from a Google Spreadsheet, using the API Query Language we can refine the data pulled in.  

For example, if I take the #mozfest data and plug it into Tony’s Guarduan Datastore Explorer I can select columns A,B,C,D,J,L,M,N (the minimum for TAGSExplorer to work are the columns from_user, text, created_at, time (optional for sort), source, id_str, profile_image_url, status_url) and start defining ‘where’.  This is a bookmark to select where B contains '@' and not B starts with 'RT', which only selects tweets that contain a @reply or @mention, excluding all RTs. Using the Datastore Explorer gives me a preview of the data pulled back by the query.

I can now drop the qpc and gqw parts of this bookmark (eg &gqc=A%2CB%2CC%2CD%2CJ%2CL%2CM%2CN&gqw=%28B%20contains%20%27@%27%20and%20not %20B%20starts%20with%20%27RT%27%29) straight into a TAGSExplorer url  for example http://hawksey.info/tagsexplorer/?key=0AqGkLMU9sHmLdEZFejItVVh2RGVQTjlsWHBVWlBWN2c&sheet=oau&gqc=A%2CB%2CC%2CD%2CJ%2CL %2CM%2CN&gqw=%28B%20contains%20%27@%27%20and%20not%20B%20starts%20with%20%27RT%27%29, which reduces the number of nodes displayed to less than 900 (refining this further with “starts with ‘@’” reduces the node count to 643)

#mozfest with @mentions and no RTs

Dipping into the Google Query Language Reference you can see there a lot of other select options. For example here’s a Guardian Datastore example which selects part of the #mozfest archive that contains ‘@’ and filters responses for the 4th November which can be dropped into TAGSExplorer via the url giving this:

 #mozfest 4th November only

[dotted lines are @mentions triggered by mentions=true in the querystring. This and retweets=true are both still undocumented oops]

Admittedly getting your head around the query language isn’t straight forward, but for the pro-user TAGSExplorer is now a querable data visualisation tool which I think is pretty cool!

BTW if you’re more used to writing tq queries as per the documentation tehn TAGSExplorer can also parse these (e.g. here’s the last example using a tq querystring rather than qpc and gqw)  

191 Comments

The use of Twitter to collecting tweets around an event hashtag allowing participants to share and contribute continues to grow and has even become part of mass media events, various TV shows now having and publicising their own tag. This resource is often lost in time, only tiny snippets being captured in blog posts or summaries using tools like Storify, which often loose the richness of individual conversations between participants.

It doesn’t have to be this way. Using a combination of Google Spreadsheets as a data source and a simple web interface to add interactivity it’s possible to let users explorer your entire event hashtag and replay any of conversations.


View example conversation replay

Try out a LIVE version

Update: If you are still struggling to understand the concept Radical Punch have done a overview of this tool

Here's how to archive event hashtags and create an interactive visualization of the conversation (written instructions below):


Twitter: How to archive event hashtags and create an interactive visualization of the conversation

Capturing the tweets

Use this Google Spreadsheet template. Newer version of the template here

For more reliable data collection it's recommended that you follow the steps to get authenticated API access to Twitter search results and setup a 'script trigger' to automate collection. Here are instructions on how to do it: Twitter now requires authenticated access. An updated version of the template and revised instructions is here.

  1. Open the TAGS Google Spreadsheet making a copy
  2. Register for an API key with Twitter at http://dev.twitter.com/apps/new. In the form these are the important bits:
    • Application Website = anything you like
    • Application Type = Browser
    • Callback URL = https://spreadsheets.google.com/macros
    • Default Access type = Read-only
  3. Once finished filling in the form and accepting Twitter's terms and conditions you'll see a summary page which includes a Consumer Key and Consumer Secret
  4. Back in the Google Spreadsheet select Twitter > API Authentication (you'll need to select this option twice, the first time to authorise read/write access to the spreadsheet). Paste in your Consumer Key and Secret from the previous step and click 'Save' (if the Twitter menu is not visible click on the blue button to show it)
  5. From the spreadsheet select Tools > Script Editor ... and then Run > authenticate and Authorize the script with Twitter using your Twitter account
  6. While still in the Script Editor window select Triggers > Current script's triggers... and Add a new trigger. Select to run 'collectTweets' as a 'Time-driven' choosing a time period that suits your search (I usually collect 1500 tweets once a day, but increase to hourly during busy periods eg during a conference). Click 'Save'
  7. Now close the Script Editor window. Back in the main spreadsheet on the Readme/Settings sheet enter the following settings (starting in cell B9):
    • Who are you = any web address that identifies you or your event
    • Search term = what you are looking for eg #jiscel11
    • Period = default
    • No. results = 1500 (this is the maximum Twitter allows)
    • Continuous/paged = continuous
  8. Click TAGS > Run Now! to check you are collecting results into a 'Archive' sheet
  9. To allow the results to be visualised from the spreadsheet select File > Publish to the web... You can choose to Publish All sheets or just the  Archive sheet. Make sure Automatically republish when changes are made is ticked and click Start publishing

Creating a public interactive visualisation of the archived tweets

  1. Copy the url of the spreadsheet you just created
  2. Visit http://hawksey.info/tagsexplorer and paste your spreadsheet url in the box, then click 'get sheet names'
  3. When it loads the sheet names leave it on the default 'Archive' and click 'go'
  4. You now have a visualisation of your spreadsheet archive (click on nodes to delve deeper)
  5. To share the visualisation at the top right-click 'link for this' which is a permanent link (as your archive grows and the spreadsheet is republished this visualisation will automatically grow)


8 Comments

Over on UK Web Focus Brian Kelly has posted What Twitter Told Us About ILI 2011which gives a breakdown of twitter statistics the recent Internet Librarian International conference #ili2011. These statistics give some of the headline figures of number of tweets, top tweeters etc from Andy Powell’s Summarizr interface for the TwapperKeeper service (recent bought by Hootsuite).

But what if you wanted to see what @bethanar said that got her mentioned/replied to over 140 times? What if you wanted to see how some of the other twitter conversations outside the top 10 evolved?What if you wanted to extend the conversation beyond the conference by providing easy ways to people to amplify key messages or reply to statements made during the event?

Getting value from Twitter archives is something I’ve looked at in the past with the Twitter subtitling experiments of conference keynotes/TV (not surprisingly there appear to be a growing number of Twitter/TV projects springing up – did Tony and I miss a trick? One to tell the grandkids). My latest offering uses my work in network visualisation to provide an interface for entire events.

Below is what the #ili2011 archive of almost 3,000 tweets by 434 people look like when you start connecting the replies and mentions made to other people (over 1,000 in total). And if you click on a node you can actually drill down into what they tweeted and the replies/mentions they received in the box shown on the right.

TAGSExplorer of #ili2011

Those with keen eyes will spot that this screenshot is running in a browser window and if you want to play head here for the low res version without mention connections or here if you want to test the capabilities of your graphics card (you’ll need a modern browser to see anything as SVG, despite being a recognised standard for years has only just made it into IE9).

Where is this data coming from? Well it isn’t TwapperKeeper, which also stopped offering an API or data export some time ago. No instead nowadays I suck interesting hashtags into a Google Spreadsheet which sits quietly in the cloud waiting for me to see how I, or anyone else, can twist and pull the data into a more readable shape. You can read more about the technology behind this in TAGSExplorer: Interactively visualising Twitter conversations archived from a Google Spreadsheet and also discover how you can visualise your own Google Spreadsheet of tweets.

PS In Brian’s post he said:

If you carry out a sentiment analysis of the archive of the tweets from last week’s #ili2011 (Internet Librarian International) conference I suspect you’ll find a lot of positive comments

I can confirm by using my Using the Viralheat Sentiment API and a Google Spreadsheet of conference tweets to find out how that keynote went down recipe that there were a lot of positive comments :)

image

https://docs.google.com/spreadsheet/ccc?key=0AqGkLMU9sHmLdE5EOV9aSjlQVm52U3BsWWR2aGl2VEE&hl=en_GB#gid=78

12 Comments

Graphs can be a powerful way to represent relationships between data, but they are also a very abstract concept, which means that they run the danger of meaning something only to the creator of the graph. Often, simply showing the structure of the data says very little about what it actually means, even though it’s a perfectly accurate means of representing the data. Everything looks like a graph, but almost nothing should ever be drawn as one. Ben Fry in ‘Visualizing Data

Flickr Tag Error: Call to display photo '5446711248' failed.

Error state follows:

  • stat: fail
  • code: 95
  • message: SSL is required
I got that quote from Dan Brickley’s post Linked Literature, Linked TV – Everything Looks like a Graph and like Dan I think Ben Fry has it spot on. When I started following Tony’s work on network analysis (here’s a starting point of posts), my immediate response was ‘Where’s Wally?’, where was I in relationship to my peers, who was I connected to, or even who wasn’t I connected to.

As I start my exploration of tools like NodeXL it's very clear that being able to filter, probe and wander through the data provides far more insights to what’s going on. This is why when I, and I’m sure Tony as well, show our tangled webs it’s designed as a teaser to inspire you to follow our recipes and get stuck into the data yourself. This isn’t however always practical.

imageA recent example of this was when I was looking through the Guardian’s Using social media to enhance student experience seminar #studentexp. I’d captured the #studentexp tagged tweets using my TAGS spreadsheet, used my recipe get sentiment analysis from ViralHeat and imported the data into NodeXL to start exploring some of the tweets and conversations from the day.

 

But what does this graph actually mean? I could start highlighting parts of the story, but that would be my interpretation of the data. I could give you the NodeXL file to download and look at, but you might not have this software installed or be proficient at using it. I could try looking at the raw data in the Google Spreadsheet, but it lacks ‘scanability’. So I’ve come up with a halfway house. A re-useable interface to the TAGS spreadsheet which starts presenting some of the visual story, with interactivity to let you drilldown into the data. I give you:

*** TAGSExplorer ***

TAGSExplorer
http://hawksey.info/tagsexplorer/?key=0AqGkLMU9sHmLdDJYMDZYR3FUcnVwWTkwLWpScnFIUXc&sheet=ob7&mentions=true

What is TAGSExplorer?

TAGSExplorer is a result of a couple of days code bashing (so a little rough around the edges) which mainly uses the DataTable part of the Google Visualization API to read data from a TAGS spreadsheet and format it to use with d3.js graphing library. By chucking some extra JavaScript/JQuery code (partly taken from johman’s Twitter Smash example) I’ve been able to reformat the raw Twitter data from the Google Spreadsheet and reformat it returning Twitter functionality like reply/retweet by using their Web Intents API.

What is displayed:

  • A node for each Twitterer who used the #studentexp hashtag and is stored in the spreadsheet archive.
  • Solid lines between nodes are conversations eg @ernestopriego tweeted @easegill I agree completely. Learning how to use social media tools is part of digital literacy and fluency; part of education. #studentexp  creating a connection between @ernestopriego and @easegill.
  • Dotted lines are not direct replies but mentions eg @theREALwikiman tweeted “If you're an academic librarian it might be worth following @GdnHigherEd's #studentexp tag right now, if you have time. Interesting stuff.” For performance by default these are turned off but enabled by following the instructions below.
  • Node text size based on he number of @replies and @mentions

How to make your own?

  1. If you haven’t already you need to capture some tweets into a TAGS spreadsheet
  2. When you have some data from the spreadsheet File > Publish to the web …
  3. Head over to TAGSExplorer and enter you spreadsheet key (or just paste the entire spreadsheet url HT to Tony Hirst for this code)
  4. Click ‘get sheet names’ and select the sheet of the data you want to use (if you are doing a continuous collection the default is archive)
  5. Click ‘go’
  6. If you want to share with others, click the ‘link for this’ at the top right which gives you a permanent url – the permanent link also hides the spreadsheet selection interface. By default mention lines are off but can be enabled by adding &mentions=true to the link (see example above)

Some examples

If you don’t have your own data yet here’s some examples from data I’ve already collected:

Where next?

I’ve got some ideas, I’m interested in integrating the sentiment scores from ViralHeat, but more importantly where do you think I should go next with this?