A flare went up from Hamburg this morning from Tony Hirst who is at the Daten, Recherchen, Geschichten (2012) conference (#drg12) in Hamburg:
Anyone know of twitter streaming search clients that run all tweets through language translation?
— Tony Hirst (@psychemedia) March 24, 2012
Translating tweets is something I’ve dabbled with before as part of experiments with my Twitter Subtitling tool iTitle (really must revisit this). Notable examples were the Google I/O 2010 Android Keynote and presentations from #UIMPUni20 talked about in Kirsty Pitkin’s Lost In Translation blog post (because of domain shuffles most of the links are broken but here is Professor Alejandro Piscitelli talk at #UIMPUni20 with tweets normailsed to English).
Both these examples used the Google Translate API to convert tweets from one language to another. Back then the API was free for anyone to use but in August 2011 Google switched it to a paid for service … doh. All is not lost though as Google Apps’ programming environment Google Apps Script still has a ‘Language Service’ <cough>Google Translate</cough>.
The disadvantage you have with this service is unlike Translate it doesn’t have an option to autodetect the source language. [Update: Re-read documentation and clearly states it can auto detect. Probably still not a bad idea to use iso codes] This is not a problem when dealing with Twitter as the metadata includes ‘iso_language_code’.
So I have a solution to archive tweets in a Google Spreadsheet using Google Apps Script (must write more about v4) … ponder, ponder, 5 minutes later by adding:
objects[i]["text_en"] = LanguageApp.translate(objects[i]["text"], objects[i]["iso_language_code"], "en");
I’ve got a spreadsheet archiving #drg12 tweets and translating the text into English. Impressed much?