Keep your Twitter Archive fresh on Google Drive using a bit of Google Apps Script

Update: There is a newer take on this solution which dumps Google Drive web hosting and uses Github Pages instead. You can read more in Keeping your Twitter Archive fresh and freely hosted on Github Pages.

Important: A noted in the comments Google announced last year that it is killing Google Drive web hosting which this solution replies on. The good news is I’ve come up with a solution that lets you keep your archive on the web. If you’ve got an existing archive the only change is a new url  https://script.google.com/macros/s/AKfycbwrXr8ejYjHwGEO6kj8f4WHIh096ARDRHdNOgAXPqGltoa80FU/exec?folder_id=YOURFOLDERID (e.g. my archive is at https://script.google.com/macros/s/AKfycbwrXr8ejYjHwGEO6kj8f4WHIh096ARDRHdNOgAXPqGltoa80FU/exec?folder_id=0B6GkLMU9sHmLRFk3VGh5Tjc5RzQ) Your archive is be a bit slower to load and will have a bar at the top to say it’s not a Google application. I’ve updated the template to include the new url.

Twitter Archive interfaceLike a growing number of other people I’ve requested and got a complete archive of my tweets from Twitter … well almost complete. The issue is that while Twitter have done a great job of packaging the archives even going as far as creating a search interface powered by HTML and JavaScript as soon as you’ve requested the data it is stale. The other issue is unless you have some webhosting where can you share your archive to give other people access.

Fortunately as Google recently announced site publishing on Google Drive by uploading your Twitter archive to a folder and then sharing the folder so that it’s ‘Public on the web’ you can let other people explore your archive (here’s mine). Note: Mark Sample (@samplereality) has discovered that if you have file conversion on during upload this will break your archive. [You can also use the Public folder in Dropbox if you don’t want to use a Google account]

The documentation wasn’t entirely clear on how to do this. Basically it seems that as long as there’s a index.html file in the folder root and links to subdirectories are relative all you need to do is open the folder in Google Drive and swap the first part of the url with https://googledrive.com/host/ e.g. https://drive.google.com/#folders/0B6GkLMU9sHmLRFk3VGh5Tjc5RzQ becomes https://googledrive.com/host/0B6GkLMU9sHmLRFk3VGh5Tjc5RzQ/

So next we need to keep the data fresh. Looking at how Twitter have put the archive together we can see tweets are stored in /data/js/tweets/ with a file for each months tweets and some metadata about the archive in /data/js/, the most important being tweet_index.js.

Fortunately not only does Google Apps Script provides an easy way to interface Drive and other Google Apps/3rd party services but the syntax is based on JavaScript making it easy to handle the existing data files. Given all of this it’s possible to read the existing data, fetch new status updates and write new data files keeping the archive fresh.

To do all of this I’ve come up with this Google Spreadsheet template:

*** Update Twitter Archive with Google Drive ***
[Once open File > Make a copy for your own copy]

Important: Google have changed some of their backend. If you took a copy of this template prior to 8th Dec 2014 the script will fail with a Script is using OAuthConfig, which has been shut down. Learn more at http://goo.gl/IwCSaV (line 277, file "Code"). To get your archive running again the best solution is to copy the template and setup using your existing Drive folder and API key/secret.

Billy has spotted an inconsistency with the way Drive searches for the files used in the script. This has been addressed in an update to the code but existing news will need to either take and setup a fresh copy or open their existing copy and then open Tools > Script editor and replace the code with version here.

Note: There is currently an open issue which is producing the error message ‘We’re sorry, a server error occurred. Please wait a bit and try again.’ Hopefully the ticket will be resolved soon

The video below hopefully explains how to setup and use (Update The script uses a new authentication flow which makes setup easier (and API registration is not necessary if you already use TAGS v6.0):

A nice feature of this solution is that even if you don’t publically share your archive, if you are using the Google Drive app to syncs files with your computer the archive stays fresh on your local machine.

The model this solution uses is also quite interesting. There are a number of ways to create interfaces and apps using Google Apps Script. Writing data files to Google Drive and having a static html coded based interface is ideal for scenarios like this one where you don’t rely on heavy write processes or dynamic content (aware of course that there will be some sanitisation of code).

It would be easy to hook some extra code to push the refreshed files to another webserver or sync my local Google Drive with my webhost but for now I’m happy for Google to host my data ;s

171 Comments


  1. I tried making a copy of the spreadsheet, but that option was not available to me.


    1. Hi Jedidiah, first thing to check is you are logged in to your google account when you open the spreadsheet


      1. Hi Martin – I was logged into my google account. I have tried it since and been able to make a copy. Thanks.


  2. I’m not getting any action, have triple checked settings. U’ve entered my folder ID twice but the status says “Folder ID not entered yet” When I runt he script, it just stays at Running script updateArchive – it looks like its not getting my folder ID?


    1. Think it was my dodgy coding not updating the location on the readme. If you open the Log sheet has it recorded any updates?


    1. [closed – for some reason authentication with Twitter didn’t work 1st time. Added some extra UI to help]


  3. Looking for a link to archiving twitter on a website (other than on Google Drive)…




  4. hi

    am getting Unexpected error: (line 35) when i try to authorize function?

    var result = UrlFetchApp.fetch(“http://api.twitter.com/1.1/account/verify_credentials.json”, requestData);

    thanks in advance


    1. Sounds like you API key/secret are not set properly. Make sure there’s so spaces before/after them when you enter them into the configuration dialog (I’ve added a trim to prevent this in the future)


      1. hi again

        used your new spreadsheet template and now getting a different error message says:

        Apps Script

        Unexpected error:

        have tripled checked the api key/secret.
        and ideas?

        thanks


        1. ak ignore last message working now, i forgot to add the callback url, doh!
          thanks again great script!


  5. Holy smokes, this is great! I had trouble getting it to work at first, and I wanted to share the solution, in case your readers encounter the same problem. Depending upon one’s Google Drive settings, your Twitter archive folder might be converted to “Google Docs” format during the uploading process. This breaks your files and makes them unreadable (as I found out!). So make sure that in your Google Drive upload settings, you have all the file conversion rules turned off.

    Here’s my own now-working archive: https://googledrive.com/host/0By7OircJ9labZktFR0xkT1ExUXc/


    1. Thanks Mark! I’ve added a couple of notes in the template/post to highlight this.



  6. Wow, great. That even worked for a coding/web-admin dummy like me. Thank you so much. Because I was too stupid to set up Tweetnest.


    1. In Tools > Script Editor when you then clicked Run > authorize did any error msgs come up?


  7. Hi Francisco George,

    I had the same problem – if you do as Martin suggested and Run > authorize part of the script, you’ll get a dialog box (can’t remember what it said now) with a button. If you click the button, it takes you off to Twitter, and you’ll be able to authorize the script there.

    It fixed the problem for me at least – hope it works for you too.

    Nice work Martin!


  8. Hi Martin,

    Thanks a lot, it worked!

    I somehow din’t quite understood this step…and now with your hint everything worked fine.

    You should maybe submit your solution to Twitter, it would ease their load if by having this solution implemented as they would not have every one downloading their archive too frequently.

    Very nice work!

    Francisco

    PS: is there anyway to change the frequency of the automatic update and set it for a for a every 8 hours update for example?


    1. UX isn’t my strong point. I’m not sure what Twitter would make of this ;s

      [You can update frequency by opening Tools > Script Editor and then Resources > Current script triggers. In the dialog box you should see a function set to fun daily. Just adjust the frequency from one of the drop downs and save. I wouldn’t recommend going any lower than every 15 min to avoid api/usuage limits]


      1. Hi Martin, sorry if I’m being dense but what is the current update frequency set at – daily? I tried the route you mention above Tools > Script Editor > Resources – I don’t have ‘current script triggers’, instead I get the option ‘current project triggers’. The dialog box this brings up shows one trigger but it isn’t editable at all and I can’t see anywhere that mentions update frequency

        It’s not the end of the world – I would be happy with once a day if this is what it is

        Btw, love it. Thanks very much :)


        1. Hi Patrick – in the spreadsheet there should be a Sync Twitter Setup menu where you can toggle the auto-refresh


          1. Hi Martin, thanks I found that in the end but thanks for getting back to me. Very chuffed with this – great bit of work, well done :)



  9. Is the formatting of the added (updated) tweets exactly the same as the original Twitter Archive? I had a script written to parse the original archive JSON by stripping the first line (per the README in the archive) but that seems to fail with the updated archive files. It seems that in the original archives, the first line reads as “Grailbird.data.tweets_2013_01 = ” whereas yours outputs “Grailbird.data.tweets_2013_01 = [” with that opening bracket appearing on the first line rather than the second. It’s probably a simple fix to your script but it does break JSON parsing if the first line is stripped.


    1. ah hadn’t spotted that. I’ve updated the template. The changes are line 109 which is now

      tweet_file.replace(“Grailbird.data.”+var_name+” = \n”+ JSON.stringify(Grailbird.data[var_name], null, ‘\t’)); // replace old file

      and line 116 which is now

      new_tweet_file.replace(“Grailbird.data.”+i+” = \n”+ JSON.stringify(newData[i], null, ‘\t’)); // replace content with new data


  10. So I think something may be broken now that a new month has rolled around. The script generated a “2013_02.js” file in the root of my Google Drive rather in the appropriate directory which resulted in future updates failing.


      1. Very weird. I moved the file back into the appropriate location and everything seems to be working fine. Guess I’ll find out next month if it’s actually a problem.


    1. but you are right too, the file “2013_02.js” appears in my root too but along it, it says it is located in the “Tweets” folder(the public folder I use)


      1. Appeared to work okay for me. Like @paco229 the new file appeared in the root directory and data>js>tweets (which is surprising as I thought Drive had done away with files having multiple labels/locations). I’ve added an extra line in the template to prevent this. For existing copies of the script insert the following line after line 115:

        new_tweet_file.removeFromFolder(DocsList.getRootFolder());

        so it should look like:

        new_tweet_file.addToFolder(tweetsFolder); // move new file to the data/js/tweets/ folder
        new_tweet_file.removeFromFolder(DocsList.getRootFolder());




  11. hi

    my archive is giving me a 404 not found error.

    when i try to manually update archive using your script i get this error message:

    TypeError: Cannot call method “getContentAsString” of undefined

    thanks in advance!


      1. hi
        had to eventually upload new Twitter folder, and set share permissions to “Public on the web – Anyone on the Internet can find and view”

        thanks


    1. Hi Willem, search is done via the browser (no clever indexing). When I tried doing a search in your archive the console said the file data/js/tweets/2008_05.js was missing
      Thanks,
      Martin


      1. Thanks so much, you found the bug! Actually, the file was not missing, but existed two times! Ni idea how that happened. May be a hickup whith ftp.
        Anyways, search works now thanks to your attention.
        Lots of people here in the Netherlands tried your script already. It’s been retweeted quite a lot!


  12. Hi Martin

    I’m getting this message from [email protected] on intervals of one hour(my refresh rate)

    26/02/13 18:51 updateArchive We’re sorry, a server error occurred. Please wait a bit and try again. (línea 72, archivo “Code”) time-based 26/02/13 18:51

    Any idea of the problem?

    Thanks



  13. When starting “update Archive Now” I’m getting to see (in Dutch) an error message telling me that function getFolders cannot be found in object false. There the script stops. What’s going wrong ?


  14. How to fix this Twitter Api not configured???


    1. did you get a twitter api key and secret? If you did, did you add them via the Twitter sync menu? If you did, did you open Tools > script editor and run authenticate?


      1. Yes, I took all these steps without getting any error message, everything went smooth as far as these steps are concerned.


          1. what happens if you run the authenticate function again? (In particular does a dialog box appear taking you to Twitter to authorise the connection or nothing)


          2. Harmen – I’ve confused myself my responses were supposed to be for Mike. Your problem sounds like all the folders/files aren’t uploaded to Drive properly


      2. Sorry, I don’t know how to get the key and secret, I stuck there. I followed the instructions on video, by clicking authorize and run, but it appears a box – Twitter Api not configured.


        1. In the spreadsheet menu bar there should be a Sync Twitter Setup menu option. From it click API Authentication for instructions on setting up the API connection


          1. Thanks a lot! I just realized I haven’t create that things, that’s the reason I can’t authorize the app. Everything is fine and my twitter archive is able to be updated now. Thank you again. =)


          2. Between, it only updated the latest tweets, how about some old tweets that I have deleted, but it still appear on my archive?


  15. Hi, Martin. Great script and easy to set up. Thank you for sharing it. I credit you on my website.



  16. Is it possible to edit the title tags within the index.html & how would I go about using a bespoke address instead of the long googledrive address.

    Great script and so easy to set up.

    Thanks

    Adam


    1. Hi Adam, I’ve done some minor editing of the index.html for my own archive but haven’t dug that deep. I also use a redirect on my webserver http://tweets.hawksey.info (doesn’t cloak the address, I’m sure there are ways of doing it)



  17. Hi Martin,

    I’ve tried this a few times, followed step by step and to a T, but every time I get an “oAuth Error” after running the “authorize” script. Any ideas?


    1. i had the same problem. right click on “My Drive/tweets” folder
      click “Details”
      then edit the Sharing visibility and change the folder from private to public.
      if u succeeded you’ll see a small word beside your folder name “shared”
      then its done


  18. Running this since weeks and I have to say I’m very satisfied with it. The search is great and really works. But there’s one question I worry about: Will this still works, when Twitter will change to its 1.1 API completely?


    1. Fear not – when I wrote the script I was aware of the impending changes so it entirely uses version 1.1 of the API


  19. I hope this isn’t a cheeky question, but does anyone know how I would go about putting my own Twitter background on this for when people are searching tweets? At the moment it’s pretty uninspiring and blank. Thanks


    1. Easiest way is to change the img/bg.png image. For more tweaking most of the styling is done in the css/application.min.css file. It’s minified so not easy to navigate (sites like http://procssor.com/process can decompress it if you are doing lots of edits)


      1. Phew, you were right about it being minified! Thanks Martin I’ll have a bash and thanks too for keeping up with everyone’s comments, it’s great style. Much appreciated



  20. Hi, it’s me again! I would like to ask you whether that’s a way to remove DELETED tweets on twitter archive from google drive?


    1. Easiest way is to re-request archive from Twitter and replace the files. The script will just continue updating the data files it has available


  21. Hi Martin,

    Since this morning I’m getting this error report. Any advice?

    Details:

    Inicio Función Mensaje de error Activación Fin
    16/03/13 19:51 updateArchive TypeError: No se puede leer la propiedad “id_str” de undefined. (línea 95, archivo “Code”) time-based 16/03/13 19:51


    1. The error code from Appscript has changed 3 hours ago now it states

      17/03/13 6:51 updateArchive ReferenceError: “tweet_index” no está definido. (línea 80, archivo “Code”) time-based 17/03/13 6:51


      1. Hmm looks like your archive files have become corrupted (I’m guessing a write process wasn’t completed properly). Easiest solution is to request your archive from Twitter again and replace the files on drive. If you use the same folders setting up the script again is not required


    1. Hi Rodney – I know a number of graduate researchers swing by this blog and I’m sure they find it a valuable resource :)
      Martin





  22. Hi, Thanks for this wonderful guide. I have some issues though. When I get to the API Configuration step, I am not sure what to put in for the Application Details Page. I see: Name, Description, Website, and Callback URL. What is the website we should set to?

    Also will this work with a private twitter account? I don’t want to share my tweets and want my own personal archive.

    Thank you!


  23. Hey, I have downloaded twitter archive several times, it was complete. However, my recent download was incomplete, it started at my latest 3000 tweets. Why it happened to me?


        1. This sounds like a question for Twitter. I’ve no control over what is in your sorehead from them



  24. I’ve just figured out how to create a twitter archive page hosted by Google Drive thanks to your great article and video. Just a quick question- is it possible to insert/include the actual web page- the one that the twitter archive is hosted on- in a new Google Document or Spreadsheet on Drive? Thanks for any advice!


  25. When starting “update Archive Now” I’m getting to see (in Dutch) an error message telling me that function getFolders cannot be found in object false. There the script stops. What’s going wrong?


  26. dear martin
    I tried above and seemed to get things working but not viewable. what i mean is that the twitter data was upadating each time I tweeted – the new tweets were supposedly being added but I couldn’t open the archive to see these no matter what I tried.
    I reloaded the archive up again and even upgraded my google apps account to make it possible to change settings of folders/files to public (did you know you could not do that on a google apps free edition?). I am now getting the same error as Seungow
    Cannot find function getFolders in object false. – despite going through all the steps of the spreadsheet.
    If you could shed any light on this I’d be grateful. Not so au fait with scripts unfortunately.
    thanks, Helen


  27. please forget my previous problem. everything is working now and i think it’s amazing. have to get my head around how great this is and how much it can help my clients.
    you are a genius!
    thanks….


  28. This is absolutely one of the most useful tools I use. Thanks!

    Has anyone tried using this system to add additional twitter handles to the tweets that are displayed? I currently archive my own tweets, but want to add my family members in one central archive. I may try on my own, but thought I’d see what other people thought of my chances before I destroy my current archive. ;-)


    1. From recollection it is feasible but would require rewriting the collection routine. Another way to do it might be to use this with the search operator ‘from:mhawksey OR from:dhkeller’ etc. (without quotes) The display of data however isn’t as pretty and it won’t be a fully archive


  29. Thanks for the fabulous tool, been using it for the last few months & it was working like a charm..

    Now for the past couple of days, I’ve been getting this error emails from Google drive

    Your script, Copy of Sync Twitter Archive v1.0, has recently failed to finish successfully. A summary of the failure(s) is shown below. To configure the triggers for this script, or change your setting for receiving future failure notifications, click here.
    The script is used by the document Update/Host Twitter Archive with Google Drive.
    Details:
    Start Function Error Message Trigger End
    11/20/13 12:11 AM updateArchive TypeError: Cannot call method “getContentAsString” of undefined. (line 73, file “Code”) time-based 11/20/13 12:11 AM
    Sincerely,
    Google Apps Script

    I went into the spreadsheet & tried to do a manual refresh, it doesn’t work with the same error. I am no expert by any means but it seems either the Twitter API or Google Apps API has changed?

    I’m just asking if you have encountered the same error and if you plan to update the sheet to fix it. :)

    Thanks and keep up the good work.


    1. Hi Michael, I know of one other recent case similar to yours. It looks like a glitch Google’s end. Usually it appears to clear itself and start working again. One thing you can do is search your Google Drive for a file called user_details.js. If this is missing or corrupt the script won’t work. The fix is to get a fresh copy of you archive from twitter and replace the file (it’s in the data/js folder).
      Thanks,
      Martin


      1. Heya Martin. Thanks again for this excellent resource. It’s really amazing, and I love being able to host an automatically updated archive of all my tweets.

        I’m having the same problem as Michael. I’ve tried replacing user_details.js, as you suggest, but I’m still receiving the same error. Is there something else I can try, or should I just delete everything and start again from step one, hoping that a fresh go will fix this odd and out-of-nowhere problem?


  30. Hey Martin,

    This has been working great all year, and about 5 days ago I started to get this error when it tries to update:

    ‘TypeError: Cannot call method “getContentAsString” of undefined’

    I’ve requested a new backup from twitter but it seems to be taking longer than usual. Any ideas what’s going on?

    So far I have deleted my old set of files and tired starting from fresh but I still get the same issue:|

    any help would be appreciated.

    Cheers,

    Mikey.

    PS. Would there be any way to get it to download any images we post? I understand there might be storage issues after a while, just curious


    1. Hi there,
      I have the same problem as Mickey and Michael. “TypeError: Kan methode getContentAsString van undefined niet aanroepen. (regel 73, bestand ‘Code’)” Is there a fix yet?


      1. *UPADTE*

        Right, I’ve’ve finally received a new archive from twitter. I deleted everything and started again. Same issue persists. Over 4 weeks now so I’m not sure is it’s a permanent thing with drive or not :(


      2. Hey Everyone having the “TypeError: Cannot find function getContentAsString in object” error, I figured I would put that string in for people searching for this FIX.

        First off, thank you Martin for the amazing script and idea, I love my Twitter archive, but just like many people I’ve seen commenting, it hasn’t been working for a little while now. I was having the error absolutely every time, so I decided to try some things out, and I figured out the cause of the problem, and a workaround for now, but I thought I would post things here to see if we could fix it properly.

        Around line 70 of Code.gs, there are the json calls to bring in the data from the existing archive, starting with this:
        var user_details_file = js.find(“title:user_details.js”)[0];

        The issue is, for some reason, the built-in .find() function is not liking the ‘title’ part of the query. If you remove ‘title’ from those three .find functions, and the one further down (around Line 90, just look for ‘find’ and ‘title’ near each other), the queries still complete, and the archive works properly again!

        Any idea why Google Scripts isn’t accepting the ‘title:’ argument for a search anymore?


        1. Billy – Big thanks for taking the time to look at this and finding the bug. Seems like this is inconsistent as my archive has been happily updating using the tite: search. I guess it must be the way Google indexes the search in Drive.

          I’ve updated the code to try an take account of this (untested). Unfortunately this will only take affect for new copies of the script. Existing users experiencing problems should either take a fresh copy or open the new version then open Tools > Script editor and copy the code across to their own version.

          Thanks again,
          Martin


          1. Looks like a good solution to me, but it looks like the .find on line 92 of your Gist is going to toss the same error. Since I’m finally playing around with git again after a little while, I quickly forked your code:
            https://gist.github.com/WillPresley/7772938/revisions

            Since I was having the issue constantly, I figure I’m a good test-case, and your updated code (with my tiny edit) works perfectly!


    2. I’d also be very curious about how to get the Twitter images pulled. The self-hosted archive is definitely cool, but as long as the images themselves are still hosted on Twitter rather than in my GoogleDrive, the “archive” is incomplete. Anybody gets it working, let me know! :-D


  31. @Billy – well done mate. Went in and edited it manually (been years since I’ve gone anywhere near a piece of code) and it worked a treat.

    @Martin – I didn’t realise how much I rely on this to search old tweets. A big thank you for spending the time figuring out how to do it. I’ve been looking into ways of getting uploaded images (twitpic/yfrog/twitter etc.) and archiving them. Doesn’t appear to be any where near as simple as I thought it would be. Think I’ll leave it until I have some time, over Xmas maybe.

    Anyway thanks again to the both of you.

    Mikey


  32. Hi Martin,

    After a looooong time working smoothly since 24h I get every hour(my refresh rate) this message from Google:”TypeError: No se puede llamar al método “getContentAsString” de undefined. (línea 73, archivo “Code”)”

    Is it a google error once again?


  33. @Martin,
    Since two days I get an error each from Google that the script fails.
    What can I do to repair it?
    I really love my archive and use it a lot, so I hope this is temparary?
    Hope (and trust:) you can help me out…