Tag Archives: oerri

The JISC OER Rapid Innovation programme is coming to a close and as the 15 projects do their final tiding up it’s time to start thinking about the programme as a whole, emerging trends, lesson learned and help projects disseminate their outputs. One of the discussions we’ve started with Amber Thomas, the programme manager, is how best to go about this. Part of our support role at CETIS has been to record some of the technical and standards decisions taken by the projects. Phil Barker and I ended up having technical conversations with 13 of the 15 projects which are recorded in the CETIS PROD Database in the Rapid Innovation strand. One idea was see if there were any technology or standards themes we could use to illustrate what has been done in these areas. Here are a couple of ways to look at this data.

To start with PROD has some experimental tools to visualise the data. By selecting the Rapid Innovation strand as selecting ‘Stuff’ we get this tag cloud. We can see JSON, HTML5 and RSS are all very prominent. Unfortunately some of the context is lost as we don’t know without digging deeper which projects used JSON etc. 

PROD Wordcloud

To get more context I thought it would be useful to put the data in a network graph (click to enlarge).

NodeXL Graph

NodeXL Graph - top selectedWe can now see which projects (blue dots) used which tech/standards (brown dots) and again JSON, HTML5 and RSS are prominent. Selecting these (image right) we can see it covers most of the projects (no. 10), so these might be some technology themes we could talk about. But what about the remain projects?

As a happy accident I put the list of technologies/standards into Voyant Tools (similar to Wordle but way more powerful – I need to write a post about it) and got the graph below:

Voyant Tools Wordcloud

Because the wordcloud is generated from words rather than phrases the frequency is different: : api (16), rss (6), youtube (6), html5 (5), json (5). So maybe there is also a story about APIs and YouTube.

Share this post on:
| | |
Posted in JISC, JISC CETIS and tagged on by .

5 Comments

As part of the JISC OER Rapid Innovation Programme we’ve been experimenting with monitoring project blogs by gluing together some scripts in Google Spreadsheets. First there was Using Google Spreadsheets to dashboard project/course blog feeds #oerri which was extended to include social activity around blog posts.

As the programme comes to a close projects will soon be thinking about submitting their final reports. As part of this projects agreed to submit a selection of their posts with a pre-identified set of tags shown below as a MS Word document. 

tag

structure

projectplan

detailed project plan, either in the post or as an attachment

aims

reminder of the objectives, benefits and deliverables of your project

usecase

link to / reproduce the use case you provided in your bid

nutshell

1-2 paragraph description in accessible language, an image, a 140 character description [1 post per project]

outputs

update posts on outputs as they emerge, with full links/details so that people can access them

outputslist

end of project: complete list of outputs, refer back to #projectplan and note any changes  [1 post per project]

lessonslearnt

towards of the end of the project, a list of lessons that someone like you would find useful

impact

end of project: evidence of benefits and impact of your project and any news on next steps

grandfinale

this is the follow up to the nutshell post. a description in accessible language, and a 2 minute video [1 post per project]

 

OERRI DashboardWhen this was announced at the programme start-up concerns were raised about the effort to extract some posts into a document rather than just providing links. As part of the original experimental dashboard one thing I had in mind was to automatically detect the tag specific posts and highlight which had been completed. Having got the individual post urls it hasn’t been too hard to throw a little more Google Apps Script to extract the content and wrap in a MS Word document (well almost – if you have some html and switch the file extension to .doc it’ll open in MS Word). Here’s the code and template to do it:

And here are the auto-generated reports for each project:

Project posts (Est). PROD url Generated Report url Comments
Attribute images 2 http://prod.cetis.ac.uk/projects/attribute-image   No tagged posts
bebop 14 http://prod.cetis.ac.uk/projects/bebop Report Link  
Breaking Down Barriers 10 http://prod.cetis.ac.uk/projects/geoknowledge Report Link  
CAMILOE 1 http://prod.cetis.ac.uk/projects/camiloe   No tagged posts
Improving Accessibility to Mathematics 15 http://prod.cetis.ac.uk/projects/math-access Report Link  
Linked data approaches to OERs 15 http://prod.cetis.ac.uk/projects/linked-data-for-oers Report Link Partial RSS Feed
Portfolio Commons 10 http://prod.cetis.ac.uk/projects/portfolio-commons Report Link  
RedFeather 18 http://prod.cetis.ac.uk/projects/redfeather Report Link  
RIDLR 7 http://prod.cetis.ac.uk/projects/ridlr Report Link Not WP
sharing paradata across widget stores 10 http://prod.cetis.ac.uk/projects/spaws Report Link  
SPINDLE 17 http://prod.cetis.ac.uk/projects/spindle Report Link  
SupOERGlue 6 http://prod.cetis.ac.uk/projects/supoerglue Report Link Not WP
synote mobile 16 http://prod.cetis.ac.uk/projects/synote-mobile Report Link  
TRACK OER 12 http://prod.cetis.ac.uk/projects/track-oer Report Link Not WP
Xenith 4 http://prod.cetis.ac.uk/projects/xenith Report Link  
  157      

Issues

I should say that these are not issues I have with the OERRI projects, but my own issues I need to solve to make this solution work in a variety of contexts.

  • Missing tags/categories – you’ll see the dashboard has a number of blanks. In some cases it’s not the projects fault (as the majority of projects used WordPress installs it was easier to focus on these), but in other cases projects mix tags/categories or just forget to include them
  • Non-WordPress – 3 of the projects don’t use WordPress, so other ways to grab the content are required
  • RSS Summary instead of full feed – ‘Linked data approaches to OERs’ uses a summary in their RSS feed rather than full-text. As this script relies on a full text feed it can’t complete the report (one of my pet hates is RSS summary feeds – common people you’re supposed to be getting the word out, not putting up barriers.)

Hopefully it’s not a bad start and if nothing else maybe it’ll encourage projects to sort out their tagging. So what have I missed … questions welcomed.

4 Comments

Warning: Very techie post, the noob version is here Google Spreadsheet Template for getting social activity around RSS feeds HT @dajbelshaw

In my last post announcing a Google Spreadsheet Template for getting social activity around RSS feeds I mentioned it was built from spare parts. It came out of on-going work at CETIS exploring activity data around CETIS blogs (including this one) and JISC funded projects. In this post I’ll highlight some of the current solutions we are using and how they were developed.

To put this work into context it aligns with my dabbling's as a data scientist.

A data scientist … is someone who: wants to know what the question should be; embodies a combination of curiosity, data gathering skills, statistical and modelling expertise and strong communication skills. … The working environment for a data scientist should allow them to self-provision data, rather than having to rely on what is formally supported in the organisation, to enable them to be inquisitive and creative. [See this post by Adam Cooper for the origins of this definition]

It’s perhaps not surprising that the main tool I use for these explorations with data is Google Spreadsheets/Google Apps Script. The main affordances are a collaborative workspace which can store and manipulate tablature data using built-in and custom functionality. To illustrated this let me show how easy it is for me to go from A to B.

Example 1: OER Rapid Innovation Projects – Blog post social activity

I’ve already posted how we are Using Google Spreadsheets to dashboard project/course blog feeds #oerri. For this we have a matrix of project blog feeds and a set predefined categories/tags. As projects make different posts a link to them appears on the dashboard.

OERRI Project Post Directory

As part of the programme completion JISC are piloting a new project survey which will include reporting the ‘reach of the papers/articles/newsitems/blogposts out of the project’. Out of curiosity I wanted to see what social share and other activity data could be automatically collected from project blogs. The result is the page shown below, which is also now part of the revised OERRI Project Post Directory Spreadsheet. The important bit is the columns on the right with social counts from 10 different  social network services. So we can see a post by Xenith got 29 tweets, a post by bebop got 5 comments, etc.

Caveat: these counts might not include modified urls with campaign tracking etc …

OERRI Post Social Activity

How it works

The three key aspects are:

  • Getting a single sheet of blog posts for a list of RSS feeds.
  • Getting social counts (Likes, Tweets, Saves) for a range of services.
  • Allowing columns sorting on arrays of results returned by custom

Getting a single sheet of blog posts
You can use built in formula like importFeed to do this, but you are limited to a maximum 20 post items and you need to do some juggling to get all the results from 15 projects in one page. An alternative, which I quickly ruled out was fetching the rss xml using Google Apps Script and parsing the feed. The issue with this one is RSS feeds are usually limited to the last 10 posts and I wanted them all.

The solution was to use the Google Feed API (can’t remember if this is on Google cull list). The two big pluses from this are it can pull historic results (up to 100 items), and results can be returned in JSON which is easy to work with in App Script.

Here’s what the function looks like which is called as a custom formula in cell Data!A4 (formula used looks like =returnPostLinks(Dashboard!A3:B17,"postData")):

function returnPostLinks(anArray,cacheName){
  var output = [];
  var cache = CacheService.getPublicCache(); // using Cache service to prevent too many urlfetch 
  var cached = cache.get(cacheName);
  if (cached != null) { // if value in cache return it
    output = Utilities.jsonParse(cached);
    for (i in output){
      output[i][3] = new Date(output[i][3]); // need to recast the date
    }
    return output; // if cached values just return them
  }
  // else make a fetch
  var options = {"method" : "get"};
  try { 
    for (i in anArray){ // for each feed url
      var project = anArray[i][0]; // exact the project name from 1st col
      if (project != ""){ // if it's not blank
        var url = "https://ajax.googleapis.com/ajax/services/feed/load?v=1.0&num=100&q="+anArray[i][1];
        var response = UrlFetchApp.fetch(url , options);
        var item = Utilities.jsonParse(response.getContentText()).responseData.feed.entries; // navigate to returned items
        for (j in item){
          // for each item insert project name and pull item title, link, date, author and categories (other available are content snippet and content)
          output.push([project,item[j].title,item[j].link,new Date(item[j].publishedDate),item[j].author,item[j].categories.join(", ")]);
        }
      }
    }
    // cache the result to prevent fetch for 1 day
    cache.put(cacheName, Utilities.jsonStringify(output), 86400); // cache set for 1 day
    return output; // return result array
  } catch(e) {
    //output.push(e);
    return output;
  }
}

 

Getting social share counts

Collecting social share counts is something I’ve written about a couple of times so I won’t go into too much detail. The challenge is this project was to get results back from over 100 urls. I’m still not entirely sure if I’ve cracked it and I’m hoping more result caching helps. The issue is because the custom formula used in this appears in each row of column H on the Data sheet, when the spreadsheet opens it simultaneously tries to do 100 UrlFetches at the same time which Apps Script doesn’t like (Err: Service has been invoked too many times). {I’m wondering if away around this would be to set a random sleep interval (Utilities.sleep(Math.floor(Math.random()*10000))… just tried it and it seems to work}

Sorting columns when arrays of results are returned by custom formula

Sort order selectionWhen you use spreadsheet formula that write results to multiple cells sorting becomes a problem because the cells are generated using the CONTINUE formula which evaluates a result from the initiating cell. When you sort this reference is broken. The solution I use is to pull the data into a separate sheet (in this case Data) and then use the SORT formula in the ‘Post SN Count’ sheet to let the user choose different sort orders. The formula used in ‘Post SN Count’ to do this is in cell A4: =SORT(Data!A4:Q,D1,IF(D2="Ascending",TRUE,FALSE)). To try and prevent this being broken an ‘Item from list’ data validation is used with the ‘Allow invalid data’ unchecked.

Time to hack together: functional in 2hrs, tweaks/cache fixes 4hrs

Example 2: JISC CETIS Blog Post Dashboard

Having got social share stats for other peoples feeds it made sense for the JISC CETIS blogs. At the same time because I keep my blog on my own domain and Scott (Wilson) is on wordpress.com it was also an opportunity to streamline our monthly reporting and put all our page views in one place.

Below is a screenshot of this de-page viewed version of the CETIS dashboard (the code is all there I’ve just haven’t authenticated it with our various Google Analytic accounts). With this version the user can enter a date range and get a filtered view of the posts published in that period. Sort and social counts make a re-appearance with the addition of an ‘Examine Post Activity’ button. With a row highlighted, clicking this gives individual post activity (shown in the second screenshot). The code for this part got recycled into my last post Google Spreadsheet Template for getting social activity around RSS feeds.

CETIS Post Dashboard

Individual post activity

Getting to B

To get to this end point there are some steps to pull the data together. First the ‘Sources’ sheet pulls all our blog urls from the CETIS contacts page using the importHtml formula. Next on the ‘FeedExtract’ sheet the array of blog urls is turned into a sheet of posts using the same code in example 1. Social counts are then collect on the ‘Data’ sheet which is read by the Dashboard sheet.

A detour via page views

Having some existing code to get stats from Google Analytics made it easy to pass a url to one of two GA accounts and get stats back. Because Scott is on wordpress.com another way was required. Fortunately Tony Hirst has already demonstrated a method to pull data from the WordPress Stats API into Google Spreadsheets. Following this method stats are imported to a ‘Scott’sStats’ sheet and tallied up with a code loop (WordPress stats come in on per post, per day).

WordPress Stats API import

Here’s the custom formula code to return page views for a url:

function getCETISPageViews(url, startDate, endDate){
  // format dates
  var startDate = Utilities.formatDate(startDate, "BST", "yyyy-MM-dd");
  var endDate = Utilities.formatDate(endDate, "BST", "yyyy-MM-dd") ;
  // extract domain and path
  var matches = url.match(/^https?\:\/\/([^\/:?#]+)(?:[\/:?#]|$)/i);
  var domain = matches && matches[0];
  var path = url.replace(domain,"/");
  // switch for google analytic accounts
  var ids = false;
  if (domain == "http://mashe.hawksey.info/"){
    ids = ""//ga-profile-id  
  } else if (domain == "http://blogs.cetis.ac.uk/"){
    ids = ""//ga-profile-id;
  } else if (domain == "http://scottbw.wordpress.com/"){
    // code to compily stats imported to sheet
    // get all the values
    var doc = SpreadsheetApp.getActiveSpreadsheet();
    var sheet = doc.getSheetByName("Scott'sStats")
    var data = sheet.getRange(1, 4, sheet.getLastRow(), 2).getValues();
    var count = 0;
    // count on url match
    for (i in data){
      if (data[i][0] == url) count += data[i][1];
    }
    return parseInt(count);
  }
  if (ids){
    // GA get data using sub function
    return parseInt(getGAPageViews(startDate, endDate, path, 1, ids));
  }
}

Note: I use a custom Google Analytics function available in the spreadsheet script editor. There’s a new Analytics Service in Apps Script as an alternative method to connect to Google Analytics.

Time to hack together: functional in 3hrs, tweaks fixes 1hr

Summary

So there you go two examples of how you can quickly pull together different data sources to help us record and report the social reach of blog posts. You’ll notice I’ve conveniently ignored whether social metrics are important, the dangers of measurement leading to gaming, the low recorded social activity around OERRI projects. I look forward to your comments ;)

6 Comments

In the original JISC OER Rapid Innovation call one of the stipulations due to the size and durations of grants is that the main reporting process is blog-based. Amber Thomas, who is the JISC Programme Manager for this strand and a keen blogger herself, has been a long supporter of projects adopting open practices, blogging progress as they go. Brian Kelly (UKOLN) has also an interest in this area with a some posts including Beyond Blogging as an Open Practice, What About Associated Open Usage Data?

For the OERRI projects the proposal discussed at the start-up meeting was that projects adopt a taxonomy of tags to indicate keys posts (e.g. project plan, aims, outputs, nutshell etc.). For the final report projects would then compile all posts with specific tags and submit as a ms-word or pdf.

There are a number of advantages of this approach one of them, for people like me anyway, is it exposes machine readable data that can be used in a number of ways. In this post I’ll show I’ve create a quick dashboard in Google Spreadsheets which takes a list of blog RSS feeds and filters for specific tags/categories. Whilst demonstrated this with the OERRI projects the same technique could be used in other scenarios, such as, as a way to track student blogs. As part of this solution I’ll highlight some of the issues/affordances of different blogging platforms and introduce some future work to combine post content using a template structure.

OERRI Project Post Directory
Screenshot of OERRI post dashboard

The OERRI Project Post Directory

If you are not interested in how this spreadsheet was made and just  want to grab a copy to use with your own set of projects/class blogs then just:

*** open the OERRI Project Post Directory ***
File > Make a copy if you want your own editable version

The link to the document above is the one I’ll be developing throughout the programme so feel free to bookmark the link to keep track of what the projects are doing.

The way the spreadsheet is structured is the tags/categories the script uses to filter posts is in cells D2:L2 and urls are constructed from the values in columns O-Q. The basic technique being used here is building urls that look for specific posts and returning links (made pretty with some conditional formatting).

Blogging platforms used in OERRI

So how do we build a url to look for specific posts? With this technique it comes down to whether the blogging platform supports tag/category filtering so lets first look at the platforms being used in OERRI projects.

chart1This chart (right) breaks down the blogging platforms. You’ll see the most (12 of 15) are using WordPress in two flavours, ‘shared’, indicating that the blog is also a personal or team blog containing other posts not related to OERRI and ‘dedicated’, setup entirely for the project.

The 3 other platforms are 2 MEDEV blogs and the OUs project on Cloudworks. I’m not familiar with the MEDEV platform and only know a bit about cloudworks so for now I’m going to ignore these and concentrate on the WordPress blogs.

WordPress and Tag/Category Filtering

One of the benefits of WordPress is you can can an RSS feed for almost everything by adding /feed/ or ?feed=rss2 to urls (other platforms also support this, I a vague recollection of doing something similar in blogger(?)). For example, if you want a feed of all my Google Apps posts you can use http://mashe.hawksey.info/category/google-apps/feed/.

Even better is you can combine tags/categories with a ‘+’ operator so if you want a feed of all my Google Apps posts that are also categorised with Twitter you can use http://mashe.hawksey.info/category/google-apps+twitter/feed/.

So to get the Bebop ‘nutshell’ categorised post as a RSS item we can use: http://bebop.blogs.lincoln.ac.uk/category/nutshell/feed/

Looking at one of the shared wordpress blogs to get the ‘nutshell’ from RedFeather you can use: http://blogs.ecs.soton.ac.uk/oneshare/tag/redfeather+nutshell/feed/

Using Google Spreadsheet importFeed formula to get a post url

The ‘import’ functions in Google Spreadsheet must be my favourites and I know lots of social media professionals who use them to pull data into a spreadsheet and produce reports for clients from the data. With importFeed we can go and see if a blog post under a certain category exists and then return something back, in this case the post link. For my first iteration of this spreadsheet I used the formula below:

importFeed formula

This works well but one of the drawback of importFeed is we can only have a maximum of 50 of them in one spreadsheet. With 15 projects and 9 tag/categories the maths doesn’t add up.

To get around this I switched to Google Apps Script (macros for Google Spreadsheets I write a lot about). This doesn’t have an importFeed function built-in but I can do a UrlFetch and Xml parse. Here’s the code which does this (included in the template):

Note this code also uses the Cache Service to improve performance and make sure I don’t go over my UrlFetch quota.

We can call this function like other spreadsheet formula using ‘=fetchUrlfromRSS(aUrl)’.

Trouble at the tagging mill

So we have a problem getting data from none WordPress blogs, which I’m quietly ignoring for now, the next problem is people not tagging/categorising posts correctly. For example, I can see Access to Math have 10 post including a ‘nutshell’ but none of these are tagged. From a machine side there’s not much I can do about this but at least from the dashboard I can spot something isn’t right.

Tags for a template

I’m sure once projects are politely reminded to tag posts they’ll oblige. One incentive might be to say if posts are tagged correctly then the code above could be easily added to to not just pull post links but the full post text which could then be used to generate the projects final submission.

Summary

So stay tuned to the OERRI Project Post Directory spreadsheet to see if I can incorporate MEDEV and Cloudworks feeds, and also if I can create a template for final posts. Given Brian’s post on usage data mentioned at the beginning should I also be tracking post activity data on social networks or is that a false metric?

I’m sure there was something else but it has entirely slipped my mind …

BTW here’s the OPML file for the RSS feeds of the blogs that are live (also visible here as a Google Reader bundle)

As the JISC OER Rapid Innovation projects have either started or will start very soon, mainly for my own benefit, I thought it would be useful to quickly summarise the the technical choices and challenges.

Attribute Images - University of Nottingham

Building on the Xpert search engine which has a searchable index of over 250,000 open educational resources, Nottingham are planning a tool to embed CC license information into images.

The Attribute Images project will extend the Xpert Attribution service by creating a new tool that allows users to upload images, either from their computer or from the web and have a Creative Commons attribution statement embedded in the images. … It will provide an option for the user to upload the newly attributed images to Flickr through the Flickr API … In addition it will have an API allowing developers to make use of the service in other sites.

From the projects first post when they talk about ‘embedding’ CC statements it appears to be visible watermarking. It’ll be interesting if the project explore the Creative Commons recommended Adobe Extensible Metadata Platform (XMP) to embed license information into the image data. Something they might want to test is if the Flickr upload preserves this data when resizing. Creative Commons also have a range of tools to integrate license selection so it’ll be interesting to see if these are used or if there are compatibility issues.

Attribute Images Blog
Read more about Attribute Images on the JISC site

Bebop – University of Lincoln

Bebop is looking to help staff at Lincoln centralise personal resource creation activity from across platforms into a single stream.

This project will undertake research and development into the use of BuddyPress as an institutional academic profile management tool which aggregates teaching and learning resources as well as research outputs held on third-party websites into the individual’s BuddyPress profile. … This project will investigate and develop BuddyPress so as to integrate (‘consume’) third-party feeds and APIs into BuddyPress profiles and, furthermore, investigate the possibility of BuddyPress being used as a ‘producer application’ of data for re-publishing on other institutional websites and to third-party web services.

In a recent project post asking Where are the OERs? you can get an idea of the 3rd party APIs they will be looking at which includes Jorum/DSpace, YouTube, Slideshare etc. Talking to APIs isn’t a problem, after all that is what they are designed to do, and having developed plugins on WordPress/BuddyPress myself is a great platform to work on. The main technical challenge is more likely to be doing this on scale and the variability in the type of data returned. It’ll also be interesting if Bebop can be built with flexibility in mind (creating it’s own APIs so that it can be used on other platforms) – looks like the project is going down aggregating the RSS endpoint point route.

Bebop Blog
Ream more about Bebop on the JISC site

Breaking Down Barriers: Building a GeoKnowledge Community with OER

The proposed project aims to Build a GeoKnowledge Community at Mimas by utilising existing technologies (DSpace) and services (Landmap/Jorum). The aim of the use case is to open-up 50% (8 courses) of the Learning Zone through Creative Commons (CC) Attribution Non-Commercial Share Alike (BY-NC-SA) license as agreed already with authors. A further aim is to transfer the hosting of the ELOGeo repository to Jorum from Nottingham (letter of support provided by University of Nottingham) and create a GeoKnowledge Community site embedded in Jorum using the DSpace API and linking the repository to the Landmap Learning Zone. … The technical solution in developing a specific community site within Jorum will be transferable to other communities that may have a similar requirement in the future.

Still don’t feel I have an entire handle on the technical side of this project, but its early days and already the project is producing a steady stream of posts on their blog. One for me to revisit.

Break Down Barriers Blog
Read more about Breaking Down Barriers on the JISC site

CAMILOE (Collation and Moderation of Intriguing Learning Objects in Education)

This project reclaims and updates 1800 quality assured evidence informed reviews of education research, guidance and practice that were produced and updated between 2003 and 2010 and which are now archived and difficult to access. … These resources were classified using a wide range of schemas including Dublin core, age range, teaching subject, resource type, English Teaching standard and topic area but are no longer searchable or browsable by these categories. … Advances in Open Educational Resources (OER) technologies provide an opportunity to make this resource useful again for the academics who created it. These tools include enhanced meta tagging schemas for journal documents, academic proofing tools, repositories for dissemination of OER resources, and open source software for journal moderation and para data concerning resource use.

So a lot of existing records to get into shape and put in something that makes them accessible again. Not only that, if you look at the project overview you can see usage statistics play an important part. CAMILOE is also one of the projects interested in depositing information into the UK Learning Registry node setup as part of the JLeRN Experiment.

Having dabbled with using Google Refine to get Jorum UKOER records into a different shape I wonder if the project will go down this route, or given the number and existing shape manually re-index them. I’d be very surprised if RSS or OAI-PMH didn’t make an appearance.

Read more about CAMILOE on the JISC site

Improving Accessibility to Mathematical Teaching Resources

Making digital mathematical documents fully accessible for visually impaired students is a major challenge to offer equal educational opportunities. … In this project we now want to turn our current program, that is the result of our research, into an assistive technology tool. … According to the identified requirements we will adapt and embed our tool into an existing open source solution for editing markup to allow post-processing of recognised and translated documents for correction and further editing. We will also add facilities to our tool to allow for suitable subject specific customisation by expert users. … In addition to working with accessibility support officers we also want to enable individual learners to employ the tool by making it available firstly via a web interface and finally for download under a Creative Commons License.

The project is building on their existing tool Maxtract which turns mathematical formula in pdf documents into other formats including full text descriptions, which are more screen reader friendly (a post with more info on how it works). So turning

example equation

into:

1 divided by square root of 2 pi integral sub R e to the power of minus x to the power of 2 slash 2 dx = 1 .

The other formats the tool already supports are PDF annotated with LaTeX and XHTML. The project is partnering with JISC TechDis to gather specific user requirements.

Improving Accessibility to Mathematics Blog
Read more about Improving Accessibility to Mathematics on the JISC site

Linked Data Approaches to OERs

This project extends MIT’s Exhibit tool to allow users to construct bundles of OERs and other online content around playback of online video. … This project takes a linked data approach to aggregation of OERS and other online content in order  to improve the ‘usefulness’ of online resources for education. The outcome will be an open-source application which uses linked data approaches to present a collection of pedagogically related resources, framed within a narrative created by either the teacher or the students. The ‘collections’ or ‘narratives’ created using the tool will be organised around playback of rich media, such as audio or video, and will be both flexible and scaleable.

MIT’s Exhibit tool, particularly the timeline aspect, was something I used in the OER Visualisation Project. The project has already produced some videos demonstrating a prototype that uses a timecode to control what is displayed (First prototype!, Prototype #2 and Prototype #2 (part two)). I’m still not entirely sure what ‘linked data approaches’ will be so it’ll be interesting to see how that shapes ups.

Linked Data Approaches to OERs Blog
Read more about Linked Data Approaches to OERs on the JISC site <- not on the site yet

Portfolio Commons

… seeks to provide free and open source software tools that can easily integrate open educational practices (the creation, use and sharing of OERs) into the daily routines of learners and teachers … This project proposes to create a free open source plugin for Mahara that will enable a user to select content from their Mahara Portfolio, licence it with a Creative Commons licence of their choosing, create metadata and make a deposit directly into their chosen repositories using the SWORD protocol

The SWORD Protocol, which was developed with funding by JISC, has a healthy eco system of compliant repositories, clients and code libraries, so the technical challenge on that part is getting it wired up as a plugin for Mahara. Creative Commons also have a range of tools to integrate license selection for web applications. It’ll be interesting to see if these are used.

When I met the project manager, John Casey, in London recently I also mentioned, given the arts background, of this project that scoping whether integrating with the Flickr API would be useful. Given that the Attribute Images project mentioned above is looking at this part the ideal scenario might be to link the Mahara plugin to a Attribute Images API, but timings might prevent that.

Read more about Portfolio Commons on the JISC site

Rapid Innovation Dynamic Learning Maps-Learning Registry (RIDLR)

Newcastle University’s Dynamic Learning Maps system (developed with JISC funding) is now embedded in the MBBS curriculum, and now being taken up in Geography and other subject areas … In RIDLR we will test the release of contextually rich paradata via the JLeRN Experiment to the Learning Registry and harvest back paradata about prescribed and additional personally collected resources used within and to augment the MBBS curriculum, to enhance the experience of teachers and learners. We will develop open APIs to harvest and release paradata on OER from end-users (bookmarks, tags, comments, ratings and reviews etc) from the Learning Registry and other sources for specific topics, within the context of curriculum and personal maps.

The technical challenge here is getting data into and out of the Learning Registry, it’ll be interesting to see what APIs they come up with. It’ll also be interesting to see what data they can get and if it’s usable within Dynamic Learning Maps. More information including a use case for this project has been posted here.

RIDLR and SupOERGlue Blog
Read more about RIDLR on the JISC site

RedFeather (Resource Exhibition and Discovery)

RedFeather (Resource Exhibition and Discovery) is a proposed lightweight repository server-side script that fosters best practice for OER, it can be dropped into any website with PHP, and which enables appropriate metadata to be assigned to resources, creates views in multiple formats (including HTML with in-browser previews, RSS and JSON), and provides instant tools to submit to Xpert and Jorum, or migrate to full repository platforms via SWORD.

The above quote nicely summarises the technical headlines. In a recent blog post the team illustrate how RedFeather might be used in a couple of use cases. The core component appears to be creating a single file (coded in PHP which is a server side scripting language) and transferring files/resources to a web server. It’ll be interesting to see if the project explore different deployments, for example, packaging FedFeather on a portable web server (server on a usb stick), or maybe deploy on Scraperwiki (a place in the cloud where you can execute PHP), or looking at how other cloud/3rd party services could be used. Update: I forgot to mention the OERPubAPI which is built on the SWORD v2. The interesting part that I'm watching closely is whether this API will provide a means to publish to none SWORDed repositories like YouTube, Flickr and Slideshare.

RedFeather Blog
Read more about RedFeather on the JISC site

Sharing Paradata Across Widget Stores (SPAWS)

We will use the Learning Registry infrastructure to share paradata about Widgets across multiple Widget Stores, improving the information available to users for selecting widgets and improving discovery by pooling usage information across stores.

For more detail on what paradata will be included the SPAWS nutshell post says:

each time a user visits a store and writes a review about a particular widget/gadget, or rates it, or embeds it, that information can potentially be syndicated to other stores in the network

There’s not much for me to add about the technical side of this project as Scott has already posted a technical overview and gone into more detail about the infrastructure and some initial code.

SPAWS Blog
Read more about SPAWS on the JISC site

SPINDLE: Increasing OER discoverability by improved keyword metadata via automatic speech to text transcription

SPINDLE will create linguistic analysis tools to filter uncommon spoken words from the automatically generated word-level transcriptions that will be obtained using Large Vocabulary Continuous Speech Recognition (LVCSR) software. SPINDLE will use this analysis to generate a keyword corpus for enriching metadata, and to provide scope for indexing inside rich media content using HTML5.

Enhancing the discoverability of audio/media is something I’m very familiar with having used tweets to index videos. My enthusiasm for this area took a knock with I discovered Mike Wald’s Synote system which uses IBM’s ViaScribe to extract annotations from video/audio. There’s a lot of overlap between Synote and SPINDLE which is why it was good to see them talking to each other at the programme start-up meeting. As far as I’m aware JISC funding for Synote ended in 2009 (but has just been refunded for a mobile version) so now is a good time to look at how open source LVCSR software can be used in a scenario where accuracy for accessibility as an assistive technology is being replaced by best guess to improve accessibility in terms of discoverability.

In terms of the technical side it will be interesting to see if SPINDLE looks at the WebVTT which seems to be finding its way at the W3C and does include an option for metadata (the issue might be that ‘V’ in WebVTT stands for video). Something that I hope doesn’t put SPINDLE off looking at WebVTT is the lack of native browser support (although it is on the way) There are some JavaScript libraries you can use to handle WebVTT.  It’ll also be interesting if there is a chance to compare (or highlight existing research) comparing an open source offering like Sphinx with commercial (e.g. ViaScribe)

SPINDLE Blog
Read more about SPINDLE on the JISC site

SupOERGlue

SuperOERGlue will pilot the integration of OER Glue with Newcastle University’s Dynamic Learning Maps, enabling easy content creation and aggregation from within the learning and teaching support environments, related to specific topics. … Partnering with Tatemae to use OER Glue, which harvests OER from around the world and has developed innovative ways for academics and learners to aggregate customised learning packages constructed of different OER, will enable staff and students to create their own personalised resource mashups which are directly related to specific topics in the curriculum.

Tatemae have a track record of working with open educational resources and courseware including developing OER Glue. There’s not a huge amount for me to say on the technical side. I did notice that OER Glue currently only works on Google Chrome web browser. Having worked in a number of institutions where installing extra software in a chore it’ll be interesting to see if this causes a problem. More information including a use case for this project has been posted hereUpdate: Related to RedFeather update I wondering if SupOERGlue will be looking at OERPub (“An architecture for remixable Open Educational Resources (OER)”)as a framework to republish OER.

RIDLR and SupOERGlue Blog
Read more about SupOERGlue on the JISC site

Synote Mobile

Synote Mobile will meet the important user need to make web-based OER recordings easier to access, search, manage, and exploit for learners, teachers and others. …This project will create a new mobile HTML5 version of Synote able to replay Synote recordings on any student’s mobile device capable of connecting to the Internet. The use of HTML5 will overcome the need to develop multiple device-specific applications. The original version of Synote displays the recording, transcript, notes and slide images in four different panels which uses too much screen area for a small mobile device. Synote Mobile will therefore be designed to display captions and notes and images simultaneously ‘over’ the video. Where necessary existing Synote recordings will be converted into an appropriate format to be played by the HTML5 player. Success will be demonstrated by tests and student evaluations using Synote recordings on their mobile devices.

I’ve already mentioned Synote in relation to SPINDLE. Even though it’s early the project is already documenting a number of their technical challenges. This includes reference to LongTail’s State of HTML5 Video report and a related post on Salt Websites. The later references WebVTT and highlights some libraries that can be used. Use of javascript libraries gets around the lack of <track> support in browsers, but as the LongTail State of the HTML5 video report states:

The element [<track>] is brand new, but every browser vendor is working hard to support it. This is especially important for mobile, since developers cannot use JavaScript to manually draw captions over a video element there.

The report goes on to say:

Note the HTML5 specification defines an alternative approaches to loading captions. It leverages video files with embedded text tracks. iOS supports this today (without API support), but no other browser has yet committed to implement this mechanism. Embedded text tracks are easier to deploy, but harder to edit and make available for search.

Interesting times for Synote Mobile and potentially an opportunity for the sector to learn a lot of lessons about creating accessible mobile video.

Synote Mobile Blog
Read more about Synote Mobile on the JISC site

Track OER

The project aims to look at two ways to reduce tensions between keeping OER in one place and OER spreading and transferring. If we can find out more about where OER is being used then we can continue to gather the information that is needed and help exploit the openness of OER. … The action of the project will be to develop software that can help track open educational resources. The software will be generic in nature and build from existing work developed by BCCampus and MIT, however a key step in this project is to provide an instantiation of the tracking on the Open University’s OpenLearn platform. … The solution will build on earlier work, notably by OLnet fellow Scott Leslie (BCCampus) and JISC project CaPRéT led by Brandon Muramatsu (MIT project partner in B2S).

At the programme start-up meeting talking to Patrick McAndrew, who is leading this project, part one of the solution is to include a unique Creative Commons License icon which is hosted on OU servers which when called by a resource reuse some content leaves a trace (option 3 in the suggested solutions here). This technique is well established and one I first came across when using the ClustrMaps service which uses a map of your website visitors as a hit counter (ClustrMaps was developed by Marc Eisenstadt Emeritus Professor at the Open University – small world ;). It looks like Piwiki is going to be used to handle/dashboard the web analytics, which is an open source alternative to Google Analytics. The second solution is extending the CETIS funded CaPRéT developed by Brandon Muramatsu & Co. at MIT which uses JavaScript to track when a user copies and pastes some text. It’ll be interesting if Track OER can port the CaPReT backend to Piwiki (BTW Pat Lockley has posted how to do OER Copy tracking using Google Analytics, which uses similar techniques).

Track OER Blog
Read more about Track OER on the JISC site

Xerte Experience Now Improved: Targeting HTML5 (XENITH)

Xerte Online Toolkits is a suite of tools in widespread use by teaching staff to create interactive learning materials. This project will develop the functionality for Xerte Online Toolkits to deliver content as HTML5. Xerte Online Toolkits creates and stores content as XML, and uses the Flash Player to present content. There is an increasing need for Xerte Online Toolkits to accommodate a wider range of delivery devices and platforms.

Here’s a page with more information about Xerte Online Toolkits, here’s an example toolkit and the source xml used to render it (view source). The issue with tis I haven’t seen the detail for the XENITH project, but something I initially thought about  was whether they would use XSLT (Extensible Stylesheet Language Transformations), but wondered if this would be a huge headache when converting their Flash player. Another possible solution I recently came across is jangaroo:

Jangaroo is an Open Source project building developer tools that adopt the power of ActionScript 3 to create high-quality JavaScript frameworks and applications. Jangaroo is released under the Apache License, Version 2.0.

This includes“let your existing ActionScript 3 application run in the browser without a Flash plugin” . It’ll be interesting to see the solution the project implements.

XENITH Blog
Read more about XENITH on the JISC site

BTW here’s the OPML file for the RSS feeds of the blogs that are live (also visible here as a Google Reader bundle)

So which of these projects interests you the post? If you are on one of the projects do my technical highlights look right or have I missed something important?

1 Comment

The JISC OER Rapid Innovation projects are all quickly finding their feet and most are already fully embracing the open innovation model and blogging their progress. Having attended the programme start-up meeting on the 26th March 2012 and speaking to most of the projects there’s rich pickings for me to blog about over the next couple of months.

In our role (JISC CETIS) supporting this programme we’ve already dusted the programme with some of our wizardry. Phil Barker has aggregated all of the registered project RSS feeds into a single stream using Yahoo Pipes and I’ve bundled an OPML file of registered feeds (if you are a Google Reader user you can subscribe directly here) Note: Not all the projects have provided feeds yet. I’ve also started an archive of the #oerri tweets which is looking sparse now but will grow over time.

Wordle: OERRI FeedSomething I was interested in trying out was to see if there was a way to dynamically create a word cloud from a RSS feed. Wordle.net does have an option to generate a feed from a blog feed (shown here), but it looks like it’s a static image eg it won’t update as new project blog posts are created.

So I turned my attention to Jason Davies and his Cloud extension to the D3 javascript library.  Jason has a demonstration site which lets you experiment with wordcloud outputs using data from Twitter and wikipedia. Here’s an example for the Twitter search term jisccetis (clicking on a word starts a new search for that term).

OER RI posts straight from Yahoo PipeThere is also an option on Jason’s site to use a ‘custom’ url. This seems to accept a range of sources: html pages, rss feeds and json. You can just use the RSS output from Phil’s pipe to get this. This however looks a bit suspect to me. For example the word ‘rapid’ appears in the cloud but there are just as many occurrences of the word ‘innovation’ in the source text but it doesn’t appear. What I think is happening is the script is picking up the first 250 words and then counting the occurrences of those words. I haven’t had time to test that theory but if anyone else does leave a comment and I’ll update the post.

Instead I tried a workaround using Yahoo Pipes Term Extract. With this Pipe I take Phil’s Pipe as a source and for each blog post extract terms. I can then output this as json and use as a data source for Jason’s cloud generator creating a wordcloud that will update as more posts are published (although I’ve got no way of embedding it yet):

OER RI Posts using term extract
Dynamic cloud of OER-RI Posts using term extract

Visual inspection would suggest that this version is more reliable. There are however some things to remember:

Share this post on:
| | |
Posted in JISC CETIS, Yahoo Pipes and tagged on by .