JISC Managing Research Data programme launch - Day 1

On December 1-2, 2011 the launch workshop for the JISC Managing Research Data programme 2011-2013 took place at the National College for School Leadership, in the University of Nottingham’s Jubilee Campus. All projects were invited to the event with two delegates. Chris Cannam and myself participated to the event, which has been a good opportunity to meet people from other projects (mainly DataFlow) with longer experience in data management.

After a nice lunch, the workshop started with a presentation of the research programme by Simon Hodson, the programme manager. Key points in his presentation were:

  • The necessity of gathering evidence of the importance and the benefits of research data management. The result of this information gathering will have to be published through each project's blog. Although 6 months projects (Strand B) as ours are not formally required to produce any evidence because of the lack of time, we are encouraged to do it. More on evidence gathering in the post about Day 2, in which I will talk about the evidence gathering exercise we performed under the guide of the "information gatherers".
  • The time to obtain results and show the importance of data management is now, there will be less and less money in the future for such projects.
  • Two calls in this same programme will be advertised at the beginning of 2012, one about data management training, and one about open data publication.
  • There will be a workshop in March 2012, where projects are welcome to present demos of their work. Another small workshop about DataCite is in program. The programme has also resources to support small workshop on specific themes, if there is enough interest.
  • The launch workshop aims at encouraging communication and collaboration among projects.

Simon Hodson concluded the presentation with some "homework" for the projects:

  • Blog about evidence of the benefits of data management
  • Blog about commonalities and possible collaborations among projects.

The second speaker to walk to the stage was Brian Kelly from UKOLN, runner-up in the IT professional blogger award. The title of the talk was "Blogging Practices To Support Project Work", a relevant topic given the fact that projects are supposed to report on their work through their blogs. A summary of his presentation, including slides, can be found here and here. From my point of view, the important points in the presentation/exercise were:

  • Identify the reasons for having a blog. These might include the fact that it is in the project's contract, the will to the tell a story, to disseminate knowledge and foster discussion through the comments, ... In order to involve the project's team in blogging, let them know WHY you have a blog.
  • Define the purpose and scope of the blog, create a page about it and make it visible from anywhere. The purpose of the blog and the purpose (i.e. goals) of the project should not be mixed up.
  • Identify who in the team is going to post on the blog, and what is their writing style (e.g. research papers, code, blogs, ...). Identify the good bloggers, provide opportunities for the reluctant bloggers, maybe invite guest bloggers to post on your blog.
  • Identify what can go wrong, and be prepared. Main problems are usually technical (e.g. servers down) and of spam in the comments. To address the latter, use for example moderation (although registration might be a barrier to comments) and spam filters.
  • How to measure the success of a blog? It depends on its intended goals, as stated in the policies. A good tip is to register the blog on Technorati and Ebuzzing (formerly Wikio) to track the activity, and perhaps create a community through programme tags.
  • To increase the number of regular readers, enhance interoperability (desktop computers and mobile devices). For example, don't use embedded content that might not work on mobile devices, and use optimised themes.
  • About Twitter: use bit.ly, which gives stats and is useful to track the success of a twit; and hashtags, which help with citations
  • Engage the commenters by replying to the comments (have alerts for new comments)
  • Very important: plan and manage the closure of the blog (i.e. at the end of the project), so that the content does't get lost!

The audience then split into two groups: one attended a presentation by the DCC - Digital Curation Centre about the tools they offer for data management, and the other group a presentation of the results of the UMF Research programme. I personally went to the UMF presentation because the DataFlow project was being introduced, and also because I had already learnt about the DCC at the Cambridge Roadshow (see my previous blog post). A summary of the DCC session can be found on the Orbital project's blog.

Without going into too much detail, I will try to summarise the UMF session.

  • JANET is going to offer cloud services to support research through the new JANET brokerage service.
  • Eduserv: a pilot project in collaboration with JANET to provide IaaS for HE and FE institutes and support long term data storage in the UK. A data centre has been built in Swindon, and a second one is in program.
  • BRISSkit project (University of Leicester): aims at creating a biomedical research infrastructure that should avoid duplication in data collection, study design and ethical matters, and bridge research data with real world data (i.e. NHS).
  • DataFlow (University of Oxford): a two-tier system that aims at making data management simple and integrated into a researcher's workflow. The first part of the system is called DataStage (formerly ADMIRAL), a virtual space where researchers can freely store any kind of data. They can share it with colleagues or keep it private, and it works like a virtual hard drive (access through SMB, WebDAV, ftp, ...). It is then possible to select important data and create a dataset that is automatically added to DataBank, the second part of the system. This is a dataset repository that supports rich metadata (RDF), the ability to export through the SWORD protocol, has different retention and access policies, and many other features.
  • Smart Research Framework (University of Southampton): a set of tools (LabTrove, blog3) to facilitate data management in the lab environment (chemistry in the case of Southampton). The idea is to have a digital lab notebook as a blog, with linked data, versioning, ...
  • VIDaaS (University of Oxford): Database as a service (DaaS) for researchers that don't have time, and/or experience with database technology.
  • YouShare (University of York): an online platform where any research software can be installed and quickly deployed, and offered as SaaS. There are already around 65 applications running. It also offers a data upload facility.

The last item on the agenda for day 1 was a poster session in which the projects got to know each other and researchers had a chance to discuss possible commonalities and collaborations. Our poster can be downloaded from here. During the poster session we saw a demo of the DataFlow system, and talked to Prof. David Shotton, Katherine Fletcher, and Bhavana Ananda about testing it at C4DM. Chris also wanted to understand if it will be possible to integrate DataStage into our authentication system. We agreed that we will have access to the source code to study and try it. Katherine pointed out that the system is not easy to install at the moment, but it is quite stable once up and running. By Christmas they are planning to release a beta version as a Debian package.

The day was closed by a very nice dinner at the conference centre, and a couple of pints at the Johnson Arms.