JISCMRD Workshop: "Meeting (Disciplinary) Challenges in Research Data Management Planning"

As our project came officially to an end on 31 March (although we will continue until the end of May on an unfunded extension), we participated, together with the other projects in our strand, to the JISCMRD Workshop: "Meeting (Disciplinary) Challenges in Research Data Management Planning". The workshop was designed to allow projects to share their final findings and experiences with other projects and with JISC. The workshop, which took place on 23 March at Etc Venues Paddington, has been a very good occasion to better understand what similar projects have been working on in the past 6 months. Although projects shared their developments on their respective blogs, listening to a summary presentation of the results has been extremely useful to identify similarities between other projects and ours that we might have previously missed. In fact, I realised that some of the observations I made in a previous post on our approach to RDM also apply to other projects as well.

DSpace access control: public and private data

One of the required features of the data management system for C4DM is the possibility to hide certain datasets from public access (for copyright or other reasons), but leave them accessible to C4DM staff. This was shown in the schematic overview in Figure 1 in the post about the online user survey. In the following post, I will discuss a few possible solutions.

Authenticating DSpace Users

The DSpace documentation lists several options for authenticating users: basic password access; Shibboleth; LDAP; IP Address; and X509. These can be combined in an authentication “stack” in which a set of authentication options are tried until one succeeds.

Our authentication stack allows four levels of access:

  • anonymous – users who don't log in to the system, but can browse the communities and collections and download any publicly available data on the system;
  • eecs – users from the School of Electronic Engineering and Computer Science (EECS) users can log in with their EECS credentials for access to a wider range of data;
  • users added by administrators – this allows external users (e.g. research collaborators) additional rights within the repository;
  • administrators – to set up communities and collections and to manage users when required.

In addition, users on the QMUL network are automatically placed in a “QMUL” group in DSpace based on their IP address. This allows “intranet” type material to be controlled in the system.

DSpace test repository: first user feedback

One of the main goals of our project is to test a pilot dataset repository for the Centre for Digital Music (C4DM). After surveying the user requirements and selecting DSpace out of several software options, we installed it, and started customising it. Once we felt the system was ready for some user testing, we selected five "power users" from those who were interviewed or answered the online questionnaire and that have some datasets ready to be published, and asked them to submit their data. Of the five users, three (A, B, and C) gave us detailed feedback until now.

A bottom-up approach to research data management

During the past five and a half months we went from being completely unaware of data management and data curation, to having a draft data management policies document and a test data repository for the Centre for Digital Music. I had the impression, while participating in the programme launch at the beginning of December, that our project was a bit of an outsider, compared to other teams which clearly had a much longer experience in the field. This, I think, allowed us to have a slightly different point of view on the subject.

DSpace: metadata schemas and data submission

During the past month we have been busy customising DSpace to suit the publication needs of the Centre for Digital Music. In this post, I will talk about our approach to dataset descriptive metadata and the online submission process.

Easily create DSpace submissions with many files

One of the limitations of DSpace pointed out by our test users is the way the web interface manages the upload of bitstreams (i.e. files) during a the submission process. We were aware of this fact already from the beginning, and tried to find alternative solutions.

DSpace for Digital Music Research Data Management

DSpace is a free, open-source data management system originally created by MIT and Hewlett Packard. It is currently developed under the auspices of Duraspace, a not-for-profit set up to support the Fedora Commons repository framework. There is an active development community, updating both the core application and creating application specific changes – DSpace is the basis for the Dryad project for research data management in the biosciences.

DSpace provides a fully functional repository system. It is very widely used with over 1000 live instances listed online. As DSpace provides a turnkey system, it is less flexible than operating in the Fedora framework, but allows a fully-functional web-based repository to be set up within a short time-scale (see installing DSpace on CentOS).

Apologies for the long silence

A sincere apology from the entire SMDMRD team for the long silence on this blog caused by technical difficulties: the hosting server had some security issues, and given the large amount of accounts to restore, it took our IT support a long time to fix it.

Open Access to Scientific Information

POST, the Parliamentary Office of Science and Technology, published "POST Note 397" on Open Access to Scientific Information on 25th January 2012. The briefing considers open access to both publications and data.

Awareness of Open Access within the government seems to be rising, with the government committing to expand access to research publications and data in March 2011. In September 2011, an independent working group was set up to examine how to attain this - recommendations being expected spring 2012.

The POST Note points out that: OA to data could allow validation of findings and data re-use to "advance knowledge and promote innovation"; sharing data requires effective data management and archiving; sharing data presents challenges re. IP and privacy; and that expanding access requires collaboration between researchers, librarians, HEIs, funders and publishers.

Syndicate content