A summary of the work undertaken during the data management planning work package, lessons learnt and next steps.
Project: RoaDMaP: Leeds Research Data Management Pilot
Workpackage: 4: Data Management Planning
Report title: Developing Data Management Planning Tools at the University of Leeds
Author: Tim Banks
Data Management Planning is a crucial part of the research data lifecycle and as such it is vital that researchers have access to appropriate tools that enable them to create high quality data management plans without unnecessary duplication of effort and with easy access to help and guidance. The situation is further complicated by the fact that each funder has slightly different requirements for what questions should be answered within a data management plan (and in some cases exactly how those questions are worded).
There are further differences in requirements between the research funders in the maximum length of plans and how they are formatted. As such we identified that the process of creating data management plans could be significantly enhanced through the use of an online tool that addressed all of these requirements. Our chosen route was to work with the Digital Curation Centre to identify specific enhancements to their DMPOnline platform and test these with researchers at the University of Leeds.
Finally, we wanted to move to a position where the existing ‘research data risk assessment’ processes at the University of Leeds could be replaced with ‘data management planning’. This would require a detailed analysis of the requirements for DMP inclusions from the funders, University and also research discipline.
Questions we wanted to answer
1. How could we use technology to minimise the effort required to produce high quality data management plans?
2. What was the overlap between the University’s established ‘research data risk assessment’ processes and data management planning? Where were the gaps?
3. What were the differences and similarities between the DMP requirements of the various research funders?
4. What feedback would we receive from researchers who had no previous experience of creating data management plans?
What we did
We spoke to a number of researchers from a range of subject disciplines and sought feedback on their experiences of using the DMPOnline tool. The feedback was generally positive although there were some issues with the system that were quickly identified.
The first was that the default formatting of plans wasted a large amount of space on the page. Given that several funders (e.g. ESRC) put a limit on the overall length of a plan, it soon because clear that it was not possible to include as much information in DMPOnline as would have been possible in a Word document. Clearly, this was a disincentive to use of the tool, so we submitted an urgent development request to the DCC (https://github.com/DigitalCurationCentre/DMPOnline/issues/39) and also published a blog post on this subject (http://blog.library.leeds.ac.uk/blog/roadmap/post/115). The DCC quickly responded with alternative formatting options that addressed this concern.
Researchers also commented on the fact is was annoying to have to complete a follow-on question that was made redundant by a previous answer. e.g.
Q1: Are there any IP issues to consider with this project?
Q1.1 If yes, then please give details.
In this example, if the researcher answered ‘No’ to question 1, the system would not allow them to continue until they had answered ‘n/a’ or similar to question 1.1
This issue was logged as a feature request to the DCC (https://github.com/DigitalCurationCentre/DMPOnline/issues/40) and has now been implemented.
One comment that was made by several researchers was that the process of registering an account (particularly for use with institutional Shibboleth authentication) was very ‘clunky’ and seemed to require a number of unnecessary steps. We suggested an approach to the DCC around the pre-creation of accounts by authorised users (https://github.com/DigitalCurationCentre/DMPOnline/issues/53) to which we have yet to receive any feedback.
We also undertook a large scale exercise to map the DCC checklist questions used in each of the DMPOnline funder templates against the University research data risk assessment questions. We also took the opportunity to perform an analysis of the differences between the questions included in the different funder templates.
In the later stages of the project, when it became clear that the digital curation centre were planning on relaunching DMPOnline and fundamentally changing the way in which it operated, we contributed a number of ideas in the form of blog post comments (http://www.dcc.ac.uk/news/future-plans-dmponline & http://www.dcc.ac.uk/blog/dmponline-current-status) and further suggestions for enhancement (https://github.com/DigitalCurationCentre/DMPOnline/issues/59).
We also suggested ways in which the DMPOnline tool could be used to automatically handle the various requirements for plan formatting (font, size, margins, page length etc.) from the funders based on the template selected. We suggested that this would be one way in which we could realise the benefits of using an online (rather than a manual document based) process.
What we found
The analysis of the mapping of DCC checklist questions to funder templates in DMPOnline revealed the following:
Only one question (DCC 2.1 “Give a short description of the data being generated or reused in this research) appears in every funder template
Two questions appear in all funder templates except one (DCC 4.1.1 and follow-up questions “Are you under obligation or do you have plans to share all or part of the data you create/capture?” and DCC 6.1 “What is the long-term strategy for maintaining, curating and archiving the data?”)
2 questions appear in 6 funder templates (DCC 4.2.1 and DCC 4.2.3)
4 questions appear in 5 funder templates (DCC 2.5.1, DCC 7.1, DCC 2.3.2, DCC 2.3.3.
5 questions appear in 4 funder templates
13 questions appear in 3 funder templates
12 questions appear in 2 funder templates
11 questions appear in just 1 funder template.
Of the 36 questions appearing in 3 funder templates or fewer, 24 (2/3rd) are ‘Required for core DMP’ as determined by the DCC checklist.
Our conclusions (summarised in our comments on thsi blog post http://www.dcc.ac.uk/news/future-plans-dmponline) were that the current DMPOnline questions were driven entirely by the funders’ requirements for what should be included in a data management plan. Our suggestion to the DCC was that they build a tool which also captures University and discipline specific requirements. We proposed a model of intersecting circles to demonstrate the point:
In this representation, our suggestion was that DMPOnline currently only covered the green circle and its scope should be broadened.
Our further work mapping the University research data risk assessment questions with those in DMPOnline confirmed that some (whilst still valid) fall outside the scope of a data management plan. We therefore proposed to the University Research Data Steering Group that some current questions should be included within the new University grants management system (KRISTAL) whilst others should be included within a DMP (possibly via a DMPOnline institutional template). Some examples of the questions considered are as follows:
||App / Award
|Does proposed methodology comply with legal / funder requirements & best practice?
||Remove this question, because this is already covered by the University RDM policy which applies to all projects
|Requirements discussed with Faculty IT Staff?
|Would disclosure of data damage University reputation?
||Remove this question, because the reasons behind it are covered by other DMP questions relating to the data protection act and other forms of sensitive data.
|Does data have commercial value to University?
|Do staff require CRB / VBS checks?
|How will off-campus access to data be managed?
||DMP Institutional template
|Does data need to be passed to other organisations and how?
||DMP Institutional template
|Have arrangements with other organisations been checked?
||DMP Institutional template
One of the main lessons that we learnt was that developments were entirely reliant on the DCC development team and thus what was a priority for us wasn’t necessarily top of their list of issues. We do believe we engaged constructively, but a larger scale pilot of the tool was not taken forward as we didn’t feel it was yet at a point where this was appropriate. Several proposed development were put on hold pending a larger review on the future of the tool. Whilst we fully support the new direction of DMPOnline and believe it is the right thing to do, one consequence was that we were unable to progress a larger scale roll out and integration with systems at Leeds due to the timing of this announcement.
We are continuing to engage with the DCC (most recently suggestion a simple way in which links could be built with institutional grants management systems – see comment on blog post http://www.dcc.ac.uk/blog/dmponline-current-status and https://github.com/DigitalCurationCentre/DMPOnline/issues/59). We are looking to run a Faculty wide pilot of the new version of DMPOnline in the next 6-8 months prior to a wider roll out within the institution.