‘Has anyone seen my data?’

Reflective blog post on a research data management training session from Andy Turner

Has anyone seen my data? Why research data management matters and how you can help

This post is primarily about an internal training and awareness raising event held at the University of Leeds on the 5th of March 2014. The event was aimed at University of Leeds staff engaging with the process of research data management (RDM), particularly staff with a research administration roles, but also researchers actively managing research data, and anyone with a role to play in developing RDM practice institutionally. Before focussing on the event itself, which ran with the same title of this post and had 16 participants, let me introduce myself and my growing interest in RDM to add a bit of context.

I am primarily a researcher based in the Centre for Computational Geography,  School of Geography, University of Leeds and have been for over 15 years. The management of research data has been a big issue for me since before I started working on my first professional research project back in 1997. That project was about halfway through and I took on the research assistant role from a colleague leaving academia. I was learn all about what they had done on the project and where all the data was in a few short meetings. By this time, the World Wide Web had matured and my forward thinking boss, Stan Openshaw, helped us realise that the best way to document everything going forward, for the project and in general, was to work openly and tell research stories using web pages and web based tools to link to whatever data we used and developed in research as much as possible. All my career I have been trying to realise the power and utility of the Internet for developing awareness of the data that exists, what data might be wanted and what might be used in research to try to understand the world (especially its problems) and forecast and shape the future so it is better for all. About a year ago I started attending some JISC RoaDMaP Project meetings and became (unofficially) a project observer. RoaDMaP was part of the JISC Managing Research Data Programme and concluded in July 2013. RoaDMaP engaged the University of Leeds in the evolving RDM landscape and it delivered a number of things, not least of these was an embryonic governance framework for developing RDM practice at the University of Leeds. Amongst other things, RoaDMaP developed a roadmap for how the university could move forwards to develop RDM best practice. For more details of all of this check out the RoaDMaP Web Pages. The work and efforts of RoaDMaP continued after the project concluded and the university provided some internal funding to establish Research Data Leeds (RDL) – an organisation with an Operational Team that I now feel part of. RDL work is reported to the Research Data Working Group (RDWG) and Research Data Steering Group which work in tandem to inform the official committees in the university governance structure.

From October 2013 I’ve been working about one day a week on RDM in the faculties of ESSL and PVAC under the management of Tim Banks – an integral member of the RDL Operational Team. It has been great working with Tim to both follow and develop RDM processes in these faculties and in doing so learn about their research work and research interests, much of which is geographical. I make an effort to document the work I am doing including my growing interest in RDM. Reflecting on my master document about RDM, I can see that I’ve already done quite a lot of work and reading over the last year to get up to speed. I look forward to continuing to work on developing RDM at the University of Leeds in the years to come whilst actively researching and developing new research data of a geographical nature to address the problem faced by society and particularly vulnerable groups.

With perhaps more than enough context about me the author, this post now reflects on the event and what I got out of it and what I think we all got out of it.

The event was organised and resourced as part of the RDL Operational Team work programme and was a team effort coordinated expertly by Rachel Proudfoot. Two colleagues from the Digital Curation Centre (DCC) – Sarah Jones and Joy Davidson based at the University of Glasgow joined us to share their expertise and knowledge about RDM. The event was informative and consisted of a number of presentations with time for questions and discussion. There was a demonstration of the latest version of DMPOnline and an exercise that took a critical and constructive look at a couple of data management plan documents.

Rachel introduced the event which began with a series of 3 short presentations: “Introduction to Research Data Management” presented by Joy; “Introduction to Research Data Leeds” presented by Rachel; and, “Why Share Data?” presented by Sarah. The first set of presentations provided good context, information and helped me reconsider the issues and reasoning about when researchers should share data and with who and when they shouldn’t.


Open discussion

Next we had an open discussion where several good and interesting points were made:

  • Although research administrators could encourage researchers to share research data and adopt RDM best practice it was not easy for them to change hearts and minds and that trying to do so could be more problematic than beneficial.
  • The last thing RDM processes should do is put in place barriers making it harder to get research done. RDM should be a research enabler that benefits researchers from the outset so that skeptical researchers can quickly transition to become adopters and even champions.
  • We learned that at least one HEI is considering registering all its researchers with ORCID researcher identifiers. Doing this requires thinking about those researchers that already have an ORCID.
  • Distinguishing RDM from open access and open data is important, especially when liaising with those skeptical about the open agenda. We don’t want anyone to be skeptical about RDM!
  • The more sensitive data is, the more management of it is required.
  • Arguments for better RDM:
  • It makes research more robust
  • Reduces risks of losing important data
  • Improves research continuity
  • Greater research impact
  • Sharing or exposing data is a driver to improves its quality, especially the quality of metadata.
  • The Social Sciences has a long history of sharing and reusing data whereas in the Arts and Humanities, the concept is relatively new.
  • Researchers get excited when they start to think what might be possible if the whole discipline started to share its data

RDM processes in practice at the University of Leeds

Next, Tim and Graham Blyth presented about RDM processes in practice at the University of Leeds. I learned that Research Data Risk Assessment (RDRA) efforts were standardised at the university following an audit in 2008 which raised RDM concerns. We considered metadata, the research data reuse life-cycle and technical terminology, in particular the difference between repositories, registries and catalogs/catalogues. We also considered Tim’s Circular Diagram about Data Management Planning which is displayed in this earlier post about Developing Data Management Planning Tools at the University of Leeds.

 Demo of DMPOnline

Sarah gave a demo of DMPOnline and outlined the planned options for customising the tool via templates and tailored guidance. There was a concern about EPSRC being missing from the list of funder templates as this could put off researchers developing a Data Management Plan (DMP) for an EPSRC project proposal. Sarah shared the advice at looking at the USA National Science Foundation (NSF) Directorate for Engineering Data Management Requirements in this respect.

Sample data management plans

Next we considered two sample data management plans and discussed what we thought about these in pairs and then collectively. There was a sample AHRC technical plan dated 1st of August 2013 which had reviewer comments and an older more brief technical appendix from an AHRC project.This was a useful exercise. Imperfections and omissions in the data management plans were identified. In discussion we considered:

  • The role of research administrators in developed DMPs.
  • How to gather and use reviewer feedback on DMPs?
  • Potential role for DCC in helping research councils to review DMPs.
  • Sharing DMPs.
  • The business case for developing a RDM Service at the University of Leeds and the demand for information about examples of the DMP process helping to get things costed and worked out better at application and post award stages.
  • How to ensure DMPs are being followed and revisited as research projects are ongoing? It was mentioned that an Ethical Review Auditing process is being piloted at the University of Leeds by the Research and Innovation Service and that part of this would be looking to see that RDM was being done satisfactorily.

Funder Policies and Costings

Joy gave a final presentation about Funder Policies and Costings. There was a consideration of eligible costs and the UK Data Service – Data Management Costing Tool and Checklist. The point was made that some costs may be very high, for example, data anonymisation, such as pixelating faces in videos. If the costs of anonymisation are not worked out and costed sufficiently, then projects may fail to make data available or institutions may be obliged to meet the costs of anonymisation from core funds. There was a discussion about research project budgets and about if research councils and panels would prefer research projects costed more cheaply over those which had fully considered the costs of managing and sharing research data. A further suggestion was made that projects should start the work needed to share research data as early as possible in the project.

Final Discussion

In a final discussion session we considered:

  • Generating revenue through data reuse.
  • Trusted Repository Status of institutional repositories.
  • On-going activity as part of the RDL Operational Team work plan to conduct a Research Data Audit building on the RoaDMaP Research Data Survey. It was mentioned that this audit work is looking at all research data, both physical and digital and that one of the current efforts is trying to identify large volumes of research data and interesting case studies from which we can learn.

All in all this was a well organised event and I learned and gained quite a bit from it. It has been good to remember the event and write this blog post which I hope can serve several ends to improve research data management at the university of Leeds and beyond. To learn more about RDM at the University of Leeds please visit the Managing Research Data Web Pages.

‘Has anyone seen my data?’ Why research data management matters and how you can help.

Presentations and blog from research support staff research data management training event

Training session held on 05/03/2014 and targeted at research support staff. Attendees were given the options of coming at 12.30-1.30 for an introduction to research data management (and a free lunch!) or joining the session at 1.30pm to look at some aspects of research data management in detail. All bar one participant came for the whole session.

  • Digital Curation Centre presenters: Joy Davidson, Sarah Jones
  • University of Leeds presenters: Tim Banks, Graham Blyth, Rachel Proudfoot

Programme with links to presentations.

12:30 – 13:30 Lunch and Research Data Management introduction