Meeting with DataFlow team – Oxford, 7 February 2012

Meeting with DataFlow team to explore potential collaboration and plan for Leeds implementation.


The future of DataStage and DataBank

At the end of the current DataFlow project DataBank development will move into the Bodleian Library under Neil Jefferies.

DataStage may move into the library or may be supported through its user and devloper communities.

There is the potential for DataStage and DataBank to be decoupled.

Implementation and feedback

Both DataStage and DataBank will be available as virtual machines or as installable packages. Delivery is imminent.

It would help the dataflow project if we could off feedback on the installation process and our initial use

  • problems encountered
  • customisation needed
  • perceived barriers to adoption by researchers

Support and training for researchers

It is envisaged that there will be several layers of support and training material

  • Core materials such as those developed by DCC
  • Dataflow general – wiki under development by Dataflow project
  • Discipline specific examples through RoaDMap for project case studies
  • Discipline specific examples for Dataflow use through RoaDMap for project case studies

Agreed that RoaDMap will make training materials available.


Discussed metedata, standards and implementation options

DataBank includes core metadata collected at time of deposit.

Options for including richer discipline, project and experiment specific metadata within stored research objects is probably best approach.

Interesting links

DAMARO project – Policy and infrastructure for Oxford

Bagit specification – format for data collections

ISATAB – Investigate Study Assay metadata handling

Xforms – aid to metadata creation

Orbeon – support for Xforms

MIIDI – good example of metadata handling

Meeting with National Instruments – 19 January 2012

Meeting with National Instruments. Preliminary discussion of technologies and fit.

Meeting with National Instruments – 19 January 2012

Graham Blyth (RoaDMap)

Robert Lee (NI)

Steve Crowe (NI)

Discussion of the approach intended for RoaDMap and how this will fit with the existing NI tools and technologies. In particular we focussed on the creation of metadata as part of the experimental procedure and within the interface being used to run the experiment. NI have data management tools and formats TDM, TDMS and DIADEM within their software. In addition utilities have been written that can be used to create metdata and save as XML.

The idea of using the inbuilt tools within the active part of the research lifecycle then exporting the data and metadata to more open and generally usable formats was suggested. This would give the option of archiving both the native format and the more open format of the data and metadata along the lines used by the UKDA. It was agreed that this would be explored further.

Institutional Challenges in the Data Decade 7 Feb 2012 (Day 1)

Some key themes from the day and a few personal thoughts about regional collaboration

The DCC have run – and continue to run – a series of regional Data Management Roadshow events under the umbrella title Institutional Challenges in the Data Decade. For example, see a write up from the March 2011 event in Sheffield published in Ariadne.

Here are links to the presentations from the two day event in Loughborough (7th-8th Feb).

Day one brought together a number of projects from the region to share their RDM progress and experience to date and discuss emerging trends. Several of the projects are funded under the JISCRMD02 programme.

Key themes of the day included:

  • RDM drivers: can be internal and external. We need to understand what they are but also use them to engage researchers. More work needed to fully understand why some academic disciplines have embraced RDM whilst others have not.
  • Discoverability: what’s out there, how do we find it, what do we need to do to maximise discoverability in our own datasets?
  • RDM as risk management: risk of loss, risk of fines for non-compliance, FOI requests, risks of not acting vs. risks of acting.
  • Roles and responsibilities: how can we articulate these, who should be involved, is there are role for library professionals, is there a need for a research broker / facilitator to bring together the players from multiple departments across the institution?
  • RDM policies: these are already taking different forms, from a high level aspirational – like the University of Edinburgh, to much more nitty-gritty, defining specific roles and defining research data at the University of Hertfordshire. Which approach will be most effective? Both? How do we link the different levels together and translate policy to practice and, where appropriate, practice to policy?
  • How do we know what’s out there: if we don’t have a handle on the size and scope of our institutional output, how can we hope to manage it? An audit, perhaps DCC’s Data Asset Framework, seems essential.

Regional Collaboration

One question that occurred to me at the event is where there is a regional role in fostering a community of data management practice. I have a particular interest in this having worked for several years on the White Rose Research Online and White Rose Etheses Online repository services for the Universities of Leeds, Sheffield and York. The three library services work closely together to explore opportunities to work together including identifying areas of redundancy or economies of scale. The three universities work together closely in a number of ways – for example offering PhDs with joint supervision from two of the three partners – and the collaboration is promoted by dedicated staff in the White Rose University Consortium.

We have found the Collaboration Continuum model, proposed by OCLC, can be helpful in articulating the different levels at which working together to build community and services can operate. In particular, how do we move from good ideas and shared experience to the more challenging prospect of developing longer term shared services or solutions.

In the case of White Rose, the institutions have many commonalities apart from geographical proximity and it is these, as much as geographical convenience, which drive the cooperative initiatives. However, geography is a factor, whether it be bring people together in a convenient way or understanding how university’s contribute to/ interact with local and regional communities – for example, local businesses. Maybe regionalism is not relevant in the research data management world – but we certainly see shared research projects across the White Rose Consortium and a regional approach does offer opportunities for exchange of experience: the question is whether this can/will/should translate into regional shared services – for example, computing and storage. Of course, the White Rose consortium is not the only collaborative configuration – for example, there is also the N8 research partnership and YHMAN.

Any thoughts welcome!