Professor Bren Neale describes some of the challenges of managing qualitative longitudinal research data and reflection on the relationship between the RoaDMaP project and Timescapes, one RoaDMaP’s case study projects.
Timescapes provides an interesting research data management case study – partly because Timescapes staff have a wealth of experience of data management practice but also because Timescapes has created a subject specific archive being managed within an institutional data management structure. A recent leaflet outlining the new UK Data Service states that “the UK Data Service will be operating a more selective collections development policy which will include working with universities towards the goal of holding more research data within institutional repositories.” The relationship between different types of repository is something we will need to be aware of and plan for. The following post by Director of Timescapes, Professor Bren Neale, outlines the Timescapes programme and its research data management challenges.
Timescapes – our Research
One of the major tasks of the ESRC funded Timescapes programme (www.timescapes.leeds.ac.uk, 2007-12) was to create a specialist archive of Qualitative Longitudinal (QL) social science research data for sharing and re-use. QL data, which is gathered over time through in depth interviews and ethnographic methods, explores the lived experience of change and continuity in the social world and gives insights into how and why micro processes unfold.
The Timescapes Data Archive
Over the five years of the programme we set up the Timescapes archive as a collaborative venture with the University of Leeds Institutional Repository (LUDOS), using the DigiTool platform from ExLibris, and hosted by the University Library. We developed the resource in collaboration with the UK Data Archive and, in doing so, adhered to national level standards for data management and archiving.
By the close of our funding in May 2012, we had archived nine social science datasets, comprising nearly 3,000 files of multi-media data. Eight of these datasets were drawn from a network of projects that were funded through Timescapes to explore the dynamics of family lives and relationships. The ninth dataset (on the experiences of health and illness) was drawn from our network of affiliated projects.
The Timescapes Affiliation Scheme
The affiliation scheme was set up to encourage data sharing and re-use; over the course of our funding we supported the development of over 50 QL projects, and found ourselves expanding into inter-disciplinary areas of scholarship as QL methods became more widely established. This included a project funded by the Engineering and Physical Sciences Research Council on the Dynamics of transport. We currently have a queue of affiliated projects ready to deposit data in the archive, demonstrating the researchers see value in making their data available to share.
To encourage secondary use, and develop a community of users for the resource, we set up a secondary analysis demonstrator project and a series of training workshops; to date we have over 200 users registered for the archive and requests for registration continue to grow steadily. These are significant achievements in a context where hardly any QL datasets were available for re-use at the outset of our programme. New projects continue to seek affiliation with Timescapes, benefitting from ongoing methodological and data management advice. This reflects a growing commitment among researchers to archive and share data as an integral part of the research process.
New models of archiving
The advances outlined above were achieved within Timescapes through a stakeholder model of data sharing and re-use. In this model, archiving does not occur in a vacuum, but is harnessed to particular research agendas and becomes embedded in the research process as a project unfolds. This is important in QL research because there is no clear point at which a primary project, which is addressing dynamic research questions and producing cumulative findings, comes to an end, and secondary use can begin.
Archiving, in this context serves a dual purpose. It becomes a useful tool for the safe storage and longitudinal use of data by the originating team, as well as creating archive-ready data for wider sharing and re-use. The archive is set up in such a way that primary researchers can store their data in secure areas of the resource, with the originating team controlling who can access the data. The sense of ‘giving data away’ is therefore avoided. Providing data security through such access controls is often preferable to the process of anonymising data, since this can strip some qualitative data (especially audio and images) of their integrity and meaning.
Timescapes – in the longer term
In the longer term, Timescapes aims to build further collections of QL datasets for sharing and re-use, bringing related datasets together through the archive, and providing refined means of thematic searching and data retrieval both within and across projects. This creates the opportunity for new forms of cross project analysis and the potential to enhance the evidence from studies that are often small scale, scattered and localised in their findings and impact. For example we have recently set up a network of projects that are researching the voluntary sector using QL methods.
Funding permitting, we will scale up the evidence on the third sector through a programme of archiving, data sharing and knowledge exchange activities across the network. Of crucial importance, this new project will also promote the ethos of data sharing and re-use within policy and practice communities. Two further networks are under development, both of which address important themes for public policy (environmental sustainability and the lived experience of welfare reform).
Data Management Planning (DMP) & RoaDMaP
Data Management Planning (DMP) is integral to the developments outlined above, and has prompted us to produce guidelines for QL researchers who are facing the challenges of organising and presenting cumulative waves of data for their own and others’ use (see www.timescapes.leeds.ac.uk/about/timescapes-methods-guide-series for our methods guides on the archive, secondary analysis, the ethics of data sharing and re-use, and data management planning).
Our inclusion as a case study in the JISC funded RoaDMaP project here at Leeds has highlighted this important dimension of our work and enabled us to reflect on the processes involved and how we might have managed things better. We are also considering what we need to do in the longer term to sustain and improve the data resource that we have created, both in terms of technical development and its scientific value and use to the research community.
Research Data Management challenges
The challenges that we have faced straddle two domains: research and archiving. The research tasks are associated with the generation and safe storage of data, and preparation for archiving and sharing, including:
- Identifying a lead researcher to take overall responsibility for DMP, sourcing suitable training for this role, and costing and allocating sufficient time and budget for this task from the outset as an integral part of a project.
- Ensuring high technical and scientific standards for the generation of data in the field.
- Developing and applying ethical templates to seek permission from research participants to archive and share data about their lives, including transferring copyright to researchers.
- Specifying and applying mechanisms for the safe storage, formatting, ‘future proofing’ and labelling of data files that accumulate over time, to enable longitudinal as well as case based analysis by both primary and secondary teams.
- Developing and applying ethical and technical protocols for the representation of a dataset for archiving, re-use and dissemination purposes, including templates for the layout of interview transcripts, and for anonymising data, including multi-media data where appropriate.
- Developing and applying gold standards for the production of metadata (data about data) to document and contextualise a dataset to aid longitudinal and secondary use.
The archiving challenges faced within Timescapes (set out below) may not be currently applicable in many research contexts, but they are likely to have a growing currency in future as institutional repositories assume greater responsibility for the curation, preservation and sharing of research data, and as archiving and data sharing increasingly comes to be seen as a collaborative venture between research and archiving teams. Particular challenges for Timescapes have included:
- Building and sustaining the archive collections within an institutional repository, requiring archive and repository to advance in tandem, technically and scientifically, and with adequate institutional support.
- Working with and applying national level standards for data curation and dissemination as part of our collaboration with the UK Data Archive.
- Ensuring an appropriate software platform to maximise ease of use and technical backup to maintain the platform. We are currently facing the challenge of migrating the archive to a new open source software platform (Eprints) in line with developments in the University of Leeds LUDOS system.
- Creating and applying metadata (cataloguing) templates for the ingest of data into the resource.
- Creating a useful interface and search and retrieval tools in line with the analytical needs of researchers. We have identified the need to improve the interface and search tools once we have migrated to Eprints.
- Making provision for varied levels of access controls, including ‘approved’ access (fine grained, file level access) that enables secure deposit and controls on re-use for the benefit of primary teams.
- Tagging files in the resource to enable thematic searching and retrieval of data within and across projects (e.g. through the assignment of key words to data files and free text searching).
- Supporting QL researchers in data management planning and facilitating creative synergies between research and archiving.
- Ongoing collaboration with ‘stakeholder’ data depositors and users e.g. seeking feedback on and refining the presentation of archived data and metadata.
- Securing resources and skilled staff to manage, develop and promote the resource and ensure its medium and longer-term sustainability through external and institutional funding.
The challenges outlined above are substantial but there also significant rewards in working at the cutting edge of new archiving developments and supporting a new ethos of data sharing. We hope that our involvement in RoaDMaP will help us in future to hone our skills and refine our practices as well as promoting new ways of sharing data that are in line with researcher needs.