Professor Bren Neale responds to comments and questions from Simon Hodson on her previous blog post
In a recent blog post http://blog.library.leeds.ac.uk/blog/roadmap/post/132 Professor Bren Neale described how the Timescapes data archive was created to house qualitative, longitudinal social science data. JISC Managing Research Data Programme Manager, Simon Hodson, asked for more details on specific aspects of the Timescapes data archive. Prof Neale’s response is below:
Thanks for your useful comments and questions relating to technical development, sustainability, cost recovery and the role of institutional repositories in supporting ventures such as the Timescapes Archive. These are issues that we are currently grappling with! Here are some clarifications, thoughts and reflections on these questions from a research perspective:
Q. Why did we choose the DigiTool Platform and has it proved a good choice?
A. Our choice was a pragmatic and strategic one. At start up in 2007 we chose to build our archive within the digital repository at Leeds (LUDOS – Leeds University Digital ObjectS), as a test case for the development of data resources that would be re-used, as well as curated and preserved in new ways. In our case, managing and preserving data is not an end in itself – it is data re-use which is the end point, the goal of the exercise; hence the drive to find more tailored solutions for data re-use than were available nationally at that time. (From this point of view, the designation of our fledgling data facility as an archive – which implies a goal of data preservation – is perhaps a misnomer). We were encouraged to take the decision to build the archive locally by senior members of the faculty, who were keen to maximise the funds that could be brought into the University through the Timescapes proposal.
We were experimenting with the idea of a devolved, specialist data facility, which would complement the centralised, generic archive (Qualidata at the UK Data Archive) but enable more engaged and bespoke input and investment in the resource from academic researchers, as both data depositors and users. We also wished to try out more refined ways of searching and retrieving data than those available through the national data archive – ways that did not necessarily mean downloading a full dataset, but allowed for downloading of comparable slices of data within and across datasets.
Tying our enterprise in with the institution’s LUDOS repository gave us a secure infrastructure within which to test out our ideas and develop our data resource. Digitool had been adopted as a digital repository platform by the University Library and it offered particular functions, such as good graphics and the ability to accommodate multi-media data, which were useful for QL research. We anticipated too that we could refine the software to suit our purposes as the archive took shape. We found it useful to think of our enterprise as having two parallel strands of development, the technical and the scientific. It was necessary for them to develop in tandem in order to create a viable resource, and this remains the case today.
We found that Digitool has some notable constraints. The propriety nature of the software limited its flexibility to add refinements to the features or to accommodate different forms of data or add bespoke retrieval functions. It was not possible, for example, to drop slices of data into a basket for download, so our ideas for refined searching and retrieval were compromised. Currently, there remains a need to improve the interface for end users of the facility, and enable more effective search and retrieve functions to support secondary use.
During our Timescapes funding our technical officer carried out a feasibility study to move to an open source Fedora system, and by 2011 a Fedora prototype had been developed, with much potential to overcome the limitations of the current system. However, with the project end point (mid 2012) looming and limited funding options for the future, we could not run the risk of migrating to this independent platform and losing our institutional support and technical back up.
Over the past year, the library has migrated its other collections from Digitool to Eprints, but retained the archive on Digitool in order to preserve its current capabilities. We are shortly to embark upon a feasibility study to assess the potential for migrating the archive to Eprints and to identity what the benefits and pitfalls might be. Again we may find that functionality is limited; for example it may not be possible to replicate the access controls that we have set up in Digitool, so this may turn out to be another short term solution. However, it is the best we can achieve in the current climate. By 2014, the university will have completed its own feasibility study, and may be in a position to commit to the development of a robust and sustainable infrastructure for the preservation and re-use of research data in all its variety.
Q. What is the sustainability model for Timescapes? Who is taking responsibility for the long term preservation of the collection? Are there plans for further activity to expand the archive now that funding has ended?
A. As the above discussion shows, the sustainability of the Timescapes Data Archive is tied in with the institution’s digital repository infrastructure, whose support we continue to rely on. During our one year hiatus in funding, the University Library undertook to support the Digitool platform for the Archive, thus tiding us over until further funds could be secured. This support has been vital to our survival, and allowed us to keep the resource open for re-use and for teaching purposes here at Leeds, and to provide a skeleton service to registered and new users.
However, in order to carry out remedial work on the user interface, and expand the scope of the resource by building more collections of thematically related datasets, further external funding is needed. We have recently secured funding through to mid 2014 under the ESRC Knowledge Exchange scheme (Changing Landscapes, Hughes and Neale). The new funding reflects one of the hallmarks of Timescapes – the integration of research and archiving. The aim is to enhance research evidence on the third (voluntary) sector, through a programme of archiving and sharing of new datasets and a synthesis of findings from across a network of related QL projects. Timescapes provides the data and methodological infrastructure to support the work of this group of projects, and in turn we have secured funds for both archive and technical developer posts to support the archive. We plan to develop similar bids with other research networks in the future, using the infrastructure of the Timecapes archive and our data management expertise to support new research agendas.
Summing up on long term sustainability, the development of the archive was a collaborative venture, developed in tandem, between the Institution and QL researchers. In the same way, ongoing collaboration and commitment on the part of both the institution and QL researchers is needed if the Timescapes Archive is to be sustained and further developed over time.
Q. Are there any plans for cost recovery from affiliated projects?
A. We asked projects that were seeking affiliation with Timescapes to build a nominal sum of £3,000 into their funding proposal as a contribution to our work of preparing and ingesting the datasets into the archive. This was a straightforward process that worked well. Some of our affiliates, however, sought affiliation after they had been funded and were, therefore, without ring fenced funds to contribute. To date we have archived one affiliated dataset, with funds provided through a retrospective application to the funder (Dept of Health). The idea of cost recovery is valuable, and we plan to develop this further to underpin our future work. However this would not obviate the need for some core funding, e.g. for sustaining staff, and it would be likely to provide no more than a contribution to ongoing costs.
Q. Do you think the University Library can have a role in supporting, curating and preserving special collections of this sort, where there is a strong institutional interest? What are the limits and constraints that might apply? What are the barriers to Timescapes type initiatives operating in other research areas?
A. There is clear potential for University Libraries to play a key role in curating and preserving collections of research data and making them available for re-use. Indeed, the need for this is likely to grow in the not too distant future. Over the next five to ten years, Universities may well find themselves responding to new agendas from the Research Councils, which are likely to make such provision at local level a condition of funding. In order to respond to these new agendas, two things are needed. Firstly, and crucially is the will and vision at institutional level to build data management, curation, and re-use, into IT and Library functions, and to identify and adopt new systems that are flexible enough to support these functions over time and across varied forms of research and research data. Secondly, these new data infrastructures are more likely to fit researcher needs if they are developed in tandem with selected groups of researchers and can support the specialist requirements of research groups who are in the vanguard of developing new substantive or methodological research. A flexible platform that can accommodate both generic and relatively standardised needs, but respond also to these specialist needs, would be ideal.