As the Fellowship ends, Archivist, Caroline Bolton reflects on the benefits of catalogues as data as an approach.


This blog series has highlighted examples of the benefits of publishing reusable data which can be mined, enhanced and visualised. Aside from improving discoverability, accessibility and engagement it has shown that even small steps towards catalogues as data can be useful for:

  • Preparing to manage and explore an increasing volume of born digital and digitised archives and their metadata. Including techniques that support cataloguing from the ‘bottom-up’ using item level metadata.
  • Learning about approaches and tools that can also be used to exploit data within collections. The Collections as Data project provides useful resources and case studies.
  • Providing a small sample of structured and unstructured data to:
    • Engage volunteers, researchers, and staff to work with data remotely and support digital and (post) Covid expectations for access.
    • Contribute to an aggregation of data that can be used to train artificial intelligence tools.
  • Exploring opportunities to use digital technologies that enable us to do more with creating and using archival descriptions that benefit both researchers and archivists:
    • Embedding connections and fostering collaborations and interdisciplinary research
    • Using them for management purposes, including getting around system limitations such as increasingly using csv imports for bulk changes

Towards research ready/re-usable catalogue data:

The Fellowship case studies have focused on catalogues of single collections and certainly there are benefits at this scale, especially as learning opportunities. Publishing on a bigger scale could multiply these benefits and efficiencies, but this needs agreements on standardising data and investment in sustainable technical infrastructure like aggregators and linked data services. These issues are currently being explored within and across cultural heritage sectors with initiatives like Towards a National Collection. Whilst best practice is emerging there is still plenty that can be done to prepare:

  • Know your (meta)data: Understanding what it is, where it’s held and what standards it meets, can be important in identifying gaps, data quality and checking whether they are sufficient for cataloguing in the digital age?
  • Strategies for enhancing catalogues: This might include reducing narrative, using controlled vocabularies as access points for people, organisation, place and subjects. If/ how this should be applied retrospectively to existing catalogues and how this could be resourced as well as embedding this in cataloguing practice for new analogue and born-digital collections, are all considerations.

Catalogues as data – part of managing the digital shift

The “digital shift” is much more than digital preservation, it is about the opportunities for efficiencies, improvement and innovation that using digital technologies and data can bring. For archivists it is obvious that this change includes managing digital archives but is also about meeting the expectations of new and digital audiences for discovery and access to all collections. This data driven approach has raised questions about what a next-generation digital catalogue looks like and particularly:

  • The effectiveness of current archival standards such as ISAD G and the systems built on these compared with emerging technologies and standards such as Records in Context, designed to support digital archives and new opportunities for access.
  • The levels of IT support and data skills needed to adapt and how these might be developed:
    • DPC’s proposed follow up to N2NH looking at digital access (TBC 2021)  
    • Intuitive tools for both archivists and researchers such as Open Refine, GLAM workbench
    • Opportunities for digital labs to grow these skills
    • Emergence of Computational Archival Science to inform new approaches
  • The role of (local) online catalogue as the only source of access and discovery and where these fit in relation to a move towards collective or web solutions like Wikidata.
  • The roles of ‘authoritative sources’ and managing expectations of confidence/control if cataloguing is automated or co-created.
Suggested levels of data expertise.
Suggested levels of data expertise. Image credit Leeds University Library

These are not easy questions to answer, especially when the effectiveness of any new approaches will need some consensus, like establishing standards for interoperability. For now, if nothing else we may just need to be part of this conversation within our own organisations and across and beyond the archive profession.