As a busy library in a popular university, we want to make finding our books as easy as possible. We’ve amassed a lot of bibliographic data over the years and we’ve always uploaded and maintained it manually. Introducing a new library management system gave us a chance to look at our data, as well as how we manage it, and see what we could improve.

We worked with a third-party company who specialise in data analysis and clean-up for libraries, called Backstage. This helped us to see the areas where our data needed to be updated and how we could change our cataloguing processes in preparation for the new system.

We spoke with Sheila Gallagher, one of our metadata cataloguers who led on the data enhancement project:

What was the condition of the catalogue data prior to the project?

A lot of the data on the system is really old. When I first started at the Library, 27 years ago, we were working on a retrospective conversion project turning the card catalogue into digital records. This involved sending our records out to CURL (Consortium of Research Libraries, which later became RLUK). They matched our data by looking at author, title, ISBN etc. We then got the records back on massive chest-high print outs which we would go through to check the matches.

A lot of the data from the 80s and 90s, before we started cataloguing online, was quite brief. Over the years, the amount of metadata we include in a record has grown – some fields are now mandatory that weren’t before.

To prepare for the new library management system, we took the opportunity to undertake a similar conversion project – in which we’d send out our data to be matched and updated. This time, we worked with a company called Backstage who matched records against the Library of Congress to see if we could get a more complete view of the data in our records.

So how did this project work?

There were two stages to the project.

The initial phase started just after Christmas 2018, when we asked Backstage to add 300 fields of data – this included the description field, pagination, illustrations, size and dimensions. Then they enhanced the records with subject headings.

When I started, subject headings were added by subject specialists. We would create the bibliographic records and the books would then go to the subject specialist where they would assign subject headings and a classmark to determine where it would sit on the shelf. Their assistants would then create the item records. Over the years, we took over the classification but we didn’t assign subject headings. We would look for external records with subject headings and take the fuller record if we could – but we’d never assigned them. This was an opportunity to include subject headings where we hadn’t previously had them.

The second phase was more around tidying up–Backstage ran the records through their very clever systems to pick up obsolete fields and update where fields may have changed over the years, bringing the data up to date. They also did an RDA (resource description access) update, which involved things like spelling out abbreviated words – pages instead of pg, illustrations instead of il, etc.

They also produced reports for where they didn’t (or couldn’t) make any changes. Using these reports, we have the opportunity to update these records manually in future.

How many records went through this process?

We sent around 1.6 million bibliographic records to them. We excluded a lot such as Special Collections items dated pre-1800s because they were unlikely to find matches. We didn’t include journals and theses in the phase two tidy-up.

So what does this mean for library users?

The new library management system, Alma, does authority linking every night – we’ve had manual authority linking in the past – meaning the new records will go through a similar process to the project we’ve just done and find the most recent information on a daily basis. This means our records will always be up to date and will contain a lot more searchable data.

It will co-locate authors – where we might have had several different versions of a single author, for example because of initials, this has now all been brought together.

And, of course, there are the subject headings – these mean it is likely that more relevant results are brought back when searching.

So we’re hoping that this data project means that people will be able to find materials more easily and the data they see in our catalogue will be more consistent and up-to-date.

You can explore what’s next on the Library website and view the other blogs in this series right here on the blog.