Visualising library catalogues

The recent generous release by the Library of Congress of 25 million MARC records for open use prompted us to re-purpose an existing tool we have written for visualising and querying our collections data. As we start to think about the re-development of our Search the Collections, we have been considering (and learning from much helpful research in this area) about “breaking free of the tyranny of the keyword search box” by exploring other ways to interrogate and visualise data. One tool we have developed has been a structured natural language interface that can query museum collection data (with a deeper understanding of the data structure used to catalogue it) and display visualisations of the results. It was relatively easy to re-purpose this tool to query MARC records (as used by libraries) and then visualise our own National Art Library catalogue, and indeed any other MARC records. So it seemed obvious to load in the Library of Congress records.

Watch the video below to see an example of exploring the library collection (best viewed in full screen)

Of course this is not the entire collection from the Library so should not be taken as an authoritative view of the collection, but hopefully it provides an example of the ways in which data visualisation can help in exploring a collection. And as we’re not trying to create a new OPAC, there is a noticeable and deliberate lack of design on the results page (book cover images courtesy of Open Library Cover API).

How we did it

We made use of the following libraries and programmes:

An index was created in Elastic with a subset of fields from the MARC record using elasticsearch_dsl. We then loaded the records into Elastic using Catmandu with a custom mapping for relevant MARC fields. This can then be queried using aggregations in Elastic. Code to follow shortly on GitHub.

Relevant Research: