Open Repositories 2016

‘Better it is to get wisdom than gold’

Motto above the original entrance to the V&A

As the V&A has recently launched its research institute VARI, and is obliged to make its publicly funded project outcomes (currently journal articles, excluding monographs at present – see AHRC Open Access ) available to all under open access, what better place to learn about the best way to go about this than at the 11th annual Open Repositories conference held recently in Dublin. This gathering of university librarians, metadata analysts, developers, researchers and administrators included a day of workshops and three days of discussion about the hows & whys, pros & cons and best practices of setting up and running an open access institutional repository (and beyond).

A Quick Guide to Open Access

Before anything else, it’s helpful to spell out the intention of open access:

Government Funded Research Outputs (e.g. articles and possibly data) should be available for all to read (and use) for free, not just in journals affordable
only to libraries at large institutions.

(see the UK Government statement and Research Councils UK policy)

That’s it. And it would seem fairly easy to satisfy, just upload the PDFs onto
your website… Oh, and a few other things:

Add the relevant metadata to the publications (who funded it? Who took part in it? What kind of outputs did it have? etc).
Make it fully text searchable.
It would help if the metadata was available via various APIs, so funders can automatically discover the outcomes of the projects they funded.
You also need to ensure the data is preserved, always accessible and migrated to new formats as and when necessary.
Only do all this when the article’s embargo ends (if it is embargoed, see Green or Gold).
Make sure you’re aware of all the articles published by researchers in your organisation, wherever and whenever they are published.
Produce regular reports and statistics on all of this, in standard formats.

So, a little more complex to implement than it would first seem. And so
Institutional Repositories were born, to provide a home for research
articles published in journals, and to make available all the metadata
associated with them. And later on, when open access to research data was also requested, Research Data Repositories. And possibly other repositories for other purposes (multimedia, etc).

Conference

With all this in mind the majority of the talks & discussions at the conference
were on the different leading repository systems in use around the world; tools and utilities to help manage them; and analytics on the usage of the repositories.

Repository Systems

The open repository systems discussed at the conference were:

Eprints

Eprints is the oldest of the repository systems, developed at the University of
Southhampton, with a first release in 2000. Written in Perl, it probably has the best support ‘out-of-the-box’ for a UK universities metadata requirements.

Hydra

One of the newer repository system, Hydra is a framework that integrates several modular components such as Fedora Commons and Blacklight. Written in Ruby, it is configurable for many different purposes, with the intention that multiple ‘heads’ can use as a common Fedora implementation for different purposes (Institutional Repository, Multimedia repository, etc)

DSpace

Also using Fedora Commons as its preservation system, Dspace was released in 2002, written in Java. It provides support out of the box for many of the required metadata standards/APIs. In the UK it’s probably the most commonly used system alongside Eprints.

Islandora

Islandora combines Fedora Commons, Solr and Drupal to provide a repository system written in PHP.

Fedora

Fedora serves as the backend preservation system for multiple repositories systems, but it can also be used on its own for those who want to create their own custom implementation.

Invenio

Another new(ish) repository system, although based on CERNs document server which has been running since 2002. Invenio is written in Python, and uses PostGreSQL and ElasticSearch as its preservation system.

Tools & Utilities

The conference covered a huge number of subjects related to open access and
repositories, including:

ORCID – an attempt to uniquely identify researchers so their work can be tracked and automatically imported into an institutional repository.
RIOXX – a metadata application profile to record research funder and project/grant identifiers to help institutional repositories comply with the RCUK policy on open access
Various services developed by Jisc to assist UK universities with their Open Access requirements.
The International Image Interchange Format (IIIF) and the many benefits of implementing this for the interchange of images between institutions (the equivalent of our collections API for images)

Analytics

There was much discussion on providing correct analytics for access to repositories, especially around the issue of trying to remove bots (web search engines crawlers indexing pages) from the stats. This is important for the use of altmetrics to measure the impact of a paper, which may be used in future research funding allocation decisions.

Lack of Museum Representation

I was rather surprised to be (as far as I know) one of only two people there from
a museum (certainly from the UK). Whilst the open access movement started in the sciences, it has had an impact across all the disciplines. It may be that museum based researchers can often be affiliated with a University, so for the moment are using a University repository for their publications.

It may be helpful for those museums who are planning on setting up a repository to meet to discuss some of the issues unique to us, for example, how does research work related to the museums collections data fit into the open access requirements (does
making available a non-standard public collections API suffice? Would copies of the updated collection object’s data stored in a Institutional Research Data Repository
be suitable for funders? Would using Zenodo to store the data be suitable?)

Jargon

One commonly mentioned point at the conference was the seeming incomprehensibility of the language used in the world of IR (Institutional Repository) and OA (Open Access). New acronyms seemed to be coined daily, each funder tends to have their own metadata schema they require to be used, probably with some name even more cryptic than the previous one. This makes life quite hard to understand for the newcomer to the world.

Summary
For the V&A’s immediate needs, this was an extremely useful way to quickly learn about the current state of open access & institutional repository technology, and related projects.

Thanks

Thanks to the many helpful conference attendees who guided me through the technologies, acronyms and projects that make up the world of Open Repositories.

Comments

Add a comment

More on Digital

A chat with computer animation programmer: Paul Allen Newell

Vera Molnar: Machine Imaginaire – the dance of hands and machine thinking

Digital design – Art Fund New Collecting Award

More from around the blog

MEMBERSHIP

SHOP