Persistent identifiers: more important than you think

Article dated: 22-Feb-11

Researchers: Have you ever wondered why your work is not cited more often? It could be related to the way your data is identified.

The Digital Object Identifier (DOI) framework is an international standard for identifying objects (or information about them) in a globally unique way. By using a DOI you are ensuring that even if the location or information about an object changes, the DOI will always link or 'resolve' to the same object.

row of stones

So what exactly is a DOI? It is a string of letters and numbers that can be used to make resources directly available to anyone over the internet. When the identifier is persistent, a researcher can remain confident that a citation will always resolve to the original object.

These issues are important to the UK Data Archive, as we are in the business of managing, sharing and preserving digital objects.

We refer to the collection of files that make up an object as a study. As a key part of our management process, each study is assigned an identifier unique within the Archive and has an associated text citation, available with the study, to allow researchers to cite the work in their own publications.

Collaborating to establish standards

The DataCite organisation was founded by organisations from six countries, including the British Library, to establish research data as legitimate, citable, and capable of verification and re-use – all goals the Archive holds dear. DataCite see DOIs as an effective method of persistent identification and has already broken the barrier of one million registered objects.

The Archive is now using a pilot service in conjunction with DataCite to test the implementation of DOIs for each instance of a study. This will guarantee that with each significant change to a study (such as an amendment or addition to the data or metadata), a new DOI is assigned and researchers can confidently go beyond the study level and cite individual versions of a dataset.

During the pilot, the Archive will clearly define 'significant changes', review citation structure and assign DOIs to a subset of our 5,200 catalogued studies. Users accessing a DOI over the web will be taken to a page describing the full history of each version of the study. It will be immediately clear if the DOI resolves to the latest or a past version of a study and what changes have been made over time. These DOI 'jump' pages will provide links to the full Data Catalogue at ESDS.

The Archive will also promote its use of DOIs to depositors and researchers as an efficient and coherent means to cite source data or metadata. For researchers, this avoids inconsistent references and broken links and simplifies the citation process. For depositors this ensures results will be verifiable and re-usable as well as increasing the visibility of their name and work.

As the Archive focuses on its commitment to the full data lifecycle and adopts more detailed management methods and technologies such as those offered by web services and the Data Documentation Initiative (DDI3), we expect to increase our adoption of DOIs and persistent identification.

Archive Director Matthew Woollard showcased our experiences to date with DOIs in a talk titled 'Persistent Identifiers at the UK Data Archive' at a workshop on Persistent Identifiers for the Social Sciences, held in Germany in early February 2011.