HOW TO CURATE DATA
TRUSTED DIGITAL REPOSITORIES
TRUST IN DATA ARCHIVES: : A SOCIAL SCIENCE PERSPECTIVE
Trust has always been critical to our relationships with depositors and users but it has increasingly become a more formal issue as standards and best practices emerge; trust is a key theme in digital preservation.
The UK Data Archive has to address trust in two important areas by providing:
- access to data which may be sensitive in some way or another
- preservation services which allow the long-term reuse of any data, despite their potentially disclosive nature
In the first case, data owners like the Office for National Statistics have to trust the staff of the Archive not only to provide access solely to authorised users, but also to carry out services such as ingest processing (which includes ensuring the data are appropriately anonymised and internally consistent) and data archiving (managing the data within a secure environment) without disclosing any sensitive information.
Trust is even more important when we deal with data that require provisions over and above our standard information security procedures.
Trust ought to be transitive: data subjects who trust data owners to look after information about them appropriately should implicitly trust the data archives and repositories who become custodians of these data.
In the second case, data users have to trust that the data that the Archive holds and makes accessible are the same data that have been deposited by the data owners. The data should remain so as we migrate them to new standards and formats to support their long-term preservation.
The data we provide should be not only usable, but also authentic and reliable versions of the data we receive. Data users also have the right to know whether the reproducibility of results will be affected by changes to the data.
Increasingly, and especially with transnational access to data, repositories which have been licensed to provide access to data and are able to assign access to these data to other archives, have to trust each other.
In the age of 'big data' with an increasing focus on the use of administrative data for research, data creators, repositories and users increasingly rely upon each other for services as well as data.
Clearer criteria for trust benefit all.
This network of trust is underpinned by standards, legislation and criteria for trustworthiness.