HOW WE CURATE DATA
What happens once data come into the UK Data Archive? We follow current best practice in preparing, curating and documenting our digital data to ensure continuous access.
The UK Data Archive has been curating data for more than 40
years over which time it has gained a wealth of expertise in all
aspects of data preparation and curation.
All acquisitions to the Archive undergo a number of steps to prepare them for sharing and re-use. We always contact data owners or creators for advice and clarification if queries arise during the processing phase of our curation activities.
Step 1. Transfer of data
Even before data arrive at the Archive, staff liaise with government departments, researchers and other data owners/creators to ensure that data arrive at the Archive in the best possible shape for processing.
Step 2. Assigning processing standard
We assign one of four processing standards (levels) to each data collection that we archive. This processing standard determines the work we carry out on the data collection. The selected standard depends on the nature and condition of the data, the quantity and complexity of the documentation, the estimated level of use and whether or not the data collection is to be made available through online browsing tools.
Further information may be found in our data
processing standards document.
Step 3. Data processing
Before processing we examine the data against the documentation supplied and, where appropriate, we validate the data, and check the data labels and the overall integrity of the data. Where necessary, we check confidentiality and carry out anonymisation, usually in collaboration with the depositor.
We then prepare variable lists for survey data or data listings for qualitative and historical studies. Using in-house scripts, and often manual processing, we prepare archival and dissemination versions of the data.
Enhanced labelling, grouping of survey variables and mark-up in XML for interview texts are carried out for the highest processing levels. This allows access via the online systems.
Step 4. Documentation processing
At the same time as processing data we process documentation. Documentation is all the relevant material we feel that users will need to make the best use of the data. If documentation is incomplete we request more from the depositor, or where possible, create it ourselves. We then prepare the materials into a relevant usable format, such as a collated online user guide. More detailed documentation is prepared for enhanced user resources, such as the user guides created by the UK Data Service.
Further information may be found in our documentation processing procedures document.
Step 5. Metadata creation
Metadata creation is carried out at the same time as steps 3 and 4 and often continues after they are complete. We create a metadata record based on information provided by depositors in the deposit form and on additional work by our specialist cataloguers. This catalogue record is based on the Data Documentation Initiative (DDI) and enables our collections to be searched and for our citation record to be created.
Step 6. Additional user information
Once the main work has been completed on preparing the data and documentation, some additional documentation is prepared, including a 'Read' file which gives details of the processing levels of the data collections. Depositors are given the opportunity to see what we will be archiving and making accessible.
Step 7. Publishing data
After all the data and documentation are complete, the material we hold at the Archive is transferred to the preservation system. The system is designed in such a way as to verify all the files which are added to it and to ensure that there are no problems with these files. A subset of this material is brought together to allow users to order data easily. At this stage the catalogue record is published and the data become available to users. A DataCite DOI is created for every collection, taking the form, e.g. 10.5255/UKDA-SN-3314-1. Reda mor about our citation approach (LINK) .
Step 8. Delivering data
We deliver most of the data we hold via a download system which is linked to our resource discovery system. We administer data access and manage the conditions associated with access of all data collections. We also deal with most user enquiries ourselves, and we keep records of the use of data for administrative and strategic purposes.
Step 9. Preserving data
We manage the preservation and curation of data using techniques built up from years of practice and that have been informed by the use of diverse standards. In essence, we keep multiple back-ups of data, in multiple versions, in secure locations which ensure no data loss. We also follow a strategy of migrating data and other files when new formats become commonplace. Since 2010 our organsiation has been certified under the international ISO 27001 standard for information security. A number of government departments have also carried out surveillance visits to the Archive to clarify our information security regime.