Preparing data for deposit

Article Image

"Guidance on preparing and managing data"

Whether depositing large-scale survey data in our curated collection or smaller research collections via our self-deposit system, ReShare, data creators should consult our guidance below on preparing data. Ideally this should be prior to the start of fieldwork or data collection. In additon to the summary points noted here, we also provide comprehensive best practice guidance aimed at individual researchers and research support staff which can be found on our Manage data web pages.

We run a programme of regular training workshops covering key areas of managing and sharing research data. Please also get in touch with us if you would like to discuss any of these issues further.

Data files
Confidential data

Preparing data documentation

Data documentation should give future researchers sufficient information to be able to understand and reuse the data. Consider what kind of survey, data, and other contextual documentation, can explain what the data mean, how they were collected and which methods were used to create them. For survey series, ensure that documentation refers to the current year’s data.

Types of survey documentation

  • survey technical report with standard headings, describing sampling, achieved sample size, fieldwork and weighting and so on – the level of detail may vary depending on the scale and resourcing of the survey. See the Health Survey for England Report for gold standard documenation
  • information leaflets and consent forms
  • questionnaires, with universe and routing instructions
  • showcards
  • interviewer instructions – if these are commercially sensitive, then a summary of briefing content
  • coding frames and coding instructions
  • links to primary reports and publications
  • information about questionnaire variables that have been removed and the reason for this e.g. confidentiality
  • information about any known errors or issues in the data
  • structured information for the UK Data service collection-level metadata record, using DDI-compliant metadata and controlled vocabularies, as set out in our deposit form

Types of data documentation

  • variable list
  • links from variables to questions in the questionnaire (CAI or otherwise)
  • code book or data dictionary (we can also generate a data dictionary from the data files at the ingest stage)
  • clear, unique definitions of variables
  • geographical identifiers or spatial units should be defined using:
    • a unique name
    • referenced definition or authority
    • datestamp of when unit boundaries were defined, not when the sampling was recorded
  • weighting variables described
  • syntax for any derived variables
  • where possible or practical, change in key content over time - questions and variables, e.g. the summary of changes in topics and sampling from the User Guide for the Health Survey for England.

Qualitative data

  • topic guide for qualitative data
  • data list to serve as finding aid
Access and licensing

Back to top  

We expect to run as normal a service as possible during this COVID-19 (Coronavirus) emergency. Please visit our COVID-19 page for the latest information.


Quick Access To

Related Links