Anonymisation

Anonymisation is a valuable tool that allows data to be shared, whilst preserving privacy. The process of anonymising data requires that identifiers are changed in some way such as being removed, substituted, distorted, generalised or aggregated.

A person's identity can be disclosed from:

  • Direct identifiers such as names, postcode information or pictures
  • Indirect identifiers which, when linked with other available information, could identify someone, for example information on workplace, occupation, salary or age

You decide which information to keep for data to be useful and which to change. Removing key variables, applying pseudonyms, generalising and removing contextual information from textual files, and blurring image or video data could result in important details being missed or incorrect inferences being made. See example 1 and example 2 for balancing anonymisation with keeping data useful for qualitative and quantitative data.

Anonymising research data is best planned early in the research to help reduce anonymisation costs, and should be considered alongside obtaining informed consent for data sharing or imposing access restrictions. Personal data should never be disclosed from research information, unless a participant has given consent to do so, ideally in writing.

Quantitative data
Qualitative data
Step-by-step

Follow these steps to anonymise a data file:

Find and highlight direct identifiers

  • Quantitative: visually scan variables
  • Qualitative: read the transcript

Assess indirect identifiers

  • Can the identity of a participant be known from information in the data file
  • Can a third party be disclosed or harmed from information in the data file

Assess the wider picture

  • Quantitative: run descriptive statistics and crosstabs to find unique cases and combinations of variables that can identify an individual in the dataset
  • Qualitative: which identifying information about an individual participant can be noted from all the data and documentation available to a user

Remove (or psuedonymise) direct identifiers

Aggregate or blur (in)direct indentifiers

Redact indirect identifiers

Re-assess any remaining disclosure risk

Back to top  

Due to industrial action, there may be a delay in responding to enquiries from Monday 25 November to Wednesday 4 December.

DATA CATALOGUE

Quick Access To

Add-on MS Word anonymisation macro tool for qualitative data, available from our tools collection.