Research data management

The importance of managing and sharing data

In the digital age, the generation of research data has grown exponentially, and data are nowadays very easily stored, kept and exchanged around the world. Digital infrastructures and the internet facilitate both the creation of ever larger amounts of research data, as well as their sharing.

Since 2000 we have seen a boom in both the drivers of data sharing, as well as the development of human and technological capability to do so. Data sharing is increasingly encouraged or required by research funders and journal publishers, but also from within the research community itself. Research funders want to maximise the scientific outputs and benefits to society from their investments and make sure that data can be reused for future research.

Data lifecycle

Increase opportunities for learning and innovation

Data often have a longer lifespan than the research project that creates them. Researchers may continue to work on data after funding has ceased. Follow-up projects may also analyse or add to the data, and data is also often re-used by other researchers.

Well organised, well documented, preserved and shared data are invaluable to advance scientific inquiry and to increase opportunities for learning and innovation.

Plan to share

Plan ahead to create high-quality, shareable research data

In research projects, early planning is essential to ensure that activities are considered in detail and are organised to ensure efficiency and successful completion of the work.

The same applies to planning how research data will be managed over the length of a research project and beyond. In this digital age, most research projects are data-centric, and therefore, research needs to be planned around the data. A Data Management Plan is, therefore, the ideal planning tool for researchers.

Planning for data sharing also means considering the FAIR Data Principles. This standard framework endorsed by funders, institutions, and data repositories ensures research data are Findable, Accessible, Interoperable and Reusable.

Find out more

Below are links to web pages containing best practice guidance on data management planning:

Data management planning overview

Data management checklist

ESRC data management plan and policy

Data management roles and responsibilities

Data management costing

Data sharing strategies

FAIR data principles

CARE data principles

Rights in data

Rights relating to research data

Rights in research data refer to the legal and ethical entitlements associated with the creation, use, and sharing of research data. These rights are often tied to intellectual property laws such as copyright and implemented through legal tools such as data licensing. Data licences govern who can access, use, modify, or distribute the data.

Intellectual property rights

Intellectual property (IP) rights apply to research data and play a key role when creating, sharing, and reusing data. Many kinds of data created as part of a research project are subject to the same rights as literary or artistic work. These might include texts, maps, audiovisual recordings, or information arranged in a database structure. Such items retain rights like copyright or more general IP rights when they are created. This gives the rights owner control over the exploitation of their work, such as the right to copy and adapt the work, the right to rent or lend it, the right to communicate it to the public, and the right to licence and distribute. The two most relevant types of rights as far as research data are concerned, are copyright and database rights. These rights need to be considered when creating, using, and sharing data.

Find out more

Below are links to web pages which give further information on rights relating to research projects:

Copyright

Licensing

Copyright scenarios

Other rights

Cite the data

Resources on rights

Collaborative research

Provide a data management framework for researchers

Large-scale and collaborative research is becoming more commonplace, with many research projects taking a cross-national and interdisciplinary approach to research.

This brings additional data management challenges for providing shared storage, access and the transfer of research data across the various partners or institutions.

Due to the nature of these projects, the coordination and streamlining of data management become important tasks.

Find out more

Below are links to web pages which give further information on strategies for collaborative research:

Data inventory

Standards

File sharing

Resources library

Data Protection

The legal landscape

Much research data – even sensitive data – can be shared legally if researchers employ strategies of informed consent, anonymisation and controlling access to data.

Researchers obtaining data from people are expected to comply with the relevant legislation, such as data protection legislation (e.g. the UK General Data Protection Regulation (UK GDPR) and the UK Data Protection Act 2018). Carrying out an assessment of disclosure risk can help to apply best practices of gaining consent, anonymising data and regulating access to enable data to be shared.

Some ethical issues, such as the duty of confidentiality, are legally-binding.

Find out more

Below are links to web pages which give further information on legal obligations and practical guidance on how to address this:

Access control

Data protection legislation

Disclosure assessment

Further resources on legal issues

Ethical issues

Upholding ethical standards

Most researchers will confront ethical considerations in practice when writing a research proposal, applying for ethics approval, or having to deal with ethical dilemmas that arise during a research project.

Collecting, using and sharing data in research with people all require that researchers meet ethical obligations. These responsibilities relate to respecting people, being fair with both research participation and the benefits of research, and minimising harm.

Upholding of scientific standards (integrity) and the compliance with the law (data protection and intellectual property) are often also considered ethical duties.

In the aftermath of the Facebook and Cambridge Analytica scandal, involving inappropriate data sharing, it is even more imperative that the research community upholds ethical standards.

Interactive module

Ethical consent and data sharing

Find out more

Below are links to web pages which give further information on ethical obligations and practical guidance on obtaining consent for sharing data:

Ethical obligations

Consent for data sharing

Resources on ethical issues

Storing data

Keep your digital data safe, secure and recoverable

Ensuring your data are safe is crucial to any research project. A good storage and backup strategy will help prevent potential data loss.

Ensuring data security requires attention to physical security, network security, and the security of computer systems and files to prevent unauthorised access or unwanted changes to data, disclosure, or destruction. Data security arrangements need to be proportionate to the nature of the data and the risks involved.

Encryption can be used to safely store and send files. Regular backups protect against accidental or malicious data loss, and this procedure can be easily automated. Data needs to be securely destroyed once it is no longer needed, as merely deleting files and reformatting a hard drive will not prevent data recovery.

Find out more

Below are links to web pages containing best practice guidance on data storage and security:

Security

Storage

Encryption

Backup

Disposal

Formatting data

Create well organised and sustainable data

Research data exist in many different forms: Textual, numerical, databases, geospatial, images, audio-visual recordings and data generated by machines or instruments. Digital data exists in specific file formats, which are coded so that a software programme can read and interpret these data.

Using standard and interchangeable or open lossless data formats ensures longer-term usability of data. For long term preservation, digital data is converted to such formats.

Data files should be clearly named, well organised, structured and quality, and version-controlled throughout the research. It is vital to develop suitable procedures before data gathering starts in order to adhere to any conventions, instructions, guidelines or templates that will help to ensure quality and consistency across a data collection.

Find out more

Below are links to web pages containing best practice guidance on formatting data:

Versioning

Digitisation

File formats

Organising

Quality

Transcription

Recommended formats

Anonymising data

Data privacy strategies

Wherever possible, data privacy strategies should always take the form of a three-pronged approach: consent, anonymisation, and access control. This strategy ensures that data shared from human participants is shared ethically and legally while preserving privacy and maintaining usability for secondary research. For unconsented studies, enhanced anonymisation and strict access control are essential.

Preserving the privacy of participants through data modifications

Anonymisation is a valuable tool that allows data collected from human participants to be shared safely, within both ethical and legal boundaries. This requires modifying identifiable data elements through removal, substitution, distortion, generalisation or aggregation.

A person’s identity can be disclosed from:

Data elements that directly identify the participants. These are known as direct identifiers. Examples include names, full addresses, identification numbers such as NHS numbers and raw audio and visual data capturing living individuals.
Data elements that indirectly identify the participants. These are known as indirect identifiers which, when linked with other available information, could identify someone. Examples include ages, education information, employment details, ethnicity, income and other financial information.

The Information Commissioner’s Office (ICO) has published extensive guidance (PDF) on what is considered “effective anonymisation”.

Balancing data modifications with keeping data useful

When deciding which information to keep and which information to remove you must always consider the data usability. Removing key variables, applying pseudonyms, generalising and removing contextual information from textual files and blurring image or video data could result in important details being missed or incorrect inferences being made.

Anonymising research data is best planned early in the research process. It should also be considered alongside consent and appropriate data licensing in line with the information that is to be made available.

Directly identifiable personal data should never be disclosed, unless a participant has given their consent to do so, ideally in writing. Even with consent, always consider the necessity and benefits of sharing such data, ensuring that any disclosure respects both the privacy and the intent of the participant.

Interactive modules

De-identification and anonymisation of transcript data

De-identification and anonymisation of quantitative data

Find out more

Below are links to web pages containing best practice guidance on anonymising data:

Anonymising qualitative data

Anonymising quantitative data

Anonymisation step-by-step

Documenting data

Make data clear to understand and easy to use

A crucial part of ensuring that research data can be shared and reused by a wide range of researchers for a variety of purposes is by taking care that those data are accessible, understandable and (re)usable.

Original researchers wishing to return to their data some time later, or new users wanting to use data, need sufficient contextual and explanatory information to make sense of those data.

Documentation deposited alongside data files should enable users, with no prior knowledge of the research project and data collected, to understand exactly how the research was carried out and what the data mean, in order to (re)use the data correctly in their respective projects and for their respective purposes.

This requires clear and detailed data description and annotation. Besides the information that is needed to reuse the data, data also need to be accompanied by information for citing and discovering the data.

To prepare data for secondary research, researchers should document data appropriately. They should also explain the procedures and fieldwork methods, the objectives and methodology of the research, and explicitly describe the meanings of variables and codes used. Additionally, they should describe any derivation, transformations, de-identification (pseudonymisation/anonymisation) or data cleaning carried out.

They should also ensure that data are held in an organised manner. Documentation is invaluable in enabling secondary users to contextualise data and conduct better, informed re-use of the material.

Any consent and confidentiality concerns that may inhibit archiving data should be resolved before the deposit is made. See our guidance on consent for data sharing.

Creating comprehensive data documentation is easiest when begun at the onset of a project and continued throughout the research.

Interactive module

Best practices for documenting data collections

Find out more

Below are links to web pages containing best practice guidance on documenting data:

This site uses necessary cookies

Website stats

The importance of managing and sharing data

Data lifecycle

Increase opportunities for learning and innovation

Plan to share

Plan ahead to create high-quality, shareable research data

Find out more

Data management planning overview

Data management checklist

ESRC data management plan and policy

Data management roles and responsibilities

Data management costing

Data sharing strategies

FAIR data principles

CARE data principles

Rights in data

Rights relating to research data

Intellectual property rights

Find out more

Copyright

Licensing

Copyright scenarios

Other rights

Cite the data

Resources on rights

Collaborative research

Provide a data management framework for researchers

Find out more

Data inventory

Standards

File sharing

Resources library

Data Protection

The legal landscape

Find out more

Access control

Data protection legislation

Disclosure assessment

Further resources on legal issues

Ethical issues

Upholding ethical standards

Interactive module

Find out more

Ethical obligations

Consent for data sharing

Resources on ethical issues

Storing data

Keep your digital data safe, secure and recoverable

Find out more

Security

Storage

Encryption

Backup

Disposal

Formatting data

Create well organised and sustainable data

Find out more

Versioning

Digitisation

File formats

Organising

Quality

Transcription

Recommended formats

Anonymising data

Data privacy strategies

Preserving the privacy of participants through data modifications

Balancing data modifications with keeping data useful

Interactive modules

Find out more

Anonymising qualitative data

Anonymising quantitative data

Anonymisation step-by-step

Documenting data

Make data clear to understand and easy to use

Interactive module

Find out more

Data level