Answers to some of our most frequently asked questions are provided here.
If you are unable to find an answer to your question, please get in touch by selecting one of the short web forms.
Questions about how to manage data
In an increasingly surveyed society what access do government departments have to data held at the UK Data Archive and to personal data of participants, e.g. through researchers working for government departments?
Data should be prepared in such a way to enable the data to be used by other researchers, and for the data archive to be able to create accurate catalogue records. Researchers are encouraged to document data appropriately, (see our guidance on data documentation and metadata), to include research procedures and fieldwork methods and to ensure that data are held in an organised manner. Documentation is invaluable in enabling secondary users to contextualise data and conduct better informed re-use of the material. Any consent and confidentiality concerns which may inhibit archiving data should be resolved. See our guidance on consent and confidentiality.
There should be no additional costs for archiving data, other than researcher(s) time to prepare data and documentation for deposit. This time should be costed into the application. On average it is recommended that two to three weeks are costed into an average two-year research grant application to prepare and collate materials for deposit. However, owing to the disparate nature of research and data creation, we cannot provide advice on exact costs likely to be incurred in data preparation. We have created a useful costing tool as a starting point.
Various activities typically associated with preparing data are outlined below, for which you can work out appropriate costs in terms of people's time and equipment/software needed. Much data preparation can be carried out as part of the research process during data entry and transcription, therefore significantly reducing the cost of preparing data for archiving.
For quantitative/numerical data, allow time to add appropriate variable, value and code labelling to data, to create SPSS set up files (if relevant), to supply the syntax or logical statements for derived variables etc. as part of data-level documentation.
For qualitative data, provision should be made for the full transcription of interviews, focus groups discussions and so on, where the budget will allow. Transcription cost should be included in the overall research budget. If full transcription is really not feasible, interviews or focus groups should be fully summarised. For transcript in non-English languages, English summaries should be prepared for archiving, so costs for translation might be appropriate. A data listing giving details of each interview should be created.
Consent and data confidentiality also impact on costs for archiving. Consent for data archiving should be arranged during research when consent for participation and data use is obtained, or arranged afterwards. Allow time to anonymise data, where required. Ideally anonymisation should be undertaken during the project but will need to be checked before archiving the data. The time involved should not be underestimated as anonymisation appropriate for archiving may require the use of pseudonyms, or the preparation of an anonymisation key. The anonymised document should be meaningful and usable by other researchers. See anonymising research data guidelines and techniques for detailed guidance.
The cost of digitisation of non-digital sources, e.g. if the research involves work on a paper-based collection, can usually be included in the overall research budget. Additional suggestions and requirements on preparing data and documentation for archiving can be found in the 'prepare and manage data' pages.
Data can only be understood and used to their full potential by other researchers if they are adequately documented, see document your data. Any potential re-user should understand exactly how the research was carried out and what the data mean. The data creator should provide sufficient information on the objectives and methodology of the research; explain the data collection methods used; explicitly describe the meanings of variables and codes used and any derivation, transformations or data cleaning carried out.
It is recommended that transcriptions of interviews are made. Full transcriptions significantly extend the potential for analysis and re-use of a research collection, both by the original researchers and by secondary users. Transcription should be seen as a step within the analytical process of research, rather than as a mechanical conversion of data. If interviews are not transcribed, then recorded interviews could be archived alongside summaries, but it may be difficult or time consuming to effectively anonymise the audio files. See guidance on transcription.
Audio-visually recorded interviews are usually transcribed manually, see guidance on transcription. A standard transcription structure is recommended if transcripts are to be archived or if Computer Assisted Qualitative Data Analysis (CAQDAS) software is to be used to analyse the data. Transcriptions possess a unique identifier, adopt a uniform layout, make use of speaker tags clearly indicating the question/answer sequence, carry line breaks, be page numbered and carry a document header giving brief details of the interview: date, place, interviewer name, interviewee details, etc.
If after reviewing the information on preparing and documenting data for sharing and archiving, any query or question remains unresolved, get in touch with the Collections Development and Data Publishing team through the UK Data Service.
The General Data Protection Regulation (GDPR) defines ‘personal data’ as ‘any information relating to an identified or identifiable natural person’ (‘data subject’).
An identifiable natural person is defined as one ‘who can be identified, directly or indirectly, by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.’ See further details on Data protection and sharing research data.
Personal data may include photographs, email messages and data recorded by closed-circuit television (CCTV), if a person can be identified from this. It also includes data identified by reference numbers, where a separate list can be used to match the reference numbers to named individuals.
It, however, does not mean that all information provided during research by a person (e.g. during interviews) is personal data. If a person cannot be identified directly or indirectly from the information, then the information is not defined as personal data.
Special categories data are defined in the General Data Protection Regulation (GDPR) as racial or ethnic origin, political opinions, religious or philosophical beliefs, trade union membership, genetic data, biometric data, data concerning health, data concerning a natural person's sex life or sexual orientation.
The Statistics and Registration Service Act applies only to data designated as Official Statistics. Data access is an express statutory function of the Statistics Authority and the Act defines the legal gateways under which 'personal information' can be disclosed. The Act permits disclosure of personal information to an Approved Researcher, i.e. an individual to whom the Statistics Authority has granted access, for the purposes of statistical research, to personal information held by it. The criteria for access require the Statistics Authority to consider whether the individual is a fit and proper person, and whether the purpose for which access is requested is valid. The Act also states that disclosure of personal information outside of the legal gateways is a criminal offence.
The Act does not apply to individual researchers who are managing confidential research data that are not designated as Official Statistics. Further information on the Act is available at legislation relevant to data sharing.
If research data were given in confidence to an archive and the release of such data would breach this confidence, then a FOI request to disclose such research data can be refused.
Such research data could be data that under the licence agreement are only available to researchers, or may even have been placed under more restrictive access conditions due to the data being confidential. In addition, if a 'record' (e.g. research data) contains personal data (under the definition of the DPA or the GDPR) or Personal Information (under the definition of the Statistics and Registration Services Act) that would allow a person to be identified from the data, then such data cannot be released under the FOI Act.
In general, a FOI request entitles access to the content of information held, not necessarily to an exact original document (e.g. an interview transcript or dataset).
There is no overarching recommendation on length of time data should be kept, but some disciplines have firmer requirements than others. For example, in the UK:
However, for most social science data there is no recommended retention period for data, unless you have gained third party data under licence which actively requires you to destroy it after use.
Retention needs to be balanced against data protection requirements, which state that personal records (names, addresses and so on) should not be held for any period longer than necessary. This does not apply to most research data, but does apply to associated administrative data associated with a research project, unless you have sought explicit permission to keep it for a particular defined purpose.
Anonymised research data deposited with the Archive will be retained indefinitely and it is important that this distinction is made on the Information Sheet and Consent Form to inform participants. See our guidance on consent for data sharing.
There are a number of options dependant on the type of research being conducted. Consent should ideally be sought whilst conducting the research e.g. at the time of an interview or survey. However, at times it may be more logical to obtain consent for data sharing at a later stage, when the participants have a better understanding of the data in question. It is also possible to return to participants to seek consent for data uses (such as data sharing) possibly not discussed at the time of fieldwork. See also when to seek consent for information on one-off or process consent.
Best practice would be to gain informed consent from participants for each part of the research project data to be archived, so they can choose if they wish for the transcripts, audio files etc. to be deposited. It is important to ensure that participants are informed of the difficulties around anonymisation of audio and visual material, so they can choose whether the non-anonymised version can be deposited with an Archive. See our guidance on consent for using or sharing audio-visual material.
At a minimum, the law requires notification of the recording through clear signage. For example, a researcher video recording interaction and conversation in a shopping centre could not brief everyone in that location face to face. It would be more appropriate to have information sheets/signs about the project and the recording displayed in the location.
Based on a legal ruling (known as ‘the Gillick Principle’), young people aged 16 years and above can give their own consent. For younger children, a judgement must be made about their ability to understand what is being asked of them. They should have clear and intelligible information about the project, suited to their level of understanding. They should be asked for their individual, voluntary consent, in addition to that of a parent/guardian and/or head teacher. See detailed information on gaining consent in research with children.
Explain in a way that is appropriate to the cultural environment that the information they give you may be used by other like-minded researchers. If you have made certain agreements, e.g. that data will not be seen by government officials or the press, make it clear these promises will still apply to archived data. Please see our guidance on informing participants about an archive.
The UK Data Archive assesses data that are offered for archiving for potential confidential, sensitive or personal data. We also seek information on whether or not researchers have obtained consent for data sharing. Where we have concerns, we discuss with depositors options such as further anonymisation of data or renegotiating consent - where this is not possible, data may be rejected from the collection. In this way, we ensure the ethical re-use of archived data. Note, however, that the Archive is not an ethics review committee - it focuses only on the ethics of data sharing. In addition, we advise researchers on issues of informed consent, anonymisation and research ethics and encourages researchers to adhere to appropriate ethical guidelines when collecting data.
Yes, special considerations are needed for gaining consent for data sharing because of the nature of the sample group, similarly to obtaining consent for research participation from these populations, see special cases of consent.
No, the use of written consent forms in research is not mandatory, though consent should be documented. Obtaining informed consent for people's participation in research and for the use of the data gathered for various research purposes is an ethical requirement for most research and is typically required by Research Ethics Committees. Whether such consent is obtained verbally or in writing, using a brief informative statement or a detailed consent form, depends on the nature of the research, the kind of data gathered and how the data will be used.
Although obtaining consent in writing is recommended where possible, as it reduces the uncertainty over what was agreed between researcher and participant, it may be too formal for some research with people. It is the responsibility of the researcher to address this problem.
If data are collected verbally through audio-recordings, verbal consent agreements can be audio-recorded together with the data.
If personal data, sensitive data or confidential data are gathered during the research, the use of written consent forms is recommended, see Data protection and sharing research data.
For personal and/or sensitive data to be processed, archived and disseminated by the Archive, explicit consent is needed from each participant. Ideally this should be in writing.
Further information is available on the options of written or oral consent, on the use of consent forms and on consent in surveys under the consent pages.
The UK Data Archive Collections Development and Data Publishing team can comment on draft consent forms. Detailed examples for various types of research that take into account data sharing are available, see consent forms. Information specific to data archiving, that explains the procedures, benefits and risks of archiving and sharing data, is available for researchers, see why share data and informing research participants. We also have a model consent form which can be adapted for your own research project.
If consent forms contain personal information on the participants that could result in the disclosure of the participant's identity from the data, then they should be stored separately from the research data. The length of time that consent forms should be stored will be informed by institutional or research ethics committee requirements. A blank sample consent form should be archived at the UK Data Archive alongside research data as part of the documentation providing background information to the data.
No, since consent forms contain personal data, they are not archived at the Archive alongside research data. It is the researcher's or the research institution's responsibility to store consent forms safely and to decide how long they should be kept. A blank consent form and information sheet should be archived at the Archive alongside research data as documentation providing background information on data.
Restrictions on the use of archived data obtained by users from the UK Data Archive are outlined in the End User Licence (EUL). All users agree to this when registering prior to downloading data. In particular there is a fundamental restriction concerning the confidentiality of data. Users should not attempt to use the data to deliberately compromise the confidentiality of individuals, households or organisations and are required to abide by the Data Protection Act and the General Data Protection Regulation. The EUL also covers requirements for citation of publications and safeguarding of data. The University of Essex may refer any breach of the EUL for legal action under relevant legislation.
In addition, data re-users have the same ethical and legal obligations as primary data users and researchers in general to not disclose confidential or identifiable information from research data.
Exceptions to the duty of confidentiality occur where there is a legal compulsion, for example, the information may be subpoenaed by relevant police investigations or court proceedings, or where there is a disclosure of the information made 'in the public interest', as defined by the courts. There are no mandatory reporting laws in the UK but guidance issued by professional bodies and local safeguarding children boards emphasises the need to make a referral where there is a reasonable belief that a child is at risk of significant harm. There are thus ethical obligations on researchers working with children to make provision for the required actions to be taken in cases of disclosure of e.g. child abuse. Under the Children Act 1989 (England and Wales), the Children (Scotland) Act 1995 and the Children (Northern Ireland) Order 1995, the local authority has a duty to make enquiries about any allegation of abuse (is suffering, or is likely to suffer, significant harm). There are also instances where legislation requires reporting, such as under the Terrorism Act when a member of the public is informed of a planned terrorist attack. Additionally, some researchers are members of professional groups such as teachers and social workers who have a legal duty to report suspected child abuse.
In an increasingly surveyed society what access do government departments have to data held at the UK Data Archive and to personal data of participants, e.g. through researchers working for government departments?
Researchers working in or for government departments access data held at the UK Data Archive under the same restrictions as HE researchers. All researchers agree to an End User Licence (EUL) before data can be accessed. This EUL poses restrictions on how data can be used. All researchers wishing to access data held under a special licence must apply for access through the Approved Researcher route.
It is important to distinguish between personal data collected in research, and research data in general. In the case of personal data, those should not be disclosed (unless consent has been given that they can be disclosed). A Research Ethics Committee may indeed request that personal data collected during research, i.e. data that can identify participants, are destroyed after a certain time period to avoid possible disclosure (for example if data would be left unattended on an old PC). Identifiable information could also be excluded from data sharing. A Research Ethics Committee should not, however, ask you to destroy research data in general.
If research data contain sensitive or confidential information, then the sharing of such data should be considered carefully, but should not be dismissed as being impossible. If researchers need advice on how to address the sharing of research data as part of their ethical review, or if there exist conflicts between the need to archive data and a Research Ethics Committee's guidelines on data management, they can consult the information on research ethics review or they can contact the Collections Development and Data Publishing team.
A researcher does have a duty of confidentiality towards informants with regards information obtained from them. An exception to this duty of confidentiality is when the informants have consented to the information being used in specific ways and for agreed purposes. For the purposes of sharing or archiving data, researchers should make clear to informants that information will be shared with other academic researchers under strict terms and conditions, and should indicate how data may be anonymised where necessary.
It is important to demonstrate the agreement on confidentiality and data sharing between researchers and participant by obtaining consent from informants for the use of the information obtained for the purposes of research, publication, and data sharing. Ideally consent is obtained in writing.
Confidentiality of data therefore does not prohibit the archiving of data, as long as informed consent is obtained from informants to archive and share data, or where data are anonymised. Detailed information on this topic can be found on the pages on legal and ethical issues.
All users of data archived at the UK Data Archive are registered users - data are therefore not in the public domain - and users sign an End User Licence (EUL) that legally binds them to maintaining appropriate confidentiality of data.
Not all research collects data that is confidential or even sensitive. Even if you do, this does not automatically prohibit data sharing. It is common practice for researchers to obtain informed consent to use data for their research and publication purposes. In the same way, consent can be obtained for archiving purposes to allow secondary use of the data for research. Where necessary data can be anonymised, or access and usage can be restricted, to safeguard sensitive information. Detailed information on this topic can be found in the section on legal and ethical issues.
Data archives value the data deposited with them and take their duty very seriously to make sure the materials are used only in ethical and appropriate ways. Detailed information for participants, explaining what researchers and data archives jointly do to protect the confidentiality of interview data, whilst enabling data archiving and sharing for research purposes, is available at informing research participants.
In the case of quantitative data that are adequately anonymised, there is strictly speaking no need to obtain separate consent for archiving in order to enable their use by the wider research community (although it is ethically recommended to do so).
If substantial descriptive (qualitative) information obtained from informants is to be archived and informants were not asked for their consent to archive this information, they can still be re-contacted to obtain their consent. However, it is possible to share qualitative material that possesses no disclosure risk. If consent forms were presented and informants chose not to give consent for the archiving and re-use of qualitative data, then the data cannot be archived.
For especially confidential research data, additional access restrictions may be imposed beyond the standard licensed access. Data access authorisation may be required from the data owner prior to release of the data; or confidential data may be placed under embargo for a given period of time. This is decided on a case-by-case basis in dialogue between ourselves and the data owner.
Personal data or sensitive data may not be suitable for sharing with other researchers, depending on the informed consent that has been obtained from participants. Also data for which partial copyright lies with parties other than the researcher cannot be shared unless permission for data sharing has been given by all copyright holders. The Archive asks for specific information on such circumstances when data are being offered for archiving so data can be assessed by us to ensure that they can be shared with other researchers in an ethical and legal way.
No, data archived at the UK Data Archive remain the property of the original data creator(s). The Archive preserves, stores and disseminates the data for you, but does not own the data or hold any rights in the collection, unless added-value work such as transcription has been undertaken as part of processing in-house.
Copyright, an Intellectual Property Right reflecting the output of human intellect, applies to creative and artistic original work including written work, spoken word, photographs, databases, research data, etc. Copyright is automatically assigned and does not need to be applied for.
Usually copyright is retained by the author of the original work; this could be an individual, organisation or institution. If a piece of work is completed as part of employment, the employer will retain copyright of the work. Anyone who is commissioned to create a piece of work on behalf of someone else will retain copyright of that work. See detailed information on copyright.
When data have been created from a variety of sources or if the research has been funded by a number of organisations, there is shared copyright for all involved parties. In these cases permission to archive data must be sought from all interested parties and a covering letter confirming agreement should accompany the materials when deposited. We have a useful set of scenarios exploring various copyright considerations.
Where the words are recorded by the speaker, they would hold the copyright. Transcription of the words on paper or computer is protected by copyright and is owned by the person making the transcription. If the transcription is a substantial reproduction of the words recorded, the speaker will own copyright in the words and a separate copyright will apply to the transcription. This is of particular relevance to the recording of in-depth interviews. This also applies to a recording on tape or video. The person making the recording will own the copyright in the recording and the interviewee will own the copyright in the recorded words.
Copyright can only be transferred in writing and signed by the person making the transfer. This document is called an assignment. If researchers wish to publish large extracts from an interview, it is advisable to obtain a transfer of copyright from interviewees.
Yes, but not sole copyright to the new material. The creator (author) of the existing data used for the research will still retain copyright in that material. For the purpose of data archiving, permission is needed from the person/organisation holding copyright of existing data to archive the new data.
Access conditions for the dissemination of all materials deposited at the UK Data Archive are agreed between the Archive and the data depositor at the time of deposit, using the licence agreement. This agreement is a contractually binding legal document. Similarly, when materials are requested from the Archive all users agree to an End User Licence (EUL) whereby the user undertakes to abide by all conditions stipulated in the agreement. This must be completed before any materials are supplied to any user. It is the combination of these contractual agreements that ensures copyright infringement does not occur.
Yes. You will need to formulate an agreement with the person commissioned to create the data, stating that copyright is to be assigned to you.
Database rights were introduced in 1996 (Directive 96/9/EC on the legal protection of databases and Copyright and Rights in Databases Regulations 1997) exclusively to protect databases. If the Directive applies, the owner can prevent unauthorised extraction or reutilisation of all or a substantial part of the contents. The rights last for 15 years and can be renewed. This may pose problems for those wishing to reuse data for research purposes but the 'fair dealing' exception (which is restricted to extraction) by a 'lawful' user allows illustration for non-commercial teaching or research. Substantial is also not clearly defined. If in doubt about sharing your database contact the Collections Development and Data Publishing team in the first instance.
Some journals require authors to submit data alongside a publication so that the published results can be replicated by others. Note that data obtained from the UK Data Archive including subsets and derived data, cannot be submitted to journals alongside publications, as this would be a breach of the End User Licence (EUL) that users agree to when they register. However, in most cases it is sufficient for the author of the publication to supply information to the journal about the data and its location, using a proper citation.
In addition, for derived data there are a number of options available including:
The Archive has a preservation strategy which ideally archives data in a non-proprietary open format, so they are software and hardware independent. This enables wider use, easier access and guarantees long-term preservation of archived data. If your data cannot be converted to a standard non-proprietary format, we cannot guarantee their long-term accessibility. An example might be if you have N-Vivo or Atlas-ti files, which are held within a proprietary fixed format which is not totally exportable. These project-based files would be acceptable as long as you also kept the raw data files in MS-Word, RTF or plain text formats. See file formats for examples of acceptable data formats.
We judge each data offer on a case-by-case basis. Whilst it is preferable for research purposes that recorded interviews/discussions are transcribed (as it makes re-use of such data much easier), at times audio-visual materials are archived too. Paper-based artefacts, such as photos, postcards, family trees, could be digitised. We can discuss formats with you if you are unclear. Contact the Collections Development and Data Publishing team.