Bridging the data divide: UK Data Service experts take part in IASSIST 2015
Article dated: 30 June 2015
Representatives from the UK Data Service travelled to Minneapolis in June to take part in this year's IASSIST conference.
IASSIST is an international organisation of professionals working in and with information technology and data services to support research and teaching in the social sciences. The conference focused on Bridging the data divide: data in the international context. Our experts gave papers and presented posters on running a data service, including:
Public use of public data - finding space in an open data world
As part of a session on 'Enabling public use of public data',the UK Data Service's Margherita Ceraolo explained how the Service is finding space in an open data world. Here she emphasised how, as part of our commitment to increasing access to data, we are making more of the data we hold data open to the public where possible – including a range UK census data, qualitative data, survey datasets and international macrodata.
Using international macrodata as an example, Margherita illustrated how we have developed the open delivery platform UKDS.Stat to create an invaluable resource for anyone interested in international socio-economic data. She also outlined how the Service is providing APIs, visualisation, integrating social media, and acting as brokers to highlight the impact of our data.
Training users to work with research data
The UK Data Service's Sarah King-Hele, during a dedicated session on Training users, explained how the Service is teaching users how to work with research data. Sarah outlined the variety of methods we use to do this, including our rolling series of webinars, face-to-face presentations, practical workshops and our suite of online resources (access our entire suite of specialist training courses and webinars from our events page). Sarah highlighted how these different methods enables us to meet the needs of different users, and how our training programme can be developed to cover new kinds of data (such as open data and big data) and target an even wider range of users.
The impact of data citation - making citation connections
In a session later the same day, the focus shifted to the Impact of data citation, in which Melanie Wright, presented on how the UK Data Service is helping users make effective data citation connections – from study level, to subsets of data, to paragraphs of text. We produce citations every data catalogue record in Discover, that can be copied and pasted into any reference list, all of which include a persistent identifier that gives a unique access code for the data. Melanie explained how our tools also dynamically create a citation from paragraphs in qualitative text, whereby users can select a passage and the Service then mints a unique identifier that can be used to cite, precisely, that piece of text.
Improving the user experience through better visibility
The Service has embraced a number of initiatives to engage with users to learn about and improve their experience of using the service. In the session on Improving the user experience, Hersh Mann, Support Manager, for User Support and Training, explained how collaboration with data producers can greatly enhance data visibility and use. He emphasised how data services can do much more than simply 'get, store and provide' the data. Hersh used the Young Lives birth cohort study as an example of a dedicated plan to improve its visibility resulted in a spike in usage. Indeed, the UK Data Service continues to build on its relationship with these data producers and the Young Lives data portfolio now includes a teaching dataset, as well as new waves of data.
Data user insights: Listening to the user voice
The UK Data Service is actively listening to the user voice in order to deliver an improved and targeted service and Sarah King-Hele explained, as part of a session on Data user insights how it is essential that we provide our users with the tailored support that they need. Her presentation outlined how we consult with our users, through our annual stakeholder consultation, a pop-up survey on the Service website, ad-hoc consultations with specific user groups, regular user-testing of the website, monitoring of Google Analytics, user conferences and monitoring feedback and attendance figures from training events. Sarah also introduced some of our exciting new plans to reach new audiences, including expanding our use of data visualisation and a new dissertation zone.
ReShare: re-shaping the landscape of research data repositories
Contributing to the session on Data repository models and infrastructure,Louise Bolger, Collections Development Officer, discussed how our new ReShare interface is re-shaping the landscape of research data repositories. She explained how depositors can now archive and share research data through this interface, by self-depositing their data and preparing data files themselves. Louise highlighted that ReShare minimises the 'burden' on depositors of completing metadata records, by optimisng linkages with other systems to maximise standardisation. She also noted that any data deposited this way are easily-discoverable for users; through the UK Data Service's Discover interface.
Leveraging metadata - extending access through public APIs
In the session on Leveraging metadata,John Shepherdson, Director of Technical Services, discussed how the UK Data Service is boosting access to its data collections and encouraging the development of exciting ways to access and view data through the use of public APIs. He illustrated how these facilitate a self-service approach for our data producers and researchers alike, whilst also enabling third party developers to write applications that consume our APIs to present new ways of accessing some of the data collections that we hold.
Another colleague in our Technical Services team, Jonathon Sexton, presented a poster outlining how the Service has implemented Solr Cloud – through which we were able to provide a simpler, more maintainable system to facilitate a stable system for fast full-text and faceted searches. He highlighted how the Technical Services team sought an alternative route of Solr Replication which provided a low-maintance solution that met the needs of the Service in terms of its reliability, scalability and easy configuration - keeping things simple.
The challenges of data linking
Meanwhile, in a dedicated session on the Challenges of data linking, our colleagues at the Administrative Data Service presented the challenges of reducing the public's data trust deficit and improving the efficiency and accuracy of administrative data linkage. Here, Ilse Verwulgen, presented on behalf of Trazar Astely-Reid, Chris Coates and Judith Knight. She discussed how securing an understanding of public attitudes to the use and linking of administrative data has been the cornerstone to setting up the new Administrative Data Research Network - ADRN - (a UK-wide partnership between universities, government bodies, national statistics authorities and the wider research community). She emphasised how it is the ADRN's role is to both secure the public's trust and provide a service to researchers that is secure, lawful and ethical, run by experts in the field who ensure privacy is protected.
As part of the same session, Trazar's colleague Kakia Chatsiou, outlined one of the biggest challenges of enabling access to linked de-identified administrative data – that of the accuracy and quick delivery of pre-processing and linkage of such large datasets. Only some of the records are successfully linked using automated methods (by i.e. using deterministic or probabilistic methods) while most are linked by indexing professionals. Kakia's paper provided an overview of the ADRN's current methods for preparing and linking administrative records and how methods from other disciplines, such as Natural Language Processing, have been dealing with similar challenges when working with similar goals in mind.