Productive visit by team from South Africa's DataFirst

Article dated: 15 July 2016

The UK Data Service has been working in partnership with South Africa’s DataFirst as part of an International Centre Partnership Fund sponsored by the Economic and Social Research Council (ESRC) and South Africa's National Research Foundation (NRF).

Photo: UK Data Service and DataFirst, meeting in London

A focus of the collaborative project is to investigate the challenges of curating and providing access for the social scientists to 'big' and complex data. The theme of household energy data is used to create use cases for work on our Hadoop data environment currently under construction. So far two knowledge exchange workshops have taken place; the first in London in December last year and the second in January 2016 in Cape Town. Reports on both of these can be found on our blog, with accompanying presentations.

This month we were delighted to welcome the DataFirst team for a week-long visit. Lynn Woolfrey, Manager of DataFirst, presented on the work of her Data Service, focusing on providing social science research data and services with limited resources, with minimal but core staff. Lynn demonstrated the use of their Data Portal which uses, an open source software platform, the National Data Archive, for rapidly publishing survey data and metadata. She also talked about their Secure Data Centre as well as issues around citation and version control.

Finally, we heard about their current data rescue project, where DataFirst had teamed up with other South African universities to convert rich at-risk historical Apartheid Era records to research-ready formats to open them up for new analyses.

Two days were spent at a hands-on workshop where DataFirst staff experimented with the Hadoop ecosystem, importing and visualising different types of data.

On the last day our guests and members of the Smart Household Energy project team attended a meeting hosted by our colleagues at the RCUK Centre for Energy Epidemiology, based at University College London, to hear from researchers about the tools and techniques they are using for their own energy projects. Projects were looking at areas like energy performance certification, identifying vulnerable customers, tariffs nudging, and household temperature monitoring. A selection of these are being taken forward as use cases for modelling data processing and analysis workflows and tools on the Hadoop system.

Louise Corti, Investigator on the project, commented that ”this was a really productive week. Not only did we learn a great deal from our guests on the challenges of running efficiently the full spectrum of data archiving activities with 4-5 people (and efficient it is), but we hope that our visitors have a better sense of how we plan to scale up our data curation to accommodate new and novel forms of data, and enable grander scientific data investigation”.

Lynn Woolfrey added to Louise’s comment by saying “We built our Data Service on the exemplary work of founding archives such as the UK Data Service. The Smarter Energy data collaboration has strengthened links between our Services, and it has been rewarding to visit and see the work you do first-hand.”

Discover UK Data Service