Synthetic data code demo

15 June 2021
Online, 13.00 - 14.00 BST

In this free training series, we cover the advantages and disadvantages of synthetic data. We explore the variety of methods available to generate synthetic data. Finally, we discuss the nuanced definitions comprising synthetic data itself. 

In this code demo we will showcase some data synthesis tooling. We work through a manual process of creating synthetic data in Python. We explore a web-based data generation library, Mockaroo. We replicate this dataset using a data generation library for Python called Faker. We simulate the rolling of dice and children’s shoe sizes.

Presenter: Joe Allen, UK Data Service

In the first webinar, we introduce synthetic data and explore some reasons why we want to use it.

In the second webinar, we explore two categories of data synthesis methods: Masking and Redaction.

In the third webinar, we explore three categories of data synthesis methods: Coarsening, Mimicking and Simulation.

Recordings of UK Data Service webinars are made available on our YouTube channel and, together with the slides, on our past events pages soon after the webinar has taken place.