We all know that the Internet is awash with data not all of which is directly available on a web page.
This free webinar, organised by the UK Data Service, will illustrate techniques for automating the download of files from the Internet, using APIs to download data and demonstrating how you might store the data, and finally, if the data is on web pages, how you can systematically identify it, scrape it and store the data in a dataset.
About the webinar series: getting data, storing data, manipulating data
All projects need data. You can generate it yourself via surveys or you can get some from the UK Data Service. If you didn’t generate it yourself, the chances are it is not quite what you wanted but you can adapt it to your needs. You can adapt the data by a variety of means; cleaning the data, extracting parts of the data (and ignoring the rest) and, trickiest of all, joining data from different sources.
This series of three webinars will cover ways of dealing with these data issues using both familiar software tools such as Excel and others that you may be aware of but may not have had any direct experience of.
In the first webinar we will look at SQL and Databases: useful for storing large data (100’s of GB or even more) and using simple SQL queries to retrieve only the items you want, possibly across more than one dataset.
In the second webinar we will look at various means of getting data from the Internet: this might range from simple copy and paste through to systematic data downloads from datasets or scraping specific items from 100’s of similar web pages.
The third webinar will look at the new functionality available in the later versions of Excel: Power Pivot to break the 1M rows problem and making dataset joins much easier, and the latest dynamic array functions which can simplify many common tasks.