Data tidying with Python and Pandas
This workshop covers practical approaches for handling data in Python. We will use the Python library Pandas. This workshop is a recommended prerequisite for the Data Visualisation workshop. In order to do effective data analysis or visualisation, we usually need to have our data cleaned and in a consistent format. We will cover the concept of “tidy”, and long-form, and wide-form data, and hands-on approaches for manipulating data and fixing common problems. This workshop concentrates on tabular data, like that found in spreadsheets or databases.
At the end of the course, you will be able to:
- read tabular data into Python using Pandas, and manipulate it
- identify problems in datasets that will hinder analysis
- use Python to fix common problems
- understand and convert between different data layouts such as wide-form and “tidy” as appropriate for the problem being solved.
This workshop is designed for participants with a basic knowledge of Python, but is also appropriate for attendees who do not know Python but have significant experience using another programming language.
Attendees are required to bring their own laptop computers.