Wednesday 1:20 p.m.–4:40 p.m.
Hands-on Data Analysis with Python
Sarah Guido
- Audience level:
- Intermediate
- Category:
- Python Libraries
Description
Python is quickly becoming the go-to language for data analysis. However, it can be difficult to figure out which tools are good to use. In this workshop, we’ll work through in-depth examples of tools for data wrangling, machine learning, and data visualization. I’ll show you how to work through a data analysis workflow, and how to deal with different kinds of data.
Abstract
Python is quickly becoming the go-to language for data analysis. However, there are so many tools out there that it can be difficult to figure out which ones are useful. In this workshop, I’ll give you an in-depth look at some of the best tools for data wrangling, machine learning, and data visualization. You’ll learn strategies for working with data, how to structure a data analysis workflow, and which tools are appropriate for handling different kinds of data. You’ll leave with a good understanding of different data analysis techniques in Python.
Using Pandas, scikit-learn, and matplotlib, we’ll work through a data analysis workflow from start to finish, and we’ll cover the following data analysis problems:
- Data preprocessing and data wrangling with Pandas
- Using scikit-learn for machine learning
- Visualizing our results with matplotlib
Students should have an intermediate knowledge of Python, including the ability to write functions. Knowing how to use IPython Notebook will also be helpful, since the materials will be in that format. Having all of the materials installed prior to the tutorial is necessary, as I'll only spend a few minutes covering installation, though I'll present several options for participating in the tutorial. These packages are most easily installed through a distribution like Anaconda or Enthought Canopy.
Student Handout
No handouts have been provided yet for this tutorial