Wednesday 9 a.m.–12:20 p.m.

Exploring Machine Learning with Scikit-learn

Jake Vanderplas, Olivier Grisel

Audience level:
Intermediate
Category:
Science

Description

This tutorial will offer an introduction to the core concepts of machine learning, and how they can be easily applied in Python using Scikit-learn. We will use the scikit-learn API to introduce and explore the basic categories of machine learning problems, related topics such as feature selection and model validation, and the application of these tools to real-world data sets.

Abstract

Machine learning is the branch of computer science concerned with the development of algorithms to which can learn from previously-seen data in order to make predictions about future data. It has become an important aspect of work in a variety of applications: from optimization of web searches, to financial forecasts, to studies of the nature of the Universe.

This tutorial will provide a hands-on introduction to the central concepts of machine learning and the scikit-learn package. Beginning from the broad categories of supervised and unsupervised learning problems, we will dive into the fundamental areas of classification, regression, clustering, and dimensionality reduction. In each section, we will introduce aspects of the Scikit-learn API and explore practical examples of some of the most popular and useful methods from the machine learning literature.

The strength of scikit-learn lies in its clean, uniform, and well-documented interface to efficient implementations of a large number of the most important machine learning algorithms. By the end of this tutorial, participants will have a basic practical background in machine learning and the use of scikit-learn, and will be well poised to apply these tools in many areas, whether for work, for research, for Kaggle-style competitions, or for their own pet projects.

Student Handout

No handouts have been provided yet for this tutorial