Change the future

Wednesday 9 a.m.–12:20 p.m.

Faster Python Programs through Optimization

Mike Müller

Audience level:
Intermediate
Category:
High Performance Computing

Description

Although Python programs may be slow for certain types of tasks, there are many different ways to improve performance. This tutorial will introduce optimization strategies and demonstrate techniques to implement them. Another of the objectives of this course is to give participants the ability to decide what might be the optimal solution for a certain performance problem.

Abstract

Objective

This tutorial provides an overview of techniques to improve the performance of Python programs. The focus is on concepts such as profiling, difference of data structures and algorithms as well as a selection of tools and libraries that help to speed up Python.

Intended Audience

Python programmers who would like concepts to improve performance. Audience Level Programmers with good Python knowledge.

Prerequisites

Please bring your laptop with the operating system of your choice (Linux, Mac OS X, Windows). In addition to Python 2.6 or 2.7, we need: - RunSnakeRun (http://www.vrplumber.com/programming/runsnakerun) - the Guppy_PE framework (http://guppy-pe.sourceforge.net) - psyco (http://psyco.sourceforge.net, Python 2.6 only, version 1.5.2 or higher) - pypy (http://pypy.org) and - NumPy (http://numpy.scipy.org, version 1.2 or higher).

Update: See updated tutorial preparation instructions at Faster Python Programs through Optimization

Method

This is a hands-on course. Students are strongly encouraged to work along with the trainer at the interactive prompt. There will be exercises the students need to do on their own. Experience shows that this active involvement is essential for an effective learning.

Outline

  • How fast is fast enough? (10 min)
  • Optimization guidelines (10 min)
  • Premature optimization
  • Optimization rules
  • Seven steps for incremental optimization
  • Optimization strategy (30 min)
  • Measuring in stones
  • Profiling CPU usage
  • Profiling memory usage
  • Algorithms and Anti-patterns (40 min)
  • String concatenation
  • List and generator comprehensions
  • The right data structure
  • Caching
  • The example (5 min)
  • Testing speed (5 min)
  • Pure Python (15 min)
  • Meet Psyco, the JIT (10 min)
  • Using PyPy (10 min)
  • NumpPy for numeric arrays (10 min)
  • Using multiple CPUs with multiprocessing (20 min)
  • Combination of optimization strategies (10 min)
  • Results of different example implementations (5 min)