Python, Linkers, and Virtual Memory

Brandon Rhodes

Type:: Talk
Audience level:: Intermediate
Category:: Core Python (Language, Stdlib)

March 11th 2:10 p.m. – 2:55 p.m.

Description

Why does “top” show that your Python process uses 110 MB of virtual memory but has a resident set size of 9 MB? Does it consume more memory to spawn several interpreters, or to run one Python and have it fork() further workers? What is an “undefined symbol,” anyway? Learn about how an operating system manages memory, loads shared libraries, and what this means for Python servers and applications.

Abstract

If you have ever seen the error “Undefined symbol” when running a Python program, then you have encountered dynamic linking: a feature of modern operating systems by which they minimize program size and maximize the memory shared between processes, but that requires software to have been compiled against exactly the right version of a third party library.

This talk will tackle modern operating system memory management from the ground up, steadily building a picture of its impact on Python performance. By considering how this very limited resource is partitioned and managed by the operating system, we will arrive at very specific recommendations about how your Python program should be debugged, deployed, and monitored.

Many topics will be covered:

The invention of virtual memory. The memory space of each Python process is a fiction sustained by the operating system and processor hardware. Why is this fiction, which originated in the 1960s, necessary? What does it accomplish? How much does it cost? And which parts of your Python application are allocated to the text segment, the stack, and the heap?
Memory, caching, and swap. We will examine the hierarchy of storage media on a modern computer system, and how quickly costs grow as information moves several levels from the processor. We will contrast swap, which persists physical memory pages not currently in active use, with disk buffers by which information already on disk is brought much closer to the processor. When a machine beings thrashing, we will learn, our Python application gets clobbered.
Linking to shared libraries. To avoid the expense of recompiling large programs, programmers invented linking so that they could combine pre-compiled objects together. After looking at how symbol tables are used in static linking, we will explore what happens when linking takes place at runtime instead — and how disk space and physical memory can be saved as a result. This will teach us why “dev” packages are so often necessary to compile Python extensions, and why undefined symbols result when libraries are missing, or we try to mix-and-match shared library versions.
Measuring the Memory Footprint. Stepping back to see how these pieces fit together to produce a typical Python process, we will learn how to measure its memory usage as best we can. Basic tools like “top” and the Windows Tasklist will be examined, as well as more low-level tools like the /proc filesystem and tools like ps_mem.py.
Taking Control of Compilation. What if you want to make your own decisions about what the Python interpreter does and does not include? What if you want to include a library statically to give you safety and portability, and give up the benefits of dynamic linking? We will answer these questions by looking at how Python itself, and also individual extension modules, can choose to statically link against important libraries instead of leaving them dynamic — our specific example will be my pyzmq-static package on PyPI. Finally, we will consider how forking on Linux can result in a set of Python processes with as much memory in common as possible, so long as the application is careful to generate as much shared state as possible before the fork.