Tuesday 4:15 p.m.–5 p.m.
Python Sorted Collections
Grant Jenks
- Audience level:
- Intermediate
- Category:
- Python Libraries
Description
Abstract
Python's standard library is great until you need sorted collections types. Most go far with the standard library's bisect
, heapq
, and queue.PriorityQueue
. Others mistake collections.OrderedDict
for a dictionary that maintains sort order. But when you really need a sorted list, dict, or set, you're faced with a dozen varying options on PyPI. Together we'll look at when to use sorted collections types and the tradeoffs of those options.
Of particular focus will be the SortedContainers module, an Apache2 licensed collections library, written in pure-Python and as fast as modules with C-extensions. Performance is a feature in SortedContainers and benchmarks across alternatives, runtimes, and workloads will be highlighted. After only a couple years, SortedContainers has gained popularity and unseated decade-old incumbents.
Our discussion will survey what makes SortedContainers so fast. We'll see why benchmarks matter most in claims about performance and why the strengths and weakness of modern processors affect how you choose your data structures. It's possible to write fast code in pure-Python! Come see where algorithms and theory intersect processors and machines.