Flexing SQLAlchemy's Relational Power

Brandon Rhodes

Type:: Talk
Audience level:: Intermediate
Category:: Databases/NoSQL

March 10th 11:05 a.m. – 11:45 a.m.

Description

How do you take the big step from casual SQLAlchemy user, who treats your database as a mysterious object store, to advanced power user, who optimizes critical queries, plans indexing and migrations, and generates efficient reports? This talk will teach you how databases think; why humanity invented the Relational Algebra; and how SQLAlchemy grants you access to relational power.

Abstract

While drawing enlightening comparisons between SQLAlchemy, the Django ORM, and several NoSQL databases, this talk will focus most of its energy on understanding relational databases — a technology foundation that has been steadily improved since the early 1970s. The talk will first tackle the big questions that all databases have to answer, then teach specific SQLAlchemy techniques for taking advantage of relational queries.

Records and Indexes: whether a database is relational, key-value, document-based, or hierarchical, it must both store some kind of record, and also allow indexes to be built across its collection of records. After reviewing why hardware speeds make indexes a necessity, we will consider their structure, performance, cost (especially for writing), and the trade-offs between building them directly from data versus through functions or map-reduce mechanisms.
The Relational Algebra and Query Optimization: we will learn — using concrete, well-illustrated examples — how relational databases use a combination of powerful indexes and intelligent query planning to support “normalized” data storage. This will be briefly contrasted with the different normalizations approach encouraged by modern document databases.
Building Queries: with brief glances at pure SQL syntax for the very curious, we will learn how SQLAlchemy lets you build SQL queries as a series of Python method calls. We will see how queries can return either raw rows or ORM class instances, and how the use of joins reduces the number of round-trips to the database.
Advanced ORM: finally, we will use our knowledge of database structure and queries to see how high-level ORM operations can benefit from eager loading, query-specific indexes, query logging, and the EXPLAIN operator to learn how your database is — or is not — optimizing your operations. Finally, we will review the transactional nature of relational databases and learn about the SQLAlchemy object cache (pointing out the big difference between it and the Django ORM), the SQLAlchemy unit-of-work construct, and how these can vastly confuse you if you are not prepared for their behavior.