Sunday 1:50 p.m.–2:20 p.m.

PostgreSQL is Web Scale (Really :) )

Hannu Krosing

Audience level:
Intermediate
Category:
Databases

Description

This talk will outline and demonstrate usage of PostgreSQL for data storage and processing scenarios where new common wisdom would usually turn to "NoSQL" databases for scalability reasons. It demonstrates both NoSQL style usage and techniques using more traditional relational storage models with required adjustments for infinite scalability.

Abstract

Why is NoSQL the current hot scalability solution == First I discuss why NoSQL is easier to scale than traditional relational databases. Then I describe how NoSQL-style compromises can be done in SQL databases, often with similar or better results because of tens of years of work in making the base data engine really fast and robust on single servers. How to do the same and more using PostgreSQL == Finally I show you how to set up a python and **PostgreSQL** based system which is easy to scale, provides ACID guarantees where they are needed most and makes compromises for scalability and availability where the latter are deemed more important. The best thing is that this kind of scalability work for both OLTP and OLAP workloads, so the learning effort is good for both of these. Sometimes you can even have just a single large "database" which can take almost any type of transaction processing **and** analytics load. (Though it still may be a good idea to have many instances/replicas of it for robustness) Also, if you'd rather not write any SQL, you can do all the OLTP stuff in a more pythonic way using an automagically generated ORM layer inside the database, near the data. If you are really masochistic, you can use the same ORM also for map-reduce type distributed data processing, though on more complex data queries the small effort of learning SQL usually pays off big by letting the database optimizer to select the best plan for "map" part of map-reduce. But as I said, everything runs inside the database, **near the data** and thus even the ORM & map-reduce analytics work fast. A few comparative benchmarks are provided for NoSQL style workloads on popular NoSQL databases and PostgreSQL