<blockquote class="callout" style="text-align: center; margin: auto; width: 45%">
<h3>PyCon US 2018 Talk Lineup</h3>
See the schedule in grid form <a href="/2018/schedule/talks">here</a>
</blockquote>
James Bennett
Saturday
5:10 p.m.–5:40 p.m.
in
Grand Ballroom B
At some point every Python programmer sees Python bytecode files -- they're those '.pyc' files Python likes to leave behind after it runs. But have you ever wondered what's really going on in those files? Well, wonder no more! In this talk you'll learn what Python bytecode is and how it's used to execute your code, as well as how to decipher and read it, and how to reason about bytecode to understand the performance of your Python code.
Irina Truong
Friday
2:35 p.m.–3:05 p.m.
in
Grand Ballroom A
Until very recently, Apache Spark has been a de facto standard choice of a framework for batch data processing. For Python developers, diving into Spark is challenging, because it requires learning the Java infrastructure, memory management, configuration management. The multiple layers of indirection also make it harder to debug things, especially when throwing the Pyspark wrapper into the equation.
With Dask emerging as a pure Python framework for parallel computing, Python developers might be looking at it with new hope, wondering if it might work for them in place of Spark. In this talk, I’m using a data aggregation example to highlight the important differences between the two frameworks, and make it clear how involved the switch may be.
Note: Just in case it's unclear, there's no Java of any kind in this talk. All the code / examples use Python (PySpark).
Philip James, Asheesh Laroia
Friday
5:10 p.m.–5:40 p.m.
in
Grand Ballroom B
In this talk, you’ll learn about a category of security issue known as side channel attacks. You’ll be amused to see how features like automatic data compression, short-circuit execution, and deterministic hashing can be abused to bypass security systems. No security background knowledge is required. The talk assumes at least intermediate Python experience.
We’ll take a tour of real side channel vulnerabilities in open source Python codebases, including the patches that fixed them. It also offers practical advice for avoiding these issues. My goal is to demystify this topic, even if you aren’t writing security-critical software.
Alex Petralia
Friday
3:15 p.m.–3:45 p.m.
in
Global Center Ballroom AB
“So tell me,” my manager said, “what is an average?”
There’s probably nothing worse than that sinking feeling when you finish an analysis, email it to your manager or client to review, and they point out a mistake so basic you can’t even fathom how you missed it.
This talk is about mine: how to take an average.
Averages are something we use everywhere - it’s a simple np.mean() in pandas or AVG() in SQL. But recently I’ve come to appreciate just how easy it is to calculate this statistic incorrectly. We learn once - in middle school no less - how to take an average, and never revisit it. Then, when we are faced with multidimensional datasets (ie. pretty much every dataset out there), we never reconsider whether we should be taking an average the same way.
In this talk, we follow my arduous and humbling journey of learning how to properly take an average with multidimensional data. We will cover how improperly calculating it can produce grossly incorrect figures, which can slip into publications, research analyses and management reports.
Daniel Pyrathon
Sunday
1:50 p.m.–2:20 p.m.
in
Room 26A/B/C
Recommender systems have become increasingly popular in recent years, and are used by some of the largest websites in the world to predict the likelihood of a user taking an action on an item. In the world of Netflix, this means recommending similar movies to the ones you have seen. In the world of dating, this means suggesting matches similar to people you already showed interest in!
My path to recommenders has been an unusual one: from a Software Engineer to working on matching algorithms at a dating company, with a little background on machine learning. With my knowledge of Python and the use of basic SVD (Singular Value Decomposition) frameworks, I was able to understand SVDs from a practical standpoint of what you can do with them, instead of focusing on the science.
In my talk, you will learn 2 practical ways of generating recommendations using SVDs: matrix factorization and item similarity. We will be learning the high-level components of SVD the "doer way": we will be implementing a simple movie recommendation engine with the help of Jupiter notebooks, the MovieLens database, and the Surprise recommendation package.
Kelsey Pedersen
Friday
11:30 a.m.–noon
in
Grand Ballroom A
Do we even need humans? Humans and data science are flawed on their own. Humans lack the ability to process large volumes of information. Machines lack intuition, empathy, and nuance. You'll learn how to guide users of expert-use systems by applying data science to their user experience. This allows us to take advantage of the human-touch while leveraging our large datasets. What is the relationship between human decisions and algorithms? Are we thinking about data science all wrong? In this talk, you'll learn the ways we balance human decisions and data science throughout our applications, the challenges we have faced along the way and the future of the relationship between humans and data.
Kyle Knapp
Sunday
2:30 p.m.–3 p.m.
in
Grand Ballroom A
Writing quality Python code can be both tough and tedious. On top of the general design, there are many code quality aspects that you need to watch out for when writing and reviewing code such as adherence to PEP8, docstring quality, test quality, etc. Furthermore, everyone is human. If you are catching these code quality issues by hand, there is a good chance that at some point you will miss an easy opportunity to improve code quality. If the quality check can be done by a machine, then why would you even try to catch the code quality issue by hand? In the end, the machine will be able to perform the quality check with much more speed, accuracy, and consistency than a person.
This talk will dive into how existing open source projects offload and automate many of these code quality checks resulting in:
- A higher quality and a more consistent codebase
- Maintainers being able to focus more on the higher level design and interfaces
of a project.
- An improved contribution process and higher quality pull requests from
external contributors
By diving into how these open source projects automate code quality checks, you will learn about:
- The available tooling related to checking code quality such as `flake8`,
`pylint`, `coverage`, etc.
- How to automate code quality checks for both a development and team
setting.
- First-hand accounts of the benefits and lessons learned from automating
code quality checks in real-life open source projects.
Christopher Fonnesbeck
Saturday
1:40 p.m.–2:25 p.m.
in
Global Center Ballroom AB
Nowadays, there are many ways of building data science models using Python, including statistical and machine learning methods. I will introduce probabilistic models, which use Bayesian statistical methods to quantify all aspects of uncertainty relevant to your problem, and provide inferences in simple, interpretable terms using probabilities. A particularly flexible form of probabilistic models uses Bayesian *non-parametric* methods, which allow models to vary in complexity depending on how much data are available. In doing so, they avoid the over-fitting that is common in machine learning and statistical modeling. I will demonstrate the basics of Bayesian non-parametric modeling in Python, using the PyMC3 package. Specifically, I will introduce two common types, Gaussian processes and Dirichlet processes, and show how they can be applied easily to real-world problems using two examples.
Andrew Knight
Friday
12:10 p.m.–12:40 p.m.
in
Grand Ballroom A
Behavior-Driven Development (BDD) is gaining popularity as an improved way to collaborate over product features and tests. In Python, **behave** is one of the leading BDD test frameworks. Using **behave**, teams write Gherkin behavior scenarios (e.g., tests) in plain language, and then programmers write Python code to automate the steps. BDD testing is great because tests are self-documenting and steps abide by the DRY principle. An example test could be:
> Given the DuckDuckGo home page is displayed
> When the user searches the phrase "Python"
> Then search results for "Python" are shown
This talk will teach how to use **behave** to develop well-designed test scenarios and a robust automation framework. It will focus on the layers of the behave framework: feature files, step definitions, support classes, and config files. A full example project will be hosted on GitHub for audience members to reference after the talk.
Julie Lavoie
Friday
2:35 p.m.–3:05 p.m.
in
Grand Ballroom B
Scraping one web site for information is easy, scraping 10000 different sites is hard. Beyond page-specific scraping, how do you build a program than can extract the publication date of (almost) any news article online, no matter the web site?
We’ll cover when to use machine learning vs. humans or heuristics for data extraction, the different steps of how to phrase the problem in terms of machine learning, including feature selection on HTML documents, and issues that arise when turning research into production code.
Hillel Wayne
Friday
12:10 p.m.–12:55 p.m.
in
Room 26A/B/C
You've used pytest and you've used mypy, but bugs are still slipping through your code. What's next? In this talk, we cover two simple but powerful tools for keeping your code problem-free. Property-based testing, provided by the [Hypothesis](https://hypothesis.readthedocs.io/en/latest/) library, lets you run hundreds of tests from a single template. Contracts, via [dpcontracts](https://github.com/deadpixi/contracts), make your program test itself. You'll learn how and why to use these tools and how to combine them with the rest of your testing suite.
Ned Batchelder
Sunday
2:30 p.m.–3 p.m.
in
Grand Ballroom C
Big-O is a computer science technique for analyzing how code performs as data gets larger. It's a very handy tool for the working programmer, but it's often shrouded in off-putting mathematics.
In this talk, I'll teach you what you need to know about Big-O, and how to use it to keep your programs running well. Big-O helps you choose the data structures and algorithms that will let your code work efficiently even on large data sets.
You can understand Big-O even if you aren't a theoretical computer science math nerd. Big-O isn't as mystical as it appears. It's wrapped in mathematical trappings, but doesn't have to be more than a common-sense assessment of how your code will behave.
Kirk Kaiser
Friday
2:35 p.m.–3:05 p.m.
in
Grand Ballroom C
In the past few years, the power of computer vision has exploded. In this talk, we'll apply a deep learning model to a bird feeder. We'll use that model to detect, identify, and record birds that come to a smart bird feeder.
Along the way, we'll learn about different platforms to deploy deep learning cameras on, from the lowly Raspberry PI all the way up to the powerful NVIDIA Jetson embedded computer with a built in GPU.
Sam Kitajima-Kimbrel
Friday
5:10 p.m.–5:40 p.m.
in
Global Center Ballroom AB
Facebook, Google, Uber, LinkedIn, and friends are the rarefied heights of software engineering. They encounter and solve problems at scales shared by few others, and as a result, their priorities in production engineering and architecture are just a bit different from the rest of us down here in the other 99% of services. Through deconstructing a few blog posts from these giants, we’ll evaluate just what is it that they’re thinking about when they build systems and whether any of their choices are relevant to those of us operating at high scale yet still something less than millions of requests per second.
This talk will go into depth on how to make technological decisions to meet your customers’ requirements without requiring a small army of engineers to answer 2 AM pages, and how to set realistic goals for your team around operations, uptime, communications, and disaster recovery.
With these guidelines in mind, you should be better equipped to say no (or yes!) the next time your team’s software hipster proposes moving everything to the Next Big Thing.
Russell Keith-Magee
Saturday
2:35 p.m.–3:05 p.m.
in
Grand Ballroom C
Have you ever wanted to write a GUI application you can run on your laptop? What about an app that you can run on your phone? Historically, these have been difficult to achieve with Python, and impossible to achieve without learning a different API for each platform. But no more.
BeeWare is a collection of tools and libraries that allows you to build cross-platform native GUI applications in pure Python, targeting desktop, mobile and web platforms. In this talk, you'll be introduced to the BeeWare suite of tools and libraries, and see how you can use them to develop, from scratch, a GUI ChatBot application that can be deployed as a standalone desktop application, a mobile phone application, and a single page webapp - without making any changes to the application's codebase.
Jiaqi Liu
Friday
4:30 p.m.–5 p.m.
in
Room 26A/B/C
It’s one thing to build a robust data pipeline process in python but a whole other challenge to find tooling and build out the framework that allows for testing a data process. In order to truly iterate and develop a codebase, one has to be able to confidently test during the development process and monitor the production system.
In this talk, I hope to address the key components for building out end to end testing for data pipelines by borrowing concepts from how we test python web services. Just like how we want to check for healthy status codes from our API responses, we want to be able to check that a pipeline is working as expected given the correct inputs. We’ll talk about key features that allows a data pipeline to be easily testable and how to identify timeseries metrics that can be used to monitor the health of a data pipeline.
Joyce Jang
Friday
10:50 a.m.–11:20 a.m.
in
Grand Ballroom C
We build product and software as teams. And as anyone who as worked on a team knows, there’s often a lot more that goes into working together to build that product than actually just building the product itself. A highly functional team is not as elusive it may seem. Software engineering is a skill we’ve developed, but even more importantly software engineering on teams is another skill we’ve been practicing and improving on as an industry. Software engineering principles and best practices may seem to have very little to do with teamwork, but being able to thoughtfully apply some of what we’ve learned as engineers towards teamwork, we can help move towards creating such success with our teams.
Dmitry Filippov, Ewa Jodlowska
Saturday
1:55 p.m.–2:25 p.m.
in
Room 26A/B/C
Want to know about the latest trends in the Python community and see the the big picture of how things have changed over the last few years? Interested in the results of the latest official Python Developers Survey 2017 which was supported by the Python Software Foundation and gathered responses from more than 10.000 Python developers? Come learn about the most popular types of Python development, trending frameworks, libraries and tools, additional languages being used by Python developers, Python versions usage statistics and many other insights from the world of Python. All derived from the actual data and professional research such as the Python Developers Survey 2017 which collected responses from over 10.000 Python developers, organized in partnership between the Python Software Foundation and JetBrains, the Python Developers Survey 2016, 3rd party surveys and supplementary analytical research.
Greg Price
Sunday
1:50 p.m.–2:20 p.m.
in
Grand Ballroom B
Python now offers static types! Companies like Dropbox and Facebook, and open-source projects like Zulip, use static types (with [PEP 484](https://www.python.org/dev/peps/pep-0484/) and [mypy](https://github.com/python/mypy)) to make Python more productive and fun to work with — in existing codebases from 40k lines to 4 million, in Python 2 and 3, and while preserving the conciseness and flexibility that make Python a great language in the first place. I’ll describe how.
Reading and understanding code is a huge part of what we do as software developers. If we make it easier to understand our codebases, we make everyone more productive, help each other write fewer bugs, and lower barriers for new contributors. That's why Python now features optional static types, and why Dropbox, [Facebook](https://engineering.instagram.com/let-your-code-type-hint-itself-introducing-open-source-monkeytype-a855c7284881), and [Zulip](https://blog.zulip.org/2016/10/13/static-types-in-python-oh-mypy/) use them on part or all of their Python code.
In this talk, I’ll share lessons from Zulip’s and Dropbox’s experience — having led the mypy team at Dropbox and working now on the Zulip core team — for how you can start using static types in your own codebases, large or small. We’ll discuss how to make it a seamless part of your project’s tooling; what order to approach things in; and powerful new tools that make it even easier today to add static types to your Python codebase than ever before.
Sophie Rapoport
Friday
11:30 a.m.–noon
in
Global Center Ballroom AB
As engineers, we care a lot about the reliability of our applications. When a website falls over, pagers go off, and engineers burst into action to bring a site back to life. Postmortems are written, and teams develop strategies to prevent similar failures in the future.
But what about the reliability of our data? Would _you_ trust financial reports built on your data?
If not, what can you do to improve data health? If you _would_ trust these reports, how can you prove to customers, investors, and auditors alike that they should too?
In this talk, you’ll learn to apply strategies from the world of dev-ops to data. You’ll learn about questions auditors ask that can help you pinpoint data problems. You’ll also learn some accounting-specific tools for accurate and timely record keeping that I’ve found fascinating and helpful!
Stacy Morse
Friday
10:50 a.m.–11:20 a.m.
in
Grand Ballroom A
Code reviews don't have to be a time consuming, morale zapping, arduous tasks. Not only can they catch bugs and errors but they can contribute in positive ways to the individual developer, the team, management and company as a whole.
Art critiques have existed in academia for hundreds of years. The methodology of the critique has evolved to be time sensitive and productive, while keeping the enthusiasm of the student artist intact.
The purpose of the art critique is to get peers and mentors to look at the work and raise any problems they may see. It's also time where people with more experience could contribute their knowledge in a helpful way. This process is about producing the best work, quickly and in a productive and constructive way.
These methods can be applied to code review.
Chris Schuhmacher
Friday
10:50 a.m.–11:20 a.m.
in
Grand Ballroom B
In 2017, I was released from prison after serving 17 years. One of the most transformational experiences I had while incarcerated was learning to code, through a pioneering new program called Code.7370 — the first coding curriculum in a United States prison.
In this talk, I’d like to share my experiences learning to code in prison and getting a software engineering job after my release, with the goals of:
Inspiring new programmers to stick with it and be confident in their abilities
Inspiring educators to think about how to support new coders in a broad range of learning environments (there’s no internet in prison!)
Inspiring everyone to think about the potential for rehabilitation in prison in a new way
Christopher Swenson
Saturday
12:10 p.m.–12:55 p.m.
in
Grand Ballroom B
Colossal Cave, also known as Adventure or ADVENT, is the original text adventure. It was written in FORTRAN IV and there is practically no way to run the original program without translating it. We'll explore software archeology to write a Python interpreter to run the FORTRAN code as-is, without translating it. Come learn about pre-ASCII and 36-bit integers and writing interpreters in Python!
And, we'll show how to use BeeWare's Batavia Python interpreter (in JavaScript) to execute the program. FORTRAN IV in Python in JavaScript in your browser!
Nir Arad
Saturday
3:15 p.m.–3:45 p.m.
in
Grand Ballroom A
Testing mobile applications is hard. Testing manually is nearly impossible.
That’s where automated testing shines. Just sit back and watch the machine go!
Python is a very powerful language for writing automated tests, but since Python is not installed on mobile platforms, we need to find a way to remotely control and monitor the device.
But how do we automate a device remotely? The answer is Appium.
In this talk I will go over the process of deploying and testing iOS (or Android) applications, and how to work with Appium to easily generate Python 3 code for testing your application.
Amit Saha
Saturday
11:30 a.m.–noon
in
Grand Ballroom B
Setting up application monitoring is often an afterthought, and in the speaker's opinion can be a bit overwhelming to get started with. What is a `metric`? What is a `gauge`? What is a `counter`? What's that `upper 90` metric you have up on your `dashboard`? And what *all* metrics should I monitor?
This talk aims to get you started on the monitoring journey in Python. In addition to clearing up some of the jargon, we will look at `statsd` and `prometheus` monitoring systems and how to integrate our applications with these.
Without the numbers, we are really flying blind!
Shohei Hido
Friday
2:35 p.m.–3:05 p.m.
in
Global Center Ballroom AB
![Logo][1]
[**Website**](https://cupy.chainer.org/) | [**Docs**](https://docs-cupy.chainer.org/en/stable/) | [**Install Guide**](https://docs-cupy.chainer.org/en/stable/install.html) | [**Tutorial**](https://docs-cupy.chainer.org/en/stable/tutorial/) | **Examples** ([Official](https://github.com/cupy/cupy/blob/master/examples)) | [**Forum**](https://groups.google.com/forum/#!forum/cupy)
CuPy is an open-source library with NumPy syntax that increases speed by doing matrix operations on NVIDIA GPUs. It is accelerated with the CUDA platform from NVIDIA and also uses CUDA-related libraries, including cuBLAS, cuDNN, cuRAND, cuSOLVER, cuSPARSE, and NCCL, to make full use of the GPU architecture. CuPy's interface is highly compatible with NumPy; in most cases it can be used as a drop-in replacement. CuPy supports various methods, data types, indexing, broadcasting, and more.
[1]: https://raw.githubusercontent.com/cupy/cupy/master/docs/image/cupy_logo_1000px.png
Raymond Hettinger
Saturday
3:15 p.m.–4 p.m.
in
Grand Ballroom C
The PEP 557 dataclasses module is available in starting in Python 3.7. It will become an essential part of every Python programmer's toolkit. This talk shows what problem the module solves, explains its key design decisions, and provides practical examples of how to put it to work.
Dataclasses are shown to be the next step in a progression of data aggregation tools: tuple, dict, simple class, bunch recipe, named tuples, records, attrs, and then dataclasses. Each builds upon the one that came before, adding expressiveness at the expense of complexity.
Dataclasses are unique in that they let you selectively turn-on or turn-off its various capabilities and it lets the user choose the underlying data store (either instance dictionary, instance slots, or an inherited base class).
Dataclasses and typing.NamedTuple both use variable annotations which were new in Python 3.6.
Anna Nicanorova
Saturday
10:50 a.m.–11:20 a.m.
in
Grand Ballroom B
Data Visualization charts are supposed to be our map to information. However, when making charts, customarily we are just re-sizing lines and circles based on metrics instead of creating data-driven version of reality. The contemporary charting techniques have a few shortcomings (especially when dealing with high-dimensional dataset):
* **Context Reduction**: in order to fit a high-dimensional dataset into a chart one needs to filter/ aggregate/ flatten data which results in reduction of full context of information. Without context most of the charts show only a part of the story, that can potentially lead to data misinterpretation/misunderstanding.
* **Numeric Thinking**: naturally humans have hard time perceiving big numbers. While data visualization is suppose to help us to conceptualize large volumes, unless the dataset is carefully prepared, 2D charts rarely give us the intuitive grasp of magnitude.
* **Perceptual de-humanization**: when examining charts it is easy to forget that we are dealing with activity in real world instead of lines/bars.
Augmented/Mixed Reality can potentially solve all of the issues listed above by presenting an intuitive and interactive environment for data exploration. Three dimensional space provides conditions to create complex data stories with more “realistic assets” (beyond lines and bars). The talk would present the architecture required to create MR data visualization story with Python (70% of architecture), starting with drawing 3D assets in a data-driven way and finishing with deployment on MR devices.
Holden Karau
Friday
3:15 p.m.–3:45 p.m.
in
Grand Ballroom A
Apache Spark is one of the most popular big data projects, offering greatly improved performance over traditional MapReduce models. Much of Apache Spark’s power comes from lazy evaluation along with intelligent pipelining, which can make debugging more challenging. This talk will examine how to debug Apache Spark applications, the different options for logging in PySpark, as well as some common errors and how to detect them.
Spark’s own internal logging can often be quite verbose, and this talk will examine how to effectively search logs from Apache Spark to spot common problems. In addition to the internal logging, this talk will look at options for logging from within our program itself.
Spark’s accumulators have gotten a bad rap because of how they interact in the event of cache misses or partial recomputes, but this talk will look at how to effectively use Spark’s current accumulators for debugging as well as a look to future for data property type accumulators which may be coming to Spark in future version.
In addition to reading logs, and instrumenting our program with accumulators, Spark’s UI can be of great help for quickly detecting certain types of problems.
Debuggers are a wonderful tool, however when you have 100 computers the “wonder” can be a bit more like “pain”. This talk will look at how to connect remote debuggers, but also remind you that it’s probably not the easiest path forward.
Van Lindberg
Friday
11:30 a.m.–noon
in
Grand Ballroom B
In 2011 I gave a talk about "Killing Patents with Python" - finding the right piece of prior art by using statistical natural language processing techniques on the US Patent Database. A number of unexpected benefits came out of that exploration, including the ability to describe large patent portfolios and businesses in a way that had not been done before.
Since then, the state of the art has advanced - and so has the ability to do strange and wonderful things by applying the latest neural network-based analysis to the nine million patents and patent applications that people have submitted to the USPTO. Not only can we learn new things about what people have invented, we might just be able to get the computer to do a little "inventing" itself.
Matthew Rocklin
Saturday
10:50 a.m.–11:20 a.m.
in
Room 26A/B/C
We use JupyterHub, XArray, Dask, and Kubernetes to build a cloud-based system to enable scientists to analyze and manage large datasets. We use this in practice to serve a broad community of atmospheric and climate scientists.
Atmospheric and climate scientists analyze large volumes of observational and simulated data to better understand our planet. They have historically used tools like NumPy and SciPy along with Jupyter notebooks to combine efficient computation with accessibility. However, as datasets increase in size and collaboration extends to new populations of scientists these tools begin to feel their age. In this talk we use more recent libraries to build a modern deployment for academic scientists. In particular we use the following tools:
- **Dask:** to parallelize and scale NumPy computations
- **XArray**: as a self-discribing data model and tool kit for labeled and index arrays
- **JupyterLab:** to enable more APIs for users beyond the classic notebook
- **JupyterHub:** to manage users and maintain environments for a new population of cloud-friendly users
- **Kubernetes:** to manage everything and deploy easily on cloud hardware
This talk will focus less on how these libraries work and will instead be a case study of using them together in an operational setting. During the talk we will build up and deploy a running system that the audience can then use to access distributed computing resources.
Lisa Roach
Friday
12:10 p.m.–12:55 p.m.
in
Grand Ballroom B
One of the most challenging and important thing fors for Python developers learn is the unittest mock library. The patch function is in particular confusing- there are many different ways to use it. Should I use a context manager? Decorator? When would I use it manually? Improperly used patch functions can make unit tests useless, all the while making them look as if they are correctly testing code.Let’s learn how to wield patch with confidence!
Lilly Ryan
Saturday
1:55 p.m.–2:25 p.m.
in
Grand Ballroom C
In the 1850s, Edward Orange Wildman Whitehouse was appointed the lead engineer of the first attempt to build a trans-Atlantic telegraph cable. With the entire population of two continents waiting for his go-live, their handlebar moustaches aquiver, he demonstrated in fine form just how spectacularly a big project can be a bigger disaster.
This is a tale of long-winded rants, spectacular sideburns, and gentlemen scientists behaving badly. It is also a lesson about the importance of honest reflection in technical teamwork. Lilly outlines some of the mistakes made during one of the biggest tech delivery projects in history, and how a constructive view of failure helped to turn it all around. Through the public meltdowns of Wildman Whitehouse you will learn the importance of feedback, how to handle complex tasks gracefully, and the best way to recover from having your pipeline eaten by a whale.
Paul Vincent Craven
Saturday
2:35 p.m.–3:05 p.m.
in
Room 26A/B/C
Want to have fun with Python? Do something visual? Get started today? Learn how to draw, animate, and use sprites for games with the [Python Arcade](http://arcade.academy/) library.
"Arcade" is an easy-to-use Python library for creating 2D arcade games. We'll show you how to get started creating your own game, and find plenty of example code to get an idea of what you can do with this library. If you are familiar with PyGame, Arcade is easier, more powerful, and uses recent Python features like type hinting and decorators.
This talk is great for beginners, educators, and people who want to create their own arcade games.
Claudio Freire
Sunday
1:10 p.m.–1:40 p.m.
in
Grand Ballroom B
Multithreading makes shared memory easy, but true parallelism next to impossible. Multiprocessing gives us true parallelism, but it makes sharing memory very difficult, and high overhead. In this talk, we'll explore techniques to share memory between processes efficiently, with a focus on sharing read-only massive data structures.
Mario Corchero
Sunday
1:10 p.m.–1:40 p.m.
in
Room 26A/B/C
Logs are our best friend, especially on those late nights when we try to troubleshoot a problem in production that was written by a co-worker who is on vacation. Logs are the main way to know what is happening with an application at runtime, but we don’t realize how important they are until we actually need them. Unfortunately, they are usually an under-estimated part of the development process.
This talk aims to transmit the need for the logging module, briefly explains how to use it and how it is built, and dives into all the complexity that is hidden to us. This will help attendees not just understand all the magic that allows us to inspect our applications at runtime, but also to avoid mistakes and adapt the module to our needs for more esoteric scenarios.
The talk is structured to simplify the understanding of the logging module. Many people have read the documentation, but still struggle to fully understand what is happening under the hood. This talk aims to eliminate that barrier by presenting it in an easier-to-digest manner.
Nina Zakharenko
Friday
5:10 p.m.–5:40 p.m.
in
Room 26A/B/C
Are you an intermediate python developer looking to level up? Luckily, python provides us with a unique set of tools to make our code more elegant and readable by providing language features that make your code more intuitive and cut down on repetition. In this talk, I’ll share practical pythonic solutions for supercharging your code.
Specifically, I'll cover:
- What magic methods are, and show you how to use them in your own code.
- When and how to use partial methods.
- An explanation of ContextManagers and Decorators, as well as multiple techniques for implementing them.
- How to effectively use `NamedTuples`, and even subclass and extend them!
Lastly, I'll go over some example code that ties many of these techniques together in a cohesive way. You'll leave this talk feeling confident about using these tools and techniques in your next python project!
Stephanie Kim
Friday
4:15 p.m.–5 p.m.
in
Grand Ballroom A
Anyone who is interested in deep learning has gotten their hands dirty playing around with Tensorflow, Google's open source deep learning framework. Tensorflow has its benefits like wide scale adoption, deployment on mobile, and support for distributed computing, but it also has a somewhat challenging learning curve, is difficult to debug, and hard to deploy in production. PyTorch is a new deep learning framework that solves a lot of those problems.
PyTorch is only in beta, but users are rapidly adopting this modular deep learning framework. PyTorch supports tensor computation and dynamic computation graphs that allow you to change how the network behaves on the fly unlike static graphs that are used in frameworks such as Tensorflow. PyTorch offers modularity which enhances the ability to debug or see within the network and for many, is more intuitive to learn than Tensorflow.
This talk will objectively look at PyTorch and why it might be the best fit for your deep learning use case and we'll look at use cases that will showcase why you might want consider using Tensorflow instead.
Colin Carroll, Karin C. Knudson
Saturday
1:55 p.m.–2:25 p.m.
in
Grand Ballroom B
At the end of 2017, there were seven states with ongoing redistricting litigation. We will discuss a statistical model that the United States Supreme Court declared to be appropriate in cases of racial gerrymandering, and show how it can be implemented and used with the library `PyMC3`. We will also discuss what the model tells us about racial gerrymandering in North Carolina.
Jason Fried
Saturday
4:30 p.m.–5 p.m.
in
Grand Ballroom B
Today, services built on Python 3.6.3 are widely used at Facebook. But as recently as May of 2014 it was actually impossible at all to use Python 3 at Facebook. Come learn how we cut the Gordian Knot of dependencies and social aversion to the point where new services are now being written in Python 3 while older Python 2 projects are actively migrated to Python 3. All accomplished by a small group of individual contributors in their spare time. Learn to fight the good fight and upgrade your organization to Python 3 like we did at Facebook.
Anna Ossowski
Friday
1:55 p.m.–2:25 p.m.
in
Grand Ballroom B
You maintain an Open Source project with great code? Yet your project isn’t succeeding in the ways you want? Maybe you’re struggling with funding or documentation? Or you just can’t find new contributors and you’re drowning in issues and pull requests?
Open Source is made up of many components and we are often better-trained in methods for writing good code, than in methods for succeeding in the other dimensions we want our project to grow.
In this talk we’ll explore the different components of an Open Source project and how they work together. After this talk you’ll be well-equipped with a ideas and strategies for growing, cultivating, and nourishing your Open Source project.
For your project to succeed, all of its non-code components must be well-maintained. What are these different components and what methods can we learn to maintain them?
* Build real relationships with your sponsors and determine ways how both sides can benefit from this relationship, don’t just ask people for money.
* Establish a good communication system with your contributors: Keep them informed, listen to their feedback and input, make them feel heard.
* Thank the people who worked on ticket triage or marketing, not just those who wrote code, in your release notes.
* Make it easy for new contributors to get started: Write and maintain good documentation, answer questions in a friendly and timely manner.
* Market and evangelize in the right places and at the right time: Give conference talks, organize sprints, keep your project’s Twitter account active, always curate new and interesting content on your blog or website.
* Implement a Code of Conduct and enforce it if needed: Make your project a safe space to contribute for everyone.
With these methods and a half-dozen others, you’ll handle beautifully all the components your project needs to succeed.
Barry Warsaw
Saturday
5:10 p.m.–5:40 p.m.
in
Grand Ballroom C
Resources are files that live within Python packages. Think test data files, certificates, templates, translation catalogs, and other static files you want to access from Python code. Sometimes you put these static files in a package directory within your source tree, and then locate them by importing the package and using its `__file__` attribute. But this doesn't work for zip files!
You could use `pkg_resources`, an API that comes with `setuptools` and hides the differences between files on the file system and files in a zip file. This is great because you don't have to use `__file__`, but it's not so great because `pkg_resources` is a big library and can have potentially severe performance problems, even at import time.
Welcome to `importlib.resources`, a new module and API in Python 3.7 that is also available as a standalone library for older versions of Python. `importlib.resources` is build on top of Python's existing import system, so it is very efficient. It also defines an abstract base class which loaders can implement to provide their own resource access. Python's built-in zipimporter uses this to provide efficient access to resources within a zip file. Third party import hooks can do the same, so resources can come from anything that is importable by Python.
This talk will step through the motivations behind `importlib.resources`, the library's usage, its interfaces, and the hooks made available to third party packages. It will also talk about the minor differences between the standalone version and the version in Python 3.7's standard library. Hopefully audience members will come away with compelling reasons to port their code to this much more efficient library.
Nicolle Cysneiros
Saturday
1:40 p.m.–2:25 p.m.
in
Grand Ballroom A
Have you ever considered how many relationships you have in your virtual life? Every friend or page liked on Facebook, each connection in LinkedIn or Twitter account followed is a new relationship not only between two people, but also between their data. In Brazil only, we have 160 millions Facebook users. How can we represent and manipulate all these relationships? Graph Databases are storage systems that use graph structure (nodes and edges) to represent and store data in a semantic way.
This talk will begin approaching the challenge in representing relationships in Relational Databases and introducing a more friendly solution using graph. The definition of Graph Database, its pros and cons and some available tools (Neo4J, OrientDB and TitanDB) will be shown during the presentation, as well as how these tools can be integrated with Python.
Amjith Ramanujam
Saturday
11:30 a.m.–noon
in
Global Center Ballroom AB
During peak hours, Netflix video streams make up more than one third of internet traffic. Netflix must stream uninterrupted in the face of widespread network issues, bad code deploys, AWS service outages, and much more. Failovers make this possible.
Failover is the process of transferring all of our traffic from one region in AWS to another. While most of Netflix runs on Java, failovers are powered entirely by Python. Python's versatility and rich ecosystem means we can use it for everything from predicting our traffic patterns to orchestrating traffic movement, while dealing with the eventual consistency of AWS.
Today, we can shift all of our 100 million+ users in under seven minutes. A lot of engineering work went into making this possible. The issues we faced and solutions we created have broad application to availability strategies in the cloud or the datacenter.
Jack Diederich
Saturday
12:10 p.m.–12:55 p.m.
in
Room 26A/B/C
A function is a small chunk of code that does useful work. Your job when writing a function is to do it in a way that it easy to read. Based on over 15 years of code reviews here are some tips and guidelines I give again and again.
Hynek Schlawack
Saturday
4:15 p.m.–5 p.m.
in
Grand Ballroom A
The DevOps movement gave us many ways to put Python applications into production. But should your *application* care? Should it need to know whether it’s running on your notebook, on a server, in a Docker container, or in some cloud platform as a service?
It should not, because environment-agnostic applications are easier to **test**, easier to **deploy**, easier to **handle**, and easier to **scale**.
But how can you *practically* structure and configure your applications to make them indifferent to the environment they run in? How do secrets fit into the picture? And where do you put that log file?
By the end of this talk you’ll know the tools and techniques that enable you to write such Python applications and you’ll be ready for the next big change.
Christopher Neugebauer, Josh Simmons, Sam Kitajima-Kimbrel
Friday
10:50 a.m.–11:20 a.m.
in
Room 26A/B/C
New conferences rarely have resources to run the sort of outreach and inclusion programs that big conferences have. It’s hard to guess how much money you’ll have to spend, how many attendees you’ll have, and what your new community will look like. With so many things to worry about, it’s no surprise that most events don’t prioritise outreach until they’ve got a few years under their belt, if at all.
It doesn’t have to be this way, and it can even be easier to build a new event around outreach and inclusion than it is to build it in later on!
This talk shares the story of North Bay Python’s inaugural conference, which we planned in under 6 months, ran on a $40,000 budget, and built a welcoming community to make it real. We made inclusivity a founding principle and did so without compromising our speaker lineup while still attracting great sponsorship and hosted an event that almost every attendee wants to return to.
In this talk, we’re going to share with you how we built a conference, from the ground up, to be as inclusive as we could make it. We’ll touch on early organisation, marketing, and on-the ground logistics. Throughout the talk, you’ll learn:
* How we designed a budget that let us prioritise outreach and inclusion activities
* How we built the community that we wanted before the conference even started
* How we ran an event that proved that we meant everything we said
You too can host a new conference with a great lineup on a shoestring budget and short timeline, and you can do it while being inclusive, welcoming, and putting attendee safety first. Find out how you can have your cake, eat it, and still have lots to share with your new community.
Amber Brown ("HawkOwl")
Sunday
2:30 p.m.–3 p.m.
in
Global Center Ballroom AB
Most software has a user. Depending on the software, the user may need to provide various details about themselves for proper operation -- their name, their date of birth, where they live. However, it is quite common for software systems such as these to ask the wrong questions, collect too much data, and when it comes down to it, serialise the parts of the user's identity wrongly. This talk will discuss common ways that real-world systems store identity wrong, what questions you shouldn't ask, and how you can fix it in your own projects.
David Gouldin
Friday
12:10 p.m.–12:55 p.m.
in
Grand Ballroom C
Timezones are one of those things every programmer loves to hate. Most of us, at
least in the US, just try to ignore them and hope nobody notices. Then twice a
year, we fear with impending doom those 3 small words: Daylight Saving Time.
It doesn't have to be this way. Armed with some best practices and a little help
from supporting libraries, timezone-related bugs can be a thing of the past.
This talk explores standard library and 3rd party library timezone support, as
well as persistence and serialization techniques for timezone-aware datetimes.
By the end of the talk, the listener should feel confident in their ability to
correctly store, send, receive, and manipulate datetime objects in any timezone.
Dustin Ingram
Saturday
11:30 a.m.–noon
in
Room 26A/B/C
Questions and confusion about the Python packaging ecosystem abound. What is this `setup.py` file? What's the difference between wheels and eggs? Do I use setuptools or distutils? Why should I use twine? Do I put my projects dependencies in a `requirements.txt` or in `setup.py`? How do I just get my module up on PyPI? Wait, what is Warehouse?
This talk will identify the key tools one might encounter when trying to distribute Python software, what they are used for, why they exist, and their history (including where their weird names come from). In addition, we'll see how they all work together, what it takes to make them work, and what the future has in store for Python packaging.
Alex Gaynor
Sunday
1:50 p.m.–2:20 p.m.
in
Grand Ballroom C
Unless you work on pacemakers or at NASA, you've probably accepted the fact that you will make mistakes in your code, and those mistakes will creep into production. This talk will introduce you to post-mortems, and how to use them as a vehicle for improving your code and your process.
Padmaja Bhagwat
Friday
4:30 p.m.–5 p.m.
in
Grand Ballroom B
Imagine you have an appointment in a large building you do not know. Your host sent instructions describing how to reach their office. Though the instructions were fairly clear, in a few places, such as at the end, you had to infer what to do. How does a _robot (agent)_ interpret an instruction in the environment to infer the correct course of action? Enabling harmonious _Human - Robot Interaction_ is of primary importance if they are to work seamlessly alongside people.
Dealing with natural language instructions in hard because of two main reasons, first being, Humans - through their prior experience know how to interpret natural language but agents can’t, and second is overcoming the ambiguity that is inherently associated with natural language instructions. This talk is about how deep learning models were used to solve such complex and ambiguous problem of converting natural language instruction into its corresponding action sequence.
Following verbal route instructions requires knowledge of language, space, action and perception. In this talk I shall be presenting, a neural sequence-to-sequence model for direction following, a task that is essential to realize effective autonomous agents.
At a high level, a sequence-to- sequence model is an end-to-end model made up of two recurrent neural networks:
- **Encoder** - which takes the model’s input sequence as input and encodes it into a fixed-size context vector.
- **Decoder** - which uses the context vector from above as a seed from which to generate an output sequence.
For this reason, sequence-to-sequence models are often referred to as _encoder-decoder_ models. The alignment based encoder-decoder model would translate the natural language instructions into corresponding action sequences. This model does not assume any prior linguistic knowledge: syntactic, semantic or lexical. The model learns the meaning of every word, including object names, verbs, spatial relations as well as syntax and the compositional semantics of the language on its own.
In this talk, steps involved in pre-processing of data, training the model, testing the model and final simulation of the model in the virtual environment will be discussed. This talk will also cover some of the challenges and trade-offs made while designing the model.
Allison Kaptur
Friday
5:10 p.m.–5:40 p.m.
in
Grand Ballroom C
Wrestling bugs can be one of the most frustrating parts of programming - but with the right framing, bugs can also be our best allies. I'll tell the tales of two of my favorite bugs, including the time I triggered a DDOS of a logging cluster, and explain why I love them. I'll also give you concrete strategies for approaching tricky bugs and making them easier and more fun.
Esther Nam
Saturday
12:10 p.m.–12:40 p.m.
in
Global Center Ballroom AB
Those of us who have worked in software development for longer than a few years probably feel we have an intuitive sense of what a great developer is. Some traits come more easily to mind than others when it comes to identifying a great developer. In this talk we will take a slightly different approach to evaluating software development best practices, and identify one underrated skill common to great software developers: empathy. I hope to demonstrate that cognitive and emotional empathy skills are critical to good software development. We will explore ways to cultivate this trait in order to become better developers, both for our own sakes and for the sake of the teams in which we work.
Miguel Grinberg
Saturday
11:30 a.m.–noon
in
Grand Ballroom C
What do AWS, GitHub, Travis CI, DockerHub, Google, Stripe, New Relic, and the rest of the myriad of services that make our developer life easier have in common?
They all give you secret keys to authenticate with. Did you ever commit one of these to source control by mistake? That happened to me more times than I'm willing to admit!
In this talk I'm going to go over the best practices to follow when writing Python applications that prevent this type of accident.
Jake VanderPlas
Friday
11:30 a.m.–noon
in
Grand Ballroom C
Python provides a powerful platform for working with data, but often the most straightforward data analysis can be painfully slow. When used effectively, though, Python can be as fast as even compiled languages like C. This talk presents an overview of how to effectively approach optimization of numerical code in Python, touching on tools like numpy, pandas, scipy, cython, numba, and more.
Kenneth Reitz
Saturday
2:35 p.m.–3:05 p.m.
in
Global Center Ballroom AB
This talk is about the history of Python packaging, the tools that have been historically available for application deployment, the problems/constraints presented by them, and presents a holistic solution to many of these problems: Pipenv.
A live demo of the tool will be presented, as well as a Q&A session.
Carol Willing
Saturday
11:30 a.m.–noon
in
Grand Ballroom A
Each member of your project team uses something different to document
their work -- RestructuredText, Markdown, and Jupyter Notebooks. How do
you combine all of these into useful documentation for your project's users.
Sphinx and friends to the rescue!
Learn how to integrate documentation into your everyday development
workflow, apply best practices, and use modern development tools and
services, like Travis CI and ReadTheDocs, to create engaging and up-to-date
documentation which users and contributors will love.
Janet Matsen
Saturday
12:10 p.m.–12:40 p.m.
in
Grand Ballroom A
The genome of a typical microbe contains roughly 5 million base pairs of DNA including > 4000 genes, which provide the instructions for cellular replication, energy metabolism, and other biological processes. At Zymergen, we edit DNA to design microbes with improved ability to produce valuable materials and molecules. Microbes with these edits are built and tested in high throughput by our fleet of robots. Genomes are far too large for exhaustive search, so identifying which edits to make requires machine learning on non-standard features. Our task to extract information from trees, networks, and graphs of independently representable knowledge bases (metabolism, genomics, regulation), in ways that respect the strongly causal relationships between systems. In this talk, I will describe how we use Python’s biological packages (e.g. BioPython, CobraPy, Escher, goatools) and other packages (NetworkX, TensorFlow, PyStan, AirFlow) to extract machine learning features and predict which genetic edits will produce high-performance microbes.
vigneshwer dhinakaran
Sunday
1:50 p.m.–2:20 p.m.
in
Global Center Ballroom AB
If you’ve spent much time writing (or debugging) Python performance problems, you’ve probably had a hard time managing memory with its limited language support.
In this talk, we venture deep into the belly of the Rust Language to uncover the secret incantations for building high performance and memory safe Python extensions using Rust.
Rust has a lot to offer in terms of safety and performance for high-level programming languages such Python, Ruby, Js and more with its easy Foreign Function Interface capabilities which enables developers to easily develop bindings for foreign code.
Trey Hunner
Saturday
4:30 p.m.–5 p.m.
in
Grand Ballroom C
The end of life for Python 2 is 2020. Python 3 is the future and you'll need to consider both your upgrade plan and what steps you'll take after upgrading to start leveraging Python 3 features.
During this talk we'll briefly discuss how to start **the process of upgrading your code to Python 3**. We'll then dive into some of **the most useful Python 3 features** that you'll be able to start embracing once you drop Python 2 support.
A number of the most powerful Python 3 features are syntactic features that are **Python 3 only**. You won't get any experience using these features until you fully upgrade. These features are an incentive to drop Python 2 support in existing 2 and 3 compatible code. You can consider this talk as a teaser of Python 3 features that you may have never used.
After this talk I hope you'll be inspired to fully upgrade your code to Python 3.
Victor Stinner
Saturday
3:15 p.m.–4 p.m.
in
Grand Ballroom B
Looking back at Python evolutions over the last 10 years.
Python 3.0 was released ten years ago (December 2008). It's time to look back: analyze the migration from Python 2 to Python 3, see the progress we made on the language, list bugs by cannot be fixed in Python 2 because of the backward compatibility, and discuss if it's time or not to bury Python 2.
Python became the defacto language in the scientific world and the favorite programming language as the first language to learn programming.
Justin Myles Holmes
Saturday
2:35 p.m.–3:05 p.m.
in
Grand Ballroom A
For 2 years, a family of three has traveled on a converted school bus from conference to conference, building tooling for the road in Python and visiting Python families in every corner of the country.
Erin Braswell
Saturday
10:50 a.m.–11:20 a.m.
in
Global Center Ballroom AB
What do geiger counters, black holes, heart monitors, and volcanoes have in common? They all can use sound to convey information! This talk will explore using python for sonification: the process of translating data into sound that could otherwise be represented visually. Have you ever wondered how to use python to represent data other than making charts and graphs? Are you a musician looking for inspiration in the world around you? This talk will go over how to use python to translate time series data to MIDI that can be played back in real time. We’ll sonically interpret light-curve data from the Kepler space telescope using pygame, MIDIUtil, and astropy, turning points on a graph into a musical masterpiece! Come learn about how data sonification is used to help people, to expand the reach of scientific research, and to create music from data.
Ravi Chityala
Friday
3:15 p.m.–4 p.m.
in
Room 26A/B/C
Quantum computers are slowly turning in to reality more than 30 years after they were first theorized. The need for quantum computers have become clear as we reach the limits of Moore’s law and yet we need more computational power. We are at a very early stage of quantum computing. Yet Python is slowly becoming a defacto language for programming quantum computers.
In this talk, we will discuss the difference a traditional computer and a quantum computer. We will learn about the two architectures namely Quantum annealing and Quantum gate. Finally, we will learn to program quantum computers using Python.
Rae Knowler
Saturday
5:10 p.m.–5:40 p.m.
in
Global Center Ballroom AB
Python 3 removes a lot of the confusion around Unicode handling in Python, but that by no means fixes everything. Different locales and writing systems have unique behaviours that can trip you up. Here’s some of the worst ones and how to handle them correctly.
Matt Davis
Saturday
4:30 p.m.–5 p.m.
in
Room 26A/B/C
Occasionally we’ll find that some bit of Python we’ve written doesn’t run as fast as we’d like, what can we do? Performance bottlenecks aren’t always intuitive or easy to spot by reading code so we need to collect data with [profiling](https://docs.python.org/3.6/library/profile.html). Once we’ve identified the bottleneck we’ll need to change our approach, but what options are faster than others?
This talk illustrates a Python performance investigation and improvements using an [Advent of Code](http://www.adventofcode.com/) programming challenge. I’ll walk through starting from a slow (but correct) solution, look at profiling data to investigate _why_ it’s slow, and explore multiple paths for improving performance, including more efficient algorithms and using third-party tools like [Cython](http://cython.org/). You’ll leave this talk with a recipe for analyzing Python performance and information about some options for improved performance.
Amanda Sopkin
Friday
2:35 p.m.–3:05 p.m.
in
Room 26A/B/C
There are many computational needs for randomness--from creating a game to building a simulation involving naturally occurring randomness similar to the physical world. For most purposes using the python math module to create random numbers within a specific range can be done with no further questions, but sometimes we require a more nuanced implementation.
We will look at both pseudo-random number generators, which use statistically repeatable processes to generate seemingly random series and true random number generators, which inject physical processes like atmospheric noise to generate sequences of numbers. We will discuss the benefits and drawbacks of both approaches and common methods of implementing these two types of generators in python.
Finally, we will look at several real applications for randomness and discuss the best method for generating “randomness” in each scenario.
Valery Calderon
Friday
4:15 p.m.–5 p.m.
in
Global Center Ballroom AB
Web applications contains lots of database operations, network calls, nested callbacks and other computationally expensive tasks that might take a long time to complete or even block other threads until it's done, here is where ReactiveX enters, it doesn't only gives us the facility to convert almost anything to a stream; variables, properties, user inputs, caches, etc to manage it asynchronously. But it also gives us an easy way to handle errors which is a hard task within asynchronous programming. ReactiveX makes our code more flexible, readable, maintainable and easy to write.
We will be exploring how ReactiveX help us to make things easier with its operators toolbox that can be used to filter, create, transform or unify any of those streams. We will learn that in just a few lines of maintainable code, we can have multiple web sockets which recieves multiple requests all handled by an asynchronous process that serves a filtered output.
To do that I decided to explain an example of the use with an example by implementing observables, observers/subscribers and subjects. We will start by requesting our data stream from the Github API with a Tornado web socket and then filtering and processing it asynchrounosly.
David Beazley
Saturday
12:10 p.m.–12:55 p.m.
in
Grand Ballroom C
Writing lexers and parsers is a complex problem that often involves the use of special tools and domain specific languages (e.g., the lex/yacc tools on Unix). In 2001, I wrote Python versions of these tools which can be found in the PLY project. PLY predates a huge number of modern Python features including the iteration protocol, generators, decorators, metaclasses, and more. As such, it relied on a variety of clever hacks to layer a domain specific parser specification language on top of Python itself.
In this talk, I discuss a modernization of the PLY project that abandons its past and freely abuses modern Python features including advanced metaclasses, guaranteed dictionary ordering, class decorators, type hints, and more. The result of this work can be found in the SLY project. However, this talk isn't so much about SLY as it is focused on how far you can push Python metaprogramming features to create domain-specific languages. Prepare to be horrified--and to write code that will break your IDE.
Graham Dumpleton
Saturday
5:10 p.m.–5:40 p.m.
in
Room 26A/B/C
The WSGI (Web Server Gateway Interface) specification for hosting Python web applications was created in 2003. Measured in Internet time, it is ancient. The oldest main stream implementation of the WSGI specification is mod_wsgi, for the Apache HTTPD server and it is over 10 years old.
WSGI is starting to be regarded as not up to the job, with technologies such as HTTP/2, web sockets and async dispatching being the way forward. Reality is that WSGI will be around for quite some time yet and for the majority of use cases is more than adequate.
The real problem is not that we need to move to these new technologies, but that we aren't using the current WSGI servers to their best advantage. Moving to a new set of technologies will not necessarily make things better and will only create a new set of problems you have to solve.
As one of the oldest WSGI server implementations, Apache and mod\_wsgi may be regarded as boring and not cool, but it is still the most stable option for hosting WSGI applications available. It also hasn't been sitting still, with a considerable amount of development work being done on mod\_wsgi in the last few years to make it even more robust and easier to use in a development environment as well as production, including in containerised environments.
In this talk you will learn about many features of mod\_wsgi which you probably didn't even know existed, features which can help towards ensuring your Python web application deployment performs to its best, is secure, and has a low maintenance burden.
Topics which will be covered include:
* Easy deployment of Python web applications using mod\_wsgi-express.
* Integration of mod_wsgi-express with a Django web application.
* Using mod\_wsgi-express in a development environment.
* How to make use of mod\_wsgi-express in a production environment.
* Using mod_wsgi-express in a containerised runtime environment.
* Ensuring consistency between development and production environments using warpdrive.
* Using mod\_wsgi-express to bootstrap a system Apache installation for hosting WSGI applications.
* Why you should be using daemon mode of mod\_wsgi and not embedded mode.
* How to properly associate mod\_wsgi with a Python virtual environment.
* Building a robust deployment that can recover from misbehaving application code, backend services, or request overloading.
* Using hooks provided by mod\_wsgi to monitor the performance of your Python web application.
If you are a beginner, come learn why mod\_wsgi is still a good option for deploying your Python web applications. If you are an old time user of mod\_wsgi, find out about all the features you probably didn't know existed, revisit your current Python web application deployment and make it even better.
Liz Sander
Friday
1:40 p.m.–2:25 p.m.
in
Grand Ballroom A
When you think of an API, you’re probably thinking about a web service. But it’s important to think about your developer interface when designing a software library as well! I’ll talk about the scikit-learn package, and how its API makes it easy to construct complex models from simple building blocks, using three basic pieces: transformers, estimators, and meta-estimators. Then I’ll show how this interface enabled us to construct our own meta-estimator for model stacking. This will demonstrate how to implement new modeling techniques in a scikit-learn style, and more generally, the value of writing libraries with the developer interface in mind.
Larry Hastings
Saturday
10:50 a.m.–11:20 a.m.
in
Grand Ballroom C
Stop writing crappy shell scripts—write crappy Python scripts instead!
Other talks will show you how to write clean, performant, robust Python. But that's not always necessary. When writing personal automation or solving one-shot problems, it can be safe (and fun!) to quickly hack something together.
This talk will show examples of problems suitable for this approach, scenarios where it's reasonable to cut corners, novel techniques that can help break a problem down, and shortcuts that can speed development.
Julie Qiu
Friday
11:30 a.m.–noon
in
Room 26A/B/C
At some point, we all find ourselves at a SQL prompt making edits to the production database. We know it's a bad practice and we always intend to put in place safer infrastructure before we need to do it again — what does a better system actually look like?
This talk progresses through 5 strategies for teams using a Python stack to do SQL writes against a database, to achieve increasing safety and auditability:
(1) Develop a process for raw SQL edits
(2) Run scripts locally
(3) Deploy and run scripts on an existing server
(4) Use a task runner
(5) Build a Script Runner service
We’ll talk about the pros and cons of each strategy and help you determine which one is right for your specific needs.
By the end of this talk you’ll be ready to start upgrading your infrastructure for making changes to your production database safely!
Scott Triglia
Saturday
3:15 p.m.–3:45 p.m.
in
Global Center Ballroom AB
Taking on leadership roles always includes new demands on your attention and time. Inevitably, your finite work week will conflict with the sheer amount of tasks you have to do. How can we as leaders keep stepping up to new responsibilities while balancing our pre-existing ones?
This talk will focus on strategies for managing a too-large workload without abandoning important tasks or doing a shoddy job. We’ll look at techniques to prioritize what work matters most, identify tasks we should be doing ourselves, and finally delegate the rest to build our team’s skills while reducing our own workload.
Alvaro Leiva Geisse
Sunday
1:10 p.m.–1:40 p.m.
in
Grand Ballroom C
Done! Your shiny new application is functionally complete and ready to be deployed to production! But how exactly do you deploy properly on Linux? Wonder no more! In 30 minutes, this talk explains how you can harness the power of the init system and systemd to solve common deployment problems, including some that you didn't even know you had. Examples of things we will cover:
* How to secure your system by having: private /tmp for your process, read-only paths so that your process can not write to them, inaccessible paths, protect users home, network access, bin directories, etc.
* How to limit the resources you app can consume.
* How to interact directly with systemd, so it can start transient units, start/stop services, mount disks, resolve addresses.
* How to isolate your service without containers.
* How to isolate your service using containers (using systemd to spawn a namespace).
All this will be covered from a Python developer's perspective.
Andrew Godwin
Friday
3:15 p.m.–4 p.m.
in
Grand Ballroom C
The Django Channels project has taken a major turn with version 2.0, embracing Python's async functionality and building applications around an async event loop rather than worker processes.
Doing this, however, wasn't easy. We'll look through some of the techniques used to make Django coexist in this async world, including handing off between async and sync code, writing fully asynchronous HTTP and WebSocket handling, and what this means for the future of Django, and maybe Python web frameworks in general.
Emily Morehouse-Valcarcel
Friday
3:15 p.m.–4 p.m.
in
Grand Ballroom B
Get under the hood and learn about Python's beloved Abstract Syntax Tree. Ever wonder how Python code is run? Overheard people arguing about whether Python is interpreted or compiled? In this talk, we will delve into the lifecycle of a piece of Python code in order to understand the role that Python's Abstract Syntax Tree plays in shaping the runtime of your code. Utilizing your newfound knowledge of Python's AST, you'll get a taste of how you probably already rely on ASTs and how they can be used to build awesome tools.
Steven Sklar
Sunday
1:50 p.m.–2:20 p.m.
in
Grand Ballroom A
As web apps grow increasingly complex, distributing asynchronous work across multiple background workers is often a basic requirement of a performant app. While there are a variety of tools that exist to solve this issue, one common feature among them is the need for a robust messaging platform.
[RabbitMQ][1] is a stable, full-featured, and mature solution that is usually found in the Python ecosystem backing [Celery][2] implementations. While Celery's utilization of RabbitMQ works just fine out of the gate, users with complex workflows, unique constraints, or tight budgets can take advantage of the flexibility of RabbitMQ to streamline their data pipelines and get the most out of their infrastructure.
This talk will provide an overview of RabbitMQ, review its varied message-routing capabilities, and demonstrate some of the ways in which these features can be utilized in Python applications to solve common yet difficult use-cases.
[1]: https://www.rabbitmq.com/
[2]: http://www.celeryproject.org/
VM (Vicky) Brasseur
Saturday
3:15 p.m.–4 p.m.
in
Room 26A/B/C
Projects fail in droves. Systems hiccup and hours of downtime follows. Screws fall out all the time; the world is an imperfect place.
We talk a lot about building resilient systems, but all systems are (at least for now) built by humans. Humans who have been making the same types of mistakes for thousands of years.
Just because failure happens doesn’t mean we can’t do our best to prevent it or—at the very least—to minimize the damage when it does. As a matter of fact, embracing failure can be one of the best things you do for your system. Failure is a vital part of evolution. By learning to love failure we learn how to take the next step forward. Ignoring or punishing failure leads to stagnation and wasted potential.
This talk distills 3000 pages of failure research into 40 minutes of knowledge about the human factors of failure, how it can be recognised, and how you can work around it to create more resilient systems.
By the end of this talk the audience will have an awareness of the most common psychological reasons for mistakes and failures and how to develop systems and processes to protect against them.
Opetunde Adepoju
Friday
1:55 p.m.–2:25 p.m.
in
Room 26A/B/C
All the data in the world is useless if you cannot understand it. EDA and data visualization are the most crucial yet overlooked stage in analytics process. This is because they give insights on the most relevant features in a particular data set required to build an accurate model. It is often said that the more the data, the better the model but sometimes, this can be counter-productive as more data can be a disadvantage. EDA helps avoid that.
EDA is useful for professionals while data visualization is useful for end-users.
For end-users:
A good sketch is better than a long speech. The value of a machine learning model is not known unless it is used to make data driven decisions. It is therefore necessary for data scientists to master the act of telling a story for their work to stay relevant. This is where data visualization is extremely useful.
We must remember that the end-users of the results are not professionals like us but people who know little or nothing about data analysis. For effective communication of our analysis, there is need for a detailed yet simple data visualization because the work of a data scientist is not done if data-driven insights and decisions are not made.
For professionals:
How do you ensure you are ready to use machine learning algorithms in a project? How do you choose the most suitable algorithms for your data set? How do you define the feature variables that can potentially be used for machine learning? Most data scientists ask these questions. EDA answers these questions explicitly.
Also, EDA helps in understanding the data. Understanding the data brings familiarity with the data, giving insights on the best models that fit the data set, the features in the dataset that will be useful for building an accurate machine learning model, making feature engineering an easy process.
In this talk, I will give a detailed explanation on what EDA and data visualization are and why they are very helpful in building accurate machine learning models for analytics as well as enhancing productivity and better understanding for clients. I will also discuss the risks of not mastering EDA and data visualization as a data scientist.
Sara Packman
Saturday
5:10 p.m.–5:40 p.m.
in
Grand Ballroom A
Congratulations on finishing your first tutorials or classes in python! In the parlance of the hero’s journey myth, you’ve had your ‘threshold moment”: you’ve started down a path that could lead to a long and fulfilling career. But the road to this glorious future is frustratingly obscured by a lack of guidance in the present. You know enough to realize that you don’t have all the skills you need yet, but it’s hard to know how to learn those skills, or even articulate what they are. There are no easy solutions to this problem. There are, however, a few fundamental things to know and advice to keep in mind. Drawing from my own experience and with input from others, I’ve compiled some helpful hints about the skills, tools, and guiding questions that will get you to mastery.
Zekun Li
Sunday
1:10 p.m.–1:40 p.m.
in
Grand Ballroom A
Python's cyclic garbage collector wonderfully hides the complexity of memory management from the programmer. But we pay the price in performance. Ever wondered how that works? In this talk, you'll learn how garbage collection is designed in Python, what the tradeoffs are and how Instagram battled copy-on-write memory issues by disabling the garbage collector entirely.
You'll also learn why that isn't such a great idea after all and how we ended up extending the garbage collector API which allowed us to (mostly) re-enable garbage collection. We'll discuss our upstream contributions to the garbage collector that landed in Python 3.6 and 3.7.
This is an in-depth talk about memory management but no prior experience with CPython internals is necessary to follow it.
John Reese
Friday
10:50 a.m.–11:20 a.m.
in
Global Center Ballroom AB
Have you ever written a small, elegant application that couldn't keep up with the growth of your data or user demand? Did your beautiful design end up buried in threads and locks? Did Python's very special Global Interpreter Lock make all of this an exercise in futility?
This talk is for you! With the combined powers of AsyncIO and multiprocessing, we'll redesign an old multithreaded application limited by the GIL into a modern solution that scales with the demand using only the standard library. No prior AsyncIO or multiprocessing experience required.
Nathaniel J. Smith
Friday
4:30 p.m.–5 p.m.
in
Grand Ballroom C
Concurrent programs are super useful: think of web apps juggling lots of simultaneous downloads and websocket connections, chat bots tracking multiple concurrent conversations, or web spiders fetching pages in parallel. But *writing* concurrent programs is complicated, intimidating to newcomers, and often challenging even for experts.
Does it have to be? Python is famous for being simple and straightforward; can Python make concurrent programming simple and straightforward too? I think so. By carefully analyzing usability pitfalls in other libraries, and taking advantage of new Python 3 features, I've come up with a new set of primitives that make it dramatically easier to write correct concurrent programs, and implemented them in a new library called [Trio](https://trio.readthedocs.io). In this talk, I'll describe these primitives, and demonstrate how to use them to implement a basic algorithm for speeding up TCP connections. Compared to the best previous Python implementation, our version turns out to be easier to understand, more correct, and dramatically shorter.
This talk assumes basic familiarity with Python, but does *not* require any prior experience with concurrency, async/await, or networking.
Carl Meyer
Sunday
1:10 p.m.–1:40 p.m.
in
Global Center Ballroom AB
You've heard about Python type annotations, but wondered if they're useful in the real world? Worried you've got too much code and can't afford to annotate it? Type-checked Python is here, it's for real, and it can help you catch bugs and make your code easier to understand. Come learn from our experience gradually typing a million-LOC production Python application!
Type checking solves real world problems in production Python systems. We'll cover the benefits, how type checking in Python works, how to introduce it gradually and sustainably in a production Python application, and how to measure success and avoid common pitfalls. We'll even demonstrate how modern Python typechecking goes hand-in-hand with duck-typing! Join us for a deep dive into type-checked Python in the real world.
Pieter Hooimeijer
Sunday
2:30 p.m.–3 p.m.
in
Room 26A/B/C
Many projects already take advantage of static analysis tools like flake8, PyLint, and MyPy. Can we do better? In this talk, I'll discuss how to take a type checker, bolt on an interprocedural static analyzer, and delight your security team with high quality results.
Abstract
It is incredibly challenging to build a halfway decent static analysis tool for a dynamic language like Python. Fortunately, it gets quite a bit easier with Python type annotations. To explain why, I'll present a tool that finds security vulnerabilities by tracking dangerous flows of information interprocedurally across an entire codebase. **Then,** I'll demonstrate how that tool is really just a slightly slower, more sophisticated, type checker.
Renato Oliveira
Sunday
2:30 p.m.–3 p.m.
in
Grand Ballroom B
When we talk about Web API Design, we're usually driven to think in architecture, verbs, and nouns. But we often forget our user: the developer.
UX designers rely on many techniques to create great experiences. User research, User Testing, Personas, Usage Data Analysis and others. However when creating `invisible products` we’re not used to think in usability. So why don’t we take advantage of this background to improve our APIs experiences?
Adam Fletcher, Jonathan Mortensen
Friday
1:55 p.m.–2:25 p.m.
in
Grand Ballroom C
![ryu-python](http://www.thesimplelogic.com/wordpress/wp-content/uploads/2017/12/ryu-python.png)
Hear the story of how we used Python to build an AI that plays Super StreetFighter II on the Super NES. We'll cover how Python provided the key glue between the SNES emulator and AI, and how the AI was built with `gym`, `keras-rl` and `tensorflow`. We'll show examples of game play and training, and talk about which bot beat which bot in the bot-v-bot tournament we ran.
After this talk you'll know how easy it is to use Python and Python's machine learning libraries to teach a computer to play games. You'll see a practical example of the same type of machine learning used by AlphaGo, and also get to find out which character in StreetFighter II is best to pick when playing your friends.
[1]: https://lh3.googleusercontent.com/Mh9uzCm4JeevMN5w-SWJgzWabrqOClAVMsa4jJtMRm-il1dP6oVTsRstJSQlbgKf4qh3A08yMZ36pwezsITA=w3230-h1786
[2]: http://www.thesimplelogic.com/wordpress/wp-content/uploads/2017/12/ryu-python.png
Christopher Beacham / Lady Red
Saturday
10:50 a.m.–11:20 a.m.
in
Grand Ballroom A
Recently, a new LED strip specification, APA102, has been released which allows these strips to be driven by a general purpose CPU instead of a dedicated microcontroller. This allows us the luxury of controlling them with Python!
I'll teach you about how to get the the hardware, how to think about programming for lights and how to build anything from a psychedelic art installation to home lighting to an educational tool.
Programming with lights is awesome because you can SEE bugs with your eyes. I think the use of these LED's have great potential as a teaching tool because of the immediacy of the feedback.
LIVE hardware demos! See Quicksort in brilliant colors!
Brian Okken, Paul Everitt
Friday
5:10 p.m.–5:40 p.m.
in
Grand Ballroom A
Know you should be doing testing but haven’t gotten over the hurdle to learn it? pytest is Python’s modern, friendly, and powerful testing framework. When paired with an IDE, testing gets a visual interface, making it much easier to get started.
In this talk we cover “visual testing”: starting, learning, using, and mastering test-driven development (TDD) with the help of a nice UI. We’ll show PyCharm Community Edition, a free and open-source Python IDE, as a productive TDD environment for pytest. Specifically, we’ll show a workflow using pytest and PyCharm that helps make tests speed up development, or at the very least help to make testing seem less "in the way" of other development activities
Mariatta Wijaya
Saturday
4:15 p.m.–5 p.m.
in
Global Center Ballroom AB
How do you become a Python core developer? How can I become one? What is it like to be a Python core developer?
These are the questions I often receive ever since I became a Python core developer a year ago. Contributing to Python is a long journey that does not end when one earns the commit privilege. There are responsibilities to bear and expectations to live up to.
In the past year, I've been learning more about what it really means to be a Python core developer. Let me share all of that with you.
Justin Crown
Friday
12:10 p.m.–12:40 p.m.
in
Global Center Ballroom AB
Many of us practice test driven development, and pride ourselves in our code coverage. This is relatively easy to do when you begin a new project, but what happens when you take over an existing code base with little to no tests? Where and how do you start writing tests? This task can be very intimidating and frustrating, but can be accomplished!
This talk will run through some common approaches and methodologies for adding test coverage to pre-existing code (that you might not even be familiar with at all). The next time you take over an untested monolith, you will be able to do the right thing and start writing tests instead of hoping for the best!
Keith Yang
Friday
1:40 p.m.–2:25 p.m.
in
Global Center Ballroom AB
RESTful has been the go-to choice of API world. Why another API approach? To support more data-driven applications, to provide more flexibility and ease unnecessary code and calls, to address a wide variety of large-scale development problems, **GraphQL** comes with HTTP, JSON, Versioning, Nullability, Pagination, and Server-side Batching & Caching in mind to make API "Simple yet Powerful".
By applying [Graphene-Python](http://graphene-python.org/), a library for building GraphQL APIs in Python easily, this talk will go through the background and challenges of applying GraphQL as the new API service in a restaurant POS (point of sale) system within complex cloud infrastructure in Python. Introduction, testing, and live demo is included for sure.
Shannon Turner
Saturday
2:35 p.m.–3:05 p.m.
in
Grand Ballroom B
Knowing how to code and being able to teach it are two separate skills. When we have expertise in a subject, it's common to take for granted that we'll be able to effectively communicate our expertise to someone else. Come learn (or re-learn!) how to teach and discover practical examples you can put to work right away.
By sharpening your teaching skills, you'll be a more effective mentor, trainer, and team member.