Thursday 9 a.m.–12:20 p.m.

Build a data pipeline with Luigi

Aaron Knight

Description

[Luigi][1] is a Python library for building pipelines of batch processes. It "handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more" In this tutorial, we will use Luigi to build a data pipeline that runs a series of interdependent jobs. We will also discuss some real-world use cases for Luigi, and show how it can make running a data pipeline much more robust and reliable. [1]: https://pypi.python.org/pypi/luigi

Student Handout

No handouts have been provided yet for this tutorial