Thursday 9 a.m.–12:20 p.m.
Build a data pipeline with Luigi
Aaron Knight
Description
[Luigi][1] is a Python library for building pipelines of batch processes. It "handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more"
In this tutorial, we will use Luigi to build a data pipeline that runs a series of interdependent jobs. We will also discuss some real-world use cases for Luigi, and show how it can make running a data pipeline much more robust and reliable.
[1]: https://pypi.python.org/pypi/luigi
Student Handout
No handouts have been provided yet for this tutorial