Friday 3:15 p.m.–4 p.m.

Realtime predictive analytics using scikit-learn & RabbitMQ

Michael Becker

Audience level:
Intermediate
Category:
Python Libraries

Description

scikit-learn is an awesome tool allowing developers with little or no machine learning knowledge to predict the future! But once you've trained a scikit-learn algorithm, what now? In this talk, I describe how to deploy a predictive model in a production environment using scikit-learn and RabbitMQ. You'll see a realtime content classification system to demonstrate this design.

Abstract

Predicting the future!

scikit-learn is an awesome tool allowing developers with little or no machine learning knowledge to predict the future! Once you've trained your algorithm, one of the first issues you'll tackle is how to deploy it. There are many requirements to take into account when deploying your algorithm in a production environment. Message loss and scalability are two of the bigger issues you'll have to deal with.

But how?

I'll work through each of the design issues and build an example system for making language predictions using scikit-learn and RabbitMQ. I'll start off by training a scikit-learn model using data from Wikipedia. Next I'll show a simple web UI similar to Google Translate which I'll use to input data into the algorithm. I'll discuss the issue of message loss, and why RabbitMQ is a good fit for solving this problem. I'll write a simple AMQP worker which will wrap the scikit-learn model. I'll review the overall design of the solution. Finally I'll demo the system in action by predicting the language of some text. I'll wrap up the talk by covering some other important design decisions that you'll want to consider when deploying this type of solution in a production environment.