The demand for software engineers with Data Analytics and Machine Learning skills is rapidly growing and Python / Numpy is one of the best environments for quickly prototyping scalable data-centric applications or interactively exploring your data especially thanks to tools such as IPython and Matplotlib.
scikit-learn is a very active open source project that implements a variety of state-of-the art machine learning algorithms. The goal of this project and tutorial is to take the algorithms out of the academic papers and make them work on a selection of real world tasks to unleash the value of your data.
We will focus on providing hints to perform the right data preprocessing steps and on how to select algorithms and parameters suitable for the task at hand. We will also introduce tools and methodologies to measure the performance of the trained models as objectively as possible.