Saturday 9 a.m.–12:20 p.m.
Practical Network Analysis Made Simple
Eric J. Ma
- Audience level:
- Intermediate
- Category:
- Science
Description
Have you ever wondered about how those data scientists at Facebook and LinkedIn make friend recommendations? Or how epidemiologists track down patient zero in an outbreak? If so, then this tutorial is for you. In this upgraded version of last year's tutorial, we will explore a variety of network analysis problems, including graph construction, traversal, and statistics.
Abstract
In this tutorial, I will show you how you can use data to construct networks for data analysis. By the end of the tutorial, you will be equipped to think through network problems, and have enough familiarity with the `networkx` API to hack at them on your own.
Starting with understanding a toy data set as an anchor, we will go through
- graph basics (nodes + edges, list and matrix representations),
- modelling problems as graphs,
- computing statistics (network connectivity) and discovering interesting structures (hubs, paths and clusters) in the data,
- importing data using `pandas` and `networkx`,
- generating visualizations using matplotlib,
- inspired by Jake Vanderplas, the use of random graphs and permutation tests for statistical inference on networks
IPython notebooks and data files will be distributed beforehand on Github to facilitate code distribution.
As good pedagogical practice, this tutorial will be structured as a series of coding problems (from easy to challenging) interspersed with mini lectures (2-3 minutes long) to emphasize concepts. Group work is encouraged. **Python 3 only!** It's time to heed @jakevdp's call, and move away from legacy versions on to Python 3.
Student Handout
No handouts have been provided yet for this tutorial