pycon logo

PyCon 2011 Atlanta

March 9th–17th

Log in or Sign Up

Mining and Visualizing Data from the Social Web with Python

log in to bookmark this presentaton

Experienced / Tutorial
March 9th 1:20 p.m. – 4:40 p.m.
This lab teaches you how to harvest, store, analyze, and visualize data from the most popular social networking sites (Twitter, Facebook, LinkedIn, etc.) with Python, pragmatic storage technologies like Redis and CouchDB, and popular visualization tools like Graphviz and JavaScript toolkits.


Popular social networks such as Facebook, Twitter, and LinkedIn generate a tremendous amount of valuable social data. Who's talking to whom? What are they talking about? How often are they talking? Where are they at? This lab tutorial teaches you how to answer these types of questions and more. Each teaching module presents a soup-to-nuts approach that combines popular social web data, analysis techniques, and visualization so that you can find the needles you've been looking for as well as some of the ones you didn't even know to look for in the first place.

This lab is taught by the author of Mining the Social Web, and exercises are largely adapted from the the same. Much of the source code for the lab is already available on GitHub, and you are encouraged (but not required) to hack on it prior to the lab, so that you are ready with questions and ideas.

  • Get a concise and straightforward synopsis of the social web landscape so you know which 20% of the space to spend 80% of your time on
  • Use easily adaptable scripts hosted on GitHub to harvest data from popular social network APIs including Twitter, Facebook, and LinkedIn
  • Learn how to slice and dice social web data with easy to use Python tools as well as apply more advanced mining techniques such as TF-IDF, cosine similarity, collocation analysis, document summarization, and clique detection
  • Build interactive visualizations with easily adaptable web technologies built upon HTML5 and JavaScript toolkits