Learn about powerful python libraries for analyzing all types of data, including spatial data, through the following illustrated examples.
Problem: I have a large voter data file in CSV format. I want to examine it, check the column headings and data types, and do some basic stats, but I don’t want to pull it into Excel or Access. What are my options?
Solution: csvkit - I can explore my data, chop it up, sort it, summarize it, and prepare it for import to postgis.
Bonus: Developers and journalists have been working hard to add functionality to csvkit. You can contribute!
Problem: I have a bunch of data points from Twitter. How do I make sense of what I have in front of me, and where do I start?
Solutions: matplotlib, networkx
Bonus: Learn about how python libraries are plug and play with each other.
Problem: I have a year’s worth of crime incidents for a large city. I want to explore global and local patterns in the data and identify clusters.
Solutions: PySal (Numpy, Scipy)
Bonus: We’ll look at the full ESDA (Exploratory Spatial Data Analysis) module in PySal, and we’ll briefly touch on a selection of the rest of PySal’s functionality.
To wrap up the talk, we'll give some tips on using postgis and geodjango to go from data analysis and mapping to building a web application.