Thursday 9 a.m.–12:20 p.m.
Python for Social Scientists
Renee Chu
- Audience level:
- Novice
- Category:
- Science
Description
Abstract
Python for Social Scientists
A) Greeting and Orientation - 15 min
- Survey of programming levels in the audience.
- Survey of educational/research experience.
Confirm installation pre-requisistes:
- python 2.7 or higher
- MatPlotLib
- A text editor
Break into pairs to be partners during the workshop. This to help keep from getting behind due to small questions.
Time Total: 0:15
B) Group Work: Import World Bank Data - 45 min
- We will write a method to pull in CSV data from the World Bank Indicators page.
- We'll be using "Ratio of girls to boys in primary and secondary education" (http://data.worldbank.org/indicator/SE.ENR.PRSC.FM.ZS).
- Output data to the console (ie a Python dict)
Time Total: 1:00
C) Group Work: Graph the Data, Time Series - 45 min
As a group, we will find and read through pre-existing code to borrow from, specifically this one of a bar series with two sets of data, "Men" and "Women" (http://matplotlib.org/examples/api/barchart_demo.html). Then we'll alter it to meet our needs and create 2 different charts:
Display "Ratio of girls to boys in primary and secondary education" across years, grouped by country for 2-4 chosen countries
2 axes: One axis will display ratio of girls to boys in primary education" across countries over time. The other will display another indicator from the World Bank library.
Share the pattern you found with the rest of your table.
Time Total: 1:45
D) On Your Own: Answer Your Own Question - 45:00
- With your partner, pick another data set you find interesting. Pose a question that can be answered with that data.
Don't have to limit it to World Bank data set; any that outputs to .csv or .xls can work. Some resources:
- SF city data: https://data.sfgov.org/
- DC city data: http://data.dc.gov/
- Daily Treaasury Statements from 1998 to 2013: https://fms.treas.gov/fmsweb/DTSFilesArchiveAction.do (TODO crawl websites and convert .txt to .csv if have time or see if results are in the github for https://github.com/csvsoundsystem/fms-treasury-statements-analysis)
- US Dept of Education: http://www.ed.gov/developer (NOTE many currently unavailable due to shutdown)
- US Dept of Defense open data sets http://open.defense.gov/Data.aspx (NOTE currently not available due to shurtdown)
- Science Hack Day listings: http://sciencehackday.pbworks.com/w/page/24500475/Datasets
Using tools we've learned so far, display the data in a way that answers your question. Teacher (and TA's hopefully) will be circling to answer questions
Time Total: 2:30
-- 15 min break, may be mixed in earlier with the on-your-own time --
Time Total: 2:45
E) Present to the Group - 15 min * Each pair in the tutorial will present the question they asked and what they found to the group.
Time Total: 3:00
Student Handout
No handouts have been provided yet for this tutorial