Data Management and Research Life Cycle

This class equips thoughtful thinkers with powerful data science skills. You will learn how to manage and work with complex and big datasets in social science research, particularly in policy and nonprofit studies. You are expected to learn the following skills and respond to “big questions” that have social importance.

Understand the structure of data and how to work with big and complex datasets.
Understand the workflows of acquiring and managing data.
Able to conduct data-intensive and replicable social science research.

Each class has two sections: discussion of reading materials and hands-on programming. Programming environment will be JupyterHub using Python 3. Prior programming experience is helpful but not required. You are expected to have knowledge of college level statistics (e.g., you know what is “mean”, “standard deviation”, “normal distribution”, and you can use Excel to draw line charts). Advanced programming skills and statistics are helpful but not required because they are not the focus of assignments, and final project can be customized towards individual needs. Tentative syllabus is available:

The class will also have four lunch seminars with guest speakers from the academia, nonprofit industry, and intelligence community.