Statistical Modeling for Policy Analysis
Causality is fundamental to policy analysis. That is, it’s important to know what will (probably) happen if we implement a program, change a rule, or take some other action to solve a problem, before we decide what to do. A pilot or experimental program can often help, but even then it’s hard to isolate the effects of our actions from the effects of everything else that causes public problems to get better or worse. And sometimes even pilot programs are too expensive, time-consuming, or ethically challenging; if so, the best we can do is to use our past experience and natural variation to work out the extent to which solution X reduces problem Y.
For better or worse, the basic tool for estimating causal relationships is linear regression. In this class, we’ll apply regression to a variety of real-world problems and data sets to estimate the causal relationships that lead to good policy decisions. We haven’t time to be comprehensive, but we can consider the kinds of problems and situations policy analysts encounter most often. In particular, we’ll consider a variety of data structures (cross-sectional, time-series, and panel), research designs (randomized controlled trials, quasi-experiments, and nonexperiments), dependent variables (amounts, counts, binary), and relationship types (nonlinear, contingent, and bidirectional). We’ll also consider how to draw reasonable conclusions from previous studies.
Requirements: Four problem sets, a final exam, and a final paper. Students are encouraged to work in groups on problem sets, and may take the exam by themselves or with one (only one) other student. Two or more students may collaborate on a paper with permission of the instructor.
Readings: I recommend Hill, Griffiths, and Lim, Principles of Econometrics, 3rd ed. (2008). Since most basic econometrics texts will cover the same material (e.g., Wooldridge, Gujarati, Stock & Watson), students should feel free to use the book they’re most comfortable with. Likewise, I’ll use Stata in class, but if students are already comfortable with some other statistical package (e.g., SPSS, SAS, R, gretl), virtually any of them will do the job.