Chapter 4 Part 3: IV Details

한국어

Angrist & Pischke, Mostly Harmless Econometrics — Section 4.6

Core Message

This section covers practical pitfalls of IV: common mistakes in manual 2SLS, the difficulty of identifying peer effects, the relationship between 2SLS and nonlinear models (bivariate Probit), and the finite-sample bias of 2SLS when instruments are many or weak.

4.6.1 Common 2SLS Mistakes

Mistake 1: Covariate Ambivalence

The mistake: Including different covariates in the first and second stages.

Griliches & Mason (1972) included age in the second stage but not in the first stage. This is wrong because the first-stage residual (si − ŝi) is only guaranteed to be uncorrelated with variables included in the first stage.

Rule: Always include the same exogenous covariates in both stages. If a covariate is good enough for the second stage, it's good enough for the first.

Mistake 2: Forbidden Regressions

The mistake: Using nonlinear first-stage fitted values (e.g., Probit) as plug-in replacements in the second stage.

Suppose di is a binary endogenous variable. You might think: "Since di is 0/1, use Probit for the first stage instead of OLS."

Why it's wrong: Only OLS residuals are guaranteed to be uncorrelated with fitted values and covariates (by the normal equations). Probit residuals lack this property unless the Probit model is correctly specified — which we cannot verify.

Correct alternatives:

  • Standard 2SLS: Use a linear first stage (always consistent regardless of first-stage functional form)
  • Nonlinear fits as instruments: Use d̂probit as an instrument (not a plug-in) in standard 2SLS. This can improve efficiency if the Probit model is a good approximation

Caveat: Using nonlinear fits as instruments implicitly uses the nonlinearity as identifying information. If the instruments Zi appear in the causal equation, the model should be unidentified, but the nonlinear first stage creates "back-door" identification through functional form — which is questionable.

Mistake 3: Forbidden Nonlinear Second Stage

With a quadratic model yi = δ'Xi + ρ₁si + ρ₂si² + εi, do not plug in ŝ and ŝ² from a single first stage. Instead, treat both si and si² as separate endogenous variables, each with its own first-stage equation, and use proper 2SLS.

4.6.2 Peer Effects

Type 1: Effect of Group Average of One Variable on Individual Outcome of Another

Example: Does average schooling in a state (S̄jt) affect individual wages? (Acemoglu & Angrist 2000)

Yijt = αj + λt + ρsi + ψS̄jt + ujt + εijt

Problem: If OLS and 2SLS (using state dummies) give different estimates of ρ, then ψ̂ ≠ 0 mechanically, even without true externalities.

  • If 2SLS > OLS (e.g., measurement error correction): spurious positive externality
  • If 2SLS < OLS (e.g., ability bias removed): spurious negative externality

→ OLS of equation like this is very hard to interpret for peer effects.

Type 2: Effect of Group Average on Same Individual Variable

"Does the average graduation rate of my classmates affect whether I graduate?"

Regression of sij on S̄j always has coefficient = 1. This is because S̄j is literally the fitted value from regressing sij on school dummies. This regression is tautological and tells us nothing about causality.

Even using leave-one-out means S̄(−i)j is problematic because school-level common shocks (e.g., a good principal) create spurious correlation between individual and peer outcomes.

Better Approaches

Use ex ante peer characteristics that predate the outcome:

  • Ammermueller & Pischke (2006): Books in peers' homes → student test scores (books are a pre-determined home characteristic)
  • Angrist & Lang (2004): Number of bused-in low-achievers → resident students' test scores (determined by students outside the sample)

4.6.3 Limited Dependent Variables Reprise

The Case for 2SLS over Bivariate Probit

When the dependent variable is binary (e.g., employment), should we use bivariate Probit instead of 2SLS?

Arguments for sticking with 2SLS:

  • 2SLS captures LATE regardless of whether the dependent variable is binary, non-negative, or continuous
  • 2SLS requires no distributional assumptions
  • 2SLS estimates the causal effect directly — no need to compute marginal effects from latent-index coefficients
  • Bivariate Probit can estimate ATE (not just LATE), but only under joint normality — a strong assumption

Bivariate Probit Setup

First stage: di = 1[Xi'δ₀ + δ₁zi > vi]
Second stage: yi = 1[Xi'β₀ + β₁di > εi]

Endogeneity arises from Corr(vi, εi) ≠ 0.
Identified by assuming zi ⊥ (vi, εi) and joint normality.

Empirical Comparison: Effect of Third Child on Female Employment

Specification 2SLS Abadie Biprobit MFX Biprobit ATE
No covariates −0.138 −0.138 −0.138 −0.139
Some covariates −0.132 −0.132 −0.135 −0.135
+ linear age term −0.120 −0.121 −0.171 −0.171

Results are nearly identical without strong functional form assumptions. But when a linear age term replaces a dummy, bivariate Probit estimates jump to −0.171 while 2SLS and Abadie remain stable. This reflects extrapolation into sparse cells — exactly the fragility that nonlinear models introduce.

Bottom line: 2SLS is robust to functional form. Bivariate Probit can give you ATE instead of LATE, but at the cost of strong distributional assumptions that may not hold. In practice, the two usually agree unless the Probit model is extrapolating.

4.6.4 The Bias of 2SLS

OLS Is Unbiased, 2SLS Is Not

OLS is unbiased (centered on the population coefficient in any sample size). 2SLS is only consistent — it converges to the right answer in large samples, but can be systematically off in finite samples.

The Bias Formula

E[β̂2SLS − β] ≈ (β̂OLS bias) × 1/(F + 1)

where F is the first-stage F-statistic on excluded instruments.

Key Implications

Scenario 2SLS Bias
F → ∞ (strong instruments) Bias → 0 ✓
F → 0 (no first stage) Bias → OLS bias (worst case)
More instruments (higher q) F falls → Bias increases

Source of Bias

The bias arises because the first stage is estimated, not known. Fitted values ŝi = Zπ̂ contain sampling error (Pzη) that is correlated with the second-stage error ε. When instruments are weak, this sampling correlation dominates, pulling 2SLS toward OLS.

LIML: A Bias-Reducing Alternative

LIML (Limited Information Maximum Likelihood) is approximately median-unbiased even with over-identification, while having the same large-sample distribution as 2SLS.

  • LIML is essentially a bias-corrected linear combination of OLS and 2SLS
  • Available in Stata and SAS
  • Monte Carlo evidence (Flores-Lagunes 2007) supports LIML across a wide range of scenarios

Monte Carlo Evidence

Setup (true β=1) OLS Median 2SLS Median LIML Median
q=2 (1 useful + 1 useless) ~1.79 1.07 ~1.0
q=20 (1 useful + 19 useless) ~1.79 1.53 ~1.0
q=20 (all useless) ~1.79 ~1.79 widely dispersed

LIML stays centered on β=1 even with many weak instruments, while 2SLS is pulled toward OLS. With truly irrelevant instruments, LIML's wide distribution correctly reflects the lack of information.

Practical Recommendations

  1. Report the first stage. Check sign, magnitude, and plausibility.
  2. Report the F-statistic on excluded instruments. Rule of thumb: F > 10 is safe (Stock, Wright & Yogo 2002), though not an absolute theorem.
  3. Report just-identified estimates using your single best instrument. Just-identified IV is median-unbiased and immune to the many-instruments problem.
  4. Compare 2SLS and LIML. If they agree, be reassured. If they disagree, worry — and look for stronger instruments.
  5. Look at the reduced form. The reduced-form regression (y on z) is OLS and therefore unbiased. If you can't see the causal relation in the reduced form, it's probably not there.

Application: Angrist & Krueger (1991) Revisited

Instruments q F-stat 2SLS LIML
3 QOB dummies 3 32.3 0.105 (0.020) 0.106 (0.020)
QOB×YOB interactions 30 4.9 0.089 (0.016) 0.093 (0.018)
+ QOB×SOB interactions 180 2.6 0.093 (0.009) 0.091 (0.011)

With 3 instruments and F=32, 2SLS and LIML agree closely. With 180 instruments and F=2.6, the F-statistic is low but LIML still agrees with 2SLS, suggesting the bias may not be fatal here despite the mechanical rule of thumb.

Part 3 Summary

Topic Key Lesson
Covariate ambivalence Same covariates in both stages; otherwise residuals are correlated with fitted values
Forbidden regression Never plug nonlinear fitted values into a second stage; use them as instruments instead
Peer effects (Type 1) OLS estimates of externalities are confounded by OLS-vs-IV differences in private returns
Peer effects (Type 2) Regressing individual outcome on group mean of same outcome is tautological; use ex ante peer characteristics
2SLS vs. Bivariate Probit 2SLS is robust; Biprobit needs normality and is sensitive to covariates
2SLS bias Bias ≈ OLS bias / (F+1); many weak instruments → bias toward OLS
LIML Median-unbiased alternative to 2SLS; use for robustness checks
F > 10 rule Rule of thumb for instrument strength; not an absolute theorem

The five-point IV checklist:

  1. Report and inspect the first stage
  2. Report the F-statistic (aim for > 10)
  3. Report just-identified estimates with your best instrument
  4. Compare 2SLS and LIML
  5. Check the reduced form — if the causal effect isn't visible there, it's probably not real
← Part 2: LATE & Heterogeneous Effects Back to Study Notes →
This note was written with the assistance of LLM (Claude).