Angrist & Pischke, Mostly Harmless Econometrics — Chapter 7
"Here's a prayer for you... Protect me from knowing what I don't need to know." — Douglas Adams
Suhyeon Lee
Angrist & Pischke, Mostly Harmless Econometrics — Chapter 7
"Here's a prayer for you... Protect me from knowing what I don't need to know." — Douglas Adams
95% of applied econometrics is concerned with averages. But many variables have continuous distributions that can change in ways not revealed by averages — they can spread out or compress. Quantile regression lets us model entire distributions, not just means.
Key insight: Just as OLS fits a linear model to the conditional mean, quantile regression fits a linear model to conditional quantiles — allowing us to see whether treatment affects different parts of the distribution differently.
The starting point is the conditional quantile function:
| τ value | Meaning |
|---|---|
| τ = 0.10 | Lower decile |
| τ = 0.50 | Median |
| τ = 0.90 | Upper decile |
| CEF (OLS) | CQF (Quantile Reg) | |
|---|---|---|
| Solves | min E[(y - m(X))²] | min E[ρτ(y - q(X))] |
| Loss function | Squared error | Check function ρτ |
| Estimates | Conditional mean | Conditional quantile |
The check function weights positive and negative residuals asymmetrically:
| τ | Weight on positive | Weight on negative | Result |
|---|---|---|---|
| 0.5 | 0.5 | 0.5 | Median (LAD) |
| 0.9 | 0.9 | 0.1 | Upper quantile |
| 0.1 | 0.1 | 0.9 | Lower quantile |
Model: yi ~ N(Xi'β, σ²)
CQF: Qτ(yi | Xi) = Xi'β + σ·Φ-1(τ)
Key feature: Slope β is identical across all quantiles. Only the intercept changes with τ.
Model: yi ~ N(Xi'β, (Xi'γ)²)
CQF: Qτ(yi | Xi) = Xi'[β + γ·Φ-1(τ)]
Key feature: Slope varies with τ. Upper quantiles have larger coefficients → inequality increases with X.
Data: 1980, 1990, 2000 U.S. Census. White/Black men aged 40-49. Controls: race, quadratic in potential experience.
| Census | 0.10 | 0.25 | 0.50 | 0.75 | 0.90 | OLS |
|---|---|---|---|---|---|---|
| 1980 | .074 | .074 | .068 | .070 | .079 | .072 |
| 1990 | .112 | .110 | .106 | .111 | .137 | .114 |
| 2000 | .092 | .105 | .111 | .120 | .157 | .114 |
1980: Coefficients similar across quantiles (~0.07) → Location shift
2000: Upper decile (15.7%) >> Lower decile (9.2%) → Fanning out
Interpretation: "Among the educated, the rich get even richer" — education increases both mean wages and inequality.
Problem: Some data is hidden (e.g., CPS top-coding, duration censoring).
Key insight: Censoring from above doesn't affect quantiles below the censoring point.
Example: If top 10% is censored → τ ≤ 0.90 estimates are unaffected.
Powell (1986) solution:
Buchinsky (1994) iterative algorithm:
"Training raised the lower decile" ≠ "Poor people became richer"
Quantile regression tells us about the distribution's shape, not about specific individuals. Unless we assume rank preservation (treatment doesn't change ranks), we can't interpret effects as individual-level.
For means: E[y | X] = X'β ⟹ E[y] = E[X]'β ✓
For quantiles: Qτ(y | X) = X'βτ ⟹ Qτ(y) ≠ E[X]'βτ ✗
Quantiles are nonlinear operators. Extracting marginal quantiles requires integrating over the entire distribution of X (Machado & Mata, 2005).
Just like OLS, quantile regression suffers from omitted variable bias when treatment is endogenous.
| Exogenous d | Endogenous d | |
|---|---|---|
| Mean | OLS | 2SLS |
| Quantile | QR | QTE |
Abadie, Angrist, and Imbens (2002) extend the LATE framework to quantiles:
ατ = effect on τ-quantile for compliers
Properties: E[κ | complier] = 1, E[κ | non-complier] = 0
QTE Estimator:
Setting: Job Training Partnership Act (1980s US). z = randomized training offer, d = actual participation (~60%), y = 30-month earnings.
| OLS | τ=0.15 | τ=0.25 | τ=0.50 | τ=0.75 | τ=0.85 | |
|---|---|---|---|---|---|---|
| Training | 3,754 | 1,187 | 2,510 | 4,420 | 4,678 | 4,806 |
| % Impact | 21% | 136% | 75% | 35% | 17% | 13% |
| 2SLS | τ=0.15 | τ=0.25 | τ=0.50 | τ=0.75 | τ=0.85 | |
|---|---|---|---|---|---|---|
| Training | 1,593 | 121 | 702 | 1,544 | 3,131 | 3,378 |
| % Impact | 9% | 5% | 12% | 10% | 11% | 9% |
Key finding: QR shows huge effect at τ=0.15 ($1,187, 136%). But QTE shows nearly zero ($121, 5%)!
Interpretation: Low-income trainees are more motivated → positive selection bias inflates QR estimates at lower quantiles. JTPA actually only worked at upper quantiles.
Q: How does QR differ from OLS, and when should you use it?
A: OLS estimates conditional means; QR estimates conditional quantiles. Use QR when: (1) analyzing inequality, (2) detecting heterogeneous effects, (3) distinguishing location shift vs fanning out, (4) robustness to outliers.
Q: What does it mean when quantile coefficients differ across τ?
A: Identical coefficients → location shift (distribution shifts uniformly). Increasing coefficients → fanning out (inequality increases with X). 2000 Census: upper decile return (15.7%) >> lower decile (9.2%) → education increases inequality.
Q: Why might QR estimates be biased, and how does QTE fix this?
A: QR suffers from selection bias when treatment is endogenous. QTE applies IV logic: uses Abadie kappa to weight observations by probability of being a complier. JTPA example: QR lower quantile effect dropped from $1,187 to $121 (90% reduction) after correcting for selection.
| Method | Estimand | Selection bias | Distribution |
|---|---|---|---|
| OLS | E[y|X,d] | Present | Mean only |
| 2SLS | E[y|X,d] for compliers | Removed | Mean only |
| QR | Qτ(y|X,d) | Present | Full distribution |
| QTE | Qτ(y|X,d) for compliers | Removed | Full distribution |