Chapter 7: Quantile Regression

한국어

Angrist & Pischke, Mostly Harmless Econometrics — Chapter 7

"Here's a prayer for you... Protect me from knowing what I don't need to know." — Douglas Adams

Core Message

95% of applied econometrics is concerned with averages. But many variables have continuous distributions that can change in ways not revealed by averages — they can spread out or compress. Quantile regression lets us model entire distributions, not just means.

Key insight: Just as OLS fits a linear model to the conditional mean, quantile regression fits a linear model to conditional quantiles — allowing us to see whether treatment affects different parts of the distribution differently.

7.1 The Quantile Regression Model

Conditional Quantile Function (CQF)

The starting point is the conditional quantile function:

Qτ(yi | Xi) = FY-1(τ | Xi)
τ value Meaning
τ = 0.10 Lower decile
τ = 0.50 Median
τ = 0.90 Upper decile

CEF vs CQF

CEF (OLS) CQF (Quantile Reg)
Solves min E[(y - m(X))²] min E[ρτ(y - q(X))]
Loss function Squared error Check function ρτ
Estimates Conditional mean Conditional quantile

The Check Function

The check function weights positive and negative residuals asymmetrically:

ρτ(u) = u · (τ - 1(u ≤ 0))
       = τ·u       if u > 0
       = (τ-1)·u  if u ≤ 0
τ Weight on positive Weight on negative Result
0.5 0.5 0.5 Median (LAD)
0.9 0.9 0.1 Upper quantile
0.1 0.1 0.9 Lower quantile

Location Shift vs Heteroskedasticity

Case 1: Location Shift (Homoskedastic)

Model: yi ~ N(Xi'β, σ²)

CQF: Qτ(yi | Xi) = Xi'β + σ·Φ-1(τ)

Key feature: Slope β is identical across all quantiles. Only the intercept changes with τ.

Case 2: Heteroskedasticity (Location-Scale Model)

Model: yi ~ N(Xi'β, (Xi'γ)²)

CQF: Qτ(yi | Xi) = Xi'[β + γ·Φ-1(τ)]

Key feature: Slope varies with τ. Upper quantiles have larger coefficients → inequality increases with X.

Empirical Example: Returns to Schooling (Table 7.1.1)

Data: 1980, 1990, 2000 U.S. Census. White/Black men aged 40-49. Controls: race, quadratic in potential experience.

Census 0.10 0.25 0.50 0.75 0.90 OLS
1980 .074 .074 .068 .070 .079 .072
1990 .112 .110 .106 .111 .137 .114
2000 .092 .105 .111 .120 .157 .114

1980: Coefficients similar across quantiles (~0.07) → Location shift

2000: Upper decile (15.7%) >> Lower decile (9.2%) → Fanning out

Interpretation: "Among the educated, the rich get even richer" — education increases both mean wages and inequality.

7.1.1 Censored Quantile Regression

Problem: Some data is hidden (e.g., CPS top-coding, duration censoring).

Key insight: Censoring from above doesn't affect quantiles below the censoring point.

Example: If top 10% is censored → τ ≤ 0.90 estimates are unaffected.

Powell (1986) solution:

  • Model: Qτ(y | X) = min(c, X'βτ)
  • Only use observations where X'β < c

Buchinsky (1994) iterative algorithm:

  1. Estimate β̂τ ignoring censoring
  2. Find cells with X'β̂τ < c
  3. Re-estimate using only those cells
  4. Repeat until convergence

7.1.3 Tricky Points

Tricky Point 1: Individual vs Distributional Effects

"Training raised the lower decile""Poor people became richer"

Quantile regression tells us about the distribution's shape, not about specific individuals. Unless we assume rank preservation (treatment doesn't change ranks), we can't interpret effects as individual-level.

Tricky Point 2: Conditional ≠ Marginal Quantiles

For means: E[y | X] = X'β ⟹ E[y] = E[X]'β ✓

For quantiles: Qτ(y | X) = X'βτ ⟹ Qτ(y) ≠ E[X]'βτ

Quantiles are nonlinear operators. Extracting marginal quantiles requires integrating over the entire distribution of X (Machado & Mata, 2005).

7.2 Quantile Treatment Effects (QTE)

The Problem: Selection Bias

Just like OLS, quantile regression suffers from omitted variable bias when treatment is endogenous.

Exogenous d Endogenous d
Mean OLS 2SLS
Quantile QR QTE

QTE: Extending LATE to Quantiles

Abadie, Angrist, and Imbens (2002) extend the LATE framework to quantiles:

Qτ(y | X, d, complier) = ατ·d + X'βτ

ατ = effect on τ-quantile for compliers

The Abadie Kappa

κi = 1 - di(1-zi)/(1-p(Xi)) - (1-di)zi/p(Xi)

Properties: E[κ | complier] = 1, E[κ | non-complier] = 0

QTE Estimator:

τ, βτ) = arg min E[κi · ρτ(yi - α·di - Xi'b)]

QTE Implementation Steps

  1. Step 1: Probit z ~ y, X in d=1 subsample → save Ê[z | y, d=1, X]
  2. Step 2: Probit z ~ y, X in d=0 subsample → save Ê[z | y, d=0, X]
  3. Step 3: Probit z ~ X in full sample → save P̂(z=1 | X)
  4. Step 4: Compute Ê[κ | y, d, X] using formula; trim to [0, 1]
  5. Step 5: Run κ-weighted quantile regression
  6. Step 6: Bootstrap entire procedure for standard errors

Empirical Example: JTPA Training (Table 7.2.1)

Setting: Job Training Partnership Act (1980s US). z = randomized training offer, d = actual participation (~60%), y = 30-month earnings.

Panel A: OLS & Quantile Regression (Selection bias present)

OLS τ=0.15 τ=0.25 τ=0.50 τ=0.75 τ=0.85
Training 3,754 1,187 2,510 4,420 4,678 4,806
% Impact 21% 136% 75% 35% 17% 13%

Panel B: 2SLS & QTE (Selection bias removed)

2SLS τ=0.15 τ=0.25 τ=0.50 τ=0.75 τ=0.85
Training 1,593 121 702 1,544 3,131 3,378
% Impact 9% 5% 12% 10% 11% 9%

Key finding: QR shows huge effect at τ=0.15 ($1,187, 136%). But QTE shows nearly zero ($121, 5%)!

Interpretation: Low-income trainees are more motivated → positive selection bias inflates QR estimates at lower quantiles. JTPA actually only worked at upper quantiles.

Three Key Questions

Q1. Quantile Regression vs OLS

Q: How does QR differ from OLS, and when should you use it?

A: OLS estimates conditional means; QR estimates conditional quantiles. Use QR when: (1) analyzing inequality, (2) detecting heterogeneous effects, (3) distinguishing location shift vs fanning out, (4) robustness to outliers.

Q2. Location Shift vs Fanning Out

Q: What does it mean when quantile coefficients differ across τ?

A: Identical coefficients → location shift (distribution shifts uniformly). Increasing coefficients → fanning out (inequality increases with X). 2000 Census: upper decile return (15.7%) >> lower decile (9.2%) → education increases inequality.

Q3. Why QTE?

Q: Why might QR estimates be biased, and how does QTE fix this?

A: QR suffers from selection bias when treatment is endogenous. QTE applies IV logic: uses Abadie kappa to weight observations by probability of being a complier. JTPA example: QR lower quantile effect dropped from $1,187 to $121 (90% reduction) after correcting for selection.

Summary: OLS vs QR vs 2SLS vs QTE

Method Estimand Selection bias Distribution
OLS E[y|X,d] Present Mean only
2SLS E[y|X,d] for compliers Removed Mean only
QR Qτ(y|X,d) Present Full distribution
QTE Qτ(y|X,d) for compliers Removed Full distribution