Angrist Ch.7 - Quantile Regression

Chapter 7: Quantile Regression

한국어

Angrist & Pischke, Mostly Harmless Econometrics — Chapter 7

"Here's a prayer for you... Protect me from knowing what I don't need to know." — Douglas Adams

Core Message

95% of applied econometrics is concerned with averages. But many variables have continuous distributions that can change in ways not revealed by averages — they can spread out or compress. Quantile regression lets us model entire distributions, not just means.

Key insight: Just as OLS fits a linear model to the conditional mean, quantile regression fits a linear model to conditional quantiles — allowing us to see whether treatment affects different parts of the distribution differently.

7.1 The Quantile Regression Model

Conditional Quantile Function (CQF)

The starting point is the conditional quantile function:

Q_τ(y_i | X_i) = F_Y^-1(τ | X_i)

τ value	Meaning
τ = 0.10	Lower decile
τ = 0.50	Median
τ = 0.90	Upper decile

CEF vs CQF

	CEF (OLS)	CQF (Quantile Reg)
Solves	min E[(y - m(X))²]	min E[ρ_τ(y - q(X))]
Loss function	Squared error	Check function ρ_τ
Estimates	Conditional mean	Conditional quantile

The Check Function

The check function weights positive and negative residuals asymmetrically:

                ρτ(u) = u · (τ - 1(u ≤ 0))

                       = τ·u       if u > 0

                       = (τ-1)·u  if u ≤ 0

τ	Weight on positive	Weight on negative	Result
0.5	0.5	0.5	Median (LAD)
0.9	0.9	0.1	Upper quantile
0.1	0.1	0.9	Lower quantile

Location Shift vs Heteroskedasticity

Case 1: Location Shift (Homoskedastic)

Model: y_i ~ N(X_i'β, σ²)

CQF: Q_τ(y_i | X_i) = X_i'β + σ·Φ^-1(τ)

Key feature: Slope β is identical across all quantiles. Only the intercept changes with τ.

Case 2: Heteroskedasticity (Location-Scale Model)

Model: y_i ~ N(X_i'β, (X_i'γ)²)

CQF: Q_τ(y_i | X_i) = X_i'[β + γ·Φ^-1(τ)]

Key feature: Slope varies with τ. Upper quantiles have larger coefficients → inequality increases with X.

Empirical Example: Returns to Schooling (Table 7.1.1)

Data: 1980, 1990, 2000 U.S. Census. White/Black men aged 40-49. Controls: race, quadratic in potential experience.

Census	0.10	0.25	0.50	0.75	0.90	OLS
1980	.074	.074	.068	.070	.079	.072
1990	.112	.110	.106	.111	.137	.114
2000	.092	.105	.111	.120	.157	.114

1980: Coefficients similar across quantiles (~0.07) → Location shift

2000: Upper decile (15.7%) >> Lower decile (9.2%) → Fanning out

Interpretation: "Among the educated, the rich get even richer" — education increases both mean wages and inequality.

7.1.1 Censored Quantile Regression

Problem: Some data is hidden (e.g., CPS top-coding, duration censoring).

Key insight: Censoring from above doesn't affect quantiles below the censoring point.

Example: If top 10% is censored → τ ≤ 0.90 estimates are unaffected.

Powell (1986) solution:

Model: Q_τ(y | X) = min(c, X'β_τ)
Only use observations where X'β < c

Buchinsky (1994) iterative algorithm:

Estimate β̂_τ ignoring censoring
Find cells with X'β̂_τ < c
Re-estimate using only those cells
Repeat until convergence

7.1.3 Tricky Points

Tricky Point 1: Individual vs Distributional Effects

"Training raised the lower decile" ≠ "Poor people became richer"

Quantile regression tells us about the distribution's shape, not about specific individuals. Unless we assume rank preservation (treatment doesn't change ranks), we can't interpret effects as individual-level.

Tricky Point 2: Conditional ≠ Marginal Quantiles

For means: E[y | X] = X'β ⟹ E[y] = E[X]'β ✓

For quantiles: Q_τ(y | X) = X'β_τ ⟹ Q_τ(y) ≠ E[X]'β_τ ✗

Quantiles are nonlinear operators. Extracting marginal quantiles requires integrating over the entire distribution of X (Machado & Mata, 2005).

7.2 Quantile Treatment Effects (QTE)

The Problem: Selection Bias

Just like OLS, quantile regression suffers from omitted variable bias when treatment is endogenous.

	Exogenous d	Endogenous d
Mean	OLS	2SLS
Quantile	QR	QTE

QTE: Extending LATE to Quantiles

Abadie, Angrist, and Imbens (2002) extend the LATE framework to quantiles:

                Qτ(y | X, d, complier) = ατ·d + X'βτ
            

α_τ = effect on τ-quantile for compliers

The Abadie Kappa

                κi = 1 - di(1-zi)/(1-p(Xi)) - (1-di)zi/p(Xi)
            

Properties: E[κ | complier] = 1, E[κ | non-complier] = 0

QTE Estimator:

                (ατ, βτ) = arg min E[κi · ρτ(yi - α·di - Xi'b)]
            

QTE Implementation Steps

Step 1: Probit z ~ y, X in d=1 subsample → save Ê[z | y, d=1, X]
Step 2: Probit z ~ y, X in d=0 subsample → save Ê[z | y, d=0, X]
Step 3: Probit z ~ X in full sample → save P̂(z=1 | X)
Step 4: Compute Ê[κ | y, d, X] using formula; trim to [0, 1]
Step 5: Run κ-weighted quantile regression
Step 6: Bootstrap entire procedure for standard errors

Empirical Example: JTPA Training (Table 7.2.1)

Setting: Job Training Partnership Act (1980s US). z = randomized training offer, d = actual participation (~60%), y = 30-month earnings.

Panel A: OLS & Quantile Regression (Selection bias present)

	OLS	τ=0.15	τ=0.25	τ=0.50	τ=0.75	τ=0.85
Training	3,754	1,187	2,510	4,420	4,678	4,806
% Impact	21%	136%	75%	35%	17%	13%

Panel B: 2SLS & QTE (Selection bias removed)

	2SLS	τ=0.15	τ=0.25	τ=0.50	τ=0.75	τ=0.85
Training	1,593	121	702	1,544	3,131	3,378
% Impact	9%	5%	12%	10%	11%	9%

Key finding: QR shows huge effect at τ=0.15 ($1,187, 136%). But QTE shows nearly zero ($121, 5%)!

Interpretation: Low-income trainees are more motivated → positive selection bias inflates QR estimates at lower quantiles. JTPA actually only worked at upper quantiles.

Three Key Questions

Q1. Quantile Regression vs OLS

Q: How does QR differ from OLS, and when should you use it?

A: OLS estimates conditional means; QR estimates conditional quantiles. Use QR when: (1) analyzing inequality, (2) detecting heterogeneous effects, (3) distinguishing location shift vs fanning out, (4) robustness to outliers.

Q2. Location Shift vs Fanning Out

Q: What does it mean when quantile coefficients differ across τ?

A: Identical coefficients → location shift (distribution shifts uniformly). Increasing coefficients → fanning out (inequality increases with X). 2000 Census: upper decile return (15.7%) >> lower decile (9.2%) → education increases inequality.

Q3. Why QTE?

Q: Why might QR estimates be biased, and how does QTE fix this?

A: QR suffers from selection bias when treatment is endogenous. QTE applies IV logic: uses Abadie kappa to weight observations by probability of being a complier. JTPA example: QR lower quantile effect dropped from $1,187 to $121 (90% reduction) after correcting for selection.

Summary: OLS vs QR vs 2SLS vs QTE

Method	Estimand	Selection bias	Distribution
OLS	E[y\|X,d]	Present	Mean only
2SLS	E[y\|X,d] for compliers	Removed	Mean only
QR	Q_τ(y\|X,d)	Present	Full distribution
QTE	Q_τ(y\|X,d) for compliers	Removed	Full distribution