Angrist & Pischke, Mostly Harmless Econometrics — Sections 4.4–4.5
Chapter 4 Part 2: LATE & Heterogeneous Effects
한국어Core Message
When treatment effects are heterogeneous (different people benefit differently), IV estimates the Local Average Treatment Effect (LATE) — the causal effect specifically for compliers, the subpopulation whose treatment status is changed by the instrument.
Key questions this part answers:
- What does IV estimate with heterogeneous effects? → LATE (effect on compliers)
- Who are compliers? → People whose treatment changes with the instrument
- How does LATE relate to ATE and ATT? → Generally different, unless special cases apply
- How does 2SLS generalize? → Weighted average of covariate-specific LATEs
4.4 IV with Heterogeneous Potential Outcomes
Why Heterogeneity Matters
Constant effects (y1i − y0i = ρ for all i) is unrealistic. Different people benefit differently from treatment. This raises two concerns:
- Internal validity: What exactly is IV estimating?
- External validity: Do the results generalize to other populations?
Setup: Generalized Potential Outcomes
Define potential outcomes indexed by both treatment and instrument:
d1i = treatment status if zi = 1
d0i = treatment status if zi = 0
4.4.1 The LATE Theorem (Imbens & Angrist, 1994)
Four Assumptions
| Assumption | Formal Statement | Intuition |
|---|---|---|
| A1: Independence | {yi(d,z), d1i, d0i} ⊥ zi | Instrument is as good as randomly assigned |
| A2: Exclusion | yi(d, 0) = yi(d, 1) for d = 0, 1 | Instrument affects outcome only through treatment |
| A3: First stage | E[d1i − d0i] ≠ 0 | Instrument affects treatment on average |
| A4: Monotonicity | d1i ≥ d0i for all i (or vice versa) | No one is pushed away from treatment by the instrument |
The LATE Theorem:
The IV estimand = average causal effect for compliers
Proof Sketch
Numerator (reduced form):
E[yi|z=1] − E[yi|z=0] = E[(y1i−y0i)(d1i−d0i)]
By monotonicity, (d1i−d0i) is 0 or 1, so this equals:
= E[y1i−y0i | d1i>d0i] × P[d1i>d0i]
Denominator (first stage): E[d1i−d0i] = P[d1i>d0i]
Dividing cancels the compliance probability, leaving LATE.
Why Monotonicity?
Without monotonicity, some people are "defiers" (d1i < d0i). The reduced form becomes:
Positive effects could be canceled by defiers, making the reduced form misleading. Monotonicity rules out this possibility.
4.4.2 The Compliant Subpopulation
The instrument partitions the population into three groups:
| Group | Definition | Draft Lottery Example |
|---|---|---|
| Compliers | d1i = 1, d0i = 0 | Served because of draft eligibility |
| Always-takers | d1i = d0i = 1 | Volunteered regardless |
| Never-takers | d1i = d0i = 0 | Exempted / deferred regardless |
LATE ≠ ATE ≠ ATT in general:
- ATT (effect on the treated) = weighted average of effects on always-takers and compliers
- ATE (average treatment effect) = weighted average of effects on all three groups
- LATE = effect on compliers only
Special Cases: LATE = ATT or LATE = Effect on Non-treated
| Scenario | Example | Why |
|---|---|---|
| No always-takers: E[d|z=0]=0 | JTPA training experiment | Treated = compliers only → LATE = ATT |
| No never-takers: d1i=1 for all i | Twins instrument, Minneapolis DV experiment | Non-treated = compliers only → LATE = E[y₁−y₀|d=0] |
4.4.3 IV in Randomized Trials (Bloom 1984)
In a randomized trial with one-sided non-compliance (some offered treatment decline, but no control subject gets treatment), the IV estimand is the effect of treatment on the treated.
Bloom's Result: If E[di|zi=0] = 0 (no always-takers), then:
Example: JTPA Training Experiment
| By Training Status (OLS) | By Assignment (ITT) | IV Estimate (ATT) | |
|---|---|---|---|
| Men | $3,970 | $1,117 | $1,825 |
| Women | $2,133 | $1,243 | $1,942 |
OLS (by actual training) overstates the effect due to selection. ITT understates it because only 60% complied. IV = ITT ÷ 0.6 gives the causal effect on compliers = ATT.
4.4.4 Counting and Characterizing Compliers
Size of Complier Group
Proportion of Treated Who Are Compliers
Complier Characteristics
Although individual compliers can't be identified, the distribution of characteristics can be described:
Complier-characteristics ratio: For a binary characteristic x1i,
If this ratio > 1, compliers are disproportionately likely to have characteristic x₁.
Example: Complier Characteristics for Twins vs. Same-Sex Instruments
| Characteristic | Sample Mean | Twins Ratio | Same-Sex Ratio |
|---|---|---|---|
| Age ≥ 30 at first birth | 0.003 | 1.39 | 1.00 |
| College graduate | 0.132 | 1.14 | 0.70 |
Twins compliers are older and more educated; same-sex compliers are less educated. This helps explain why twins IV gives smaller labor supply effects (labor supply consequences of childbearing decline with education).
4.5 Generalizing LATE
4.5.1 Multiple Instruments
With two instruments z1i and z2i, each having its own complier group, 2SLS produces:
where ρj is the LATE using instrument j alone, and λ depends on the relative strength of each instrument in the first stage. Instruments with a stronger first stage get more weight.
4.5.2 Covariates in the Heterogeneous-Effects Model
When the instrument is only valid conditional on covariates Xi (e.g., draft eligibility conditional on year of birth):
Conditional independence: {y1i, y0i, d1i, d0i} ⊥ zi | Xi
Saturate and Weight Theorem (Angrist & Imbens 1995)
With a fully saturated first stage (separate effect of z for each value of X) and saturated covariates in the second stage, 2SLS estimates:
A weighted average of covariate-specific LATEs, with weights proportional to the variance of first-stage fitted values at each X value. Covariate values where the instrument creates more variation in treatment get more weight.
Abadie's Kappa Weighting (Abadie 2003)
2SLS approximates the causal response function for compliers: E[yi | di, Xi, complier]. The kappa-weighting function:
"finds" compliers by down-weighting always-takers (d=1, z=0) and never-takers (d=0, z=1). With a linear model for P(z=1|X), Abadie's estimator equals 2SLS.
4.5.3 Average Causal Response with Variable Treatment Intensity
When treatment is multi-valued (e.g., years of schooling s ∈ {0, 1, …, S}), the Wald estimand becomes:
ACR Theorem (Angrist & Imbens 1995):
A weighted average of unit causal effects along the causal response function, with weights:
Key insight: The weight at each point s is proportional to the shift in the CDF of treatment at that point, which can be estimated from data:
Application: Compulsory Schooling Laws
Acemoglu & Angrist (2000) show that child labor and compulsory attendance laws shift the schooling distribution mainly in grades 8–12, with no effect on post-secondary schooling. Therefore, IV estimates using these instruments capture returns to schooling in the high-school range, not the college range.
Continuous Treatment: Average Derivative
When treatment is continuous (e.g., price), the IV estimand is a weighted average derivative:
Example: Angrist, Graddy & Imbens (2000) estimate the demand for fish at Fulton Fish Market using weather instruments. Stormy weather drives up prices, and IV recovers the demand elasticity averaged over the range of storm-induced price shifts.
Applied: Angrist & Evans (1998) — Fertility & Labor Supply
Research question: Does having a third child causally reduce female labor supply?
The Identification Problem
Simple OLS comparison of mothers with 2 vs. 3+ children confounds causation with selection: women who have more children may have inherently stronger family-orientation preferences, leading to both more children and less labor supply.
Two Instruments for a Third Child
Among mothers with ≥2 children, Angrist & Evans use two sources of exogenous variation:
| Twins at second birth | Same-sex (first two children) | |
|---|---|---|
| Logic | Twins mechanically create ≥3 children | Parents prefer mixed-sex sibship → more likely to try for a third |
| First stage | 0.625 (very strong) | 0.067 (modest) |
| Validity | Twin births are essentially random | Child sex composition is random |
Results
| Outcome | OLS | Twins IV | Same-sex IV |
|---|---|---|---|
| Employment | −0.167 | −0.083 | −0.135 |
| Weeks worked | −8.05 | −3.83 | −6.23 |
Why Estimates Differ: Different Compliers
Each instrument identifies effects for a different complier subpopulation:
Twins compliers = mothers who would not have had a third child without twins
- Older, more educated, established careers
- Planned for 2 children → forced into 3 by twins
- → Labor supply impact is smaller (career attachment buffers the shock)
Same-sex compliers = mothers who had a third child due to sex-mix preference
- Younger, less educated, early career stage
- Strong family composition preferences
- → Labor supply impact is larger (less career attachment, higher opportunity cost)
Mapping to ATE / ATT / ITT / LATE
| Estimand | Definition | In This Study |
|---|---|---|
| ATE | E[Y(1)−Y(0)] for entire population | Effect of 3rd child on all mothers with 2 children — not directly observed |
| ATT | E[Y(1)−Y(0) | D=1] for treated | Effect on mothers who actually had a 3rd child — OLS (−0.167) tries but is biased by selection |
| ITT | E[Y|Z=1]−E[Y|Z=0] by assignment | Effect of being assigned twins/same-sex — reduced form, always unbiased |
| LATE | E[Y(1)−Y(0) | compliers] | Twins: −0.083 | Same-sex: −0.135 — different compliers give different LATEs |
Mathematical Relationships
ATE = E[Y₁−Y₀|C]·πC + E[Y₁−Y₀|AT]·πAT + E[Y₁−Y₀|NT]·πNT
ATT = E[Y₁−Y₀|C]·πC/(πC+πAT) + E[Y₁−Y₀|AT]·πAT/(πC+πAT)
ITT = LATE × πC (always unbiased, |ITT| ≤ |LATE|)
LATE = E[Y₁−Y₀ | Compliers] = ITT / First stage
Size Relationships
| Relationship | Condition |
|---|---|
| |ITT| < |LATE| | Always (compliance rate < 1) |
| ATT ≥ ATE (typically) | High-benefit individuals self-select into treatment |
| LATE = ATT | No always-takers (Bloom 1984) |
| LATE₁ ≠ LATE₂ | Different IVs → different compliers (Angrist & Evans) |
| LATE = ATE | Homogeneous treatment effects (constant effect) |
Method → Estimand Mapping
| Method | Estimates | Generalizability |
|---|---|---|
| RCT (full compliance) | ATE | Broad |
| RCT (non-compliance) + IV | LATE | Compliers only |
| DID / Matching / PSM | ATT | Groups similar to treated |
| RDD | LATE at cutoff | Near cutoff only |
Key lessons from Angrist & Evans:
- LATE ≠ ATE ≠ ATT. OLS (−0.167), Twins IV (−0.083), Same-sex IV (−0.135) all give different numbers for the same research question.
- Different instruments → different compliers → different LATEs. The choice of instrument determines whose effect you estimate.
- Complier characteristics explain the gap. The difference is systematic, not random — it traces back to demographics of each complier group.
- Policy implications change. −8% vs. −17% employment effects lead to completely different childcare policy conclusions.
Part 2 Summary
| Concept | Key Point |
|---|---|
| LATE | IV = E[y₁−y₀ | compliers], not ATE or ATT in general |
| Four Assumptions | Independence, Exclusion, First stage, Monotonicity |
| Monotonicity | No defiers; all affected people are pushed in the same direction |
| Complier size | = First stage; characteristics via first-stage ratio across subgroups |
| Bloom's Result | One-sided non-compliance → LATE = ATT (e.g., JTPA) |
| Multiple instruments | 2SLS = weighted average of instrument-specific LATEs |
| Covariates | 2SLS = weighted average of covariate-specific LATEs |
| ACR Theorem | Multi-valued treatment → weighted average of unit causal effects along the response function |
Practical takeaway: Different instruments estimate effects for different subpopulations. Understanding who the compliers are is crucial for interpreting what your IV estimate means and whether it generalizes.
Suhyeon Lee