Treatment Effects Guide - ATE, ATT, ITT, LATE

Treatment Effects: ATE, ATT, ITT, LATE

한국어

Applied guide with Angrist & Evans (1998) case study — Companion to MHE Chapter 4

Core Message

Not "what is the effect?" but "the effect for whom?" — The same treatment can yield different estimates (ATE, ATT, ITT, LATE) depending on the target population. Understanding which estimand your method identifies is essential for correct interpretation and policy design.

1. Treatment Effect Estimands

ATE Average Treatment Effect

The average causal effect across the entire population.

ATE = E[Y_i(1) − Y_i(0)]

Compares: everyone treated vs. everyone untreated
Relevant when: considering universal policy (e.g., mandatory program for all)
Challenge: counterfactual is never observed → requires strong assumptions or perfect RCT

ATT Average Treatment Effect on the Treated

The average causal effect among those who actually received treatment.

ATT = E[Y_i(1) − Y_i(0) | D_i = 1]

Compares: treated group's actual outcome vs. what they would have experienced without treatment
Relevant when: evaluating a voluntary program for its participants
Typically ATT > ATE when high-benefit individuals self-select into treatment

ATE vs ATT: With heterogeneous effects and self-selection, these differ. If people with larger treatment benefits tend to participate, then ATT > ATE.

ITT Intent-to-Treat

The effect of being assigned to treatment, regardless of actual take-up.

ITT = E[Y_i | Z_i = 1] − E[Y_i | Z_i = 0]

Z is assignment, D is actual treatment receipt
Always unbiased — preserves randomization even with non-compliance
Reflects the realistic effect of offering a program (including non-participation)
|ITT| ≤ |LATE| because ITT = LATE × compliance rate

LATE Local Average Treatment Effect

The average causal effect for compliers — those whose treatment status is changed by the instrument.

LATE = Cov(Y, Z) / Cov(D, Z) = ITT / First Stage

Only for compliers — excludes always-takers and never-takers
Requires monotonicity assumption (no defiers)
Different instruments → different compliers → different LATEs
RDD estimates are also interpretable as LATE at the cutoff

Summary Comparison

Estimand	Target Population	Primary Context	Method
ATE	Entire population	Universal policy effect	RCT (full compliance)
ATT	Treated group	Voluntary program evaluation	DID, Matching/PSM
ITT	Assigned group	RCT with non-compliance	Reduced form
LATE	Compliers	IV / RDD estimation	2SLS, Wald estimator

2. Case Study: Angrist & Evans (1998)

Research question: Does having a third child causally reduce female labor supply?

The Identification Problem

Simple OLS comparison of mothers with 2 vs. 3+ children confounds causation with selection: women who have more children may have inherently stronger family-orientation preferences, leading to both more children and less labor supply.

Core problem: Fertility is endogenous — unobservable preferences drive both the number of children and labor supply decisions simultaneously.

Two Instruments for a Third Child

Among mothers with ≥2 children, Angrist & Evans use two sources of exogenous variation in the probability of having a third child:

	Twins at second birth	Same-sex (first two children)
Logic	Twins mechanically create ≥3 children	Parents prefer a mixed-sex sibship → more likely to try for a third
First stage	0.625 (very strong)	0.067 (modest)
Validity	Twin births are essentially random	Child sex composition is random

Results

Outcome	OLS	Twins IV	Same-sex IV
Employment	−0.167	−0.083	−0.135
Weeks worked	−8.05	−3.83	−6.23

Key observation: |OLS| > |Same-sex IV| > |Twins IV|. Same treatment, same outcome, but different estimates. Why?

Why Estimates Differ: Different Compliers

Each instrument identifies effects for a different complier subpopulation:

Characteristic	Sample Mean	Twins Ratio	Same-sex Ratio
Age ≥ 30 at first birth	0.003	1.39 (overrepresented)	1.00 (average)
College graduate	0.132	1.14 (overrepresented)	0.70 (underrepresented)

Ratio > 1 means the characteristic is overrepresented among compliers relative to the population.

Twins compliers = mothers who would not have had a third child without twins

Older, more educated, established careers
Planned for 2 children → forced into 3 by twins
→ Labor supply impact is smaller (career attachment buffers the shock)

Same-sex compliers = mothers who had a third child due to sex-mix preference

Younger, less educated, early career stage
Strong family composition preferences
→ Labor supply impact is larger (less career attachment, higher opportunity cost)

Mapping to Treatment Effect Concepts

Estimand	Interpretation in This Study	Value / Status
ATE	Effect of 3rd child on all mothers with 2 children	Not directly observed; somewhere between the two LATEs
ATT	Effect on mothers who actually had a 3rd child	OLS (−0.167) tries to estimate this but is biased by selection
ITT	Effect of being "assigned" twins / same-sex	Reduced form: e.g., twins RF on employment = −0.052
LATE	Effect for mothers pushed into 3rd child by the instrument	Twins: −0.083 \| Same-sex: −0.135

Lessons from This Study

LATE ≠ ATE ≠ ATT. OLS (−0.167), Twins IV (−0.083), Same-sex IV (−0.135) all give different numbers for the same research question.
Different instruments → different compliers → different LATEs. The choice of instrument determines whose effect you estimate.
Complier characteristics explain the gap. The difference is systematic, not random — it traces back to the demographics of each complier group.
Policy implications change. −8% vs. −17% employment effects lead to completely different childcare policy conclusions.

3. Mathematical Relationships

Population Subgroups Under Monotonicity

The instrument partitions the population into three groups (assuming no defiers):

Group	Definition	Share
Compliers (C)	d_1i = 1, d_0i = 0	π_C = E[D\|Z=1] − E[D\|Z=0] = First stage
Always-takers (AT)	d_1i = d_0i = 1	π_AT = E[D\|Z=0]
Never-takers (NT)	d_1i = d_0i = 0	π_NT = 1 − E[D\|Z=1]

Decomposition of Each Estimand

ATE: Weighted average across all groups

ATE = E[Y₁−Y₀|C]·π_C + E[Y₁−Y₀|AT]·π_AT + E[Y₁−Y₀|NT]·π_NT

ATT: Compliers + Always-takers

ATT = E[Y₁−Y₀|C] · π_C/(π_C+π_AT) + E[Y₁−Y₀|AT] · π_AT/(π_C+π_AT)

Treated = compliers + always-takers. Never-takers are excluded (they don't get treated).

ITT: LATE × Compliance rate

ITT = LATE × (E[D|Z=1] − E[D|Z=0]) = LATE × π_C

Always unbiased (OLS of Y on Z). Smaller than LATE in magnitude because compliance rate < 1.

LATE: Compliers only

LATE = E[Y₁−Y₀ | Compliers] = ITT / First Stage

Excludes always-takers and never-takers entirely.

Special Case: LATE = ATT (Bloom 1984)

When there are no always-takers (one-sided non-compliance), i.e., E[D|Z=0] = 0:

Always-takers = 0 → Treated = Compliers only → LATE = ATT

Example: JTPA training experiment — you can't access training without assignment, so everyone who trained was a complier. IV = ITT ÷ compliance rate = ATT.

Size Relationships

Relationship	Condition	Example
\|ITT\| < \|LATE\|	Always (when compliance < 1)	ITT = LATE × compliance rate
ATT ≥ ATE (typically)	High-benefit individuals self-select	Voluntary job training, college
LATE = ATT	No always-takers	JTPA experiment (Bloom 1984)
LATE₁ ≠ LATE₂	Different IVs → different compliers	Angrist & Evans: Twins ≠ Same-sex
LATE = ATE	Homogeneous treatment effects	Constant effect for everyone

Methodology → Estimand Connection

Method	Estimates	Generalizability
RCT (full compliance)	ATE	Broad
RCT (non-compliance) + IV	LATE	Compliers only
DID	ATT	Groups similar to treated
RDD	LATE at cutoff	Near cutoff only
Matching / PSM	ATT	Groups similar to treated

Takeaway

When reading or writing empirical research, always ask:

What estimand does this method identify? (ATE, ATT, or LATE?)
Who are the compliers? (If IV/RDD — whose effect are we learning about?)
Does the estimand match the policy question? (Universal program → ATE; voluntary → ATT; nudge → LATE)
Are the compliers relevant for the intended policy? (Pilot enthusiasts ≠ general population)