Core Concepts
The author discusses the importance of determining a sufficient sample size for estimating conditional counterfactual expectations in data-driven subgroups, emphasizing the need to turn the original goal into a simultaneous inference problem.
Abstract
The content delves into the intricacies of sample size planning for estimating conditional counterfactual means in randomized experiments. It covers key concepts such as feature space partitioning, main results, learning partitions, and empirical evaluation using publicly available datasets.
Randomized experiments are highlighted as a gold standard for establishing causality. The focus is on sample size planning when contrasting multiple treatment groups. The discussion emphasizes the challenges of individual-level treatment effect estimation and shifts towards studying counterfactuals at the subgroup level.
The content explains how policy trees can be used to learn subgroups and evaluate nominal guarantees on large randomized experiment datasets. It also addresses the importance of specifying parameters like margin of error, confidence level, model complexity, and outcome variation bounds in sample size determination.
Key results include propositions outlining sufficient sample sizes per treatment group and subset of partitions to ensure accurate inference. The discussion extends to practical considerations like bounded outcomes, variance constraints, and standardized scale applications.
Empirical evaluations on real-world datasets demonstrate the application of these methods in practice. The content concludes with discussions on limitations, sensitivity analysis, external validity considerations, and future research directions.
Stats
min w,l nwl ≥ log(2 / (1 - (1 - α) / K)) |b - a|^2 / 2ϵ^2
min w,l nwl ≥ log(2 / (1 - (1 - α) / KL)) |b - a|^2 / 2ϵ^2