Print Close

Chiseling: Interactive Machine Learning for Powerful and Valid Subgroup Selection

Presented During: Using Statistics, Data Science and AI to Enrich the Assessment of Treatment Effect Heterogeneity in Drug Development

Nathan Cheng Co-Author
Harvard University

Asher Spector Co-Author

Lucas Janson Co-Author
Harvard University

Nathan Cheng Speaker
Harvard University

Monday, Aug 4: 9:15 AM - 9:35 AM
Topic-Contributed Paper Session

Music City Center

In regression and causal inference, controlled subgroup selection aims to identify, with inferential guarantees, a subgroup (defined as a subset of the covariate space) on which the average response or treatment effect is above a given threshold. E.g., in a clinical trial, it may be of interest to find a subgroup with a positive average treatment effect. However, existing methods either lack inferential guarantees, heavily restrict the search for the subgroup, or sacrifice efficiency by naive data splitting. We propose a novel framework that allows the analyst to interactively refine and test a candidate subgroup by iteratively shrinking it. The sole restriction is that the shrinkage direction only depends on the points outside the current subgroup, but otherwise the analyst may leverage any prior information or machine learning algorithm. Despite this flexibility, our method controls the probability that the discovered subgroup is null (e.g., has a non-positive average treatment effect) under minimal assumptions: for example, in randomized experiments, our method controls the error rate under only bounded moment conditions. Empirically, our method identifies substantially better subgroups than existing methods with inferential guarantees.

Keywords

Subgroup Analysis

Causal Inference

Machine Learning