Risk-inclusive Contextual Bandits for Early Phase Clinical Trials
Wednesday, Aug 6: 11:50 AM - 12:05 PM
2370
Contributed Papers
Music City Center
Early-phase clinical trials face the challenge of selecting optimal drug doses that balance safety and efficacy due to uncertain dose-response relationships and varied participant characteristics. Traditional randomized dose allocation often exposes participants to sub-optimal doses by not considering individual covariates, leading to suboptimal dosing, larger sample sizes, and longer trials. This paper introduces a risk-inclusive contextual bandit algorithm leveraging multi-arm bandit (MAB) strategies to optimize dosing using participant-specific data. The algorithm improves dose allocation balance by integrating separate Thompson samplers for efficacy and safety. Effect sizes are estimated robustly with a generalized version of the asymptotic confidence sequence (AsympCS) method (Waudby-Smith et al., 2024), ensuring uniform coverage for effect sizes over time. AsympCS validity is also established in the MAB framework. Empirical results show the method outperforms randomized and efficacy-focused Thompson samplers, with real-data application from a Phase IIb study aligning with actual findings.
Anytime-valid policy evaluation
Dose-ranging studies
Efficacy and Safety
Model-assisted inference
Sequential causal inference
Main Sponsor
Biopharmaceutical Section
You have unsaved changes.