Merging Versus Ensembling: An Adaptive Blending Approach for Handling Domain Heterogeneity
Kevin Lane
Co-Author
Boston University - Department of Environmental Health
Monday, Aug 4: 11:45 AM - 11:50 AM
1598
Contributed Speed
Music City Center
In multi-domain settings, where observations come from distinct but related data sources, heterogeneity often exists across domains due to shifts in data distributions. In cases of high heterogeneity, (1) training individual models on each domain and ensembling their predictions (ensemble approach) has been shown to outperform (2) combining domain datasets and fitting a single model (merged approach). However, determining when to choose each approach is less clear. This paper presents Multi-Study Adaptive Blend (MSAB), a method for optimally combining predictions from the ensemble and merged approaches adaptively across varying levels of heterogeneity. First, we provide theoretical insights on optimizing the combination weight in a linear model setting. Second, we propose a domain-wise cross-validation strategy for estimating the optimal blending weight as a practical, data-driven approach for broader applications. For a given heterogeneity level, MSAB performs comparable to or better than the best individual strategy (merged or ensemble), offering robust performance across low and high heterogeneity settings. MSAB offers potential improvements in predictive performance and mitigates the risk of selecting a suboptimal approach in multi-domain settings.
machine learning
domain generalization
ensemble learning
multi-study prediction
Main Sponsor
Section on Statistical Learning and Data Science
You have unsaved changes.