Transfer Learning and Data Integration, a session from the ASA Wisconsin Chapter

Conference: Symposium on Data Science and Statistics (SDSS) 2026
04/29/2026: 3:45 PM - 5:15 PM CDT
Refereed 

Tracks

AI and LLM Applications
Symposium on Data Science and Statistics (SDSS) 2026

Presentations

SADA: Safe and Adaptive Aggregation of Multiple Black-Box Predictions in Semi-Supervised Learning

Presenting Author

Jiwei Zhao, University of Wisconsin-Madison

Beyond the One-Size-Fits-All: A Deep Learning Method for Equitable Data Integration and Subgroup-Specific Biomarker Identification- An Application to COPD

The heterogeneity of chronic obstructive pulmonary disease (COPD) and other complex diseases has spurred efforts to leverage multiomics and phenotypic data to identify biomarkers of disease risk and progression, to better understand the underlying physiology. These attempts focus mainly on the general population, use few molecular factors, hardly account for social determinants of health (SDoH), and establish simple associations, limiting ability to better characterize health for disadvantaged populations. We propose a broader, systems level perspective centered on the totality of SDoH, multiomics, and phenotypic data, using innovative interpretable deep learning (DL) methods to better understand and help address health disparities in COPD and other complex diseases. Our proposed DL method jointly integrates data from multiple sources and predicts a clinical outcome while yielding common and subgroup-specific variable selection and encouraging fairness with respect to sensitive attributes (e.g.,race). Simulations are used to demonstrate the effectiveness of the proposed and other methods in the literature. Real data analyses are conducted to identify race- specific multiomics markers of COPD.  

Presenting Author

Sandra Safo, University of Minnesota

Flexible Deep Survival Learning for Kidney Transplantation: Knowledge Distillation and Data Integration

Prognostic prediction using survival analysis faces challenges due to complex relationships between risk factors and time-to-event outcomes. Deep learning methods have shown promise in overcoming these challenges, but their effectiveness often relies on large datasets. In contrast, when implemented on moderate or small data sets, these methods often suffer from severe problems, such as insufficient training data, overfitting, and difficulty in tuning hyperparameters. To address these issues and improve prognosis predictions, this talk introduces a flexible deep learning framework for integrating external risk models with internal time-to-event data using a generalized Kullback-Leibler divergence penalty. Applied to the Scientific Registry of Transplant Recipients (SRTR), the method improves prediction of short-term mortality and graft failure after kidney transplant. These gains enable transplant-specific applications such as donor risk reclassification and early post-transplant triage, supporting more reliable, data-driven decision making across the kidney transplant pathway. 

Presenting Author

Kevin (Zhi) He, University of Michigan