Advances in Transfer Learning

Jingbo Liu Chair
UIUC
 
Tuesday, Aug 5: 10:30 AM - 12:20 PM
4097 
Contributed Papers 
Music City Center 
Room: CC-103B 

Main Sponsor

Section on Statistical Learning and Data Science

Presentations

Adaptive Deep Transfer Learning in high dimensional regression

In high-dimensional regression, transfer learning provides an effective framework for leveraging knowledge from a related source domain to improve learning in a target domain. We propose an adaptive deep transfer learning approach for nonparametric regression, where the target function is approximated by leveraging knowledge from a pre-trained neural network on the source dataset. Specifically, we first train a deep neural network to learn the functional relationship in the source domain. Then, instead of fine-tuning the source model directly, we model the function difference between the source and target domains using a second neural network trained on the target dataset. The final target function approximation is obtained as the summation of these two networks, enabling adaptive knowledge transfer while preserving flexibility in high-dimensional settings. This framework enhances generalization by efficiently capturing both shared structures and domain-specific variations. We demonstrate the effectiveness of our method through theoretical analysis and empirical evaluations on synthetic and real-world datasets, showing improved predictive performance. 

Keywords

Neural networks

Nonparametric regression

Transfer learning 

Co-Author(s)

Jie Hu, University of Pennsylvania
Yudong Wang, University of Pennsylvania, Perelman School of Medicine
Yong Chen, University of Pennsylvania, Perelman School of Medicine

First Author

Yue WU

Presenting Author

Yue WU

Domain Adaptation Optimized for Robustness in Mixture populations

Integrative analysis of multi-institutional biobank-linked EHR data can advance precision medicine by leveraging large, diverse datasets. Yet, generalizing findings to target populations is difficult due to inherent demographic and clinical heterogeneity. Existing transfer learning often assumes the target shares an outcome model with at least one source, overlooking the mixture populations. Additional challenges arise when we lack of direct observation of the outcome of interest and need to explain population mixtures using a broader set of clinical characteristics. To address these challenges under shifts in both covariates and outcome models, we propose Domain Adaptation Optimized for Robustness in Mixture populations (DORM). Leveraging partially labeled source data, DORM builds an initial target model under a joint source-mixture assumption, then applies group adversarial learning to optimize worst-case performance around the initial target model. A tuning strategy refines this approach when limited target labels are available. Asymptotic results confirm statistical convergence and predictive accuracy, and simulations and real-world studies show DORM surpasses existing methods. 

Keywords

Domain adaptation

Multi-source data

Mixture population

Group distributional robustness 

Co-Author(s)

Zijian Guo, Rutgers University
Tianxi Cai, Harvard University
Molei Liu, Columbia University
Xin Xiong, Harvard University

First Author

Keyao Zhan, Harvard University

Presenting Author

Keyao Zhan, Harvard University

FASTER: Feature Alignment and Structured Transfer via Efficient Regularization

Most existing transfer learning methods assume identical feature spaces for all domains. However, differences in data collection often create feature variations across domains, making the feature space heterogeneous. To address this, we propose FASTER (Feature Alignment and Structured Transfer via Efficient Regularization), a novel two-step transfer learning framework that integrates regularized feature alignment with structured modeling to enhance knowledge transfer. FASTER first aligns source and target domains by learning structured feature mappings through covariance-regularized optimization, ensuring effective information transfer despite feature differences. In the second step, a joint predictive model is trained on the mapped source and target data by minimizing a regularized loss function, followed by an adaptive correction term that refines task-specific differences. Our approach reduces domain disparity while preserving interpretability through structured regularization. Extensive simulations and real-data experiments validate the effectiveness of FASTER in heterogeneous feature adaptation, providing a principled solution for transfer learning across diverse domains. 

Keywords

Transfer Learning

Heterogeneous Feature Space

Feature Alignment

Regularization 

Co-Author

Yang Feng, New York University

First Author

Iris Zhang, New York University

Presenting Author

Iris Zhang, New York University

Heterogeneous transfer learning for high dimensional regression with feature mismatch

We study transferring knowledge from a source domain to a target domain to learn a high-dimensional regression model with possibly different features. Recently, the statistical properties of homogeneous transfer learning have been investigated. However, most homogeneous transfer and multi-task learnings assume that the target and proxy domains have the same feature space. However, target and proxy feature spaces are often not fully matched due to the inability to measure some variables in the target data-poor environments. Existing heterogeneous transfer learning methods do not provide statistical error guarantees. We propose a two-stage method that learns the relationship between the missing and observed features through a projection step and then solves a joint penalized optimization problem. We develop upper bounds on the method's parameter estimation and prediction risks, assuming that the proxy and the target domain parameters are sparsely different. Our results elucidate how estimation and prediction error depend on the complexity of the model, sample size, the extent of overlap, and correlation between matched and mismatched features. 

Keywords

Transfer Learning

Data integration

Trustworthy AI

Feature Mismatch

Penalized regression 

Co-Author(s)

Massimiliano Russo, The Ohio State University
Subhadeep Paul, The Ohio State University

First Author

Jae Ho Chang, The Ohio State University

Presenting Author

Jae Ho Chang, The Ohio State University

MTL-MICE: Multi-Task Learning in Multiple Imputation by Chained Equation

High-dimensional datasets are common in healthcare and public health, where multi-center electronic health records (EHRs) and national surveys pose complex missing data challenges. Traditional imputation methods struggle in these settings, as they handle missing values independently for each task. To address this, we propose Multi-Task Learning via Multiple Imputation by Chained Equations (MTL-MICE), a novel approach that enhances imputation by leveraging shared information across tasks.
MTL-MICE integrates multi-task learning into the MICE framework, capturing correlations among tasks to improve accuracy and robustness. Instead of treating missing data separately, it utilizes shared relationships across features. Additionally, we incorporate a transferable source detection technique to identify informative tasks, refining imputation further.
Through simulations and real-world studies, we show that MTL-MICE significantly reduces imputation error and bias compared to single-task methods while preserving MICE's flexibility. These findings highlight the potential of multi-task learning to improve missing data methodologies for large-scale, high-dimensional studies. 

Keywords

Multi-Tasking Learning

Missing Data

Multiple Imputation

Transfer learning

High-dimensional inference

Lasso 

Co-Author

Yang Feng, New York University

First Author

Yuyu(Ruby) Chen, New York University

Presenting Author

Yuyu(Ruby) Chen, New York University

Privacy-Preserving Transfer Learning Approach

The underrepresentation of diverse populations in clinical research undermines the generalizability of predictive models, particularly for groups with small sample sizes. Data-sharing restrictions across institutions further complicate collaborative analysis. To address this issue, we propose META-TL, a federated transfer learning framework that integrates heterogeneous data from multiple healthcare institutions to improve predictive accuracy for underrepresented target populations. META-TL leverages informative features, handles high-dimensional data efficiently, reduces computational costs, and maintains robust performance. Theoretical analysis and simulations show META-TL performs comparably to pooled analysis, despite data-sharing constraints, and remains resilient to noisy or biased data. We demonstrate its practical utility by applying it to electronic health records (EHR) for predicting type 2 diabetes risk in underrepresented groups. 

Keywords

transfer learning

privacy-preserving

feature selection

high-dimensional data 

Presenting Author

Tingyin Wang

Transfer Learning Under High-Dimensional Graph Convolutional Regression for Node Classification

Node classification is a fundamental task for network data, but obtaining node classification labels can be challenging and expensive in many real-world scenarios. Transfer learning has emerged as a promising solution to address this challenge by leveraging knowledge from source domains to enhance learning in a target domain. Existing transfer learning methods for node classification primarily focus on integrating GCNs with various transfer learning techniques. While these approaches have shown promising results, they often suffer from a lack of theoretical guarantees, restrictive conditions, and are highly sensitive to hyperparameter tuning. To address these limitations, we introduce the Graph Convolutional Multinomial Logistic Lasso Regression (GCR) model, a simplified version of GCN, and propose a novel transfer learning method Trans-GCR with theoretical guarantees. Trans-GCR demonstrates superior empirical performance, has a low computational cost, and requires fewer hyperparameters than existing methods. We also illustrate how Trans-GCR enhances Alzheimer's Disease risk assessment in smaller target cohorts by transferring knowledge from larger, well-characterized biobank. 

Keywords

Transfer learning

Graph convolutions

High-dimensional

Node classification 

Co-Author(s)

Danyang Huang, Renmin University of China
Liyuan Wang, Renmin University of China
Kathryn Lunetta, Boston University School of Public Health
Debarghya Mukherjee, Boston University
Huimin Cheng, Boston University

First Author

Jiachen Chen, Boston University School of Public Health

Presenting Author

Jiachen Chen, Boston University School of Public Health