Heterogeneous transfer learning for high dimensional regression with feature mismatch
Tuesday, Aug 5: 11:20 AM - 11:35 AM
1887
Contributed Papers
Music City Center
We study transferring knowledge from a source domain to a target domain to learn a high-dimensional regression model with possibly different features. Recently, the statistical properties of homogeneous transfer learning have been investigated. However, most homogeneous transfer and multi-task learnings assume that the target and proxy domains have the same feature space. However, target and proxy feature spaces are often not fully matched due to the inability to measure some variables in the target data-poor environments. Existing heterogeneous transfer learning methods do not provide statistical error guarantees. We propose a two-stage method that learns the relationship between the missing and observed features through a projection step and then solves a joint penalized optimization problem. We develop upper bounds on the method's parameter estimation and prediction risks, assuming that the proxy and the target domain parameters are sparsely different. Our results elucidate how estimation and prediction error depend on the complexity of the model, sample size, the extent of overlap, and correlation between matched and mismatched features.
Transfer Learning
Data integration
Trustworthy AI
Feature Mismatch
Penalized regression
Main Sponsor
Section on Statistical Learning and Data Science
You have unsaved changes.