A Robust Method for Integrating Heterogeneous and Summary-Level Data from Various Data Sources

Andriy Derkach Co-Author
Memorial Sloan Kettering Cancer Center
 
Farimah Shamsi First Author
 
Farimah Shamsi Presenting Author
 
Monday, Aug 4: 2:05 PM - 2:20 PM
1289 
Contributed Papers 
Music City Center 
The dramatic increase of data sources for the scientific research highlighted the need for statistical methods to efficiently combine different level data to create comprehensive model. In our previous work, we demonstrated that parameters for full model can be estimated from summary-level data by integrating straightforward score equations, provided the random sampling assumptions. In this research, we will propose an extended method that combines data from potentially heterogeneous populations and summary-level data while accounting for this heterogeneity using the Fisher Information Matrix. The technique utilizes this information to estimate the sampling weights of each study, which are then used to recalibrate the estimating equations for the full model coefficients. The performance of the proposed method will be evaluated under various sampling designs using simulation studies and applied to the reanalysis of data from U.S. cancer registries and summary-level odds ratio estimates of selected colorectal cancer (CRC) risk factors while relaxing the random sampling assumption.

Keywords

Data integration

Information synthesis

Summary level information

Sampling weight calibration

propensity score 

Main Sponsor

Biometrics Section