32 Generalized Propensity using Computer Learning Methods in High Dimensional and Nonlinear Data

Kai Ding Co-Author
University of Oklahoma Health Sciences Center
 
Sixia Chen Co-Author
 
Jonathan Baldwin First Author
 
Jonathan Baldwin Presenting Author
 
Tuesday, Aug 6: 10:30 AM - 12:20 PM
3187 
Contributed Posters 
Oregon Convention Center 
Introduction: Few studies have examined performance of the generalized propensity score (GPS) in estimating average treatment effects (ATE) using computer learning methods in high dimensional and nonlinear data. Objective: Use simulation to assess causal inference bias when applying multiple computer learning estimated GPSs in high dimensional and nonlinear data. Methods: A large population was simulated with four covariates associated with a continuous treatment, and a continuous outcome. Extraneous covariates were simulated for total of four dimensionality scenarios. Additionally, treatment associations were simulated in a linear and non-linear fashion. 1000 Monte Carlo datasets were randomly selected and GPS was estimated using multiple linear and computer learning algorithms (including but not limited to random forest, SVM, and deep learning). ATE was assessed for each model type, and compared using bias and absolute percent relative bias from known population effects. Expected Results: Common linear model methods will perform well in linear low dimensional scenarios, computer learning methods will outperform in high dimensionality and nonlinearity.

Keywords

Generalize Propensity Score

Machine Learning

High Dimensional

Non-Linear 

Abstracts


Main Sponsor

Section on Statistics in Epidemiology