Covariate-Informed Identification of Heterogeneity and Outliers in Longitudinal Data

Jeremy Gaskins Co-Author
University of Louisville
 
Anish Mukherjee First Author
University of Louisville
 
Jeremy Gaskins Presenting Author
University of Louisville
 
Thursday, Aug 7: 11:05 AM - 11:20 AM
1580 
Contributed Papers 
Music City Center 
We often observe heterogeneity in longitudinal data, where the mean and variance for certain profiles meaningfully differs from the rest. Some profiles may also exhibit outliers at a limited number of measurements. Using a standard mixed effects model, which assumes homogeneity, can lead to overestimating the residual variance and inefficient estimation. In this work, we identify and account for three sources of heterogeneity in longitudinal data: incompatible mean trajectories, increased residual variance, and outliers at individual measurements. Our Bayesian mixture model incorporates binary indicators of heterogeneity for each of these features, modeled through logistic regression using covariates. We perform statistical inference using Markov chain Monte Carlo and implement model selection to evaluate the inclusion of various heterogeneous components. Simulations demonstrate that our model can accurately identify heterogeneity and produce efficient estimates of the fixed effects parameters. We further validate our approach using the CD4 data and DHEAS hormone data from the SWAN study.

Keywords

longitudinal data

outliers

Bayesian 

Main Sponsor

Biometrics Section