Thursday, Aug 7: 10:30 AM - 12:20 PM
4222
Contributed Papers
Music City Center
Room: CC-103A
In this session, latest research statistical and learning methods in random effects and mixed modeling will be presented in various areas of research and studies.
Main Sponsor
Biometrics Section
Presentations
Chronic pain is a major public health issue imposing substantial health, emotional, and economic burden on the population. Pain, an inherently subjective experience, is typically measured by patient-reported scores, often on an 11-point scale (0–10). Recent studies assess pain using ecological momentary assessment (EMA), with one or more assessments daily over multiple days, and longitudinally (e.g. pre- and post-intervention). The data often exhibit zero (no pain) or one (maximum pain) inflation. Also, there is substantial within-person variability both within and across days. Statistical modeling of pain trajectories thus present challenges. We propose a beta-binomial (BB) model to estimate potential zero- or one-inflated pain scores over time using a Bayesian approach via Hamiltonian Monte Carlo algorithm implemented in Stan. The model accounts for within-person variability using random effects in both location and dispersion parameters of BB distribution. Simulation study shows our method provides valid posterior inference on all model parameters for sufficiently rich data. The method offers a powerful framework for studying mechanisms underlying patient-reported pain scores.
Keywords
Chronic Pain
Beta-Binomial
Zero Inflation
Bayesian
Trajectory
EMA
Co-Author(s)
Martin Lindquist, Johns Hopkins University
Andrew Leroux, Department of Biostatistics & Informatics, University of Colorado, Denver, CO
First Author
Yanxi Liu, Johns Hopkins University
Presenting Author
Yanxi Liu, Johns Hopkins University
We introduce a Bayesian approach to assessing conditional independence
assumptions when assessing the accuracy of diagnostic tests. The
approach is based on the sampling distributions of pivotal quantities,
which are distributionally invariant under the posterior predictive distribution
Johnson (2004). Specifically, we use posterior samples of chi-square
deviates derived from latent variables and obtained from an MCMC framework
to assess conditional independence.
Our method provides a Bayesian alternative to traditional marginal
likelihood-based approaches that maintains coherence with Bayesian inference
principles. We demonstrate the method's ability to detect subtle
dependence structures in complex datasets through simulations and
real-world applications, offering a computationally scalable solution for
conditional independence testing.
Keywords
Conditional independence
Bayesian
Pivotal quantities
Latent variables
Chi-Squared Test
We often observe heterogeneity in longitudinal data, where the mean and variance for certain profiles meaningfully differs from the rest. Some profiles may also exhibit outliers at a limited number of measurements. Using a standard mixed effects model, which assumes homogeneity, can lead to overestimating the residual variance and inefficient estimation. In this work, we identify and account for three sources of heterogeneity in longitudinal data: incompatible mean trajectories, increased residual variance, and outliers at individual measurements. Our Bayesian mixture model incorporates binary indicators of heterogeneity for each of these features, modeled through logistic regression using covariates. We perform statistical inference using Markov chain Monte Carlo and implement model selection to evaluate the inclusion of various heterogeneous components. Simulations demonstrate that our model can accurately identify heterogeneity and produce efficient estimates of the fixed effects parameters. We further validate our approach using the CD4 data and DHEAS hormone data from the SWAN study.
Keywords
longitudinal data
outliers
Bayesian
Data privacy has increasingly become a daunting challenge because it limits data availability, which is essential in estimating statistical models such as generalized linear models. Access to personal data often involves considerable time, effort, and paperwork, which can impede research progress and collaboration. Federated learning has emerged as a means to estimate models without accessing individual observations from multiple data providers like hospitals or health organizations. However, this strategy requires communicating parameter estimate updates to a central server until convergence to produce a global model. In this research, we propose an approach to estimate mixed linear, logistic, and Poisson models based on summary statistics requested only once from each data provider. Our strategy involves generating pseudo-data whose summary statistics match those of the actual but unavailable data and using them in the model estimation process. The estimates we achieve are identical or at least as good as those derived from the actual data in terms of bias and coverage. Generalizability and communication efficiency distinguish our approach from the existing methods.
Keywords
federated learning
mixed effects models
data privacy
pseudo-data
aggregate data
statistical sufficiency