Monday, Aug 4: 10:30 AM - 12:20 PM
0798
Topic-Contributed Paper Session
Music City Center
Room: CC-209A
Applied
Yes
Main Sponsor
Biometrics Section
Co Sponsors
Committee on Applied Statisticians
Section on Statistics in Epidemiology
Presentations
Longitudinal studies are highly exposed to missing values that may threaten the validity of statistical inferences. This project proposes a novel trajectory model for multivariate categorical longitudinal outcomes with non-ignorable missing values. In addition, the proposed model investigates associations between patient-level time-independent covariates and trajectory group memberships to provide a better understanding of resilience trajectories. The proposed model identifies trajectory groups based on categorical outcome variables and their missing patterns to deal with missing values in outcome variables that are possibly non-ignorable. To achieve this, it introduces two types of categorical latent variables. One is for summarizing response patterns and missing patterns (latent class variables), and the other is for summarizing longitudinal patterns (latent trajectories) of latent classes. In addition, the proposed model may investigate associations between latent trajectories and patient-level time-independent covariates. We employ the Expectation-Maximization algorithm to obtain the maximum likelihood estimates. We demonstrate the novelty of the proposed model via simulation studies and by analyzing the YUCAN data set.
Keywords
Latent class analysis
Missing not at random
EM algorithm
Longitudinal data
Cancer resilience
Degradation models are commonly used in engineering to analyze the deterioration of systems over time. These models offer an alternative to standard longitudinal methods as they explicitly account for within-subject temporal variability through a latent stochastic process, allowing for random fluctuations within a patient to be captured. This work investigates Wiener process-based degradation models with linear drift (i.e., slope) while considering a diffusion term to represent within-subject temporal variability, a random-effects term to capture between-subject variability of the slope, and a time-invariant term to account for measurement error. Consistent first-difference estimators that stabilize covariance matrix inversion and remove the influence of time-invariant confounders are presented and validated in clinically relevant settings, along with profile likelihood methods that reduce dimensionality of parameter search. As a proof of concept, we applied these models to amyotrophic lateral sclerosis (ALS) data from the Pooled Resource Open-Access ALS Clinical Trials Database (PRO-ACT). We observed steeper slopes of the revised ALS Functional Rating Scale (ALSFRS-R) in individuals who died compared to those who survived, indicating that degradation model estimates are consistent with expected patterns of ALS decline. Our results demonstrate that these stochastic models provide accurate and efficient estimates of longitudinal deterioration. Future work aims to incorporate Wiener process degradation models into a joint modeling framework.
Alzheimer's disease (AD) is a growing health concern, projected to affect 13.8 million Americans by 2050. Exposures such as amyloid-β (Aβ) in cerebrospinal fluid and genetic factors like the APOE ε4 allele are not only crucial for understanding AD progression but also to understand why current treatments do not alter its progression. The Alzheimer's Disease Neuroimaging Initiative (ADNI) tracks longitudinal data on multiple cognition, and neuroimaging outcomes. However, current studies use standard methods that are limited in estimating the global effect of exposures on these outcomes accounting for their complex interrelations. We propose a mGLMM framework to jointly model multiple outcomes, providing global and individual effects of exposures. This approach accounts for different random effect specifications and error terms across all outcomes. We assessed the performance of mGLMM by simulating 1000 balanced datasets under the scenarios of 1000 patients, 5 time points, and 5 multivariate Gaussian outcomes. We applied the proposed method to examine the relationship between the exposures (APOE ε4 allele status and baseline Aβ+ status) and 8 outcomes from the motivating ADNI study, while adjusting for age, sex, and education. We show that mGLMM improves the efficiency of covariate effect estimates, while reducing Type-I error risk.
In recent decades, multivariate Generalized Linear Mixed Models (mGLMMs) have become essential for analyzing complex data structures, particularly in longitudinal studies. These models provide a flexible framework for handling correlated responses across subjects and time points, integrating both fixed and random effects. However, the assumption of multivariate normal distribution is often violated. We proposed mGLMM for modeling data from a multivariate skew normal distribution (mGLMM-SKN) that incorporates skewness. Parameter estimation was performed using the Expectation-Maximization (EM) algorithm, iteratively updating fixed effects, random effects, covariance matrices, and skewness parameters. The model's goodness-of-fit was assessed through residual analysis. We demonstrated the new approach using both simulation and real datasets with R statistical software. The simulation scenarios included N=1000, j= 5 time points, and k=8 outcomes. We applied the proposed method to examine the relationship between the exposures (APOE ε4 allele status and baseline Aβ+ status) and 8 outcomes from the motivating ADNI study, while adjusting for age, sex, and education. By incorporating skewness effects, we enable a more flexible analysis that accommodates the asymmetry, and heavy tails often present in real-world datasets.
Speaker
AKASH ROY, MEDICAL UNIVERSITY OF SOUTH CAROLINA