Wednesday, Aug 6: 8:30 AM - 10:20 AM
4147
Contributed Papers
Music City Center
Room: CC-102B
Presenters in this session will demonstrate a wide variety of tools they have developed or used to handle longitudinal and/or correlated data. These data originate from a variety of different sources like longitudinal studies or data from smart devices.
Main Sponsor
Biometrics Section
Presentations
Change-point models are important tools in cognitive-aging research. Specifically, event-based change-point models enable differentiation in moderator effects on cognitive decline in relation to a pre-specified event. Here, we explore and compare methods for implementing event-based change-point models on novel harmonized cognitive measures from the National Alzheimer's Coordinating Center (NACC). The cognitive measures include three domains: memory, executive function, and language. We implement three methods using the R nlive package: a sigmoidal mixed model, a piecewise linear mixed model with abrupt change, and a piecewise linear mixed model with smooth polynomial transition. Each method is implemented for two cognitive events: mild cognitive impairment (MCI) diagnosis and Alzheimer's dementia (AD) diagnosis. We characterize each method's model fit and applied utility, especially when multiple moderators are included, to guide future modeling frameworks of the harmonized cognitive scores.
Keywords
time-to-event
harmonized cognitive scores
change-point analysis
Alzheimer's dementia
mild cognitive impairment
Many biomedical studies collect longitudinal clinical and lifestyle data of mixed types (continuous and discrete) to examine their associations with key health outcomes. However, inconsistencies in measurement timing and missing follow-ups pose challenges in linking these predictors to a binary outcome at a specific time point, such as cancer diagnosis. While Lim et al. (2022) proposed a joint model to impute standardized longitudinal values for mixed-type covariates, their approach did not incorporate variable selection, limiting its ability to identify the most relevant predictors.
Building on this framework, we introduce two structured Bayesian variable selection strategies within a joint modeling framework.The first approach is a one-level strategy that identifies important covariates for the binary outcome, including higher-order interactions. We then extend this to a two-level strategy, which simultaneously selects covariates for both the outcome and longitudinal trajectories. The two-level approach allows for the inclusion of a large number of predictors, including higher-order interactions, without overfitting or excessive computational burden by leveraging a shrinkage-based selection method. Furthermore, it accounts for model uncertainty and facilitates model averaging, improving imputation accuracy and predictive performance.
We apply our method to the LILAC study, using longitudinal WHI data to identify factors associated with post-treatment insomnia among female cancer survivors. Our results demonstrate the benefits of integrating variable selection into joint modeling, offering a robust and interpretable framework for high-dimensional, time-dependent biomedical data analysis.
Keywords
Bayesian joint models
variable selection
interaction
Bayesian lasso
imputation
Bayesian inference
Addressing health disparities across demographic groups remains a critical challenge in public health, with significant gaps in understanding how these disparities evolve over time. This paper extends the traditional Peters-Belson decomposition to a longitudinal setting, highlighting the impact of specific explanatory variables we call modifiers that account for complex interactions among the explanatory variables. The proposed method partitions disparities into three components: The explained disparity associated with differences in the conditional distribution of explanatory variables, assuming identical modifier distributions for majority and minority groups; The explained disparity arising from unequal distributions of the modifiers and their interaction with the rest of the covariates; The unexplained disparity. Instead of aggregating the first two components into a single overall explained disparity, the proposed method allows for a detailed analysis of the temporal dynamics, both associated and unassociated with the modifiers. We demonstrate the utility of the method through a fetal growth study, examining disparities in fetal development among racial/ethnic groups.
Keywords
Disparity Decomposition Analysis
The Fetal Growth - Singletons Study
Longitudinal Analysis
Peters-Belson Approach
Analyzing continuous glucose monitor (CGM) data is challenging. Threshold-based approaches (e.g., time in range) oversimplify CGM patterns, rely on predefined cutoffs, and fail to address measurement error biases or capture blood glucose (BG) variability. Error correction methods suited for continuous data are inadequate for binary functional predictors prone to misclassification, such as CGM-derived nocturnal glucose dips measured every 15 minutes over ten days. Scalar outcomes like birth weight are also influenced by error-prone factors like dietary intake (DI) and physical activity (PA). We propose generalized functional linear regression models that account for misclassification in nocturnal glucose dips and measurement error in scalar (e.g., DI) and functional (PA) predictors while considering individual variability and diurnal patterns. Simulations show that ignoring misclassification and measurement errors leads to biased estimates. We applied our methods to a Singapore-based cohort of 277 pregnant women to examine nocturnal glucose dips and their relationship with birth weight, accounting for DI and PA.
Keywords
Continuous glucose monitor
functional logistic regression
measurement error
nocturnal glucose dip
Understanding changes over time in population blood pressure (BP) from a nationally-representative survey (e.g., NHANES) requires accurate modeling of both the center and the tail of the BP distribution; a shift in the entire distribution may indicate socioeconomic or cultural trends affecting the whole population, while changes in the high-BP tail may represent changing access to clinical care. The complex survey design used in the NHANES study introduces challenges for modeling the bulk and tail distributions of systolic BP. We propose use of the peaks-over-threshold approach, widely used in climate science and recently adopted in health contexts, where the distribution tail is modeled by a Generalized Pareto Distribution (GPD). We employ pseudo maximum likelihood estimation (PMLE) to accommodate the survey weights. Analytically we determine conditions under which neglecting survey weights may or may not lead to bias in GPD parameter estimates. In particular, estimates of the shape parameter may still be unbiased if tail observations share similar weights. We demonstrate the PMLE approach through simulations and application to BP data in NHANES.
Keywords
extreme value analysis
peaks over threshold
maximum likelihood estimation
survey weights
blood pressure
The University of California Irvine Consent-to-Contact (C2C) registry initiated an interrupted time series (ITS) design recruitment strategy study (C2C-RSS) to assess the effectiveness of interventions in recruiting individuals from disadvantaged neighborhoods in Orange County, California. To define disadvantage, we utilized the Area Deprivation Index (ADI) (Kind and Buckingham, 2018). The C2C-RSS aims to estimate a marginal intervention effect across ADI deciles on recruitment and assess effect modification by ADI strata. We employed a non-random sampling design to ensure uniform inclusion across ADI deciles. To adjust for sampling bias, we extend the Robust-Multiple ITS model (Cruz et al., 2019) to incorporate inverse probability of known sampling weights in estimating a marginal mean function. We additionally propose two variance estimators: the first quantifies uncertainty of the unknown change point associated with the intervention and the second additionally accounts for misspecification of the mean model. We demonstrate the performance of our methods through empirical simulation studies. We further use our proposed methods to assess power to achieve the aims of the C2C-RSS.
Keywords
interrupted time series
intervention assessment
multiple units
change point variability
sampling weights
Co-Author(s)
Joshua Grill, University of California, Irvine
Daniel Gillen, University of California-Irvine
Maricela Cruz, Kaiser Permanente Washington Health Research Institute
First Author
Thuy Lu, University of California, Irvine
Presenting Author
Thuy Lu, University of California, Irvine
Analysis of physical activity data from accelerometry is of great interest in the context of healthcare and quality of life. Moderate to Vigorous Physical Activity (MVPA) occurs in 'bouts' of continuous MVPA minutes. For these bouts, joint analysis of the intensity, which is a continuous variable, and the timing, which is a circular variable presents a unique challenge as the two measures constitute multi-modal data which requires sophisticated methods for analysis. Existing approaches transform the available polar coordinates to continuous rectangular coordinates that can be modelled using conventional methods, but the interpretation of results remains a challenge. We build upon existing methods by approximating the timing of a bout as a circular average of its constituent minutes, weighted by their intensity so as to better reflect the concentration of physical activity, and propose models for repeated measures analysis via Generalized Least Squares(GLS), Linear Mixed Effects(LME), and Bayesian Linear Mixed Effects(BLME). We compare the relative strengths and weaknesses of these models via simulation studies, and report results of their application to data from the NHANES cohort.
Keywords
Moderate to vigorous physical activity
Joint modelling
Circular statistics
Bayesian
Longitudinal data
Repeated measures