Longitudinal and Correlated Data Latest Research Topics

Krishna Saha Chair
Central Connecticut State University
 
Wednesday, Aug 6: 8:30 AM - 10:20 AM
4147 
Contributed Papers 
Music City Center 
Room: CC-102B 
Presenters in this session will demonstrate a wide variety of tools they have developed or used to handle longitudinal and/or correlated data. These data originate from a variety of different sources like longitudinal studies or data from smart devices.

Main Sponsor

Biometrics Section

Presentations

A Comparison of Event-Based Change-Point Models Using Novel Cognitive Domain Measures

Change-point models are important tools in cognitive-aging research. Specifically, event-based change-point models enable differentiation in moderator effects on cognitive decline in relation to a pre-specified event. Here, we explore and compare methods for implementing event-based change-point models on novel harmonized cognitive measures from the National Alzheimer's Coordinating Center (NACC). The cognitive measures include three domains: memory, executive function, and language. We implement three methods using the R nlive package: a sigmoidal mixed model, a piecewise linear mixed model with abrupt change, and a piecewise linear mixed model with smooth polynomial transition. Each method is implemented for two cognitive events: mild cognitive impairment (MCI) diagnosis and Alzheimer's dementia (AD) diagnosis. We characterize each method's model fit and applied utility, especially when multiple moderators are included, to guide future modeling frameworks of the harmonized cognitive scores. 

Keywords

time-to-event

harmonized cognitive scores

change-point analysis

Alzheimer's dementia

mild cognitive impairment 

Co-Author(s)

David Fardo, University of Kentucky
Shubhabrata Mukherjee, University of Washington
Christopher McLouth, University of Kentucky
Yuriko Katsumata, University of Kentucky
Jai Broome, University of Washington
Inori Tsuchiya, University of Kentucky

First Author

Megan Hall

Presenting Author

Megan Hall

Bayesian Variable Selection for Joint Models of Heterogeneous Longitudinal Variables and a Binary outcome

Many biomedical studies collect longitudinal clinical and lifestyle data of mixed types (continuous and discrete) to examine their associations with key health outcomes. However, inconsistencies in measurement timing and missing follow-ups pose challenges in linking these predictors to a binary outcome at a specific time point, such as cancer diagnosis. While Lim et al. (2022) proposed a joint model to impute standardized longitudinal values for mixed-type covariates, their approach did not incorporate variable selection, limiting its ability to identify the most relevant predictors.

Building on this framework, we introduce two structured Bayesian variable selection strategies within a joint modeling framework.The first approach is a one-level strategy that identifies important covariates for the binary outcome, including higher-order interactions. We then extend this to a two-level strategy, which simultaneously selects covariates for both the outcome and longitudinal trajectories. The two-level approach allows for the inclusion of a large number of predictors, including higher-order interactions, without overfitting or excessive computational burden by leveraging a shrinkage-based selection method. Furthermore, it accounts for model uncertainty and facilitates model averaging, improving imputation accuracy and predictive performance.

We apply our method to the LILAC study, using longitudinal WHI data to identify factors associated with post-treatment insomnia among female cancer survivors. Our results demonstrate the benefits of integrating variable selection into joint modeling, offering a robust and interpretable framework for high-dimensional, time-dependent biomedical data analysis. 

Keywords

Bayesian joint models

variable selection

interaction

Bayesian lasso

imputation

Bayesian inference 

Co-Author

Michael Pennell, The Ohio State University

First Author

LINGPENG SHAN, The Ohio State University

Presenting Author

LINGPENG SHAN, The Ohio State University

Decomposition of Longitudinal Disparities: an Application to the Fetal Growth-Singletons Study

Addressing health disparities across demographic groups remains a critical challenge in public health, with significant gaps in understanding how these disparities evolve over time. This paper extends the traditional Peters-Belson decomposition to a longitudinal setting, highlighting the impact of specific explanatory variables we call modifiers that account for complex interactions among the explanatory variables. The proposed method partitions disparities into three components: The explained disparity associated with differences in the conditional distribution of explanatory variables, assuming identical modifier distributions for majority and minority groups; The explained disparity arising from unequal distributions of the modifiers and their interaction with the rest of the covariates; The unexplained disparity. Instead of aggregating the first two components into a single overall explained disparity, the proposed method allows for a detailed analysis of the temporal dynamics, both associated and unassociated with the modifiers. We demonstrate the utility of the method through a fetal growth study, examining disparities in fetal development among racial/ethnic groups. 

Keywords

Disparity Decomposition Analysis

The Fetal Growth - Singletons Study

Longitudinal Analysis

Peters-Belson Approach 

Co-Author(s)

Sang Kyu Lee, National Cancer Institute
Mi-Ok Kim, UCSF
Katherine Grantz, Eunice Kennedy Shriver National Institute of Child Health and Human Development
Hyokyoung Hong, NIH

First Author

Seonjin Kim, Miami University

Presenting Author

Seonjin Kim, Miami University

Generalized functional linear regression models with measurement and misclassification errors

Analyzing continuous glucose monitor (CGM) data is challenging. Threshold-based approaches (e.g., time in range) oversimplify CGM patterns, rely on predefined cutoffs, and fail to address measurement error biases or capture blood glucose (BG) variability. Error correction methods suited for continuous data are inadequate for binary functional predictors prone to misclassification, such as CGM-derived nocturnal glucose dips measured every 15 minutes over ten days. Scalar outcomes like birth weight are also influenced by error-prone factors like dietary intake (DI) and physical activity (PA). We propose generalized functional linear regression models that account for misclassification in nocturnal glucose dips and measurement error in scalar (e.g., DI) and functional (PA) predictors while considering individual variability and diurnal patterns. Simulations show that ignoring misclassification and measurement errors leads to biased estimates. We applied our methods to a Singapore-based cohort of 277 pregnant women to examine nocturnal glucose dips and their relationship with birth weight, accounting for DI and PA. 

Keywords

Continuous glucose monitor

functional logistic regression

measurement error

nocturnal glucose dip 

Co-Author(s)

Roger S Zoh, Indiana University
Lan Xue, Oregon State University
See Ling Loy, Duke-NUS Medical School
Carmen Tekwe, Indiana University

First Author

Ashley Obeng, Indiana University Bloomington

Presenting Author

Ashley Obeng, Indiana University Bloomington

Incorporating survey weights in tail models for population blood pressure distribution

Understanding changes over time in population blood pressure (BP) from a nationally-representative survey (e.g., NHANES) requires accurate modeling of both the center and the tail of the BP distribution; a shift in the entire distribution may indicate socioeconomic or cultural trends affecting the whole population, while changes in the high-BP tail may represent changing access to clinical care. The complex survey design used in the NHANES study introduces challenges for modeling the bulk and tail distributions of systolic BP. We propose use of the peaks-over-threshold approach, widely used in climate science and recently adopted in health contexts, where the distribution tail is modeled by a Generalized Pareto Distribution (GPD). We employ pseudo maximum likelihood estimation (PMLE) to accommodate the survey weights. Analytically we determine conditions under which neglecting survey weights may or may not lead to bias in GPD parameter estimates. In particular, estimates of the shape parameter may still be unbiased if tail observations share similar weights. We demonstrate the PMLE approach through simulations and application to BP data in NHANES. 

Keywords

extreme value analysis

peaks over threshold

maximum likelihood estimation

survey weights

blood pressure 

Co-Author(s)

Rebecca Betensky, NYU College of Global Public Health
Yajun Mei, New York University

First Author

Zoe Haskell-Craig

Presenting Author

Zoe Haskell-Craig

Interrupted time series methods for non-random sampling study designs with known sampling weights

The University of California Irvine Consent-to-Contact (C2C) registry initiated an interrupted time series (ITS) design recruitment strategy study (C2C-RSS) to assess the effectiveness of interventions in recruiting individuals from disadvantaged neighborhoods in Orange County, California. To define disadvantage, we utilized the Area Deprivation Index (ADI) (Kind and Buckingham, 2018). The C2C-RSS aims to estimate a marginal intervention effect across ADI deciles on recruitment and assess effect modification by ADI strata. We employed a non-random sampling design to ensure uniform inclusion across ADI deciles. To adjust for sampling bias, we extend the Robust-Multiple ITS model (Cruz et al., 2019) to incorporate inverse probability of known sampling weights in estimating a marginal mean function. We additionally propose two variance estimators: the first quantifies uncertainty of the unknown change point associated with the intervention and the second additionally accounts for misspecification of the mean model. We demonstrate the performance of our methods through empirical simulation studies. We further use our proposed methods to assess power to achieve the aims of the C2C-RSS. 

Keywords

interrupted time series

intervention assessment

multiple units


change point variability

sampling weights 

Co-Author(s)

Joshua Grill, University of California, Irvine
Daniel Gillen, University of California-Irvine
Maricela Cruz, Kaiser Permanente Washington Health Research Institute

First Author

Thuy Lu, University of California, Irvine

Presenting Author

Thuy Lu, University of California, Irvine

Investigation of methods for joint modelling of timing and intensity of accelerometry data

Analysis of physical activity data from accelerometry is of great interest in the context of healthcare and quality of life. Moderate to Vigorous Physical Activity (MVPA) occurs in 'bouts' of continuous MVPA minutes. For these bouts, joint analysis of the intensity, which is a continuous variable, and the timing, which is a circular variable presents a unique challenge as the two measures constitute multi-modal data which requires sophisticated methods for analysis. Existing approaches transform the available polar coordinates to continuous rectangular coordinates that can be modelled using conventional methods, but the interpretation of results remains a challenge. We build upon existing methods by approximating the timing of a bout as a circular average of its constituent minutes, weighted by their intensity so as to better reflect the concentration of physical activity, and propose models for repeated measures analysis via Generalized Least Squares(GLS), Linear Mixed Effects(LME), and Bayesian Linear Mixed Effects(BLME). We compare the relative strengths and weaknesses of these models via simulation studies, and report results of their application to data from the NHANES cohort. 

Keywords

Moderate to vigorous physical activity

Joint modelling

Circular statistics

Bayesian

Longitudinal data

Repeated measures 

Co-Author

Roger Zoh, Indiana University

First Author

Omkar Khandpekar, Indiana University Bloomington

Presenting Author

Omkar Khandpekar, Indiana University Bloomington