Section on Medical Devices and Diagnostics: Statistical Modeling, Data Analysis, and Forecasting Techniques

Guangxing Wang Chair
 
Wednesday, Aug 6: 10:30 AM - 12:20 PM
4186 
Contributed Papers 
Music City Center 
Room: CC-202C 

Main Sponsor

Section on Medical Devices and Diagnostics

Presentations

A Proposed Method for Conducting a Comprehensive Assessment of Joint Modeling Prediction Accuracy

Joint modeling of longitudinal and time-to-event data (JM) is a valuable tool in predicting outcomes. Furthermore, predictions are enhanced using super learner joint models (SLJM). Currently, little advice exists regarding comparing prediction accuracy (PA) between models as PA depends on updated biomarker information and the length of the forecast prediction timeframe. Thus, instead of one measure, multiple PA measures need to be considered. We propose an approach to compare PA that accounts for these measures. Our approach is illustrated by analyzing a cohort of 251 patients with advanced non-small cell lung cancer from the RADIOHEAD study. Guardant Reveal, an assay that extracts epigenomic data, produced the temporal biomarkers. Specifically, a JM using an aggregate epigenomic score and a SLJMs leveraging multiple epigenomic components were compared. Comparisons were made using a matrix of Brier scores-one for each model, generated using combinations of current biomarker information and forecasted prediction timeframe windows. Guided by optimism controlled bootstrapped confidence intervals, results highlight our ability to pinpoint instances where SLJM PA outperforms the JM PA and vice versa. 

Keywords

Joint Modeling of Longitudinal and Time-to-Event Data

Super Learner Joint Model

Model Prediction Accuracy

Genomic and Epigenomic ctDNA

Liquid Biopsy Biomarkers

Model Validation 

Co-Author(s)

Daniel Hintz, University of Wyoming
Aaron Hardin, Guardant Health
Sara Wienke, Guardant Health
Samantha Liang, Parker Institute for Cancer Immunotherapy
Enjun Yang, Parker Institute for Cancer Immunotherapy
Amar Das, Guardant Health

First Author

Christopher Pretz, Guardant Health

Presenting Author

Christopher Pretz, Guardant Health

Counterfactual Forecasting For Panel Data

We address the challenge of forecasting counterfactual outcomes in panel data characterized by missing observations and latent factor structures with temporal dependencies. Such scenarios are common in causal inference, where estimating unobserved potential outcomes is essential. Our approach extends traditional matrix completion methods by integrating time series dynamics into the latent factors, enhancing the accuracy of counterfactual predictions. Building upon the estimator proposed by Xiong and Pelger [2023], we accommodate both stochastic and deterministic components within the factors, providing a flexible framework for various applications. In the special case of a stationary autoregressive model for the factors, we derive probabilistic error bounds for each unit and forecast horizon, and additionally provide confidence intervals for the forecast values. Empirical evaluations demonstrate that our method outperforms existing techniques, such as multivariate singular spectrum analysis Agarwal et al. [2020], particularly when latent factors exhibit autoregressive behavior. We apply our methodology to the HeartSteps V1 mHealth study, illustrating its effectiveness in forecasting step counts for users receiving activity prompts, thereby leveraging temporal patterns in user behavior.  

Keywords

Counterfactual forecast

Causal forecast

Factor models

Time series forecast

Missing data 

Co-Author(s)

Raaz Dwivedi, Cornell University
Sumanta Basu, Cornell University

First Author

Navonil Deb

Presenting Author

Navonil Deb

Evaluating Phase and Amplitude in Functional Data-Based Biomarkers: Cluster and Reliability Analyses

Adopting a new biomarker requires rigorous evaluation of its discriminability and reliability. Discriminability assesses how well a biomarker differentiates individuals with varying disease risks, while reliability measures its ability to reproduce measurements under the same conditions. With advances in medical technology, biomarkers increasingly take the form of functional data, where each observation consists of dense measurements over time, treated as smooth functions. Amplitude (vertical) and phase (horizontal) variations in functional data often provide key insights into disease mechanisms, yet existing evaluation tools, such as cluster and reliability analyses, often rely on the L2 metric, which fails to separate these variations. We introduce cluster and reliability analysis methods that assess functional data-based biomarkers based on amplitude and phase features. Specifically, for cluster analysis, we introduce a K-means-type algorithm that groups functional data using amplitude and phase distance metrics. For reliability analysis, we propose agreement indices that measure how well the amplitude and phase features of functional data are reproduced on the same unit. 

Keywords

agreement

amplitude variation

biomarker evaluation

functional data clustering

phase variation

reliability 

First Author

Jeong Hoon Jang, University of Texas Medical Branch

Presenting Author

Jeong Hoon Jang, University of Texas Medical Branch

High-Dimentional Variable Selection: an Ensemble-based Method

Variable selection in high-dimensional data analysis poses substantial methodological challenges. While numerous penalized variable selection methods and machine learning approaches exist, many demonstrate instability in real-world applications.
We developed a novel ensemble algorithm for variable selection in competing risks modeling and conducting a comprehensive stability analysis of established variable selection methods. Our methd, the Random Approximate Elastic Net (RAEN), offers a stable and generalizable solution for large-p-small-n variable selection in competing risks data. RAEN's flexible framework enables its application across various time-to-event regression models, including competing risks quantile regression and accelerated failure time models. We demonstrate that our computationally-intensive algorithm substantially improves both variable selection accuracy and parameter estimation in a numerical study. We have implemented
RAEN in a user-friendly R package. To demonstrate its practical utility, we apply RAEN to a cancer study. 

Keywords

variable selection

high-dimensional

flexible object function 

Co-Author

Xiaofeng Wang, The Cleveland Clinic Foundation

First Author

Han Sun, Cleveland Clinic

Presenting Author

Han Sun, Cleveland Clinic

Multi-reader multi-case AUC analysis methodology for Artificial Intelligence Applications

As artificial intelligence (AI) applications become more frequently employed, it is important to be able to statistically evaluate the performance of such systems when used by themselves compared to the performance of human readers using the AI system as an aid, as well as with the performance of unaided human readers.

In this talk I discuss how the Obuchowski-Rockette (OR) method, which treats both cases and human readers as random samples, can be easily adapted for comparing the usefulness of the following three modalities: (1) AI standalone; (2) AI-unaided human readers; and (3) AI-aided human readers. The adaption results from using a a "workaround" that involves a straightforward rearrangement of the data. A real-data example is presented to illustrate this method. A simulation study shows acceptable performance for this approach. 

Keywords

Artificial intelligence



Obuchowski-Rockette


Diagnostic studies


Area under the ROC curve (AUC)


ROC


Multi-reader multi-case study design 

First Author

Stephen Hillis, University of Iowa

Presenting Author

Stephen Hillis, University of Iowa

Statistical approaches to chronic disease preventive behaviors profiling

Promoting positive lifestyle behaviors to attenuate lifetime risk of cancers and related chronic diseases is of great interest to public health researchers. Given the growing population diversity, however, due to unobserved/undefined individual heterogeneity in multiple highly correlated measurements of various disease preventive behaviors, statistical modeling to assess these complex data is challenging. Biomarkers that identify high-risk individuals may improve our understanding of heterogeneity in risk behavioral patterns, but there is a lack of validated approach that can properly link biomarkers to multiple behaviors by determining their dynamic relations with cancer risk. This is because it requires a validation process and advanced statistical methodology that can address various challenges in analyzing biomarker data, including left-censoring due to detection limits. We propose a new statistical approach to disease-preventive behaviors profiling to address these statistical challenges while providing greater flexibility to characterize risk-specific lifestyle behavioral patterns. We evaluate performance of the proposed method through simulations and real data applications. 

Keywords

Biomarkers

Quantile Regression

Lifestyle behaviors

Left-censoring

Longitudinal data 

Co-Author(s)

Belinda Reininger, University of Texas Health Science Center at Houston, School of Public Health
Kelley Gabriel, University of Alabama at Birmingham, Birmingham,
Nalini Ranjit, University of Texas Health Science Center at Houston, School of Public Health
Larkin Strong, University of Texas MD Anderson Cancer Center

First Author

MinJae Lee, UTHealth-Houston

Presenting Author

MinJae Lee, UTHealth-Houston

Using Wearable Device Data for Step Measurement On Parkinson’s Disease Population

Parkinson's disease (PD) is a progressive neurodegenerative disorder with various motor symptoms. Home detection and monitoring of such symptoms prove to be valuable, as it enables more constant monitoring at patient's convenience. Wearable devices equipped with inertial measurement unit (IMU) sensors are particularly essential in objective symptoms progression monitoring at home. Some literatures identify gait features, which characterize a person's walking or running movement, as important predictors for detecting PD symptoms. Such gait features can be derived from IMU signals. In this work, we propose a step measurement methodology using convolutional neural network architecture, which is an integral step in deriving important gait features. With the limited accessibility to such gait features, an open-source step-measurement model that translates raw IMU signals into gait features would be valuable to researchers in Parkinson's disease. We demonstrate the use of the proposed model through the WearGait-PD dataset. 

Keywords

Digital health

Convolutional Neural Network

Parkinson's Disease

Wearable Device

Inertial Measurement Unit

Time-series 

Co-Author(s)

Derek Hansen, University of Michigan
Meyeon Lee, Food and Drug Administration
Kimberly Kontson, U.S. Food and Drug Administration
Rajesh Nair, FDA
Guangxing Wang

First Author

Howon Ryu, UC San Diego, Department of Family Medicine & Public Health

Presenting Author

Howon Ryu, UC San Diego, Department of Family Medicine & Public Health