Sunday, Aug 4: 2:00 PM - 3:50 PM
1601
Topic-Contributed Paper Session
Oregon Convention Center
Room: CC-B112
Commercial wearable devices and smartphone apps for monitoring health-related behaviors have proliferated rapidly. Analyzing the data generated by commercial wearables and apps has the potential to alter how we study human behavior and how we intervene to improve health. These datasets are larger and more complex than traditional research studies and bring new statistical challenges.
Applied
No
Main Sponsor
Section on Risk Analysis
Co Sponsors
Committee on Women in Statistics
ENAR
Presentations
Wearable devices such as accelerometers are increasingly assessed in many observational and interventional studies. However different studies often have inconsistent choices of device brands, wear positions and processing pipelines, making it challenging to compare and combine data across studies. Since accelerometry data are often recorded continuously over multiple days and have complex the time dependency structures, the existing data harmonization methods are inapplicable. We propose a new method to integrate multiday minute-level physical activity datasets from two different studies and model the shared information by common eigenvalues and eigenfunctions while allowing for batch-specific scale and rotation. The methods are applied on different batches of NHANES accelerometry data and the results demonstrate the superior performane of our proposed method in removing batch effects while preserving biological signals compared to existing approaches.
This paper introduces the functional quantile principal component analysis (FQPCA), a dimensionality reduction technique that extends the concept of functional principal components to the quantile regression framework, obtaining a model that can explain the subject specific quantiles conditional on a set of principal component functions. FQPCA is able to capture shifts on the scale and distribution of the data that may affect the quantiles but may not affect the mean, and is also a robust methodology suitable for dealing with outliers, heteroscedastic data or skewed data. The need for such methodology is exemplified by our motivating example: using the accelerometer data from the National Health and Nutrition Examination Survey (NHANES) we analyze the physical activity level of over $3600$ people during one day. The proposed methodology can deal with sparse and irregular time measurements, is evaluated in synthetic data and real data analyses, and is available as a package in R programming language.
Modern longitudinal data from wearable devices consist of biological signals at high-frequency time points and offer unparalleled opportunities for discovering new health insights. Distributed statistical methods have emerged as a powerful tool to overcome the computational burden of estimation and inference with these intensively measured outcomes, but methodology for distributed functional regression remains limited. Developing functional regression tools is critical to appropriately modeling and understanding these data. We propose distributed estimation and inference procedures that efficiently estimate functional parameters for intensively measured longitudinal outcomes and overcome computational difficulties by leveraging recent developments in high performance computing platforms. We demonstrate the practicality of our approaches through application of our methods to accelerometer data from the NHANES data set.
Mobile health has emerged as a major success in tracking individual health status, due to the popularity and power of smartphones and wearable devices. This has also brought great challenges in handling heterogeneous, multi-resolution data that arise ubiquitously in mobile health due to irregular multivariate measurements collected from individuals. We propose an individualized dynamic latent factor model for irregular multi-resolution time series data to interpolate unsampled measurements of time series with low resolution. A major advantage of the proposed method is the capability to integrate multiple irregular time series and multiple subjects by mapping the multi-resolution data to the latent space. Moreover, the proposed individualized dynamic latent factor model is applicable to capturing heterogeneous longitudinal information through individualized dynamic latent factors. In theory, we provide the integrated interpolation error bound of the proposed estimator and derive the convergence rate with B-spline approximation methods. Simulation studies and the application to smartwatch data demonstrate the superior performance of the proposed method compared to existing methods.