Novel Statistical Methods for Mobile and Wearable Device Data

Xiaojing Sun Chair
Purdue University
 
Fei Xue Organizer
Purdue University
 
Sunday, Aug 4: 2:00 PM - 3:50 PM
1601 
Topic-Contributed Paper Session 
Oregon Convention Center 
Room: CC-B112 
Commercial wearable devices and smartphone apps for monitoring health-related behaviors have proliferated rapidly. Analyzing the data generated by commercial wearables and apps has the potential to alter how we study human behavior and how we intervene to improve health. These datasets are larger and more complex than traditional research studies and bring new statistical challenges.

Applied

No

Main Sponsor

Section on Risk Analysis

Co Sponsors

Committee on Women in Statistics
ENAR

Presentations

Statistical methods for integrating accelerometry data from multiple sources

Wearable devices such as accelerometers are increasingly assessed in many observational and interventional studies. However different studies often have inconsistent choices of device brands, wear positions and processing pipelines, making it challenging to compare and combine data across studies. Since accelerometry data are often recorded continuously over multiple days and have complex the time dependency structures, the existing data harmonization methods are inapplicable. We propose a new method to integrate multiday minute-level physical activity datasets from two different studies and model the shared information by common eigenvalues and eigenfunctions while allowing for batch-specific scale and rotation. The methods are applied on different batches of NHANES accelerometry data and the results demonstrate the superior performane of our proposed method in removing batch effects while preserving biological signals compared to existing approaches.  

Speaker

Haochang Shou, University of Pennsylvania

Detailed activity assessment using participant-level daily quantile trajectories

This paper introduces the functional quantile principal component analysis (FQPCA), a dimensionality reduction technique that extends the concept of functional principal components to the quantile regression framework, obtaining a model that can explain the subject specific quantiles conditional on a set of principal component functions. FQPCA is able to capture shifts on the scale and distribution of the data that may affect the quantiles but may not affect the mean, and is also a robust methodology suitable for dealing with outliers, heteroscedastic data or skewed data. The need for such methodology is exemplified by our motivating example: using the accelerometer data from the National Health and Nutrition Examination Survey (NHANES) we analyze the physical activity level of over $3600$ people during one day. The proposed methodology can deal with sparse and irregular time measurements, is evaluated in synthetic data and real data analyses, and is available as a package in R programming language.
 

Speaker

Jeff Goldsmith, Columbia University

New Methods For Analyzing Wearable Device Data

Modern longitudinal data from wearable devices consist of biological signals at high-frequency time points and offer unparalleled opportunities for discovering new health insights. Distributed statistical methods have emerged as a powerful tool to overcome the computational burden of estimation and inference with these intensively measured outcomes, but methodology for distributed functional regression remains limited. Developing functional regression tools is critical to appropriately modeling and understanding these data. We propose distributed estimation and inference procedures that efficiently estimate functional parameters for intensively measured longitudinal outcomes and overcome computational difficulties by leveraging recent developments in high performance computing platforms. We demonstrate the practicality of our approaches through application of our methods to accelerometer data from the NHANES data set. 

Speaker

Emily Hector, North Carolina State University

Individualized Dynamic Model for Multi-resolutional Data

Mobile health has emerged as a major success in tracking individual health status, due to the popularity and power of smartphones and wearable devices. This has also brought great challenges in handling heterogeneous, multi-resolution data that arise ubiquitously in mobile health due to irregular multivariate measurements collected from individuals. We propose an individualized dynamic latent factor model for irregular multi-resolution time series data to interpolate unsampled measurements of time series with low resolution. A major advantage of the proposed method is the capability to integrate multiple irregular time series and multiple subjects by mapping the multi-resolution data to the latent space. Moreover, the proposed individualized dynamic latent factor model is applicable to capturing heterogeneous longitudinal information through individualized dynamic latent factors. In theory, we provide the integrated interpolation error bound of the proposed estimator and derive the convergence rate with B-spline approximation methods. Simulation studies and the application to smartwatch data demonstrate the superior performance of the proposed method compared to existing methods. 

Speaker

Fei Xue, Purdue University