Sunday, Aug 4: 2:00 PM - 3:50 PM
5009
Contributed Papers
Oregon Convention Center
Room: CC-G129
Main Sponsor
Section on Nonparametric Statistics
Presentations
Electronic health records (EHRs) are promising but challenging resources for research on investigating and monitoring disease progression. Motivated by the hospitalized COVID-19 patient data from West Bengal in India, we aim to dynamically predict the chance of "discharge" or "death" of these COVID-19 hospitalized patients based on their longitudinal laboratory measurements. In total, there are 147,805 hospitalized COVID-19 patients with 1,091,322 laboratory measurements, and the high volume of this data raises the computation challenge for dynamic prediction. In addition, the features of EHRs data such as sparsity, irregularity and non-linearity also place a challenge in modelling. To address these, we propose a two-step landmark competing risk model which summarizes the historical laboratory measurements using a functional principle analysis (PCA) and then uses the landmark competing risk model for prediction. The proposed method is easy to implement using the existing software. All estimated model parameters, longitudinal history, and at-risk population vary over the landmark time. The whole dataset was randomly split into training and testing set with the ratio of 1:1. Different approaches for handling longitudinal observations including baseline measure, mean, recent measure (last value carry forward), and linear regression are adopted in the two stage estimation and compare with the proposed method via the weighted Harrell's C-Index and Brier score. The proposed method outperforms all comparable methods at the distant landmark time. Using the proposed model we dynamically predict "death" or "discharge" given the different landmark time and depict their associations with COVID-19 medication according to their historical laboratory measurements, which provide the evidence that this model has potential to assist clinicians in understanding patients' disease progression at different time and providing the suggestion about the medication use based on their historical information.
Keywords
landmark model
competing risk
dynamic prediction
longitudinal data
survival
Causal inference methods play a pivotal role in elucidating the effects of interventions and treatments in various domains including healthcare. This research proposes a novel framework that integrates double machine learning and targeted minimum loss-based estimation with Gaussian process regression to estimate treatment effects. The approach dynamically selects inducing points and model parameters based on the complexity of the data and the estimated treatment effects. We illustrate the application of our framework in the domain of medical testing where accurate estimation of treatment effects is crucial for assessing the efficacy of diagnostic tests and medical interventions. Through simulations and real-world data, we demonstrate the effectiveness of our adaptive approach in providing efficient estimates of treatment effects and improving decision-making. The research contributes to advancing the field of causal inference by introducing an adaptive approach that dynamically adjusts to the data characteristics, thereby addressing complex challenges in medical testing and intervention evaluation.
Keywords
Causal inference
Artificial Intelligence (AI)
Double Machine Learning (DML)
Treatment effects
Adaptive
Inducing points
In the realm of generalized functional regression, interpreting results from multivariate functional principal component analysis (MFPCA) applied to diverse, multi-dimensional functional data can be complex. This study introduces an advanced model selection technique that leverages a forward selection approach in MFPCA Here, functional variables are incrementally integrated, with their inclusion in the model being determined by a user-selected criterion. This method is adaptable to sparse data or data plagued with measurement errors. We benchmark the effectiveness of this novel approach against existing methods. A key application of this methodology is demonstrated in a study of neonate metabolites, with the goal of understanding the relationship between longitudinal trajectories and a binary morbidity outcome. This research marks a significant step forward in refining model selection strategies within generalized functional regression frameworks using MFPCA.
Keywords
Functional principal component analysis
Model selection
Generalized functional regression
Longitudinal data
Multivariate functional principal component analysis
Forward selection
This work studies a year of posting behavior of social media users interacting with bot accounts and how their behavior differs from that of users that do not interact with bots. The posting behavior is described by a combination of the user's weekly number of posts, words, and ats. We propose a flexible functional regression model model for the posting behavior of users to not only provide a framework to describe and interpret how susceptible accounts differ from those which are not, but also assess if there is evidence that a new user, whose posting behavior has been observed repeatedly, is susceptible to bot interaction. The proposed methodology is investigated in finite samples through simulations, including scenarios that mimic the data application.
Keywords
Functional Data Analysis
Social Media
Testing
Social Bot Interaction
Posting Behavior
The primary aim of dose-finding studies is to pinpoint the optimal dose level based on subjects' responses, focusing on 'Efficacy' and 'Toxicity.' The optimal dose is identified at the point of maximum probability, where efficacy is significant without toxicity. While some studies use Emax, quadratic, or non-linear models, they are unsuitable for non-monotonic curves. Cripper & Orsini (2016) proposed regression splines, but they may not sufficiently describe reasonable dose-response distributions. This paper introduces functional data models for dose-finding studies, presenting a novel approach by applying them to meta-analysis data. We focus on three outcome probabilities: P(Efficacy), P(Toxicity), and P(Efficacy but No Toxicity), guided by monotonic and unimodal assumptions. Our functional data models estimate these probability distributions and introduce adjusted confidence intervals. Finally, we apply these models to analyze data on alcohol consumption and colorectal cancer.
Keywords
Functional Data
Dose Study
Meta-Analysis Data
Efficacy and Toxicity
Functional Anova
Smoothing methods
For testing hypothesis on a multi-dimensional parameter associated with a time series, the self-normalization (SN) method avoids the bandwidth choice and is asymptotically distribution-free under the null. So far the literature has not provided a way of using SN for the inference of an infinite dimensional parameter. In this talk, I will propose a SN-based inference method for a functional parameter via the idea of sample splitting. The proposed statistic avoid the bandwidth choice, and are asymptotically distribution-free. Our method has wide applicability and can be used for many time series testing problems when an infinite dimensional parameter is of main interest. Through simulations, we examine their finite sample performance in comparison with some existing methods, and show that the proposed methods typically leads to more accurate size with mild loss of power.
Keywords
Time Series
Infinite Dimensional Parameter
Sample Splitting
Inference
We propose a new method, which we call Multivariate Functional Deep Neural Network (MFDNN), for classifying multivariate functional data across diverse domains. In contrast to existing approaches limited to Gaussian settings and uniform dimensional domains, MFDNN accommodates non-Gaussian data functions on varying dimensional domains (e.g., functions and images). The proposed classifier attains minimax optimality, substantiated by theoretical justifications. Demonstrations on simulated and real-world datasets underscore the versatility and efficacy of MFDNN. This approach complements recent advancements and extends previous results by exploring deep neural network procedures on multivariate functional data across different domains. Comparisons highlight the favorable performance of our method.
Keywords
Functional data
Deep neural network
Classification
Multivariate functional data