Monday, Aug 5: 2:00 PM - 3:50 PM
1520
Topic-Contributed Paper Session
Oregon Convention Center
Room: CC-E141
Student paper award and John M. Chambers statistical software award winner presentations.
Applied
No
Main Sponsor
Section on Statistical Computing
Co Sponsors
Section on Statistical Graphics
Presentations
We propose a reproducible pipeline for extracting representative signals from 2D topographic scans of the tips of cut wires. The process fully addresses many potential problems in the quality of wire cuts, including edge effects, extreme values, trends, missing values, angles, and warping. The resulting signals can be further used in source determination, which plays an important role in forensic examinations. With commonly used measurements such as the cross-correlation function, the procedure controls the false positive rate and false negative rate to the desirable values as the manual extraction pipeline but outperforms it with robustness and objectiveness.
Co-Author(s)
Heike Hofmann, Iowa State University
Yuhang Lin, Center for Statistics and Applications in Forensic Evidence (CSAFE), Iowa State University
Speaker
Yuhang Lin, Center for Statistics and Applications in Forensic Evidence (CSAFE), Iowa State University
Multivariate spatio-temporal data have a spatial component referring to the location of each observation, a temporal component recorded at regular or irregular time intervals, and multiple variables measured at each spatial and temporal value. Often, such data are fragmented, reflecting a common practice of focusing on either spatial or temporal aspects separately. This fragmentation makes it difficult to handle them coherently and comprehensively. This work introduces a new data structure to facilitate the study of different portions or combinations of spatio-temporal data for exploratory data analysis. The proposed structure, implemented in the R package, cubble, organizes spatial and temporal variables as two facets of a single data object, allowing them to be wrangled separately or combined while ensuring synchronization.
We propose a novel method to study properties of graph-structured data by means of a geometric construction called Dowker complex. We study this simplicial complex through the use of persistent homology, which has shown to be a prominent tool to uncover relevant geometric and topological information in data. A positively weighted graph induces a distance in its sets of vertices. A classical approach in persistent homology is to construct a filtered Vietoris-Rips complex with vertices on the vertices of the graph. However, when the size of the set of vertices of the graph is large, the obtained simplicial complex may be computationally hard to handle. A solution The Dowker complex is constructed on a sample in the set of vertices of the graph called landmarks. A way to guaranty sparsity and proximity of the set of landmarks to all the vertices of the graph is by considering ε-nets. We provide theoretical proofs of the stability of the Dowker construction and comparison with the Vietorips-Rips construction. We perform experiments showing that the Dowker complex based neural networks model performs good with respect to baseline methods.
Speaker
Jae Choi, University of Texas at Dallas
Medical research often involves the study of composite endpoints that combine multiple clinical events to assess the efficacy of treatments. When constructing composite endpoints, it is a common practice to analyze the time to the first event. However, this approach overlooks outcomes that occur after the first event, resulting in information loss. Furthermore, the terminal event can not only be of interest but also a competing risk for other types of outcomes. While regression models exist to analyze all types of such outcomes, not just the first event, and properly address the terminal event, they do not account for nonlinear relationships between covariates and composite endpoints. To address these important issues, we introduce Random FORest for Composite Endpoints (Rforce) consisting of non-fatal composite events and terminal events. The proposed method handles the dependent censoring due to the terminal events with the concept of pseudo-risk time. In simulation studies, Rforce demonstrates comparable performance with existing regression-based models under linear settings and outperforms competing methods under non-linear settings.
Speaker
Yu Wang, Medical College of Wisconsin
Functional mixed models are widely useful for regression analysis with dependent functional data, including longitudinal functional data with scalar predictors. However, existing algorithms for Bayesian inference with these models only provide either scalable computing or accurate approximations to the posterior distribution, but not both. We introduce a new MCMC sampling strategy for highly efficient and fully Bayesian regression with longitudinal functional data. Using a novel blocking structure paired with an orthogonalized basis reparametrization, our algorithm jointly samples the fixed effects regression functions together with all subject- and replicate-specific random effects functions. Crucially, the joint sampler optimizes sampling efficiency for these key parameters while preserving computational scalability. Perhaps surprisingly, our new MCMC sampling algorithm even surpasses state-of-the-art algorithms for frequentist estimation and variational Bayes approximations for functional mixed models—while also providing accurate posterior uncertainty quantification—and is orders of magnitude faster than existing Gibbs samplers.