Statistical Methods for Functional Data

Uditha Wijesuriya Chair
University of Southern Indiana
 
Monday, Aug 4: 2:00 PM - 3:50 PM
4079 
Contributed Papers 
Music City Center 
Room: CC-102B 

Main Sponsor

Section on Statistical Learning and Data Science

Presentations

A data-driven way to compute vector summaries of persistence diagrams using functional data analysis

Vectorization plays a crucial role in Topological Data Analysis (TDA), bridging topological descriptors with conventional machine learning models. While numerous vectorization techniques exist, their effectiveness varies across datasets. We propose adaptive vectorization methods that adjust to the structure of the given data, optimizing representation for downstream tasks. Our approach refines vectorization using iterative optimization tailored to classification and regression settings. Extensive simulations demonstrate that these adaptive methods can outperform existing techniques in specific cases, yielding improved predictive accuracy and robustness. These findings highlight the importance of dataset-specific vectorization strategies in TDA. 

Keywords

Topology Data Analysis

Functional Data Analysis

Data-Driven Optimization

Classification and Regression

Feature Engineering 

Co-Author

Aleksei Luchinskii

First Author

Umar Islambekov

Presenting Author

Aleksei Luchinskii

A Sparse Functional SVD Method for Clustering Functional Data

This work investigates the Sparse Multivariate Functional SVD (SMFSVD) method for clustering multivariate functional data. SMFSVD aims to construct a sparse, low-rank structured representation of multivariate functional data, serving as a novel exploratory tool for identifying interpretable clusters of subjects and functional variables. Within the SMFSVD framework, we introduce two approaches: the bicluster approach and the tricluster approach.

In the bicluster approach, adaptive Lasso and adaptive group Lasso penalties are applied to achieve sparsity in both subjects and functional variables. The tricluster approach extends this framework by introducing an additional adaptive Lasso penalty to select meaningful subregions within each functional variable, thereby capturing finer-grained structures.

Furthermore, recognizing that real-world data are often sparsely and irregularly sampled-conditions that traditional functional data analysis techniques struggle to handle-we incorporate a best- approximation computation within the SMFSVD framework. This enhancement ensures robust and effective performance when analyzing sparse and irregular functional data. 

Keywords

functional data analysis

sparse group lasso

functional SVD

iterative shrinkage-thresholding algorithm 

Co-Author(s)

Sandra Safo, University of Minnesota
Thierry Chekouo Tekougang, University of Minnesota

First Author

Yue Zhao

Presenting Author

Yue Zhao

WITHDRAWN Boosting AI-Generated Biomedical Images with Confidence through Advanced Statistical Inference

Generative artificial intelligence (AI) has transformed the biomedical imaging field through image synthesis, addressing challenges of data availability, privacy, and diversity in biomedical research. This paper proposes a novel nonparametric method within the functional data framework to discern significant differences between the mean and covariance functions of original and synthetic biomedical imaging data, thereby enhancing the fidelity and utility of synthetic data. Focusing on surface-based synthetic imaging data, our approach employs triangulated spherical splines to address spatial heterogeneity. A key contribution is the construction of simultaneous confidence regions (SCRs) to rigorously quantify uncertainty in original-synthetic differences. The asymptotic properties of the proposed SCRs are established, providing exact coverage probabilities and demonstrating equivalence to those derived from noise-free imaging data. Simulation studies validate the coverage properties of the SCRs and evaluate the size and power of the associated hypothesis tests. The proposed method is applied to compare the original and synthetic brain imaging data from the Human Connectome Project, 

Keywords

Biomedical imaging synthesis

Functional principal component analysis

Simultaneous confidence regions

Surface-based imaging data

Triangulated spherical splines 

Co-Author(s)

Shan Yu, University of Virginia
Guannan Wang, College of William and Mary
Lily Wang, George Mason University

First Author

Zhiling Gu, Yale University

WITHDRAWN Change Points Detection for Spherical Functional Autoregressive Processes

Every phenomenon can potentially experience transitions in its behavior, making change detection essential for understanding their evolution over time and space. The change point framework is a valuable tool for identifying shifts in dynamic processes and involves estimating the number and location of time points where transitions occur.

A growing area of interest is the study of random fields on the sphere, relevant in astrophysics and climate science, among others. Notably, spherical functional autoregressions (SPHAR(p)) effectively capture random behavior by integrating spatial and temporal dependencies. Detecting structural breaks in spherical random processes is crucial, especially in climate science, where changes in global surface temperature could help describe global warming.

Thanks to the change point framework, we generalize the SPHAR(p) model by relaxing the stationarity assumption. We also introduce a Lasso-based change point detection technique in this setting and assess its effectiveness on both synthetic and real data. 

Keywords

Change-point detection

Spherical random fields

Autoregressive processes

Functional analysis

Lasso 

Co-Author(s)

Alessia Caponera, LUISS Guido Carli
Pierpaolo Brutti, Sapienza Università di Roma

First Author

Federica Spoto, Harvard University

WITHDRAWN Functional Differential Equation Model for Dynamic System

Ordinary Differential Equations (ODEs) are commonly used in modeling dynamic systems. However, one major limitation of the ODE model is that it assumes the derivatives of the system only depend on the concurrent values. This concurrent assumption may oversimplify the mechanisms of dynamic systems and limit the applicability of differential equations. To address this, we propose a general Functional Differential Equation (FDE) model which allows the derivative to explicitly depend on both the current value and a historical segment of the system through an unknown operator which maps historical curves to scalars. To estimate the FDE model from noisy observations, we propose the Functional Neural Networks (FNNs) with a smooth hidden layer and establish their universal approximate property: the FNNs can universally approximate the operator in FDE and the solution to the approximate FDE can be uniformly and arbitrarily close to the solution to the original FDE. We propose a new method based on the changes of the dynamic system on moving windows to construct the FNN, and then make forecasts by solving the approximate FDE. 

Keywords

differential equation

dynamic systems

functional differential equation

functional universal approximation theorem

functional neural networks 

Co-Author

Xin Qi

First Author

Ruiyan Luo

Generalized projection-based shape outlier detection in functional data

Shape outliers, or abnormally shaped functional data, are difficult to detect when masked by surrounding functions. Detection attempts range from visualization tricks to quantification techniques. They typically summarize high-dimensional shape information into a finite set of indices using tools like statistical depths or functional principal component analysis (FPCA). However, existing approaches overlook the varying importance of the derived indices. To address this, we propose the Generalized Trimmed Functional Score (GTFS), an outlyingness index that automatically reweighs the extracted indices. It is computed as the weighted sum of eigenscores, the projection of the curves onto FPCA eigenfunctions. The weighing plan we designed leverages the extreme value distribution of the squared eigenscore maxima to adaptively select only the eigenfunctions helpful for detection. We also introduce the specialized centering scheme that makes the index magnitude-invariant by un-masking the shape outliers. The thresholding rule based on the asymptotic distribution of GTFS, with which we control the false-positive rate is also provided. Theoretical studies explore the statistical power and some asymptotic properties. Finally, we validate the practicality via extensive simulations and a real-world application using the smartphone human activity signal data. 

Keywords

Functional data analysis

Shape outlier

Outlier detection

Generalized extreme value distribution

Functional principal component analysis

Reweighting 

Co-Author

Arlene Kim, Korea University

First Author

Hyungjun Lim

Presenting Author

Hyungjun Lim

Multivariate Sparse Functional Data Classification via Bayesian Aggregation

Multivariate functional data arise in a wide range of applications, from medical diagnostics to economic time series. However, classification becomes notably difficult when data are sparsely and irregularly observed. To address this challenge, we propose a novel Bayesian ensemble framework that integrates multivariate functional principal component analysis (MFPCA) with probabilistic aggregation. Our method first extracts key features from the multivariate functional observations using MFPCA, then generates multiple bootstrap samples to capture variability in the data. Rather than relying on conventional ensemble heuristics, the proposed approach employs Bayesian generalized linear models (Bayesian GLMs) to systematically calibrate and combine predicted probabilities across bootstrap iterations. This principled treatment of uncertainty leads to more accurate and reliable classification outcomes. Extensive simulations and real-world case studies demonstrate that our framework consistently outperforms standard single classifiers and traditional ensemble techniques. 

Keywords

Multivariate Functional Principal Component Analysis (MFPCA)

Sparse Longitudinal Data


Functional Principal Component Analysis (FPCA)

Bootstrap Aggregating

Classification

Statistical learning 

First Author

Ahmad Talafha, St.Edward's University

Presenting Author

Ahmad Talafha, St.Edward's University