Statistical Methods at the Interface of Statistics and Machine Learning/Artificial Intelligence

Jeffrey Morris Chair
University of Pennsylvania, Perelman School of Medicine
 
Jeffrey Morris Organizer
University of Pennsylvania, Perelman School of Medicine
 
Giles Hooker Organizer
University of Pennsylvania
 
Monday, Aug 4: 8:30 AM - 10:20 AM
0815 
Topic-Contributed Paper Session 
Music City Center 
Room: CC-106A 

Applied

No

Main Sponsor

Section on Nonparametric Statistics

Co Sponsors

Section on Statistical Learning and Data Science
Section on Statistics in Imaging

Presentations

Deep Neural Network for Functional Graphical Models Structure Learning

We propose a novel and flexible method to estimate the neighborhood of each node using a deep neural network-based functional data regression and feature selection approach with an arbitrary nonparametric form. The full graph structure is then recovered by combining the estimated neighborhoods. Our approach avoids common distributional assumptions on the random functions and circumvents the need for a well-defined precision operator, which may not exist in the functional data context. Furthermore, we establish model consistency for the proposed algorithm. The convergence rate reaches to the classical non-parametric regression rate up to a logarithmic factor. We discover a novel critical sampling frequency that governs the convergence rates of the deep neural network estimator. The empirical performance of our method is demonstrated through simulation studies and a real data application. 

Keywords

Deep Neural Network

Functional graphical model

Feature selection

Functional data analysis

Non-parametric statistics 

Speaker

guanqun Cao, Michigan State University

Functional Mixed Model using Autoencoder Representations

Latent feature representations (e.g., PCA) are widely used for dimensionality reduction and statistical modeling of high-dimensional functional and multivariate data. Traditional functional regression models typically rely on linear basis expansions (e.g., PCA, B-splines, wavelets), but modern non-linear machine learning methods (e.g., autoencoders, GANs) offer more flexible alternatives.
In this work, we present two main contributions:
1. We propose CLaRe (Compact near-Lossless Latent Representations), a flexible evaluation framework for selecting among linear and non-linear latent feature representations in high-dimensional functional and multivariate data. CLaRe provides a principled set of criteria to assess methods based on their dimensionality reduction (compactness) and the information they preserve (near-losslessness).
2. We demonstrate how, when non-linear methods such as autoencoders are selected, they can be embedded within the Functional Mixed Model (FMM) framework of Morris and Carroll (2006). This integration enables flexible modeling of complex functional structures while retaining the interpretability and inference capabilities of FMMs. We illustrate the utility of this approach on multidimensional functional imaging data.
 

Speaker

Edward Gunning, University of Pennsylvania

High-Dimensional Multivariate Mediation Analysis for Brain Imaging: A Dimension Reduction Approach

Causal mediation analysis is critical for understanding how changes in the brain mediate the effects of environmental and genetic factors on neurological outcomes in neuroimaging studies. However, traditional mediation methods often face challenges when dealing with high-dimensional multivariate mediators, such as complex brain imaging data, due to the curse of dimensionality and reduced statistical power. This study introduces a novel methodology leveraging envelope methods to enhance dimensionality reduction, pathway identification, and statistical power in detecting indirect effects. The proposed approach is applied to Alzheimer's Disease Neuroimaging Initiative (ADNI) data to examine how structural changes in brain regions mediate the impact of genetic factors on cognitive decline. Simulation studies validate the asymptotic properties of the estimators and demonstrate that the method outperforms existing techniques in improved power and reduced variance in estimation. This approach advances mediation analysis in neuroimaging and extends to other high-dimensional multivariate contexts, offering a robust framework for disease detection and intervention strategies. 

Co-Author

Kwun Chuen Gary Chan, University of Washington

Speaker

Yuexuan Wu, University of South Carolina

Learning Image Manifolds Using Functional and Shape Analysis

Machine learning and AI methods have been truly impressive in their performance on image and computer vision tasks. What are the reasons for this success? One reason is that despite image data being ultra-high-dimensional, most images lie on very low-dimensional manifolds. We speculate that AI methods can learn and exploit geometries of these low-dimensional manifolds to result in efficient procedures for vision tasks. In this paper, we investigate manifolds formed by images of 3D objects using tools from functional and shape analysis. First, we take individual 3D objects, say a chair, a car, or a sofa, and we form their pose image manifold -- the set of images formed by all 3D rotations of that object. To visualize and analyze these pose manifolds, we use a geometry-preserving transformation (e.g., the well-known multi-dimensional scaling) to map them to a smaller Euclidean space called the latent space. In the smaller latent space, we study the shapes of these image manifolds and compare them across different 3D objects. For example, allowing only 1D rotation of objects, we get curves (parameterized by the rotation angle). Allowing 2D rotations, we get surfaces (parameterized by two rotation angles). For complete 3D rotations, we get hypersurfaces (parameterized by three rotation angles) in latent spaces. We study the shapes of these functions (curves, surfaces, and hypersurfaces) using Kendall's shape analysis and obtain some clustering results.  

Co-Author(s)

Benjamin Beaudett
Shenyuan Liang, Florida State University
Anuj Srivastava, Florida State University

Speaker

Benjamin Beaudett

Representation Learning of Dynamic Networks

This study introduces a novel representation learning model for dynamic networks, capturing evolving relationships within a population. Framing the problem within functional data analysis, we represent dynamic networks as matrix-valued functions and embed them into a lower-dimensional functional space. This space preserves network topology while enabling attribute learning, community detection, and link prediction. Our model accommodates asymmetric embeddings to distinguish nodes' regulatory and receiving roles, ensuring continuity over time. Unlike discrete-time methods, our approach leverages a functional representation to infer network structures at unobserved time points. We validate our model through simulations and real-world applications, demonstrating superior link prediction accuracy compared to existing approaches. Applying our method to dynamic social networks in ant colonies, we uncover meaningful patterns in interactions and role transitions. Our findings align with known ant colony behaviors, highlighting the model's interpretability and utility in analyzing evolving networks. This work provides a statistical framework balancing representation learning capacity with interpretability, offering insights into dynamic network structures. 

Keywords

Functional data analysis

Representation learning

Graph analysis

Dimension reduction

Community detection 

Speaker

Haixu Wang, University of Calgary