Innovations in Statistical, Machine Learning, and Deep Learning Methods for Complex Data

Zhengjun Zhang Chair
University of Chinese Academy of Sciences
 
Chunming Zhang Organizer
University of Wisconsin-Madison
 
Monday, Aug 4: 10:30 AM - 12:20 PM
0311 
Invited Paper Session 
Music City Center 
Room: CC-104E 

Applied

Yes

Main Sponsor

Section on Nonparametric Statistics

Co Sponsors

International Statistical Institute
Section on Statistical Learning and Data Science

Presentations

Complex-time Representation of Longitudinal Processes and Topological Kime-Surface Analysis

Complex-time (kime) extends the traditional representation of temporal processes into the complex plane and captures the dynamics of both classical longitudinal time and repeated-sampling process variability. Novel approaches for analyzing longitudinal data can be developed that build on the 2D parametric manifold representations of time-varying processes repeatedly observed under controlled conditions. Longitudinal processes that are typically modeled using time series are transformed into multidimensional surfaces called kime-surfaces, which jointly encode the internal dynamics of the processes as well as sampling variability. There are alternative strategies to transform classical time-courses to kime-surfaces. The spacekime framework facilitates the application of advanced topological methods, such as persistent homology, to these kime-surfaces. Topological kime-surface analysis involves studying the topological features of kime-surfaces, such as connected components, loops, and voids, which remain invariant under continuous deformations. These topological invariants can be used to classify different types of time-varying processes, detect anomalies, and uncover hidden patterns that are not apparent in traditional time-series analysis.

New AI models can be developed to predict, classify, tesselate, and forecast the behavior of high-dimensional longitudinal data, such as functional magnetic resonance imaging (fMRI), by leveraging complex-time representation of time-varying processes and topological analysis. Kime-surfaces represent mathematically-rich and computationally-tractable data objects that can be interrogated via statistical-learning and artificial intelligence techniques. Spacekime analytics has broad applicability, ranging from personalized medicine to environmental monitoring, and statistical obfuscation of sensitive information.
 

Keywords

complex-time, kime

spacekime analytics

AI

statistical learning

topological analysis 

Co-Author

Ivo Dinov, Statistics Online Computational Resource

Speaker

Ivo Dinov, Statistics Online Computational Resource

Stabilizing black-box model selection with the inflated argmax

Model selection is the process of choosing from a class of candidate models given data. For instance, methods such as the LASSO and sparse identification of nonlinear dynamics (SINDy) formulate model selection as finding a sparse solution to a linear system of equations determined by training data. However, absent strong assumptions, such methods are highly unstable: if a single data point is removed from the training set, a different model may be selected. This paper presents a new approach to stabilizing model selection that leverages a combination of bagging and an "inflated" argmax operation. Our method selects a small collection of models that all fit the data, and it is stable in that, with high probability, the removal of any training point will result in a collection of selected models that overlaps with the original collection. In addition to developing theoretical guarantees, we illustrate this method in (a) a simulation in which strongly correlated covariates make standard LASSO model selection highly unstable and (b) a Lotka–Volterra model selection problem focused on identifying how competition in an ecosystem influences species' abundances. In both settings, the proposed method yields stable and compact collections of selected models, outperforming a variety of benchmarks.

This is joint work with Jake Soloff and Rina Barber. 

Keywords

Stability

Model selection

Bagging 

Speaker

Rebecca Willett, Univ of Chicago

Dynamic Causal Modelling using Chen-Fliess Expansion

Dynamic causal modelling (DCM) provides a powerful framework for studying dynamics of large neural populations by using neural mass model, a set of differential equations. Although DCM has been increasingly developed into a useful clinical tool in the fields of computational psychiatry and neurology, inferring the hidden neuronal states in the model with neurophysiological data is still challenging. Many existing approaches, based on a bilinear approximation to the neural mass model, can mis-specify the model and thus compromise their accuracy. In this talk, we will introduce Chen-Fliess expansion for the neural mass model. The Chen-Fliess expansion is a type of Taylor series that converts the problem of estimating differential equations into a problem of estimating ill-posed nonlinear regression. We develop a maximum likelihood estimation based on the Chen-Fliess approximation. Both simulations and real data analysis are conducted to evaluate the proposed approach. 

Keywords

Dynamic causal modelling

Neural differential equations

Chen-Fliess expansion

Maximum likelihood estimation

Hidden state model

Computational psychiatry and neurology 

Speaker

Jian Zhang, University of Kent