Efficient Estimation and Change Point Detection: From Kernel Methods to Neural Networks

Lyudmila Sakhanenko Chair
Michigan State University
 
Wednesday, Aug 6: 10:30 AM - 12:20 PM
4180 
Contributed Papers 
Music City Center 
Room: CC-103A 

Main Sponsor

Section on Nonparametric Statistics

Presentations

Efficient Estimation for Constrained Optimization Problems

Many estimands can be defined through constrained optimization problems with a stochastic component, for instance principal components analysis, constrained maximum likelihood estimation, and many penalized estimation problems.  To obtain asymptotic theory when an estimand is on the boundary of the constraint set, researchers have leveraged significant insight from the perturbation analysis of optimization problems, which studies how optimization problems vary under small changes in auxiliary parameters. Despite the previously developed asymptotic theory, the literature about the efficiency of such estimators has focused on finite-dimensional settings and convex objective functions.  We help fill this gap by showing how to derive efficient influence functions for general estimands defined through constrained optimization problems with potentially infinite-dimensional nuisance parameters.  We lean again on perturbation theory and offer general results for practitioners who may be interested in deriving influence functions for their own estimands of interest, as well as describe when pathwise differentiability may fail to hold.  We provide examples of how this theory can be applied to calculate influence functions for several specific estimands in both semiparametric and nonparametric settings to allow for efficient root-n estimation. 

Keywords

Efficient influence function

M-estimation

Constrained optimization

Asymptotic theory

Perturbation analysis of optimization problems 

Co-Author

Robert Kass, Carnegie Mellon University

First Author

Konrad Urban, Carnegie Mellon University

Presenting Author

Konrad Urban, Carnegie Mellon University

Asymptotics for Kernel estimators of CDF in Autoregressive Models

In this talk, I will discuss recent advances in the strong convergence of the kernel estimation of the cumulative distribution function (CDF). The first part of the talk focuses on the law of the iterated logarithm (LIL) of L1-norms of kernel estimators for CDF based on the independent and identically distributed (i.i.d) data. The second part extends the LIL of i.i.d case to that of Lp-norms of the residual-based kernel estimators of error CDF in the Autoregressive Models. Some simulation results showing the estimators' performances. 

Keywords

Kernel Estimator

Autoregressive Models

Lp-norm

LIL 

First Author

Fuxia Cheng, Illinois State University

Presenting Author

Fuxia Cheng, Illinois State University

Deconvolving Kernel Regression Function Estimation Based On Right Censored Data

In this study, we propose a novel regression function estimator for scenarios involving errors-in-variables within a convolution model, particularly when the data are subject to right-censoring. By leveraging the tail behavior of the characteristic function of the error distribution, we establish the optimal local and global convergence rates for the kernel estimators. Our results reveal thatthe convergence rate depends on the smoothness of the error distribution: It is slower for super smooth errors and faster for ordinary smooth errors, both locally and globally. Importantly, we demonstrate that while the choice of kernel K has a negligible impact on the optimality of the mean square error (MSE), the bandwidth h plays a critical role. Through simulations across varying sample sizes and 100 replications per setting, we validate the theoretical findings. Finally, we apply the proposed estimator to analyze the relationship between advanced lung cancer cases and Karnofsky Performance Scores, offering practical insight into this medical context. 

Keywords

Kernel Regression

Deconvolution

Right Censored Data

Additive Measurement Errors 

Co-Author(s)

Shan Sun, Univ of Texas At Arlington, Dept. of Mathematics
Dengdeng Yu, University of Texas at San Antonio
Qiang Zheng

First Author

Erol Ozkan, University of Texas at Arlington

Presenting Author

Will Chen, University of Texas at Arlington

Kernel Estimation for Nonlinear Dynamics

Many problems involve data exhibiting both temporal and cross-sectional dependencies. While linear dependencies have been extensively studied, the theoretical analysis of estimators under nonlinear dependencies remains scarce. This work studies a kernel-based estimation procedure for nonlinear dynamics within the reproducing kernel Hilbert space framework, focusing on nonlinear stochastic regression and nonlinear vector autoregressive models. We derive nonasymptotic probabilistic bounds on the deviation between a kernel estimator and the true nonlinear regression function. A key technical contribution is a concentration bound for quadratic forms of stochastic matrices in the presence of dependent data, which may be of independent interest. Additionally, we characterize conditions on multivariate kernels required to achieve optimal convergence rates. 

Keywords

nonlinear dynamics

vector autoregressive model

reproducing kernel Hilbert space

concentration inequality

time series

machine learning 

Co-Author

Adam Waterbury, Denison University

First Author

Marie-Christine Duker, Cornell University

Presenting Author

Adam Waterbury, Denison University

Distributional Change Point Detection via Dense ReLU Networks

We study the problem of detecting changes in conditional distributions over time, where the relationship between inputs and responses shifts at unknown time points, referred to as change points. The conditional distributions are assumed to belong to a structured class of hierarchical models and remain piecewise constant between change points. Our objective is to estimate the locations of these changes and analyze the conditions under which they can be reliably detected. To achieve such a task, a novel method, Deep Distributional Change Point Detection, is introduced. It combines a Dense ReLU network-based estimation algorithm with a Seeded Binary Segmentation procedure to efficiently identify and localize changes in conditional distributions. Our theoretical analysis examines the impact of varying model parameters as the number of observations increases, including the minimum spacing between consecutive change points and the smallest detectable shift in distributions. We establish fundamental limits on localization accuracy and derive the minimum signal strength required for consistent detection. Extensive numerical experiments demonstrate the effectiveness of the proposed method. 

Keywords

Change Point

Dense ReLU Network

CUSUM estimator

Seeded Binary Segmentation 

Co-Author(s)

Carlos Misael Madrid Padilla, Washington University in St. Louis
Xuming He, Washington University in St. Louis

First Author

Shourjo Chakraborty

Presenting Author

Shourjo Chakraborty

Testing for latent structure via the Wilcoxon--Wigner random matrix

This paper considers the problem of testing for latent structure in large symmetric data matrices. The goal here is to develop statistically principled methodology that is flexible in its applicability and insensitive to data variation, thereby overcoming limitations facing existing approaches. To do so, we introduce and systematically study symmetric matrices, called Wilcoxon--Wigner random matrices, whose entries are normalized rank statistics derived from an underlying independent and identically distributed sample of absolutely continuous random variables. These matrices naturally arise as the matricization of one-sample problems in statistics and conceptually lie at the interface of nonparametrics, multivariate analysis, and data reduction. Among our results, we establish that the leading eigenvalue and corresponding eigenvector of Wilcoxon--Wigner random matrices admit asymptotically Gaussian fluctuations with explicit centering and scaling terms. These asymptotic results, which are parameter-free and distribution-free, enable rigorous spectral methodology for addressing two hypothesis testing problems, namely community detection and principal submatrix localization. 

Keywords

Rank statistic

Hypothesis testing

Semicircle distribution

Bai--Yin law

Outlier eigenvalue and eigenvector

Spectral method 

Co-Author

Joshua Cape, University of Wisconsin-Madison

First Author

Jonquil Liao

Presenting Author

Jonquil Liao

A swarm of regressions from regression data with one and two predictors

In a simple linear regression model, one has a numeric response variable Y and a predictor X. The model has an intercept β_0 and slope β_1, which are unknown. We assume X is numeric. We will have data (Y1, X1), (Y2, X2), …, (Yn, Xn). The observation Yi is drawn from the conditional distribution of Y|X=Xi. Using the data, one can estimate the intercept and slope of the model using the least squared method. The estimators are linear in Yi's, and are unbiased with minimum variance. Assume all Xi's are distinct. A unique line passes through any two points (Yi, Xi) and (Yj, Xj) with i≠j. We will have a swarm of lines, each of which provides unbiased estimators of β_0 and β_1. The swarm is used to develop a nonparametric regression model of Y on X. We show that a small subset of the swarm combined reproduces the least squared estimators. We extended this result to the case of two predictors. 

Keywords

simple linear regression

linear regression with two predictors

nonparametric simple linear regression

swarm of regressions 

Co-Author(s)

Marepalli Rao, University of Cincinnati
Tianyuan Guan

First Author

Zhaochong Yu

Presenting Author

Zhaochong Yu