Innovations in Statistical Learning and Inference for Complex Data Structures

Mitra Devkota Chair
University of North Georgia
 
Sunday, Aug 3: 2:00 PM - 3:50 PM
4007 
Contributed Papers 
Music City Center 
Room: CC-102B 

Main Sponsor

Isolated Statisticians

Presentations

A Monte Carlo Simulation Comparison of Some Nonparametric Survival Functions for Incomplete Data

This article presents a comparative analysis of a novel piecewise exponential estimator (NPEE) for censored data against three widely recognized estimators: the Kaplan-Meier estimator (KME), the Nelson estimator (NE), and an empirical Bayes type estimator (EBE). The NPEE, characterized by continuity on [0, ∞) and an exponential tail with a hazard rate derived through a novel nonparametric approach, retains the core advantages of the KME while addressing limitations inherent in the other estimators. These shortcomings restrict the broader applicability of the KME, NE, and EBE. To evaluate model performance, a simulation study was conducted using absolute bias and relative efficiency as quality metrics. Comparisons were performed across three levels of censoring, two sample sizes, and various quantiles. Results demonstrate that the NPEE, which is asymptotically equivalent to the KME, outperforms the other estimators for finite sample sizes, providing a robust alternative in survival function estimation. 

Keywords

survival function

censored data

piecewise exponential estimator

Kaplan-Meier estimator

simulation study

nonparametric methods 

First Author

Ganesh Malla, University of Cincinnati - Clermont College

Presenting Author

Ganesh Malla, University of Cincinnati - Clermont College

A Win Odds Approach to Advancing Dose Optimization in Drug development

Choosing the optimal dose is critical for drug development. The FDA's Project Optimus emphasizes that dose optimization should be based on the totality of safety, efficacy, PK, and PD. Current methods, such as the MTD approach, focus solely on safety, while others, like the Clinical Utility Score (CUS), combine weighted endpoints with obscure weight assignments. In contrast, the win odds method offers a comprehensive benefit-risk assessment, which integrates different efficacy, safety, and other endpoints into one composite endpoint. Specifically, the win odds can determine a winner by comparing the overall outcomes of pairs of patients on two different doses, from the most to the least important outcome. Additionally, it can assess benefit and risk within multiple candidate doses. Therefore, the win odds test can identify the optimal dose that has significantly more winners than the other doses. Extensive simulations are conducted to explore the robustness of win odds method across various scenarios. It is also compared with the CUS method and applied to a clinical trial. Overall, the win odds method provides an effective and straightforward approach to dose optimization. 

Keywords

Dose Optimization

Benefit-Risk Analysis

Win Odds 

Co-Author

Rachael Wen, Bristol-Myers Squibb Company

First Author

Cong Cao, Bristol-Myers Squibb Company

Presenting Author

Cong Cao, Bristol-Myers Squibb Company

Comparisons of Variable Selection and Inference Methods in High-dimensional Mediation Analysis

Mediation analysis is a framework to understand how a treatment affects the outcome through intermediate variables, namely mediators. Over the past decades, large and high-dimensional datasets have become easily stored and publicly available. This has led to many recent advances in mediation analysis, including developing models to fit more complex data structures and methods for mediator selections in high-dimensional settings. The statistical inference procedure following the mediator selection is also an important step in the mediation analysis. We study the effect of different variable selection and inference procedures through simulation studies. In this talk, I will discuss our simulation settings and the findings to provide guidelines that help distinguish among various approaches, highlight the advantages and disadvantages of each, and identify ones that perform better in certain scenarios. 

Keywords

Linear structural equation modeling

Penalization

Bootstrap 

Co-Author(s)

Yuan Huang, Yale University
Yeying Zhu, University of Waterloo

First Author

Xizhen Cai, Williams College

Presenting Author

Xizhen Cai, Williams College

Dynamic Latent Space Models for Relational Data

Latent space models are powerful tools for analyzing relational data, offering low-dimensional representations of interactions. However, many real-world relationships evolve over time, requiring more flexible models. With the increasing availability of dynamic interaction data, capturing these changes is crucial. We extend the latent space model to embed actor trajectories in Euclidean space, enabling better inference of evolving relationships. This framework is particularly useful for studying complex networks, where uncovering latent structures provides critical insights. By tracking how entities' latent positions evolve, we can better understand shifting interaction patterns, emerging structures, and long-term trends, offering valuable perspectives for various domains. This is joint work with Dr. Owen Ward (Simon Fraser University). 

Keywords

network science

latent space models

dynamic networks

spatial embeddings 

Co-Author

Owen Ward, Simon Fraser University

First Author

Jie Jian, University of Chicago

Presenting Author

Jie Jian, University of Chicago

Model selection for big multivariate time series data using emulators

Order identification for models of big time series data presents computational challenges. Results from previous studies on big univariate time series suggest that methods based on kriging and optimization can reduce the computing time substantially while providing adequately plausible model orders. In today's world, however, one must analyze multiple big time series simultaneously such as multiple stocks or measuring humidity in various rooms of a house. This becomes a much bigger computational challenge to address, as one must take into account the cross-correlation between the individual time series. The goal of this work is to detail a method to fit big multivariate time series. The results show that the proposed technique can substantially decrease computing time while still provide reasonably accurate model orders. 

Keywords

Big data

Kriging

Optimization

Order identification

ARMA 

First Author

Brian Wu, Xavier University

Presenting Author

Brian Wu, Xavier University

Online Quantile Regression

This paper tackles the challenge of integrating sequentially arriving data within the quantile regression framework, where the number of covariates is allowed to grow with the number of observations, the horizon is unknown, and memory is limited. We employ stochastic sub-gradient descent to minimize the empirical check loss and study its statistical properties and regret performance. In our analysis, we unveil the delicate interplay between updating iterates based on individual observations versus batches of observations, revealing distinct regularity properties in each scenario. Our method ensures long-term optimal estimation irrespective of the chosen update strategy. Importantly, our contributions go beyond prior works by achieving exponential-type concentration inequalities and attaining optimal regret and error rates that exhibit only short-term sensitivity to initial errors. A key insight from our study is the delicate statistical analyses and the revelation that appropriate stepsize schemes significantly mitigate the impact of initial errors on subsequent errors and regrets. This underscores the robustness of stochastic sub-gradient descent in handling initial uncertainties, 

Keywords

online linear regression

quantile regression

nonsmooth optimization

sub-gradient descent

batch learning 

Co-Author(s)

Dong Xia, Hong Kong University of Science and Technology
Wenxin Zhou, University of Illinois Chicago

First Author

Yinan Shen, University of Southern California

Presenting Author

Yinan Shen, University of Southern California