Contributed Poster Presentations: Section on Nonparametric Statistics

Shirin Golchi Chair
McGill University
 
Tuesday, Aug 5: 10:30 AM - 12:20 PM
4100 
Contributed Posters 
Music City Center 
Room: CC-Hall B 

Main Sponsor

Section on Nonparametric Statistics

Presentations

13: Distance-based Repeated Measures MANOVA for Longitudinal Network on Spherical Surface

Longitudinal network data reflects the dynamic evolution of network structures and attributes over time, offering a unique opportunity to explore temporal dynamics, uncovering trends, and identifying the mechanisms driving network evolution. These insights are particularly valuable in areas such as social networks, biological systems, communication networks, and neuroscience/neurology. In this study, we introduce a novel non-parametric hypothesis-testing method specifically tailored for longitudinal network data on spherical surface. The proposed method begins with the construction of a network distance matrix on manifold, and accounts for the impact of serial correlation across multiple time points, ensuring temporal dependencies are appropriately addressed. Experiments on both synthetic and real-world data demonstrate that the proposed method effectively controls type I errors while maintaining robust statistical power to detect group or time effects and their interactions in network data. 

Keywords

Longitudinal Network Analysis

Distance-based Repeated Measures MANOVA

Manifold Learning 

Co-Author(s)

Xing Qiu
Hongyu Miao, Florida State University

First Author

Heling Tong

Presenting Author

Heling Tong

14: Impact of Arrests on Human Trafficking Ad Volume Using a Nonparametric Change-Point Model

Human trafficking is a critical issue, with online advertisements serving as proxies for illicit activity within the trafficking network. Law enforcement works diligently to disrupt the networks, but long-term effectiveness of arrests on reducing online advertisements is unclear. Existing research highlights immediate impacts of law enforcement intervention but lacks consensus on sustained reductions. This study explores the relationship between arrests and fluctuations in trafficking ads using data from five cities. By analyzing time-series data, we investigate whether arrests trigger significant changes in ad volumes and identify potential changepoints associated with enforcement activity. Descriptive statistics reveal short-term declines in ad activity following arrests, though long-term patterns are ambiguous. Applying a nonparametric changepoint model, we observe short-term decreases but limited evidence for lasting impact on ad activity. These findings suggest that while arrests disrupt trafficking activity, they may not produce sustained reductions. This research emphasizes the importance of date-informed strategies and coordinated interventions to combat human trafficking. 

Keywords

Nonparametric Changepoint Model

Time Series Analysis

Law Enforcement Impact

Human Trafficking 

Co-Author(s)

Arthur Graham, University of Alabama
Subhabrata Chakraborti, The University of Alabama
Jason Parton, University of Alabama
Nickolas Freeman, The University of Alabama

First Author

Sawyer Griffy

Presenting Author

Sawyer Griffy

15: Machine Learning Adjustment Boosts Efficiency of Exact Inference in Randomized Controlled Trials

In this work, we proposed a novel inferential procedure assisted by machine learning based adjustment for randomized control trials. The method was developed under the Rosenbaum's framework of exact tests in randomized experiments with covariate adjustments. Through extensive simulation experiments, we showed the proposed method can robustly control the type I error and can boost the statistical efficiency for a randomized controlled trial (RCT). This advantage was further demonstrated in a real-world example. The simplicity, flexibility, and robustness of the proposed method makes it a competitive candidate as a routine inference procedure for RCTs, especially when nonlinear association or interaction among covariates is expected. Its application may remarkably reduce the required sample size and cost of RCTs, such as phase III clinical trials. 

Keywords

Machine learning

Randomized controlled trial

Exact inference 

Co-Author(s)

Alan Hutson, Roswell Park Cancer Institute
Xiaoyi Ma, Roswell Park Comprehensive Cancer Center

First Author

Han Yu, Roswell Park Comprehensive Cancer Center

Presenting Author

Han Yu, Roswell Park Comprehensive Cancer Center

16: Nonparametric Within-Between Model: Extending BART for Multilevel Modeling

The within-between model is a robust approach that addresses the constraints inherent in both fixed effects and random effects models by distinctly modeling within-group and between-group effects. This paper introduces a nonparametric extension of the Within-Between model for the analysis of hierarchical data using Bayesian Additive Regression Trees. Our extension permits flexible nonlinear relationships while preserving the interpretability benefits of the linear Within-Between framework. We establish theoretical guarantees on posterior concentration rates under appropriate conditions and present a framework for deriving interpretable summaries of the intricate nonparametric effects using surrogate models. Through simulation studies, we demonstrate the superior performance of our approach compared to existing methods, including linear fixed effects, random effects, and standard BART extensions, particularly when the true relationships are nonlinear. We illustrate the practical applicability of our method through its application to the National Education Longitudinal Study, wherein we analyze student dropout status while accounting for both student-level and school-level effects. 

Keywords

BART

Multilevel Modelling

Within-Between Model

Nonparametric Regression 

Co-Author(s)

Antonio Linero, Florida State University
Jared Murray

First Author

Soumyabrata Bose

Presenting Author

Soumyabrata Bose

17: Profiling Functional Effects of Long-Term Physical Activity on Risk of Diabetes Onset with All-of-US

Diabetes is a leading chronic condition that affects the regulatory glucose mechanism. Preventive care, such as physical activity, is essential to reduce the risk of diabetes onset. The All-of-US Research Program, launched by the NIH, records the daily active zone minutes of over 15620 diverse participants across time. We conducted a retrospective study on All-of-US participants with data collected before the outbreak of COVID-19 in March 2020 when physical activity patterns began to shift. This project assessed the functional association of long-term physical activity on the risk of diabetes onset, using the logistic regression with time-varying effects of daily activity durations. Individuals' long-term activity duration curves and effect curves are decomposed by shared orthonormal basis functions. We adopt fused lasso to cluster individuals based on their latent projection features. Participants in the same subgroup share characteristic activity duration curves and functional effects of long-term physical activity. The subgroup functional effects are estimated through the alternating direction methods of multiplier (ADMM). The details of the data analysis results are presented. 

Keywords

Functional effects

Subgroup analysis

Time-varying effects

All-of-US research program

Fitbit 

Co-Author

Peter Song, University of Michigan

First Author

Rui Nie

Presenting Author

Rui Nie

18: Reassessing Estrogen Receptor Expression Thresholds for Prognosis Using Shape Restricted Modeling

We used a novel shape-restricted Cox model to determine the desirable ER expression cutoff to predict breast cancer prognoses. Our model treats ER as a continuous variable using a flexible monotone-shaped Cox regression to assess its association with survival outcomes holistically. The study included 3055 patients with stage II/III HER2-negative breast cancer. The primary outcomes were time to recurrence or death (TTR) and overall survival (OS). The shape-restricted Cox model identified 10% ER as the preferred cutoff to predict TTR. The finding was confirmed by the log-rank test and standard Cox model that patients with ER ≥ 10% had TTR benefit over ER < 10% (log-rank p < 0.001). No OS or TTR benefit of adjuvant endocrine therapy was observed in patients with 1% ≤ ER < 10% (HR 0.877, 95% CI 0.481 – 1.600, p = 0.668 for TTR and HR 0.698, 95% CI 0.337 – 1.446, p = 0.333 for OS). Using the shape-restricted Cox model, this study suggests a potential preferred threshold of 10% for predicting TTR, assisting physicians in effectively weighing the benefits and risks of adjuvant endocrine therapy for patients with ER < 10% disease, particularly in cases with severe adverse events. 

Keywords

Estrogen receptor

Threshold

Survival

Modelling

Endocrine therapy

Breast cancer 

Co-Author(s)

Takeo Fujii, Center for Cancer Research, National Cancer Institute
Jing Ning, University of Texas, MD Anderson Cancer Center
Toshiaki Iwase, University of Hawaiʻi Cancer Center,
Jing Qin, National Institute of Allergy and Infectious Diseases, NIH
Naoto Ueno, University of Hawaiʻi Cancer Center
Yu Shen, UT M.D. Anderson Cancer Center

First Author

Wenli Dong, UT MD Anderson Cancer Center

Presenting Author

Wenli Dong, UT MD Anderson Cancer Center

19: Testing for the geometric distribution in multi-sample settings

This paper deals with simultaneously testing whether k count variables, observed from independent samples, each have geometric laws, where the parameters of these k geometric laws may be different. A test statistic is proposed and shown to be asymptotically distribution free under the null hypothesis, where asymptotic means k→∞. For moderate values of k, this asymptotic null distribution yields conservative tests and so a bootstrap procedure is suggested to approximate the null distribution. Furthermore, this approximation is shown to be consistent. The asymptotic power of the test is also derived, allowing us to determine the alternatives that the new procedure is able to detect. The finite sample performance of the proposal is studied via numerical simulation methods. The test is also applied to the 2024 PGA golf Championship data set. Finally, we observe that the proposed procedure can be imitated to build tests for goodness-of-fit of other distributions in multi-sample settings. 

Keywords

goodness-of-fit

count data

many samples

bootstrap

consistency

asymptotic power 

Co-Author(s)

Maria Dolores Jiménez-Gamero, Universidad de Sevilla
Leonard Santana, North West University

First Author

James Allison, North-West University

Presenting Author

Leonard Santana, North West University

20: Testing Separability of High-Dimensional Covariance Matrices

Due to their parsimony, separable covariance models have been popular in modeling matrix-variate data. However, the inference from such a model may be misleading if the population covariance matrix is actually not separable. This suggests the use of statistical tests of covariance separability. Likelihood ratio tests have tractable null distributions and good power when the sample size $n$ is not less than the number of variables $p$, but are not well-defined otherwise. Other existing separability tests for the $p>n$ case have low power for small sample sizes, and have null distributions that depend on unknown parameters, preventing exact error rate control. To address these issues, we propose novel invariant tests leveraging the core covariance matrix, a complementary notion to a separable covariance matrix. We show that testing separability of a covariance matrix is equivalent to testing sphericity of its core component. Based on this observation, we construct test statistics that are well-defined in high-dimensional settings and have distributions that are invariant under the null hypothesis of separability, allowing for exact simulation of null distributions. We study asymptotic null distributions and show consistency of our tests in a $p/n\rightarrow(0,\infty)$ asymptotic regime. Via simulation studies, we illustrate the large power of our proposed tests as compared to existing procedures.  

Keywords

Core covariance matrix

eigenvalues

hypothesis testing

invariance

separable covariance matrix

separable covariance expansion 

Co-Author

Peter Hoff, Duke University

First Author

Bongjung Sung, Duke University

Presenting Author

Bongjung Sung, Duke University