Foundational Estimation and Inference Techniques: New Developments and Applications

Hanxuan Ye Chair
Texas A&M University
 
Wednesday, Aug 6: 10:30 AM - 12:20 PM
4181 
Contributed Papers 
Music City Center 
Room: CC-Davidson Ballroom A2 

Main Sponsor

International Chinese Statistical Association

Presentations

Assigning Important Factors to Good Design Columns for Better Experimental Analysis

Traditionally, design criteria related to effect aliasing are commonly used to evaluate and select optimal designs of the same size in factorial experiments. However, limited attention has been given to the assignment of experimental factors to design columns once an optimal design is selected. This study introduces a modified version of the Summary of Effect Aliasing Structure (SEAS) to assess the severity of aliasing for each design column relative to others within the design matrix. In addition, a systematic approach is proposed to guide the assignment of important experimental factors to design columns with minimal aliasing. Simulation results demonstrate improved performance in experimental analysis when factors are assigned using the proposed method. 

Keywords

Summary of Effect Aliasing Structure (SEAS)

supersaturated design (SSD)

factorial designs

factor assignment 

Co-Author(s)

Frederick Kin Hing Phoa, Academia Sinica
Dave C. Woods, Southampton Statistical Sciences Research Institute, University of Southampton

First Author

Yi-Hua Liao, Institute of Statistics, National Tsing Hua University

Presenting Author

Frederick Kin Hing Phoa, Academia Sinica

Characterizing and comparing order-of-addition orthogonal arrays

Given a set of parameters, several non-isomorphic order-of-addition orthogonal arrays can be generated to design an order-of-addition experiment. Under resource constraints, selecting the best from these candidate designs for the experiment can be practical to extract as much information as possible from the observed data. In this talk, I will introduce a series of numerical indices called centralized generalized wordlength pattern to characterize and compare order-of-addition orthogonal arrays. First, the J-characteristics will be justified for pairwise order matrices. Next, the centralized generalized wordlength pattern will be defined based on the sums of squared differences between the normalized J-characteristics of the pairwise order matrices determined by the fractional and full designs. Some theoretical and computational results will also be introduced for future work. 

Keywords

Experimental design

Hadamard matrix

Hamming distance

Inversion

J-characteristic

Projection property 

First Author

Shin-Fu Tsai, National Taiwan University

Presenting Author

Shin-Fu Tsai, National Taiwan University

Fast approximation of Shapley values through factorial designs

Shapley value is a well-known concept in cooperative game theory which provides a fair way to distribute the revenues or costs to each player. Recently, it has been widely applied in data science for data quality evaluation and model interpretation. There are also other applications beyond economics such as marketing and biology. However, the computation of the Shapley value is an NP-hard problem. For a cooperative game with $n$ players, calculating Shapley values for all players requires calculating the values for $2^n$ different coalitions, which makes it infeasible for a large $n$. \revB{In this paper, we find that the value function of a cooperative game can be viewed as the expected response of a two-level factorial experiment. Based on this perspective, we derive a factorial effects representation of the Shapley value. Then, a fast approximation approach for Shapley values based on fractional factorial designs is proposed.} Under certain conditions, our approach can obtain true Shapley values by calculating values of fewer than $4n^2-4$ different coalitions. Generally, highly accurate approximations of Shapley values can also be obtained by calculating values of additional 

Keywords

interoperable artificial intelligence

design of experiments

computation

game theory 

Co-Author

Robert Mee, University of Tennessee

First Author

Zheng Zhou, Beijing University of Technology

Presenting Author

wei zheng

Generalized Method of Moments Approaches for Analyzing Recurrent Event Data

In this presentation, we introduce generalized method of moments (GMM) approaches for analyzing recurrent event data with informative censoring. Our framework employs a shared frailty model to account for the correlation between the recurrent event process and censoring time, allowing the frailty variable to be covariate-dependent. Unlike traditional shared-frailty proportional intensity models, our approach is based on rate models, enabling non-proportional rate functions across different covariate groups over time. The proposed GMM methods are robust, as they do not rely on Poisson process assumptions for recurrent events or specific distributional assumptions for frailty and censoring times. We establish the large-sample properties of our methods and evaluate their finite-sample performance through extensive simulation studies. Finally, we apply the proposed methods to a real dataset. 

Keywords

Generalized method of moments

Recurrent event data

Informative censoring

Covariate-dependent frailty 

First Author

Yu-Jen Cheng, National Tsing Hua University, Taiwan

Presenting Author

Yu-Jen Cheng, National Tsing Hua University, Taiwan

Principal Components Decomposition of Fraction of Variance in High Dimensional Linear Models

The fraction of variance explained (FVE) in a linear model quantifies the extent to which predictors account for outcome variability. In high dimensional settings, where traditional FVE estimators do not apply, modern FVE estimators such as GWASH struggle with strong correlation among predictors, often found, for example, in brain imaging data. We propose a decomposition framework that partitions the FVE into two components: a low dimensional component capturing the strong correlation, estimable by low dimensional methods, and a high dimensional component with remaining weak correlation, estimable by high dimensional methods. Simulations demonstrate that decomposition of dominant PCs improves bias reduction in FVE estimation compared to standard approaches, such as GCTA. Our method shows consistent performance asymptotically. Application to the Adolescent Brain Cognitive Development (ABCD) dataset validates its real-world applicability, capturing nuanced heritability signals in high-resolution brain imaging data. This work offers a robust framework for unbiased FVE estimation in high-dimensional models. 

Keywords

The fraction of variance explained (FVE)

brain imaging

high dimensional

principal component decomposition 

Co-Author(s)

Chun Chieh Fan, Laureate Institute for Brain Research
David Azriel, Technion
Armin Schwartzman, University of California, San Diego

First Author

Man Luo, UC San Diego, Department of Family Medicine & Public Health

Presenting Author

Man Luo, UC San Diego, Department of Family Medicine & Public Health

Unified robust estimation

Robust estimation is primarily concerned with providing reliable parameter estimates in the
presence of outliers. Numerous robust loss functions have been proposed in regression and
classification, along with various computing algorithms. In modern penalised generalised
linear models (GLMs), however, there is limited research on robust estimation that can
provide weights to determine the outlier status of the observations. This article proposes
a unified framework based on a large family of loss functions, a composite of concave
and convex functions (CC-family). Properties of the CC-family are investigated, and CC-
estimation is innovatively conducted via the iteratively reweighted convex optimisation
(IRCO), which is a generalisation of the iteratively reweighted least squares in robust
linear regression. For robust GLM, the IRCO becomes the iteratively reweighted GLM.
The unified framework contains penalised estimation and robust support vector machine
(SVM) and is demonstrated with a variety of data applications. 

Keywords

robust

MM algorithm

variable selection

SVM

iteratively reweighted

GLM 

First Author

Zhu Wang

Presenting Author

Zhu Wang