Data-Driven Decision Making: Because Your Gut Isn't a Reliable Source

Leslie Moore Chair
 
Tuesday, Aug 5: 2:00 PM - 3:50 PM
4127 
Contributed Papers 
Music City Center 
Room: CC-103B 

Main Sponsor

Quality and Productivity Section

Presentations

Adaptive Design and Inference for Step-Stress Accelerated Life Testing

The adaptive step-stress accelerated life test (ada-ssALT) was developed to address several practical shortcomings of the conventional simple step-stress ALT (ssALT). While ada-ssALT demonstrates superior performance over ssALT in terms of estimate bias and precision, particularly when the lifetime of experimental units follows an exponential distribution, the constant hazard function of the exponential model restricts its applicability in real-world scenarios. To overcome this limitation, the log-location-scale family, which includes widely used distributions such as Weibull, log-normal, and log-logistic, provides greater flexibility through the incorporation of a shape parameter. This study extends ada-ssALT to a generalized form, allowing the test unit's lifetime at each stress level to follow a log-location-scale distribution. Here we present the model formulation, maximum likelihood estimation, and derivation of the information matrix, assuming a linear relationship between the standardized stress level and the location parameter. A simulation study compares the performance of ada-ssALT with ssALT across various design criteria. 

Keywords

accelerated life tests

adaptive design

Fisher information

maximum likelihood estimator

step-stress loading

Type-I censoring 

View Abstract 903

Co-Author

Haifa Ismail-Aldayeh, University of Texas at San Antonio

First Author

David Han, University of Texas at San Antonio

Presenting Author

David Han, University of Texas at San Antonio

Revisiting the np(x) Control Chart: Performance Insights and the Impact of Parameter Estimation

The np(x) control chart, introduced by Wu et al. (2009), was designed to monitor the mean of a continuous variable using attribute inspection. While this approach offers advantages such as simplicity and cost-effectiveness, our re-examination reveals inconsistencies in prior performance assessments and provides new insights. Specifically, we demonstrate that for a two-sided np(x) chart to outperform the traditional Xbar chart, the required sample size must be significantly larger than previously recommended. Additionally, in practice, control charts are typically designed using estimated parameters, yet prior studies on the np(x) chart assume known parameters. We extend the analysis to this more realistic setting, showing that parameter estimation inflates the in-control average run length (ARL0), leading to a higher-than-expected false alarm rate. Through theoretical derivations and numerical studies, we identify conditions where the np(x) chart remains competitive and propose strategies to mitigate estimation effects. These findings refine our understanding of attribute-based control charts for mean monitoring and offer practical guidance for their implementation. 

Keywords

Parameter Estimation

False Alarm Rate

Average Run Length

Attribute control charts

np(x) chart Control Chart 

View Abstract 1918

Co-Author(s)

Mariana Oliveira, São Paulo State University (UNESP)
Marcela Machado, São Paulo State University (UNESP)
Subhabrata Chakraborti, The University of Alabama

First Author

Felipe Schoemer Jardim, Fluminense Federal University (UFF)

Presenting Author

Felipe Schoemer Jardim, Fluminense Federal University (UFF)

What is Analytic Fluency? A Thematic Content Analysis of Interviews with Expert Data Analysts

It is common sense that data should be analyzed well rather than badly. Despite this, the actual criteria by which we judge the quality of an analysis are opaque, intuitive, and heavily influenced by the uncertain standards of disciplinary norms, routines, or subjective judgments of what "feels right" or "seems off". This lack of explicit criteria is problematic not just for analysts facing real challenges in their work, but also for hiring, program evaluation, and teaching. Indeed, many analysts report that their training left them unprepared for the challenges faced in real-world analytic settings. To better understand what good analysis looks like, we conducted a qualitative study using grounded theory methodology in a sample of highly experienced analysts from diverse professional backgrounds. Our aim was to more explicitly identify the content of what we call analytic fluency, or the "soft skills" of data analysis used in real-world settings. Our analysis uncovered 5 rich, higher-order themes (i.e., families of skills) along with 11 lower-order sub-themes. We present these findings and consider their implications for data analysis practice. 

Keywords

analytic fluency

data analysis practice

psychology of data analysis

data analysis expertise

industrial-organizational psychology

data science 

View Abstract 2328

Co-Author

Roger Peng, University of Texas, Austin

First Author

Matthew Vanaman, University of Texas at Austin

Presenting Author

Matthew Vanaman, University of Texas at Austin

Optimal Sparse Projection Design for Systems with Treatment Cardinality Constraint

Modern experimental designs often face the so-called treatment cardinality constraint, which is the constraint on the number of included factors in each treatment. Experiments with such constraints are commonly encountered in engineering simulation, AI system tuning, and large-scale system verification. This calls for the development of adequate designs to enable statistical efficiency for modeling and analysis within feasible constraints. In this work, we propose an optimal sparse projection (OSP) design for systems with treatment cardinality constraints. We introduce a tailored optimal projection (TOP) criterion that ensures a good space-filling properties in subspaces and promotes orthogonality or near-orthogonality among factors. To construct the proposed OSP design, we develop an efficient construction algorithm based on orthogonal arrays and employ parallel-level permutation and expansion techniques to efficiently explore the design space with treatment cardinality constraints. Numerical examples demonstrate the merits of the proposed method. 

Keywords

Experimental designs

Space-filling design

Orthogonal arrays

Constraint space

Treatment constraint 

View Abstract 2129

Co-Author(s)

Ryan Lekivetz, JMP
Xinwei Deng, Virginia Tech

First Author

Kexin Xie, Virginia Tech

Presenting Author

Kexin Xie, Virginia Tech

Joint Modeling of Disengagement and Collision Events from Autonomous Driving Study

As the popularity of artificial intelligence (AI) continues to grow, AI systems have become increasingly embedded into various aspects of daily life, transforming industries and lifestyles. One of the typical applications of AI systems is autonomous vehicles (AVs). In AVs, the relationship between the level of autonomy and safety is an important research question to answer, which can lead to two types of recurrent events data being recorded: disengagement and collision events. This paper proposes a joint modeling approach with multivariate random effects to analyze these two types of recurrent events data. The proposed model captures the intercorrelation between the levels of autonomy and safety in AVs. We apply an expectation-maximization (EM) algorithm to obtain the maximum likelihood estimates for the functional form of the fixed effects, variance-covariance components, and the tuning parameter for the penalty term. This proposed joint modeling approach can be useful for modeling recurrent events data with multiple event types from various applications. We analyze disengagement and collision data from California's AV testing program to demonstrate its application. 

Keywords

AI reliability

AI robustness

Correlated frailty

EM algorithm

Recurrent events data

Survival models 

View Abstract 1944

Co-Author(s)

Jared Clark
Jie Min
Yili Hong

First Author

Simin Zheng

Presenting Author

Simin Zheng

High-dimensional Quickest Change Detection with Adaptive Window-Based Subset Estimation

A large scale multichannel sequential detection is considered, where an event occurs at some unknown time and affects the distributions of an unknown subset of independent data streams, possibly at a different time each of them. The goal is to detect this change as quickly as possible for any possible affected set of streams, while controlling the false alarm rate. A computationally scalable adaptive CuSum procedure is proposed. Its performance is analyzed in various high-dimensional regimes where the number of streams, the unknown number of affected streams, and the unknown delays in the emergence of the change all go to infinity as the false alarm rate goes to zero. Analytically, it is compared favorably to existing schemes in the literature with similar computational complexity and it is shown to enjoy various kinds of asymptotic optimality properties in certain sparse and moderately high dimensional regimes. Finally, performance of the proposed procedure for a Gaussian mean-shift problem is compared with other methods in a simulation study. 

Keywords

High-dimensional

Sequential change detection

Adaptive

CuSum

Multistream

Window-based 

View Abstract 2210

Co-Author

Georgios Fellouris, University of Illinois, Urbana-Champaign

First Author

Arghya Chakraborty, University of Illinois Urbana Champaign

Presenting Author

Arghya Chakraborty, University of Illinois Urbana Champaign