Wednesday, Aug 6: 8:30 AM - 10:20 AM
4144
Contributed Papers
Music City Center
Room: CC-Davidson Ballroom A3
This session focuses on advances in design and analysis of experiments as well as statistical inference for complex physical systems.
Main Sponsor
Section on Physical and Engineering Sciences
Presentations
The analysis of screening designs is often based on a second-order model with linear main effects, two-factor interactions, and quadratic effects. When the main effect columns are orthogonal to all the second-order terms, a two-stage analysis may be conducted starting with fitting a main effect only model. A popular technique to achieve this orthogonality is to take any design and append its foldover runs. In this talk, we show that this foldover technique is even powerful than originally thought because it also includes opportunities for unbiased estimation of the variance either by pure error or lack of fit. We find optimal foldover designs for main effect estimation and other designs that balance main effect estimation and model selection for the important factors. A real life implementation of our new designs involving 8 factors and 20 runs is discussed.
Keywords
Optimal design
Response surface design
Experimental design
High-throughput screening, in which large numbers of compounds are traditionally studied one-at-a-time in multiwell plates, is widely used across many areas of the biological and chemical sciences including drug discovery. To improve the efficiency of these screens, we propose a new class of supersaturated designs that guide the construction of pools of compounds in each well. Because the size of the pools are typically limited by the particular application, the new designs accommodate this constraint and are part of a larger procedure that we call Constrained Row Screening, or CRowS. We introduce the designs and their construction, and study their behavior as a function of the constraint. Via simulation, we show that CRowS is statistically superior to the traditional one-compound-one-well approach as well as an existing pooling method, and as time permits provide results from two separate applications, both related to the search for solutions to antibiotic-resistant bacteria.
Keywords
drug discovery
screening
experimental design
Lasso
Co-Author(s)
Stephen Wright, Miami University
Isaac Williams, Miami University
Richard Page, Miami University
Andor Kiss, Miami University
Surendra Bikram Silwal, Miami University
Maria Weese, Miami University
David Edwards, The Citadel, The Military College of South Carolina
Brian Ahmer, The Ohio State University
Meng Wu, The Ohio State University
Emily Rego, The Ohio State University
Zhihong Lin, The Ohio State University
First Author
Byran Smucker, Henry Ford Health
Presenting Author
Byran Smucker, Henry Ford Health
In this short talk, we survey the literature on binary maximin distance and minimax distance designs for both regular and nonregular designs. For the class of regular 2^(n-p) fractions, we found that all minimum aberration designs with 10 or fewer factors are maximin distance designs with minimum index. For 11 or more factors, there are exceptions to this rule, since there are cases where the dual of a minimum aberration design does not have minimum aberration. For nonregular fractions, we show examples where minimum G-aberration designs perform very poorly with respect to the space-filling properties. Finally, we show how to reduce the computational burden for determining binary minimax distance designs.
Keywords
maximin distance
minimax distance
error-correcting codes
binary design
fractional factorial design
orthogonal array
In this work, we propose an automatic method for the analysis of experiments that incorporates hierarchical relationships between the experimental variables. We use a modified version of the nonnegative garrote method for variable selection which can incorporate hierarchical relationships. The nonnegative garrote method requires a good initial estimate of the regression parameters for it to work well. To obtain the initial estimate, we use generalized ridge regression with the ridge parameters estimated from a Gaussian process prior placed on the underlying input-output relationship. The proposed method, called HiGarrote, is fast, easy to use, and requires no manual tuning. Analysis of several real experiments are presented to demonstrate its benefits over the existing methods.
Keywords
Gaussian process; Generalized ridge regression; Nonnegative garrote; Variable selection.
Gaussian processes (GPs) are popular as nonlinear regression models for expensive computer simulations. Yet, GP performance relies heavily on estimation of unknown kernel hyperparameters. Maximum likelihood estimation (MLE) is the most common tool, but it can be plagued by numerical issues in small data settings. Penalized likelihood methods attempt to overcome optimization challenges, but their success depends on tuning parameter selection. Common approaches select the penalty weight using leave-one-out cross validation (CV) with prediction error. Although straightforward, it is computationally expensive and ignores the uncertainty quantification (UQ) provided by the GP. We propose a novel tuning parameter selection scheme which combines k-fold CV with a score metric that accounts for GP accuracy and UQ. Additionally, we incorporate a one-standard-error rule to encourage smoother predictions in the face of limited data, which remedies flat likelihood issues. Our proposed tuning parameter selection for GPs matches the performance of standard MLE when no penalty is warranted, excels in settings where regularization is preferred, and outperforms the benchmark leave-one-out CV.
Keywords
Gaussian processes
Computer experiments
Penalized likelihood
A time series is second-order stationary if both its mean and covariance structure remain constant over time. Many existing methods test for second-order stationarity, as it is a crucial assumption in the analysis of classical time series and certain stationary nonlinear time series. However, few methods are available to determine whether a time series is semi-stationary. If a time series is semi-stationary, it can be analyzed much more easily than a general non-stationary time series. In this paper, we propose a new time-domain test to assess whether the normalized frequency pattern of a non-stationary time series remains unchanged over time. A robust statistical method is developed, and its asymptotic distribution is derived. A simulation study is conducted to evaluate the finite-sample performance of the proposed method. Finally, we apply the proposed method to vibrational data to assess whether a mechanical system exhibits linear behavior within a certain range of inputs.
Keywords
Dynamics
Periodogram
Spectral method
Robust
Semi-stationary
Vibration data
First Author
Lei Jin, Texas A&M University-Corpus Christi
Presenting Author
Lei Jin, Texas A&M University-Corpus Christi
Searches for new physics involve detecting the presence of a specific signal in data that is contaminated by a background. This is particularly challenging when a reliable description of the background is unavailable. Our aim is to develop a statistical method to test the presence of the signal in the data and estimate the signal proportion even when the background is unknown. Moreover, we carry out the signal search using a single physics dataset generated from the experiments that may or may not contain the signal of interest. Our approach relies on using orthonormal expansion to model the deviation between a proposal density and the unknown data generating density. We propose choosing the proposal density in a way that ensures a conservative estimate of the signal proportion to avoid false discovery. Reliability of this approach is demonstrated through simulation studies, application on realistic simulated data from the Fermi Large Area Telescope and on data from the ATLAS experiment. We also perform a comparative analysis of our method with the so-called "safeguard" method commonly employed in particle physics and explore cases where the latter leads to false discoveries.
Keywords
signal detection
background
orthonormal expansion
false discovery
safeguard
ATLAS experiment