05/24/2023: 4:30 PM - 4:35 PM CDT
Lightning
In statistics, truncation is defined as when the values for a given probability distribution are limited to being above or below a specific threshold or are within a specific range, occurring when no information is available for values which are outside of the bounds of truncation. Truncated data can appear in a wide variety of settings, including the fields of reliability and econometrics. In addition, another application of truncated distributions can be the modeling of proportion data. For example, when the arcsine square root transformation is applied to a given proportion p,(〖sin〗^(-1) (√p)), the transformed data can be modeled using a truncated Gaussian distribution, where the region of truncation is 0 ≤ p ≤ π/2. One area of statistical modeling where the truncated Gaussian distribution has been used to model proportion data is small area estimation (SAE). For example, area-level SAE models have been used to model county-level proportions (arcsine square root transformed) of various health outcomes using truncated Gaussian distributions via Markov Chain Monte Carlo (MCMC).
An essential feature of MCMC modeling is determining whether or not the MCMC sample has converged to a stationary distribution. There are several ways to evaluate convergence including graphical (e.g., trace plots, autocorrelation plots, density plots) and statistical (e.g., Geweke, Heidelberger-Welch, Gelman-Rubin, and Raftery-Lewis tests), but there has been limited research into the impact truncation may have on the various methods used to evaluate MCMC convergence. For this work, we will primarily focus on the statistical tests most commonly used in assessing MCMC convergence to determine how the statistical derivations of each MCMC convergence diagnostic are impacted by truncation. In addition, simulations will be used to evaluate how the type and degree of truncation impact statistical tests used to assess MCMC convergence.
MCMC
Small area estimation
Truncation
Geweke, Heidelberger-Welch
Gelman-Rubin, Raftery-Lewis
Presenting Author
John Pleis, National Center for Health Statistics
First Author
John Pleis, National Center for Health Statistics
CoAuthor(s)
Diba Khan
Benmei Liu, National Cancer Institute
Yulei He, National Center for Health Statistics
Van Parsons, National Center for Health Statistics
Bill Cai, National Center for Health Statistics
Target Audience
Mid-Level
Tracks
Practice and Applications
Symposium on Data Science and Statistics (SDSS) 2023