Tuesday, Aug 6: 2:00 PM - 3:50 PM
1717
Topic-Contributed Paper Session
Oregon Convention Center
Room: CC-F151
Applied
Yes
Main Sponsor
Government Statistics Section
Co Sponsors
International Indian Statistical Association
Survey Research Methods Section
Presentations
The complexity of survey data and the availability of data from auxiliary sources motivate researchers to explore estimation methods that extend beyond traditional survey-based estimation. The U.S. Centers for Disease Control and Prevention's Behavioral Risk Factor Surveillance System (BRFSS) collects a wide range of health information. While the BRFSS focuses on state-level estimation, there is demand for county-level estimation of health indicators using BRFSS data. A hierarchical Bayes small area estimation model is developed to combine county-level BRFSS survey data with county-level data from auxiliary sources, while accounting for various sources of error and nested geographical levels. To mitigate extreme proportions and unstable survey variances, a transformation is applied to the survey data. Model-based county-level predictions are constructed for prevalence of having a personal doctor for all the counties in the U.S., including those where BRFSS survey data were not available. An evaluation study using only the counties with large BRFSS sample sizes to fit the model versus using all the counties with BRFSS data to fit the model is also presented.
The small-area framework we consider includes instances where data is available from probability-based surveys as well as non-probability samples, and where the sample sizes from these two groups may be extremely imbalanced. We present a Bayesian algorithm and some alternatives for small-area prediction in this context. Several technical and algorithmic advancements related to the proposed technique lead to considerable broadening of the scope of using data with selection bias. We present theoretical advancements as well as results from numeric studies.
Recent proliferation of computers and the internet has opened new opportunities for collecting and processing data. However, such data are often obtained without a well-planned probability survey design. Such non-probability based samples cannot be automatically regarded as representative of the population of interest. Several methods for estimation and inferences from non-probability samples have been developed in recent years. The methods assume that non-probability sample selection is governed by an underlying latent random mechanism. The basic idea is to use information collected from a probability ("reference") sample to uncover latent non-probability survey participation probabilities (also known as "propensity scores") and use them in estimation of target finite population parameters. In this paper, we review several recently developed methods for estimation of non-probability survey participation probabilities. We compare theoretical properties of recently published methods to estimate survey participation probabilities and study their relative performances in simulations.
Motivated by Census Bureau research on re-weighting of American Community Survey 1-year estimates, this talk considers evaluation of effectiveness of weighting-adjustment based on short time series of successive weighted 1-year estimates of selected outcome variables made at national, state and county level. The variables to which this method is applied must be carefully selected to be stable and smoothly varying in the population, at each geographic level, from external subject-matter knowledge. The talk will show how to build an evaluation metric from this idea and to establish mathematical properties of the metric under ideal conditions of correct and misspecified weighting, and will illustrate the metric for several different proposed weighting schemes for ACS estimates for the years 2018 through 2021.
Classified mixed model prediction (CMMP) is a new method that has embedded the traditional mixed model prediction (MMP) with a modern flavor. In this work, we consider estimation of the mean squared prediction error (MSPE) of CMMP. A recently proposed Sumca method is implemented. Sumca combines analytic and Monte-Carlo approaches, leading to a second-order unbiased estimator of the MSPE. Performance of Sumca is investigated via simulation studies, and comparisons are made with alternative methods. The simulation study shows that a brute-force bootstrap method performs almost as well as Sumca, while a naive approach and a Prasad-Rao estimator at the matched index are significantly inferior to Sumca. A real-data application is considered. This work is joint with Jiming Jiang of the University of California, Davis, USA and J. Sunil Rao of the University of Miami, USA.