Monday, Aug 5: 10:30 AM - 12:20 PM
1726
Topic-Contributed Paper Session
Oregon Convention Center
Room: CC-258
Applied
Yes
Main Sponsor
Section on Statistical Graphics
Co Sponsors
Government Statistics Section
Scientific and Public Affairs Advisory Committee
Social Statistics Section
Presentations
The Census Bureau strives to serve as the leading source of quality data about the nation's people and economy. The Census Statistical Quality Standards ensures the utility, objectivity, and integrity of the statistical information provided by the Bureau to Congress, to federal policy makers, to sponsors, and to the public. Visualizations created by the Census Bureau need to meet these high standards like other Census statistical products. Yet, creating visualizations that are clear and concise are often viewed as incompatible with the requirement to include information on statistical uncertainty. The additional information required can clutter a visualization and confuse people who are unfamiliar with measure of statistical significance. This presentation will discuss the graphical representations of uncertainty and conditional relationships, to (i) help analysts communicate principal results – and limitations on those results – in ways that are clear and resonate with a wide range of stakeholder groups; and (ii) reduce problems related to blurred, exaggerated or outright misleading interpretations of statistical analyses.
Dimension reduction techniques, such as principal component analysis, have been a staple of multivariate exploratory data analysis, modeling, and visualization for more than 100 years. The size and complexity of data are often the main factors that pose challenges to effective exploration. This challenge has been exacerbated in recent decades, as the cost to store data has consistently decreased, and in most fields of study instrumentation and data collection become increasingly cost efficient. In response, more complex and sophisticated dimension techniques, such as uniform manifold approximation and project, have been introduced by the research community. While these techniques provide value, they often lack interpretability and transparency. Here, we introduce several visualization tools to allow decision-makers and data consumers to investigate large datasets in detail. We demonstrate applications of these tools, focusing on the use of Tufte's small multiples concept, coupling these with metrics to aid in data exploration, and linked interactive plots. We will further discuss practical mechanisms for sharing and disseminating data visualizations generated by these tools.
Speaker
Lisa Bramer, Pacific Northwest National Laboratory
Linked micromaps were developed to display geographically indexed statistics in an intuitive way by linking them to a sequence of small maps. The approach integrates several visualization design principles, such as small multiples, discrete color indexing, and ordering. Linked micromaps allow for other types of data displays that are connected to geography, including scatterplots, boxplots, time series plots, confidence intervals, and more. Initial applications of micromaps used data from the National Cancer Institute and the Environmental Protection Agency. In this presentation, we will show how linked micromaps can be used to better understand and explore relationships and distributions of statistics linked to US states and DC. We will compare linked micromaps with other popular data displays, such as bubble charts, choropleth maps, and bar charts. We will illustrate how linked micromaps can be used for evidence-based decision-making using data from the Bureau of Labor Statistics (e.g., Quarterly Census of Employment and Wages, Occupational Employment and Wage Statistics) and the Census Bureau (e.g., Building Permits Survey, Community Resilience Estimates).
The Census Bureau's Community Resilience Estimates (CRE) provide estimates of socioeconomic vulnerability to disasters for every county and Census tract in the United States. To produce the CRE, Census Bureau statisticians employ small area estimation techniques to generate counts and percentages of the population with specific vulnerabilities such as physical disability, advanced age, and access to vehicles and broadband internet. The primary audience for the CRE is diverse and includes public and private sector data users, state and local governments, members of the media, and federal partner agencies involved in emergency management. The data dissemination strategy for the CRE must meet the needs of this diverse constituency, striking a balance between rich, in-depth data and ease of use. To meet this challenge, the CRE program maintains a public-facing interactive tool that distills complex information on U.S. communities into a simple risk classification scheme. This presentation will describe the motivations and design choices for the interactive data visualization(s) associated with the CRE along with emerging research on socioeconomic vulnerability to disasters.
Price indices that place quantity information on the same time scale as prices enable them to capture real-time phenomena such as substitutions, and evolving consumer preferences more generally. But, because BLS quantity information (obtained from households) comes with an approximately one-year lag, it is not possible to produce such an index until this year passes. To provide a timely statistic, BLS issues a preliminary version of the index, currently using a constant elasticity of substitution model. This approach applies a fixed level of substitution across all items, areas, and months. We propose instead to forecast the weights needed by the index via a seasonal multivariate time series model of item-area expenditure shares and incorporate the item-area prices, which are presumed to drive much of substitution behavior.
The choice of time series model and its parameters has a profound effect on the behavior of the index. Visualization plays an important role in this selection. To choose the model that best captures the dynamic consumer behavior underlying the changes in the index, graphs displaying how parameter choices affect the fit of the index are necessary.