Tuesday, Aug 5: 2:00 PM - 3:50 PM
0489
Invited Paper Session
Music City Center
Room: CC-202B
Generative AI
Return on Investment
Machine learning
Large Language Models
Applied
Yes
Main Sponsor
Government Statistics Section
Co Sponsors
General Methodology
Survey Research Methods Section
Presentations
The USDA's National Agricultural Statistics Service (NASS) has integrated traditional AI into a number of its production processes. Privacy Preserving Record Linkage (PPRL) using Natural Language Processing (NLP) provides the foundation for integrating survey and non-survey data. Response propensity models have been used to inform sampling and data collection. Traditional AI methods are utilized in the development of geospatial products. For example, the Cropland Data Layer (CDL) displays where each of 114 crops are grown across the contiguous U.S. each year, and it forms the foundation for identifying the impacts of natural disasters on agriculture. Other models based on high-order Markov Chains and neural networks are used to predict what crops will be planted for an upcoming growing season, and maps of uncertainty are based on normalized Shannon entropy. Some machine learning models provide insights for imputation. Other models provide the foundation for producing official statistics. NASS has hundreds of programs, many of which were written in code that is no longer supported, not recommended for use, or too expensive. Although the agency does not have the resources to pay someone to convert the code to a more modern language, generative AI is a feasible solution. The progress that NASS has made in adopting traditional and generative AI methods and the future of generative AI within the agency will be discussed.
This presentation explores several incremental steps taken to integrate AI and machine learning into the National Health Interview Survey program, assessing their impact on the efficiency of the survey's processes. Examples will span most major survey stages and may include developing questions to ask about new concepts, providing initial translations from English to Spanish, streamlining the coding of responses to classify health insurance coverage types, migrating a data processing codebase while maintaining functionality, optimizing sample weighting and nonresponse adjustments, and utilizing AI-driven interpretation of results. By examining these examples, this presentation highlights the benefits and challenges associated with incorporating AI into health survey programs, and how the culture of a federal statistical agency and the weight of tradition can slow acceptance of new technologies.
Keywords
Efficiency
Health surveys
Translations
Coding
Insurance coverage
Data weighting
The Department of Homeland Security is exploring ways to take advantage of new AI tools to improve its ability to meet its mission to protect public security. When the Department is developing and using AI tools internally, it requires intentionally care to make sure that these uses are free from bias, maximize transparency for the public, and meet policies and best practices. When external AI applications and tools like commercial generative AI produce answers for the public that touch on the Department's mission spaces, they should be fed with a healthy diet of accurate, objective information. This means making sure that the Department's own data is palatable and digestible for commercial AI products.
Keywords
AI
AI ready
machine understandable
Department of Homeland Security
Office of Homeland Security Statistics
DHS