Using Data for Evidence Building: incorporating AI tools and techniques

Lisa Mirel Speaker
National Science Foundation, National Center for Science and Engineering Statistics
 
Tuesday, Aug 5: 11:15 AM - 11:35 AM
Topic-Contributed Paper Session 
Music City Center 
Linking data from disparate sources supports using data for evidence building. In March of 2023, the White House Office of Science and Technology Policy released the National Strategy to Advance Privacy Preserving Data Sharing and Analytics noting several key strategic priorities, including "cultivating an ecosystem that promotes a timely translation of theoretical results into real-world implementation and deployment." As part of the National Secure Data Service (NSDS) Demonstration Project, NCSES and its partners are deploying and evaluating PPRL tools to inform efforts for developing a shared services ecosystem. The results of this project will inform ways to streamline and innovate data sharing and linking across sources since PPRL technology supports linking individual data records without exposing personal information. The NSDS is currently evaluating two PPRL tools, including a commercial tool, HealthVerity, and an open source, Anonlink, python application. These evaluation projects will result in linked data files that provide the opportunity to implement machine learning models that could help minimize bias in linkage results. Background information on the NSDS Demonstration project and an overview of the PPRL process will be provided. Techniques to use machine learning to optimize the utility of the linked data will be described followed by a summary of the initial findings and a discussion of the next steps.