Countering Misinformation and Fostering Data-Driven Decision-Making : A Multi-Sector collaborative a

Xiaojing Wang Chair
University of Connecticut
 
Satrajit Roychoudhury Organizer
Pfizer
 
Xiaojing Wang Organizer
University of Connecticut
 
Tuesday, Aug 6: 8:30 AM - 10:20 AM
01419 
Invited Paper Session 
Oregon Convention Center 
Room: CC-C123 

Applied

Yes

Main Sponsor

Stats. Partnerships Among Academe Indust. & Govt. Committee

Co Sponsors

Committee on Applied Statisticians

Presentations

Building Trust: Using Data & Statistics in International Development

Statistics Without Borders (SWB) is a global not-for-profit organization that leads pro bono projects in partnership with international not-for-profit and non-governmental organizations. To date, SWB has completed over 250 projects, with recent projects supporting organizations in Afghanistan, Canada, Haiti, India, Nigeria, the United States, and Zimbabwe. One challenge of these projects is to build trust with our client organizations and the communities we aim to help. We'll walk through two pro bono projects that SWB has completed, emphasizing how we built trust with these clients and communities. We'll also cover systems that SWB has put into place to sustainably build trust for every project. Our goal is that you will come away with specific ideas of how you can build trust in the context of international development. 

Speaker

Matthew Brems, DataRobot

Detecting and Mitigating Algorithmic Bias in Online Misinformation

There is perhaps no bigger issue facing our field right now than that of misinformation – and the advent of tools like ChatGPT has increased this risk. A central reason for this is bias from large language models and how that can lead to misleading and/or incorrect information disproportionately impacting certain communities. NORC is developing a model of online information to better understand how to detect and mitigate bias in such models. The data are focused specifically on the topic of COVID vaccine misinformation, which the study team chose because of the strong historical record of misinformation across social media platforms and issues related to health equity. NORC collected more than 10 terabytes of data from across Twitter and Instagram from 2020 to 2023. The study team hand-coded a training sample, building upon several different open-source misinformation indexes, and then trained and deployed the model. This presentation will share learnings from this process; share the model developed; and finally, describe the learnings gleaned from the process related to bias in LLM development.  

Co-Author(s)

Joshua Y. Lerner, NORC at the University of Chicago
Chandler Carter, NORC at the University of Chicago
Erin Cutroneo, NORC at the University of Chicago
Hy Tran, NORC at the University of Chicago
Sara Lafia, NORC at the University of Chicago
Amelia Burke-Garcia, NORC at the University of Chicago

Speaker

Brandon Sepulvado

Guarding Against Misinformation Produced in Generative AI Models


The quality of output from Generative AI models is limited by the quality of data it uses to train itself. Input data which is inaccurate, outdated, or incomplete can lead to bad output or hallucinations, where the model confidently asserts that a falsehood is real. We discuss challenges in the estimation of Generative AI models which can cause misinformation including inheriting biases present in the training data and producing outputs that are plausible but fundamentally incorrect or nonsensical. We also discuss mitigation strategies such as the curation of training data, meticulous algorithm design, and continuous monitoring to minimize biases. Additionally, we present an illustrative example on establishing mechanisms for rigorous model evaluation and quality control.

 

Speaker

Ginger Holt, Databricks

Statistical Leadership in Mitigating Potential Patient Harm in Clinical Trials.

In a six-year Phase III non-inferiority trial, we evaluated dietary interventions for neutropenic patients: a liberalized diet versus a standard one. Our primary goal was to compare the incidence of major infections. Two interim analyses (IAs) were prespecified to guide the continuation or termination of the study. The 1st IA in 2021 allowed the study to continue with some reservations due to infection rates closely approaching the prespecified margin of 10% difference. Despite the p-value from the IA, which suggested the continuation of patient enrollment based on the primary endpoint, we exercised caution. In a proactive move, our statistical team proposed an earlier second IA than the original plan. The results from the newly added IA in Spring 2023 indicated that the infection rate exceeded the threshold, leading to an immediate halt in patient enrollment and trial termination in June 2023. This presentation underscores the vital role of statistical leadership in multidisciplinary research involving multiple stakeholders, showcasing how statisticians prioritized patient care, made data-driven decisions, provided clear recommendations, and influenced clinical trial outcomes. 

Speaker

Ji-Hyun Lee, University of Florida

The University of North Carolina and Merck Statistics Research Collaboration: A Win-Win Partnership

Over the last 15+ years, the University of North Carolina Department of Biostatistics and Merck's Department of Biostatistics and Research Decision Science (BARDS) have collaborated closely on a number of important statistical research projects. These projects have resulted in significant methodological contributions to the statistical field with applications to real-world problems in pharmaceutical R&D. These cover a wide scope of research topics: from frequentists to Bayesian approaches, from trial level design to aggregate/meta analyses, and from traditional statistical modelling to machine learning methodologies. The areas of pharmaceutical applications include, just naming a few examples, safety signaling and evaluation, the efficacy evaluation of innovative medicines (eg lipid lowering therapies) and post-marketing assessment of rare but serious safety events in the Rotavirus Vaccine. Case studies will be highlighted during the presentation. This collaboration has won the 2023 SPAIG Award.  

Co-Author

William Wang, BARDS, Merck Research Labs

Speaker

Amarjot Kaur, Merck & Co., Inc.