11. Large Language Model for Detecting Unreported Cases of Foodborne Illnesses

Conference: Women in Statistics and Data Science 2024
10/16/2024: 4:00 PM - 5:00 PM EDT
Speed 

Description

Foodborne outbreaks pose a serious yet preventable threat to public health, often leading to loss of worker productivity, fatalities, and significant economic impacts. Traditional detection methods typically face delays from the onset of initial infections to the public notification of an outbreak. Recently, the use of Twitter data to identify unreported foodborne illnesses has been explored with advanced models like BERTweet, showing promise yet still exhibiting limitations in accuracy and cost efficiency. This study explores the potential of utilizing large language models to enhance the accuracy and efficiency of early detection of foodborne outbreaks. We developed and assessed the GPT-4's Zero-Shot model and the GPT-4 Few-Shot model to detect cases of unreported foodborne illnesses. The BERTweet model attained an accuracy score of 0.88 and an F1-score of 0.85. The GPT-4 Zero-Shot model achieved an accuracy score of 0.89 and an F1-Score of 0.86. The GPT-4 Few-Shot model achieved an accuracy score of 0.92 and an F1-Score of 0.90. Our results indicate that the GPT-4 Zero-Shot model performs comparably to the BERTweet model with marginal improvements. More notably, our GPT-4 few-shot model demonstrates superior performance over the BERTweet model. Additionally, it does not require extensive human labeling, saving time and money. The application of large language models like GPT-4 provides a more accurate and resource-efficient method for the early detection of foodborne outbreaks, underscoring the significant potential of these models for real-time, precise, and cost-efficient public health surveillance.

Keywords

Machine Learning

Large Language Model

Twitter

Foodborne Disease 

Presenting Author

Sophia Yuan, Parkview High School

First Author

Sophia Yuan, Parkview High School

CoAuthor(s)

Kevin Bui, Parkview High School
Alexis Solorzano, Parkview High School

Target Audience

Expert

Tracks

Knowledge
Women in Statistics and Data Science 2024