05/01/2025: 4:15 PM - 4:40 PM MDT
Refereed
The increasing adoption of artificial intelligence (AI) across regulatory and healthcare domains highlights its transformative potential in addressing critical public health challenges. The U.S. Food and Drug Administration (FDA) has identified adverse drug event (ADE) detection as a priority area for innovation, as outlined in its strategic initiatives. Timely and accurate identification of ADEs is critical for ensuring patient safety and informing regulatory decisions. However, leveraging the FDA Adverse Event Reporting System (FAERS) for ADE detection remains fraught with challenges, including data heterogeneity, reporting inconsistencies, and scalability issues.
Recent advances in generative AI, machine learning (ML), and large language models (LLMs) offer a promising path forward. A recent study demonstrated the efficacy of fine-tuned LLMs, such as GPT-3.5, in analyzing detailed vaccine adverse event reports in the Vaccine Adverse Event Reporting System (VAERS) (Li et al., 2024). Using 91 annotated reports, the authors developed AE-GPT, a tool for automatically extracting and categorizing adverse events, setting a new benchmark in ADE detection.
Our research builds on this precedent, aiming to enhance ADE detection by fine-tuning LLMs for FAERS datasets. FAERS contains millions of masked case reports spanning 2004 to 2024, with data fields including demographic, administrative, drug, reaction, and patient outcome information. We use embeddings from LLMs to classify case severity and identify features predictive of severity, providing a multi-strata classification scheme for ADE detection. We use logistic regression as a baseline and compare the results to standard ML models including a Random Forest classifier, DB Scan, and XGBoost. Our framework achieved notable results demonstrating the potential of LLMs in processing complex medical data and highlight the ability to enhance early ADE detection.
health surveillance
large language models
machine learning
adverse drug events
Presenting Author
John Riddles, Westat
First Author
Joshua Turner, Westat
CoAuthor(s)
John Riddles, Westat
Julianna Lee, Westat
Jeremy Corry, Westat
Rashi Saluja
Sean Chickery, Westat
Gizem Korkmaz, Westat
Marcelo Simas, Westat
Kevin Wilson, Westat
Tracks
Practice and Applications
Symposium on Data Science and Statistics (SDSS) 2025