Statistical Analysis Plan (SAP) Automation with Generative AI

Sheraz Khan Co-Author
 
Giannis Manousaridis Co-Author
Pfizer, inc.
 
Griffith Bell Co-Author
 
Anna Plotka Co-Author
 
Amy Lauren Ashworth Co-Author
Pfizer, inc.
 
Donal Neilson Gorman Co-Author
Pfizer, inc.
 
Feng Dai Co-Author
Pfizer
 
Lili Jiang Co-Author
Pfizer, inc.
 
Yi-Chien Lee Co-Author
Pfizer, inc.
 
Gordon Siu Co-Author
Pfizer, inc.
 
Alexandra Thiry Co-Author
Pfizer, inc.
 
Subha Madhavan Co-Author
 
Richard Zhang Co-Author
Pfizer
 
Birol Emir Co-Author
 
Rogier Landman First Author
 
Rogier Landman Presenting Author
 
Monday, Aug 4: 2:05 PM - 2:20 PM
1738 
Contributed Papers 
Music City Center 
A Statistical Analysis Plan (SAP) is typically written by statisticians and is based primarily on the study protocol, sometimes taking several weeks to finalize. We present our efforts creating tools harnessing generative AI to augment and automate SAP writing.

A key step in SAP writing involves parsing the protocol and mapping its information to the appropriate sections. This information may be copied, summarized, split or fused with additional sources. Protocols are ingested into a knowledge graph. Large Language Models (LLMs) play a role at various stages. A secure Pfizer API is used to query GPT-o1, while vector search and Retrieval-Augmented Generation (RAG) are used as well.

Early versions of our automation tools have been deployed for Non-Interventional, Clinical Pharmacology, and Interventional SAPs. Subject Matter Expert (SME) statisticians have played a pivotal role in development and evaluation of these tools. Our evaluation framework is focused on truthfulness, eloquence and reasoning, and consists of automated and human metrics. User feedback indicates that the automation has successfully increased efficiency.

Keywords

Generative AI

Statistical Analysis Plan

Clinical Trials

Regulatory documents

Automation

Writing 

Main Sponsor

Biopharmaceutical Section