AI-driven Information Extraction from Unstructured Documents to Facilitate Decision Making in Clinical Development
Tuesday, Aug 5: 11:15 AM - 11:35 AM
Invited Paper Session
Music City Center
This project aims to develop an AI-driven automated database to facilitate decision-making in oncology clinical trials. To support the downstream decision-making framework, extensive historical data is needed such as tumor indication, biomarker information, Objective Response Rate (ORR), Overall Survival (OS), and Progression-free Survival (PFS). Currently, such available historical data in-house come from manual data collection, which is inefficient and laborious.
To automate this process, a variable extraction tool was developed that retrieves essential information from various sources, including external websites and internal documents. The tool leverages the recent development in large language models to transform unstructured data into structured data, incorporating key steps such as data pre-processing, context compression, multiple extraction phases, and extraction validation.
This approach ensures high-quality data extraction comparable to human efforts. The presentation will focus on the pipeline for automated data collection.
LLM, genAI, clinical trials, structured database, information retrieval, variable extraction
You have unsaved changes.