Anomaly detection through mixture of agents Generative AI

Yuxi Zhao Speaker
Pfizer
 
Sunday, Aug 3: 4:05 PM - 4:25 PM
Topic-Contributed Paper Session 
Music City Center 
In clinical trials, ensuring the quality and validity of data for downstream analysis and results is paramount, thus necessitating thorough data monitoring. In review of data monitoring strategies, targeted monitoring is appealing for utilizing key risk indicators and statistical monitoring to identify potential issues or anomalies in the data and aiming for real-time remediation of potential errors based on critical risk assessments rather than passive monitoring of past events. However, the majority tools for risk-based monitoring primarily concentrate on overseeing and managing data entry errors and alterations and being descriptive in nature e.g. TargeteCRF (Mitcheletal.,2011). Similarly, the available tools for the statistical monitoring (e.g., Bauer and Johnson (2000), JM et al.(2001), Carstensen et al.(2024)) are mostly being descriptive in nature as well, which may not serve our purpose well. Advances in AI/ML provides powerful techniques for feature/subgroup characterizations and pattern recognition, which can be potentially utilized to identify anomalous patterns for single endpoints, multiple endpoints/multi-modal data collectively, or temporal data. This project is to utilize advances in AI/ML area for anomaly detection and advocate adaptation from AI/ML to monitoring strategy. We'll examine popular generative AI methods like autoencoder, and generative adversarial network in this project. If multiple agents involved such as above, we can apply Bayesian ensemble model for combining all the results to provide a reliable prediction on the classification of anomaly data and potentially assess the overall risk factors like site. Furthermore, with pseudo labels generated, the counterpart of deep neural network under Bayesian framework – Bayesian neural network (BNN) – can be implemented for self-training and automatic classification.