Batch effect mitigation via stratification for survival risk prediction using transcriptomic sequencing data

Ai Ni Speaker
 
Monday, Aug 4: 3:25 PM - 3:45 PM
Topic-Contributed Paper Session 
Music City Center 
Survival risk prediction is an important task in clinical cancer research. By its virtue of simultaneously measuring the transcription of thousands of markers, transcriptomic sequencing holds the potential for predicting survival risk based on patients' transcriptomic profiles. Like many high-throughput platforms, transcriptomic sequencing suffers from the ubiquitous presence of batch effects. We previously developed BATch MitigAtion via stratificatioN (BatMan) method to adjust for batch effects in transcriptomic microarray data. In this study, we extend BatMan to sequencing data. The discrete nature of the sequencing count data and the presence of sequencing depth variation make it challenging to simulate batch effects. We use a Gamma-Poisson model to introduce batch effects to expression data and extensively assess the performance of BatMan in comparison with ComBat-Seq and sequencing depth normalization methods. We found that BatMan outperforms Combat-Seq in all simulation scenarios. We applied BatMan to a dataset from sarcoma patients at Memorial Sloan Kettering Cancer Center to demonstrate its performance in survival risk prediction with real-world data.