Ensemble Models for Differential Analysis

Erina Paul Co-Author
 
Piyali Basak Co-Author
Merck & Co.
 
Ziyu Liu Co-Author
Cornell University
 
Jialin Gao Co-Author
Cornell University
 
Arinjita Bhattacharyya Co-Author
Merck & Co., Inc.
 
Chitrak Banerjee Co-Author
Michigan State University
 
Himel Mallick Co-Author
Cornell University
 
Suvo Chatterjee First Author
Indiana University, Bloomington
 
Arinjita Bhattacharyya Presenting Author
Merck & Co., Inc.
 
Monday, Aug 4: 2:35 PM - 2:50 PM
1808 
Contributed Papers 
Music City Center 

Description

Inspired by ensemble models in machine learning, we propose a general framework for aggregating multiple diverse base models to boost the power of published differential association analysis (DAA) methods. We demonstrate this approach by augmenting popular DAA models with one or more biologically motivated alternatives. This creates an ensemble that bypasses the challenge of selecting an optimal model but instead combines the strengths of complementary statistical models to achieve superior performance. Our proposed ensemble learning approach is platform-agnostic and can augment any existing DAA method, providing a general and flexible framework for various downstream modeling tasks across domains and data types. We performed extensive benchmarking across both simulated and experimental datasets from single-cell to bulk ribonucleic acid sequencing (RNA-Seq) to microbiome profiles, where the ensemble strategy vastly outperformed non-ensemble methods, identified more differential patterns than the competitors, and displayed good control of false positive and false discovery rates across diversified scenarios. https://github.com/himelmallick/DAssemble.

Keywords

tweedie

differential expression

omics

data science 

Main Sponsor

Biopharmaceutical Section