A multiple imputation method for compositional microbiome data
Sunday, Aug 3: 4:25 PM - 4:45 PM
Topic-Contributed Paper Session
Music City Center
High sparsity (i.e., excessive zeros) in microbiome data is unavoidable and can significantly alter analysis results. However, efforts to address this high sparsity have been limited, in part because it is impossible to justify the validity of any such methods, as zeros in microbiome data can arise from multiple sources. In this study, we first demonstrate theoretically and empirically that treating all zeros as missing values is a more robust approach than treating them as structural zeros (i.e., true absence) or rounded zeros (i.e., undetected due to detection limit), when the source of zeros is unknown. We then introduce a novel multiple imputation method developed specifically for high-sparse, high-dimensional compositional data. The robustness of the proposed approach, along with its beneficial effects on downstream analyses, is demonstrated through extensive simulation studies. Finally, we reanalyzed a type II diabetes (T2D) dataset to determine differentially abundant species between T2D patients and non-diabetic controls.
Excess zeros
Composition
High dimension
Microbiome
Multiple imputation
You have unsaved changes.