Cn-RNN: a Supervised Learning Framework for CNV Detection with Sequencing Data
Tuesday, Aug 5: 10:05 AM - 10:20 AM
1653
Contributed Papers
Music City Center
Copy number variants (CNVs), involving genomic duplications/deletions, play a critical role in various human diseases. Accurate CNV detection is essential but challenging due to high dimensionality, technical biases, and low signal-to-noise ratios, leading to inconsistent calls and high false positives. Existing deep learning-based methods employ Convolutional Neural Networks (CNNs), which rely on image-based recognition and are prone to domain shifting problems. Also, accurate supervised learning required a large and validated variant set to differentiate CNV predictions from false positives.
Therefore, we developed a novel deep learning model, cn-RNN, for copy number estimation with sequencing data using Recurrent Neural Networks (RNNs). Unlike CNNs, RNNs inherently preserve the sequential structure of genomic data, enabling more accurate and biologically meaningful processing of sequencing data. Besides, we used a publicly available trio dataset to construct a large high-confidence CNV training set. Compared to CNN-based methods, cn-RNN achieved a 20% higher F1-score with significantly fewer false positives. Our work enables more reliable CNV detection with sequencing data.
Copy Number Variants (CNV) Detection
Recurrent Neural Networks (RNN)
Supervised Learning
Statistical Genetics
Deep Learning in Genomics
Main Sponsor
Section on Statistics in Genomics and Genetics
You have unsaved changes.