Biological causes and impacts of rugged tree landscapes in phylodynamic inference
Monday, Aug 4: 8:55 AM - 9:15 AM
Topic-Contributed Paper Session
Music City Center
Phylodynamic analysis has been instrumental in elucidating the spread and evolution dynamics of pathogens and cells. The Bayesian approach to phylodynamics integrates out phylogenetic uncertainty, which is typically substantial in phylodynamic datasets due to low genetic diversity. Bayesian phylodynamic analysis does not, however, scale with modern datasets, partly due to difficulties in traversing tree space. Here, we set out to characterize phylodynamic tree space and assess its impacts on analysis difficulty and key biological inferences. By running extensive Bayesian analyses of 15 classic large phylodynamic datasets and carefully analyzing the posteriors, we find that the posterior landscape in tree space ("tree landscape") is diffuse yet rugged, leading to widespread tree sampling problems that usually stem from a small part of the tree. We develop clade-specific diagnostics to show that a few sequences---including putative recombinants and recurrent mutants---frequently drive the ruggedness and sampling problems, although existing data-quality tests show limited power to detect such sequences. The sampling problems can significantly impact phylodynamic inferences or even distort major biological conclusions; the impact is usually stronger on "local" estimates (e.g., introduction history of a focal clade) than the "global"' parameters (e.g., demographic trajectory) that are governed by the general tree shape. In addition, we demonstrate that heterochronous sampling dates contain considerable information about tree topology, which can be in conflict with genetic data at local scale, leading to further complexity in the tree space and systematic discrepancies between Bayesian and the commonly used stepwise phylodynamic approaches. We evaluate existing and newly-developed MCMC diagnostics, and offer strategies for optimizing MCMC settings and mitigating impacts of the sampling problems. Our findings highlight the need for and directions to develop efficient traversal over the rugged tree landscape, ultimately advancing scalable and reliable phylodynamics.
Bayesian phylodynamics
phylogenetic inference
Markov chain Monte Carlo
viral evolution
heterochronous sequences
single-cell sequencing
You have unsaved changes.