BANGS: A tool for Bayesian analysis of next-generation sequencing data (2012)


Human Health


This SPARK project will develop robust statistical data analysis tools for next generation sequencing (NGS), which refers to a set of high throughput technologies for measuring signals across the genome.  Those signals may represent which genes in the genome are active, where certain regulatory molecules bind to the DNA, or even something about the state of the DNA itself. However, NGS technologies do not measure the genomic signals perfectly — there are omissions, uncertainty, and in some cases, bias in the measurements.  Dr. Theodore Perkins’ project through the Ottawa Hospital Research Institute proposes a novel approach to reconstructing genomic signals represented in NGS data using Bayesian statistics.  The main features of this approach are that the team is able to put forth a best estimate of the signal, and also to quantify the uncertainty in the estimate. Quantifying uncertainty is useful for visualization of genomic signals, and is critical for comparing them under different conditions.  This statistical approach will be implemented in efficient, open-source, well-documented software, for the benefit of the NGS community.