Earth and Planetary Science
EPS Atmosphere, Oceans, and Climate

Banfield Lab Creates a Program that Sequentially Applies a Fast, Accurate Estimation of Genome Distance

Wednesday, March 22, 2017

Matthew Olm, Christopher Brown, Brandon Brooks and Jill Banfield are co-authors of “dRep: A tool for fast and accurate genome de-replication that enables tracking of microbial genotypes and improved genome recovery from metagenomes.” The article came out on BioRxiv: The Preprint Server for Biology. BioRxiv is a free online archive and distribution service in the life sciences.

The number of microbial genomes sequenced each year is expanding rapidly, in part due to genome-resolved metagenomic studies that routinely recover hundreds of draft-quality genomes. Rapid algorithms have been developed to comprehensively compare large genome sets, but they are not accurate with draft-quality genomes. Here the Banfield Lab presents dRep, a program that sequentially applies a fast, inaccurate estimation of genome distance and a slow but accurate measure of average nucleotide identity to reduce the computational time for pair-wise genome set comparisons by orders of magnitude. The researchers demonstrate its use in a study where we separately assembled each metagenome from time series datasets. Groups of essentially identical genomes were identified with dRep, and the best genome from each set was selected. This resulted in recovery of significantly more and higher-quality genomes compared to the set recovered using the typical co-assembly method.

All four authors are members of the Banfield Lab at UC Berkeley; Matthew Olm, Christopher Brown and Brandon Brooks are graduate students working with Professor Jill Banfield.

For more information, please click here (free access).