Identifying bacterial genes and endosymbiont DNA with Glimmer.
Journal: 2007/April - Bioinformatics
ISSN: 1367-4811
Abstract:
BACKGROUND
The Glimmer gene-finding software has been successfully used for finding genes in bacteria, archaea and viruses representing hundreds of species. We describe several major changes to the Glimmer system, including improved methods for identifying both coding regions and start codons. We also describe a new module of Glimmer that can distinguish host and endosymbiont DNA. This module was developed in response to the discovery that eukaryotic genome sequencing projects sometimes inadvertently capture the DNA of intracellular bacteria living in the host.
RESULTS
The new methods dramatically reduce the rate of false-positive predictions, while maintaining Glimmer's 99% sensitivity rate at detecting genes in most species, and they find substantially more correct start sites, as measured by comparisons to known and well-curated genes. We show that our interpolated Markov model (IMM) DNA discriminator correctly separated 99% of the sequences in a recent genome project that produced a mixture of sequences from the bacterium Prochloron didemni and its sea squirt host, Lissoclinum patella.
BACKGROUND
Glimmer is OSI Certified Open Source and available at http://cbcb.umd.edu/software/glimmer.
Relations:
Content
Citations
(1K+)
References
(23)
Chemicals
(1)
Organisms
(3)
Processes
(3)
Affiliates
(1)
Similar articles
Articles by the same authors
Discussion board
Bioinformatics 23(6): 673-679

Identifying bacterial genes and endosymbiont DNA with Glimmer

Center for Bioinformatics & Computational Biology, University of Maryland, College Park, MD 20742, USA
Smurfit Institute of Genetics, University of Dublin, Trinity College Dublin, Dublin 2, Ireland
Department of Chemical & Biomolecular Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
To whom correspondence should be addressed.

Contact:ude.dmu.scaimu@rehcleda

Abstract

Motivation:

The Glimmer gene-finding software has been successfully used for finding genes in bacteria, archæa and viruses representing hundreds of species. We describe several major changes to the Glimmer system, including improved methods for identifying both coding regions and start codons. We also describe a new module of Glimmer that can distinguish host and endosymbiont DNA. This module was developed in response to the discovery that eukaryotic genome sequencing projects sometimes inadvertently capture the DNA of intracellular bacteria living in the host.

Results:

The new methods dramatically reduce the rate of false-positive predictions, while maintaining Glimmer's 99% sensitivity rate at detecting genes in most species, and they find substantially more correct start sites, as measured by comparisons to known and well-curated genes. We show that our interpolated Markov model (IMM) DNA discriminator correctly separated 99% of the sequences in a recent genome project that produced a mixture of sequences from the bacterium Prochloron didemni and its sea squirt host, Lissoclinum patella.

Abstract

Footnotes

Associate Editor: Alfonso Valencia

Availability:Glimmer is OSI Certified Open Source and available at http://cbcb.umd.edu/software/glimmer

Conflict of Interest: none declared.

Footnotes

REFERENCES

REFERENCES
Collaboration tool especially designed for Life Science professionals.Drag-and-drop any entity to your messages.