ARB: a software environment for sequence data
Abstract
The ARB (from Latin arbor, tree) project was initiated almost 10 years ago. The ARB program package comprises a variety of directly interacting software tools for sequence database maintenance and analysis which are controlled by a common graphical user interface. Although it was initially designed for ribosomal RNA data, it can be used for any nucleic and amino acid sequence data as well. A central database contains processed (aligned) primary structure data. Any additional descriptive data can be stored in database fields assigned to the individual sequences or linked via local or worldwide networks. A phylogenetic tree visualized in the main window can be used for data access and visualization. The package comprises additional tools for data import and export, sequence alignment, primary and secondary structure editing, profile and filter calculation, phylogenetic analyses, specific hybridization probe design and evaluation and other components for data analysis. Currently, the package is used by numerous working groups worldwide.
Datasets varying with respect to the number of sequence entries were used for PT server generation. The most comprehensive of these datasets comprising 25 743 entries and 40 000 alignment positions was used as a template for inserting the specified numbers of sequences into the database alignment or tree. For treeing, 2141 alignment columns were included.
ACKNOWLEDGEMENTS
The authors highly acknowledge F. O. Glöckner and R. Amann (Max Planck Institute for Marine Microbiology, Bremen, Germany) for redistributing the ARB database and software and answering user queries, as well as hosting and organizing ARB workshops. The authors thank Laura Schulz (Ludwig Maximilian University, Munich, Germany) for critical reading of the manuscript. ARB software development and database maintenance was partly supported by the European Union within the HRAMI project, by the German Research Foundation and by the German Ministry of Research and Education BIOLOG project.
REFERENCES
References
- 1. Stoesser G., Baker,W., van den Broek,A., Camon,E., Garcia-Pastor,M., Kanz,C., Kulikova,T., Leinonen,R., Lin,Q., Lombard,V. et al. (2002) The EMBL nucleotide sequence database. Nucleic Acids Res., 30, 21–26.
- 2. Benson D.A., Karsch-Mizrachi,I., Lipman,D.J., Ostell,J., Rapp,B.A. and Wheeler,D.L. (2002) GenBank. Nucleic Acids Res., 30, 17–20.
- 3. Maidak B.L., Cole,J.R., Lilburn,T.G., Parker,C.T.,Jr, Saxman,P.R., Farris,R.J., Garrity,G.M., Olsen,G.J., Schmidt,T.M. and Tiedje,J.M. (2001) The RDP-II (Ribosomal Database Project) Nucleic Acids Res., 29, 173–174.
- 4. Wuyts J., Van de Peer,Y., Winkelmans,T. and De Wachter,R. (2002) The European database on small subunit ribosomal RNA. Nucleic Acids Res., 30, 183–185.
- 5. Wuyts J., De Rijk,P., Van de Peer,Y., Winkelmans,T. and De Wachter,R. (2001) The European Large Subunit Ribosomal RNA Database. Nucleic Acids Res., 29, 175–177.
- 6. Felsenstein J., (1989) PHYLIP—Phylogeny Inference Package (version 3.2). Cladistics, 5, 164–166. [PubMed]
- 7. Olsen G.J., Matsuda,H., Hagstrom,R. and Overbeek,R. (1994) FastDNAml: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Comput. Appl. Biosci., 10, 41–48. [[PubMed]
- 8. Adachi J. and Hasegawa,M. (1996) Molphy Version 2.3, Programs for Molecular Phylogenetics Based on Maximum Likelihood. Technical Report, The Institute of Statistical Mathematics, Tokyo.
- 9. Strimmer K. and von Haeseler,A. (1996) Quartett Puzzling: a quartett maximum likelihood method for reconstructing tree topologies. Mol. Biol. Evol., 13, 964–969. [PubMed]
- 10. Stamatakis A.P., Ludwig,T., Meier,H. and Wolf,M.J. Accelerating parallel maximum likelihood-based phylogenetic tree calculations using subtree equality vectors. Proc. Supercomputing Conference (SC2002), Baltimore, MD, Nov. 2002, IEEE Computer Society.
- 11. Cannone J.J, Subramanian,S., Schnare,M.N., Collett,J.R., D’Souza,L.M., Du,Y., Feng,B., Lin,N., Madabusi,L.V., Müller,K.M. et al. (2002). The comparative RNA Web (CRW) site: an online database of comparative sequence and structure information for ribosomal, intron and other RNAs. BioMed Central Bioinform., 3, 2.
- 12. Amann R., Ludwig,W. and Schleifer,K.H. (1995) Phylogentic identification and in situ detection of individual microbial cells without cultivation. Microbiol. Rev., 59, 143–169.
- 13. Fuchs B.M., Wallner,G., Beisker,W., Schwippl,L., Ludwig,W. and Amann,R. (1998) Flow cytrometric analysis of the in situ accessibility of Escherichia coli 16S rRNA for fluorescently labeled oligonucleotide probes. Appl. Environ. Microbiol., 64, 4973–4982.
- 14. Fuchs B.M., Syutsubo,K., Ludwig,W. and Amann,R. (2001) In situ accessibility of the Escherichia coli 23S rRNA for fluorescently labeled oligonucleotide probes. Appl. Environ. Microbiol., 67, 961–968.
- 15. Ludwig W. and Klenk,H.P. (2001) Overview: a phylogenetic backbone and taxonomic framework for prokaryotic systematics. In Garrity,G. (ed.) Bergey’s Manual of Systematic Bacteriology (2nd edn). Springer, New York, pp. 49–65. [PubMed]
- 16. Kernigham B.W. and Lin,S. (1970) An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J., 49, 291–307. [PubMed]
- 17. Altschul S.F., Madden,T.L., Schaffer,A.A., Znang,J., Znang,Z., Miller,W. and Lipman,D.J. (1997) Gapped BLAST and PSI-Blast: a new generation of protein-specific gap penalties and weight matrix choice. Nucleic Acids Res., 25, 3389–3402.
- 18. Pearson W.R. and Lipman,D.C. (1988) Improved tools for biological sequence comparison. Proc. Natl Acad. Sci. USA, 85, 2444–2448.
- 19. Thompson J.D., Higgins,D.G. and Gibson,D.J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment. Comput. Appl. Biosci., 8, 189–191.
- 20. Amann R. and Ludwig,W. (2000) Ribosomal RNA-targeted nucleic acid probes for studies in microbial ecology. FEMS Microbiol. Rev., 24, 555–565. [[PubMed]
- 21. Ludwig W., Amann,R., Martinez-Romero,E., Schönhuber,W., Bauer,S., Neef,A. and Schleifer,K.H. (1998) rRNA based identification systems for Rhizobia and other bacteria. Plant Soil, 204, 1–9. [PubMed]





