STRING: a database of predicted functional associations between proteins.
Journal: 2003/March - Nucleic Acids Research
ISSN: 1362-4962
PUBMED: 12519996
Abstract:
Functional links between proteins can often be inferred from genomic associations between the genes that encode them: groups of genes that are required for the same function tend to show similar species coverage, are often located in close proximity on the genome (in prokaryotes), and tend to be involved in gene-fusion events. The database STRING is a precomputed global resource for the exploration and analysis of these associations. Since the three types of evidence differ conceptually, and the number of predicted interactions is very large, it is essential to be able to assess and compare the significance of individual predictions. Thus, STRING contains a unique scoring-framework based on benchmarks of the different types of associations against a common reference set, integrated in a single confidence score per prediction. The graphical representation of the network of inferred, weighted protein interactions provides a high-level view of functional linkage, facilitating the analysis of modularity in biological processes. STRING is updated continuously, and currently contains 261 033 orthologs in 89 fully sequenced genomes. The database predicts functional interactions at an expected level of accuracy of at least 80% for more than half of the genes; it is online at http://www.bork.embl-heidelberg.de/STRING/.
Relations:
Content
Citations
(437)
References
(23)
Chemicals
(2)
Organisms
(1)
Processes
(2)
Affiliates
(1)
Similar articles
Articles by the same authors
Discussion board
Nucleic Acids Res 31(1): 258-261

STRING: a database of predicted functional associations between proteins

European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany Max-Delbrück-Centre for Molecular Medicine, Robert-Rössle-Strasse 10, 13092 Berlin, Germany Nijmegen Centre for Molecular Life Sciences p/a Centre of Molecular and Biomolecular Informatics, Toernooiveld 1, 6525 ED Nijmegen, The Netherlands
To whom correspondence should be addressed. Email: ed.grebledieh-lbme@krob
Received 2002 Aug 14; Accepted 2002 Sep 11.

Abstract

Functional links between proteins can often be inferred from genomic associations between the genes that encode them: groups of genes that are required for the same function tend to show similar species coverage, are often located in close proximity on the genome (in prokaryotes), and tend to be involved in gene-fusion events. The database STRING is a precomputed global resource for the exploration and analysis of these associations. Since the three types of evidence differ conceptually, and the number of predicted interactions is very large, it is essential to be able to assess and compare the significance of individual predictions. Thus, STRING contains a unique scoring-framework based on benchmarks of the different types of associations against a common reference set, integrated in a single confidence score per prediction. The graphical representation of the network of inferred, weighted protein interactions provides a high-level view of functional linkage, facilitating the analysis of modularity in biological processes. STRING is updated continuously, and currently contains 261 033 orthologs in 89 fully sequenced genomes. The database predicts functional interactions at an expected level of accuracy of at least 80% for more than half of the genes; it is online at http://www.bork.embl-heidelberg.de/STRING/.

Abstract

ACKNOWLEDGEMENTS

This work was supported in part by grants from the Netherlands Organization for Scientific Research (NWO), from the Deutsche Forschungsgemeinschaft, and from the Bundesministerium für Forschung und Bildung, Germany, through its contribution to the Helmholtz Network for Bioinformatics.

ACKNOWLEDGEMENTS

REFERENCES

REFERENCES

References

  • 1. Galperin M.Y. and Koonin,E.V. (2000) Who's your neighbor? New computational approaches for functional genomics. Nat. Biotechnol., 18, 609–613. [[PubMed]
  • 2. Marcotte E.M(2000) Computational genetics: finding protein function by nonhomology methods. Curr. Opin. Struct. Biol., 10, 359–365. [[PubMed][Google Scholar]
  • 3. Huynen M., Snel,B., Lathe,W. and Bork,P. (2000) Exploitation of gene context. Curr. Opin. Struct. Biol., 10, 366–370. [[PubMed]
  • 4. Pellegrini M., Marcotte,E.M., Thompson,M.J., Eisenberg,D. and Yeates,T.O. (1999) Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl Acad. Sci. USA, 96, 4285–4288.
  • 5. Huynen M.A. and Bork,P. (1998) Measuring genome evolution. Proc. Natl Acad. Sci. USA, 95, 5849–5856.
  • 6. Ettema T., van der Oost,J. and Huynen,M. (2001) Modularity in the gain and loss of genes: applications for function prediction. Trends Genet., 17, 485–487. [[PubMed]
  • 7. Aravind L., Watanabe,H., Lipman,D.J. and Koonin,E.V. (2000) Lineage-specific loss and divergence of functionally linked genes in eukaryotes. Proc. Natl Acad. Sci. USA, 97, 11319–11324.
  • 8. Overbeek R., Fonstein,M., D'Souza,M., Pusch,G.D. and Maltsev,N. (1999) The use of gene clusters to infer functional coupling. Proc. Natl Acad. Sci. USA, 96, 2896–2901.
  • 9. Dandekar T., Snel,B., Huynen,M. and Bork,P. (1998) Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem. Sci., 23, 324–328. [[PubMed]
  • 10. Lathe W.C. III, Snel,B. and Bork,P. (2000) Gene context conservation of a higher order than operons. Trends Biochem. Sci., 25, 474–479. [[PubMed]
  • 11. Enright A.J., Iliopoulos,I., Kyrpides,N.C. and Ouzounis,C.A. (1999) Protein interaction maps for complete genomes based on gene fusion events. Nature, 402, 86–90. [[PubMed]
  • 12. Marcotte E.M., Pellegrini,M., Ng,H.L., Rice,D.W., Yeates,T.O. and Eisenberg,D. (1999) Detecting protein function and protein-protein interactions from genome sequences. Science, 285, 751–753. [[PubMed]
  • 13. Nitschke P., Guerdoux-Jamet,P., Chiapello,H., Faroux,G., Henaut,C., Henaut,A. and Danchin,A. (1998) Indigo: a World-Wide-Web review of genomes and gene functions. FEMS Microbiol. Rev., 22, 207–227. [[PubMed]
  • 14. Snel B., Lehmann,G., Bork,P. and Huynen,M.A. (2000) STRING: a web-server to retrieve and display the repeatedly occurring neighbourhood of a gene. Nucleic Acids Res., 28, 3442–3444.
  • 15. Tatusov R.L., Natale,D.A., Garkavtsev,I.V., Tatusova,T.A., Shankavaram,U.T., Rao,B.S., Kiryutin,B., Galperin,M.Y., Fedorova,N.D. and Koonin,E.V. (2001) The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res., 29, 22–28.
  • 16. Mellor J.C., Yanai,I., Clodfelter,K.H., Mintseris,J. and DeLisi,C. (2002) Predictome: a database of putative functional links between proteins. Nucleic Acids Res., 30, 306–309.
  • 17. Kolesov G., Mewes,H.W. and Frishman,D. (2002) SNAPper: gene order predicts gene function. Bioinformatics, 18, 1017–1019. [[PubMed]
  • 18. Yanai I., Derti,A. and DeLisi,C. (2001) Genes linked by fusion events are generally of the same functional category: a systematic analysis of 30 microbial genomes. Proc. Natl Acad. Sci. USA, 98, 7940–7945.
  • 19. Snel B., Bork,P. and Huynen,M. (2000) Genome evolution. Gene fusion versus gene fission. Trends Genet., 16, 9–11. [[PubMed]
  • 20. Kullback S(1959) Information Theory and Statistics. John Wiley & Sons Inc., New York, NY, pp. 1–11.
  • 21. Huynen M., Snel,B., Lathe,W.,III and Bork,P(2000) Predicting protein function by genomic context: quantitative evaluation and qualitative inferences. Genome Res., 10, 1204–1210. [Google Scholar]
  • 22. Kanehisa M. and Goto,S. (2000) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res., 28, 27–30.
  • 23. Bairoch A. and Apweiler,R. (2000) The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res., 28, 45–48.
  • 24. Torriani A(1990) From cell membrane to nucleotides: the phosphate regulon in Escherichia coli. Bioessays, 12, 371–376. [[PubMed][Google Scholar]
Collaboration tool especially designed for Life Science professionals.Drag-and-drop any entity to your messages.