The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/) incorporates, organizes and distributes nucleotide sequences from all available public sources. The database is located and maintained at the European Bioinformatics Institute (EBI) near Cambridge, UK. In an international collaboration with DDBJ (Japan) and GenBank (USA), data are exchanged amongst the collaborating databases on a daily basis to achieve optimal synchronization. Webin is the preferred web-based submission system for individual submitters, while automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via FTP, Email and World Wide Web interfaces. EBI's Sequence Retrieval System (SRS) integrates and links the main nucleotide and protein databases plus many other specialized molecular biology databases. For sequence similarity searching, a variety of tools (e.g. Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT. All resources can be accessed via the EBI home page at http://www.ebi.ac.uk.

Abstract

REFERENCES

References

1. Emmert D.B., Stoehr,P.J., Stoesser,G. and Cameron,G.N. (1994) The European Bioinformatics Institute (EBI) databases. Nucleic Acids Res., 22, 3445–3449.
2. Bairoch A. and Apweiler,R. (2000) The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res., 28, 45–48.
3. Apweiler R., Attwood,T.K., Bairoch,A., Bateman,A., Birney,E., Biswas,M., Bucher,P., Cerutti,L., Corpet,F., Croning,M.D.R., Durbin,R., Falquet,L., Fleischmann,W., Gouzy,J., Hermjakob,H., Hulo,N., Jonassen,I., Kahn,D., Kanapin,A., Karavidopoulou,Y., Lopez,R., Marx,B., Mulder,N.J., Oinn,T.M., Pagni,M., Servant,F., Sigrist,C.J.A. and Zdobnov,E.M. (2001) interPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic. Acids Res., 29, 7–40.
4. Boutselakis H., Dimitropoulos,D., Fillon,J., Golovin,A., Henrick,K., Hussain,A., Ionides,J., John,M., Keller,P.A., Krissinel,E., Mcneil,P., Naim,A., Newman,R., Oldfield,T., Pineda,J., Rachedi,A., Roser-Copeland,J., Sitnov,A., Sobhany,S., Suarez-Uruena,A., Swaminathan,J., Tagari,M., Tate,J., Tromm,S., Velankar,S. and Vranken,W. (2003) E-MSD: the European Bioinformatics Institute Macromolecular Structure Database. Nucleic Acids Res., 31, 458–462.
5. Brazma A., Sarkans,U., Robinson,A., Vilo,J., Vingron,M., Hoheisel,J. and Fellenberg,K. (2002) Microarray data representation, annotation and storage. Adv. Biochem. Eng. Biotechnol., 77, 113–139. [[PubMed]
6. Hubbard T., Barker,D., Birney,E., Cameron,G., Chen,Y., Clark,L., Cox,T., Cuff,J., Curwen,V., Down,T., Durbin,R., Eyras,E., Gilbert,J., Hammond,M., Huminiecki,L., Kasprzyk1,A., Lehvaslaihol,H., Lijnzaad,P., Melsopp,C., Mongin,E., Pettett,R., Pocock,M., Potter,S., Rust,A., Schmidt,E., Searle,S., Slater,G., Smith,J., Spooner,W., Stabenau,A., Stalker,J., Stupka,E., Ureta-Vidal,A., Vastrik,I. and Clamp,M. (2002) The Ensembl genome database project. Nucleic Acids Res., 30, 38–41.
7. Tateno Y., Miyazaki,S., Ota,M., Sugawara,H. and Gojobori,T. (2000) DNA Data Bank of Japan (DDBJ) in collaboration with mass sequencing teams. Nucleic Acids Res., 28, 24–26.
8. Benson D.A., Karsch-Mizrachi,I., Lipman,D.J., Ostell,J., Rapp,B.A. and Wheeler,D.L. (2000) GenBank. Nucleic Acids Res., 28, 15–18.
9. Lombard V., Camon,E.B., Parkinson,H.E., Hingamp,P., Stoesser,G., Redaschi,N(2002) EMBL-Align: a new public nucleotide and amino acid multiple sequence alignment database. Bioinformatics, 18, 763–764. [[PubMed][Google Scholar]
10. Beck S. and Sterk,P. (1998) Genome-Scale DNA sequencing where are we? Curr. Opin. Biotechnol., 9, 116–121. [[PubMed]
11. Zdobnov E.M., Lopez,R., Apweiler,R. and Etzold,T. (2002) The EBI SRS server—new features. Bioinformatics, 18, 1149–1150. [[PubMed]
12. Pearson W.R(1994) Using the FASTA program to search protein and DNA sequence databases. Methods Mol. Biol., 24, 307–331. [[PubMed][Google Scholar]
13. Altschul S.F., Gish,W., Miller,W., Myers,E.W. and Lipman,D.J. (1990) Basic local alignment search tool. J. Mol. Biol., 215, 403–410. [[PubMed]
14. Smith R.F. and Waterman,M.S. (1981) Comparison of biosequences. Adv. Appl. Math., 2, 482–489. [PubMed]
15. Thompson J.D., Higgins,D.G. and Gibson,T.J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res., 22, 4673–4680.
16. Borodovsky M. and McIninch,J. (1993) GeneMark: Parallel Gene Recognition for both DNA Strands. Comput. Chem., 17, 123–133. [PubMed]
17. Jonassen I., Collins,J.F. and Higgins,D.G. (1995) Finding flexible patterns in unaligned protein sequences. Protein Sci., 4, 1587–1595.