Cytoscape: a software environment for integrated models of biomolecular interaction networks.
Journal: 2004/January - Genome Research
ISSN: 1088-9051
Abstract:
Cytoscape is an open source software project for integrating biomolecular interaction networks with high-throughput expression data and other molecular states into a unified conceptual framework. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape's software Core provides basic functionality to layout and query the network; to visually integrate the network with expression profiles, phenotypes, and other molecular states; and to link the network to databases of functional annotations. The Core is extensible through a straightforward plug-in architecture, allowing rapid development of additional computational analyses and features. Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.
Relations:
Content
Citations
(7K+)
References
(30)
Chemicals
(1)
Organisms
(2)
Processes
(3)
Affiliates
(1)
Similar articles
Articles by the same authors
Discussion board
Genome Res 13(11): 2498-2504

Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks

Institute for Systems Biology, Seattle, Washington 98103, USA
Whitehead Institute for Biomedical Research, Cambridge, Massachusetts 02142, USA
Department of Bioengineering, University of California-San Diego, La Jolla, California 92093, USA
Present address: University of California, San Diego, Department of Bioengineering, La Jolla, California 92093, USA.
Corresponding authors.E-MAIL ude.dscu.gneoib@yert; FAX (858) 534-5722.E-MAIL gro.ygoloibsmetsys@onneb; FAX (206) 732-1299
Received 2003 Feb 1; Accepted 2003 Aug 22.

Abstract

Cytoscape is an open source software project for integrating biomolecular interaction networks with high-throughput expression data and other molecular states into a unified conceptual framework. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape's software Core provides basic functionality to layout and query the network; to visually integrate the network with expression profiles, phenotypes, and other molecular states; and to link the network to databases of functional annotations. The Core is extensible through a straightforward plug-in architecture, allowing rapid development of additional computational analyses and features. Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.

Abstract

Computer-aided models of biological networks are a cornerstone of systems biology. A variety of modeling environments have been developed to simulate biochemical reactions and gene transcription kinetics (Endy and Brent 2001), cellular physiology (Loew and Schaff 2001), and metabolic control (Mendes 1997). Such models promise to transform biological research by providing a framework to (1) systematically interrogate and experimentally verify knowledge of a pathway; (2) manage the immense complexity of hundreds or potentially thousands of cellular components and interactions; and (3) reveal emergent properties and unanticipated consequences of different pathway configurations.

Typically, models are directed toward a cellular process or disease pathway of interest (Gilman and Arkin 2002) and are built by formulating existing literature as a system of differential and/or stochastic equations. However, pathway-specific models are now being supplemented with global data gathered for an entire cell or organism, by use of two complementary approaches. First, recent technological developments have made it feasible to measure pathway structure systematically, using high-throughput screens for protein-protein (Ito et al. 2001; von Mering et al. 2002), protein-DNA (Lee et al. 2002), and genetic interactions (Tong et al. 2001). To complement these data, a second set of high-throughput methods are available to characterize the molecular and cellular states induced by pathway interactions under different experimental conditions. For instance, global changes in gene expression are measured with DNA microarrays (DeRisi et al. 1997), whereas changes in protein abundance (Gygi et al. 1999), protein phosphorylation state (Zhou et al. 2001), and metabolite concentrations (Griffin et al. 2001) may be quantified with mass spectrometry, NMR, and other advanced techniques. High-throughput data pertaining to molecular interactions and states are well matched, in that both data types are global (providing information for all components or interactions in an organism); high-level (outlining relationships among pathway components without detailed information on reaction rates, binding constants, or diffusion coefficients); and coarse-grained (yielding qualitative data, such as the presence or absence of an interaction or the direction of an expression change, more readily than precise quantitative readouts).

Motivated by the explosion in experimental technologies for characterizing molecular interactions and states, researchers have turned to a variety of software tools to process and analyze the resulting large-scale data. For molecular interactions, general-purpose graph viewers such as Pajek (Batagelj and Mrvar 1998), Graphlet (www.infosun.fmi.uni-passau.de/Graphlet/), and daVinci (www.informatik.uni-bremen.de/daVinci/) are available to organize and display the data as a two-dimensional network; specialized tools such as Osprey (http://biodata.mshri.on.ca) and PIMrider (pim.hybrigenics.com) provide these capabilities and also link the network to molecular interaction and functional databases such as BIND (Bader et al. 2001), DIP (Xenarios and Eisenberg 2001), or TRANSFAC (Wingender et al. 2001). Similarly, for gene expression profiles and other molecular states, numerous programs such as GeneCluster (Tamayo et al. 1999), Tree-View (Eisen et al. 1998), and GeneSpring (www.silicongenetics.com) are available for clustering, classification, and visualization. However, a pressing need remains for software that is able to integrate both molecular interactions and state measurements together in a common framework, and to then bridge these data with a wide assortment of model parameters and other biological attributes. Moreover, a flexible and open system will be required to facilitate general and extensible computations on the interaction network (Karp 2001). It is through these computations that high-level interaction data may ultimately interface with, and drive development of, low-level physico-chemical models.

To address these needs, we have developed Cytoscape, a general-purpose modeling environment for integrating biomolecular interaction networks and states. We first provide an overview of Cytoscape's core functionality for representation and integration of biomolecular network models. We then describe three case studies of existing research projects in which the Cytoscape platform is applied to concrete biological problems or extended to implement new algorithms and network computations.

Acknowledgments

While this work was in review, the Cytoscape core development team expanded to include the laboratory of Dr. Chris Sander at the Memorial Sloan-Kettering Cancer Research Center. We are particularly indebted to Gary Bader and Ethan Cerami in that lab for their recent efforts. Many thanks also go to Iliana Avila-Campillo, Andrew Finney, Mike Hucka, and Vesteinn Thorsson. T.I. is the David Baltimore Whitehead Fellow and was funded through a grant from Pfizer; A.M., P.S., and B.S. were funded through NIH Grant P20 GM64361; N.B. was funded through NSF Grant 0220153. Lastly, we gratefully acknowledge the MIT UROP office for support of J.W., N.A., and D.R.

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC section 1734 solely to indicate this fact.

Acknowledgments

Footnotes

Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.1239303.

[The Cytoscape v1.1 Core runs on all major operating systems and is freely available for download from http://www.cytoscape.org/ as an open source Java application.]

Biological semantics vary widely within the biological community as well as fromproject to project. If biological semantics were in the Cytoscape Core, we would be faced with a difficult question; which semantics should we use? Cytoscape avoids this problem by leaving it to plug-in writers to adopt semantics adequate to the problem at hand. Of course, there is often substantial biological significance associated with data in the Core. For example, the core may represent a node (an abstract concept free of biological semantics) whose label is GCN4 (a text string with significance to yeast biologists as an important transcription factor) or we might use the Core to define node attributes entitled “expression ratio” or “cellular compartment.” In this way, great freedom and flexibility—the ability to accommodate new biological problems—is gained by not inscribing the notion of specific biological entities directly into the semantics of Cytoscape's Core.

Although attribute-based layout was initially implemented as a plug-in, its general applicability and tight integration with several of Cytoscape's core features (graph layout and node attribute mapping) ultimately led us to incorporate it into the Cytoscape Core platform. Thus, plug-ins also provide a general means of introducing and testing new features.

Footnotes

WEB SITE REFERENCES

WEB SITE REFERENCES
Collaboration tool especially designed for Life Science professionals.Drag-and-drop any entity to your messages.