Bacteriophage protein-protein interactions.

Phage	Proteins	Phage–phage PPIs	Phage–host PPIs
T7	55	~43 (including 7 HTS⁾	15
λ	73	33 (+ 81 HTS)	26
P22	64	38	9
P2/P4	42/13	14	3
ϕ29	30	20	3
Dp-1	72	156 (HTS)	38 (HTS)
Cp-1	28	17 (HTS)	11 (HTS)

A. Genetic detection of interactions

Methods used to study protein–protein interactions in phages have evolved over the decades. A few important methods are illustrated in Figure 1. In the beginnings of phage molecular biology, detection of genetic interactions dominated. For instance, certain mutants in the λ J gene abolished virion binding to host Escherichia coli cells. Similarly, mutants of the E. coli gene lamB made the cells resistant to λ infections. From such experiments it was concluded that J protein bound to LamB and that LamB was the λ receptor. This hypothesis has been confirmed by isolating compensatory mutations in both genes that regenerate the interaction (Werts et al., 1994).

An external file that holds a picture, illustration, etc.
Object name is nihms404378f1.jpg

FIGURE 1

Selected methods for the study of protein–protein interactions. (A) The yeast two-hybrid (Y2H) system is based on the levels of two fusion proteins (typically in yeast but any cell type can be used). One of the proteins contains a DNA-binding domain (DBD), which can bind to the promoter of a reporter gene (here: HIS3), and a second protein X, the bait. The second fusion protein consists of a transcription-activation domain (AD) and a second protein, Y. If proteins X and Y interact, a transcription factor is formed and the reporter gene is activated. In this case, that means that the cell can grow on histidine-free medium. A yeast colony growing on such medium thus indicates an interaction of the two inserted proteins. (B) Protein complementation assay (PCA), e.g., split-YFP. As in the Y2H assay, two interacting proteins bring together two protein fragments that are inactive when separate but active when in close proximity. Here, fragments of yellow-fluorescent protein (YFP) reassociate and fluoresce when reassembled. Other fluorescent proteins, such as the green fluorescent protein (GFP), have been used in a similar way. (C) LUMIER (LUMInescence-based mammalian intERactome). Two fusion proteins are purified by means of an epitope tag (here: FLAG tag), usually on an antibody-coated matrix. Interactions between X and Y can be detected using Luciferase, whose gene is fused to Y and which emits light when luciferin is added. (D) Affinity purification. Protein complexes can be purified from cellular lysates using an affinity epitope, as in (C) with a FLAG tag. Components of the complex can then be identified using mass spectrometry, using the unique mass of peptides when the protein is digested by trypsin.

B. Electron microscopy

Another important method for the investigation of physical relationships between phage components is electron microscopy, especially in combination with genetic analysis. Mutants often have an aberrant morphology. If certain structures are missing in the mutant virion, it can be concluded that the mutated gene encodes this structure or is at least involved in its assembly. For instance, many laboratory strains of λ harbor a frameshift mutation in the stf gene, and these mutants lack side tail fibers (Hendrix and Duda, 1992). Many such studies not only showed which genes encode which proteins, but also indicated how the proteins interact (e.g., when particular structures could be “removed” from the virion by defined gene knockouts).

More recently, many bacteriophage capsids and phage protein complexes have been analyzed by cryoelectron microscopy and three-dimensional reconstructions. This technique gives resolutions approaching that of X-ray crystallography, especially for proteins present in symmetric arrays in the virion, and has proven ideal for analyzing protein–protein interactions in large complexes, especially in combination with high-resolution structural data from other techniques, such as X-ray crystallography.

C. Yeast two-hybrid screens

More recently, the yeast two-hybrid system (Y2H) has become an important method to detect pairwise protein interactions. Interaction screens of phage proteins can be carried out easily by cloning all open reading frames in appropriate vectors and then testing these in a systematic pairwise fashion. This has been done for several phages, including coliphages T7 (Bartel et al., 1996) and λ (Rajagopala et al., 2011), as well as Streptococcus pneumoniae phages Cp-1 and Dp-1 (Häuser et al., 2011; Sabri et al., 2010). The whole process can also be automated.

One disadvantage of the Y2H system is that a typical pairwise screen, also called an “array screen” because of the systematic arrangement of clones in microtiter plates, may only detect 20–30% of all interactions. That is, the rate of false negatives is on the order of 70–80%. This can, in theory, be overcome by using multiple vectors that differ in the structure of the fusion proteins, for instance in the production of N- or C-terminal fusion proteins (Stellberger et al., 2010). A combination of multiple vectors can achieve interaction detection rates of up to 80% (i.e., false negative rates as low as 20%) (Chen et al., 2010).

Y2H also suffers from false positives that can only be understood with additional experimentation. Nonetheless, a multivector Y2H strategy can help in the recognition of false positives. For instance, if multiple vector combinations are used, interactions found in all or most cases are more likely to be “true positives,” whereas interactions detected in a lower fraction of cases are less reliable and may be false positives.

D. Protein complex purification and mass spectrometry

Whole phage virions can be purified, typically by CsCl density gradient centrifugation. Proteins from purified phage can then be denatured and separated on polyacrylamide gels or subjected directly to trypsin digestion and subsequent liquid chromatography (LC) and mass spectrometry (MS). The mass of peptides determined by MS can usually identify the corresponding proteins unambiguously, and thus composition of the phage particles. Once all proteins in a virion are identified, at least some can be assigned easily to known structures, such as the icosahedral capsid or tail sheath. Typically, only a fraction of all phage-encoded proteins are represented in the mature virion (e.g., less than 20 of the ≈70 phage λ proteins). Especially for little-studied phages it can be extremely helpful to know which proteins are in the virion. For example, in an analysis of Streptococcus phage Dp-1, only 8 of the 72 predicted proteins were identified by LC/MS and were thus confirmed structural proteins (see later).

E. Biochemistry of protein complexes

Biochemical approaches often involve the purification of protein complexes and thus yield protein interaction information. This approach can be made technically easier if individual proteins are affinity tagged. For instance, phage-encoded proteins can often be fused to affinity column-binding proteins such as glutathione-S-transferase or oligohistidine without affecting their function. Such tagged proteins and proteins bound to them can then be purified using affinity reagents. The untagged proteins in the complex can then be identified by mass spectrometry or immunoblotting (if antibodies against specific proteins are available). A range of experimental techniques is available for the characterization of phage protein complexes and their activities; a comprehensive overview cannot be given here. For instance, chemical cross-linking of interacting proteins is used frequently to spatially locate physically associated proteins, and variable spacer lengths can determine the steric proximity of the binding partners. Other tests include activity assays (e.g., proteases or other modifying enzymes that may be required for phage assembly), replication, or other processes. Some examples of such activities are described here.

F. X-ray crystallography and nuclear magnetic resonance

Only atomic structures of all phage proteins and their interactions will give a mechanistic understanding of phage biology, and progress in this area has been accelerating. Individual proteins and protein complexes, if not the whole phage particle, can be crystallized and the structure of such proteins solved by X-ray analysis. Alternatively, protein structures can be solved by nuclear magnetic resonance (NMR) if high enough concentrations can be achieved. Unfortunately, size restrictions on NMR structural determinations limit the analysis of large protein complexes, but monomeric structures can be extremely useful in devising testable models for the interactions proteins may undergo. However, many phage proteins are refractory to structural analysis (especially those involved in virion assembly, as they often assemble only at high concentrations), and even in classical model systems such as phage λ less than a third of all phage-encoded proteins have been crystallized—most of these as pure proteins rather than as protein complexes (Rajagopala et al., 2011;Yura et al., 2006).

A. Coliphage T7

Bacteriophage T7 was defined in 1945 as a member of the seven Type (“T”) phages that grow lytically on E. coli B (Demerec and Fano, 1945), although it is probably identical to phage δ, used earlier by Delbrück; a close relative of T7 was likely studied by d’Herelle in the 1920s (d’Herelle, 1926). Relatives of T7 exhibit high synteny across their entire genomes, with the closest relatives differing primarily only by the number and location of homing endonucleases or other selfish elements not known to have any role in phage development. These phages therefore will exhibit life cycles and PPIs that parallel T7 itself, modified only by the resources available in their host organisms.

1. Growth physiology

T7 grows on rough strains of E. coli (i.e., those without full-length O-antigen polysaccharide on their surface) and some other enteric bacteria, but close relatives have been described that infect smooth and even capsulated strains (Molineux, 2006a). In such phages, tail fibers present on T7 are replaced with tailspikes with enzymatic activity that degrades the O- or K-antigens on the cell surface. T7 has a short latent period of 17 min at 37°C, which has resulted in most physiological studies being conducted at 30°C where infected cells lyse after 30 min. However, adaptation of T7 to high fitness with an excess of an E. coli K-12 strain growing under optimal conditions at 37°C in rich media results in a phage with a latent period of only ≈11 min. This adapted phage can undergo an effective expansion of its population by more than 10^{in 1 hr of growth (}Heineman and Bull, 2007). There is no obvious subtlety in the T7 developmental pathway: the early region of its genome is transcribed efficiently by the host RNA polymerase, leading to synthesis of a phage-specific RNA polymerase and the concurrent inhibition of the host enzyme. Thus, T7 rapidly takes control of the transcriptional capacity of the infected cell. As E. coli mRNAs are also intrinsically unstable, T7 therefore also gains control of the translational capacity of the cell. Further, because T7 uses nucleotides derived from the host chromosome for its own replication, more than 80% of the DNA in progeny is derived from the bacterial chromosome and only a small burst is observed after infection of DNA-free minicells. T7 development is not known to be dependent on host chaperones, although it does code for two proteins that have been suggested to have chaperone activity in capsid and tail assembly. Aside from host components used for phage adsorption, plus general biosynthetic processes and the translational apparatus, T7 development is remarkably independent of host proteins: only E. coli RNA polymerase, thioredoxin—the cofactor for T7 DNA polymerase—and (deoxy)cytidine monophosphate kinase are known to be essential.

2. Early steps in infection

T7 is a member of Podoviridae, possessing only a short stubby tail that needs to be extended functionally in order to span the cell envelope, which thus allows its genome to be translocated into the cell. Tail extension is affected through internal head proteins that are ejected into the cell at the initiation of infection prior to or concomitant with transport of the leading end of the genome (Chang et al., 2010; Kemp et al., 2005). A schematic of the T7 virion is shown in Figure 2. In T7, the core consists of several copies each of three proteins (gp14, gp15, and gp16), which must interact, albeit perhaps loosely, with themselves and each other, in the phage capsid. However, during their ejection into the cell they must at least partially dissociate and also partially unfold in order to pass through the narrow channel through the portal and tail (Molineux, 2006b). The N-terminal domain of gp16 refolds spontaneously in the periplasm of the infected cell to form an enzyme with lytic transglycosylase activity (Moak and Molineux, 2000, 2004). This muralytic activity is only conditionally essential for T7 infection (Moak and Molineux, 2000).

An external file that holds a picture, illustration, etc.
Object name is nihms404378f2.jpg

FIGURE 2

Phage T7: genome, virion, and interactome. ORFeome and protein interaction map of bacteriophage T7. Cloned open reading frames (ORFs) and random fragments were used to generate a two-dimensional interaction map. ORFeomes can also be used by structural genomics projects to derive three-dimensional structures. A combination of interaction maps and crystal structures often allows reconstruction of protein complexes as in viral particles or their subunits. (Top) T7 genome with each ORF represented as a box. ORFs indicated by integral numbers were identified originally as essential by genetic screens. Other genes have decimal numbers; of these, only 2.5 and 7.3 are essential. Proteins found to interact with other proteins are indicated by colored boxes; colors indicate a simplified assignment to functional classes as shown. Interactions between proteins are shown as lines, with self-interactions shown as dimers. Hatched lines indicate expected interactions that have not been found by two-hybrid analysis. Gray proteins correspond to host proteins (E. coli). Blue proteins are involved in virus assembly and structure, and, where known, their location in the virus particle is shown on the right. Proteins with thick borders have been crystallized and their structure determined. Genetic map modified after Dunn and Studier (1983). Figure modified after Uetz et al. (2004). Interactions based primarily on data in Table II (see references therein).

The C-terminal residues of gp16, likely together with a more central portion of gp15, also refold in the periplasm and then insert spontaneously into the cytoplasmic membrane, completing the channel across the cell envelope for DNA translocation into the cell. Likely the last of the core proteins to exit the phage head, gp14 is found exclusively within the outer membrane of the infected cell. Using energy derived from the membrane potential, these proteins then ratchet the leading 1 kb of the T7 genome into the cell (Chang et al., 2010; Kemp et al., 2005). In both T7 and T3, genome internalization by this process stops, and the remainder of the DNA is normally pulled into the cytoplasm as a consequence of host and then phage RNA polymerase-catalyzed transcription (Garcia and Molineux, 1995, 1996; Kemp et al., 2004). Several strong promoters lie on this leading 1-kb segment of the T7 genome.

3. Early genes

Transcription from T7 early promoters not only internalizes the remainder of the genome but also leads to expression of early T7 genes, some of which interact and inhibit or modulate the activity of host proteins (Molineux, 2006a). Of the early proteins, only T7 RNA polymerase is essential for phage growth. The first T7 protein to be synthesized is gp0.3, an anti-type I restriction protein (Studier, 1975). The structure of gp0.3 reveals that it resembles double-stranded DNA, and the protein binds to the specificity subunit of the heteropentameric EcoKI enzyme, preventing both it and the cognate modification enzyme from recognizing its target sequence (Atanasiu et al., 2001, 2002; Kennaway et al., 2009; Murray, 2000; Stephanou et al., 2009). T7 gp0.3 is active against all type I enzymes and thus T7 grows normally on B, C, and K-12 strains of E. coli regardless of its previous host. However, the protein affords no protection against type II and little, if any, against type III enzymes. Gp0.3 proteins of different T7-like phages fall into two distinct groups with no sequence similarity; both have anti-restriction activity but the group typified by T3 gp0.3 also has SAMase activity (Studier and Movva, 1976). The second major early protein whose function is known is gp0.7, a serine-threonine kinase (Rahmsdorf et al., 1974) that is also responsible for the shut off of host transcription. The two activities are separable (Simon and Studier, 1973). Gp0.7 phosphorylates several host proteins, including RNA polymerase (Severinova and Severinov, 2006; Zillig et al., 1975), the translational apparatus including EF-G and the small ribosomal subunit protein S6 (Robertson and Nicholson, 1990, 1992; Robertson et al., 1994), and both RNase III (Marchand et al., 2001) and RNase E (Mayer and Schweiger, 1983). T7 shuts down translation of host mRNAs but it remains unclear whether EF-G or S6 is involved directly. An unknown T7 late gene also modulates the specificity of RNase E (Savalia et al., 2010). Although it is likely that others do so, only 1 other of the 10 early T7 proteins is currently known to interact with host proteins. Gp1.2 binds to E. coli dGTPase, converting it into a rGTP-binding protein (Huber et al., 1988; Nakai and Richardson, 1990); gp1.2 also binds to the membrane protein FxsA and the F plasmid protein PifA (Cheng et al., 2004; Schmitt and Molineux, 1991; Wang et al., 1999).

4. T7 RNA polymerase

The hallmark of T7 is of course its phage-encoded RNA polymerase, which, together with its regulator T7 lysozyme, is responsible for most T7 gene expression. It is also essential for synthesizing RNA primers at the primary origin of replication (Fujiyama et al., 1981) and for initiation of DNA packaging (Zhang and Studier, 1997, 2004). In addition to several complexes containing a promoter and product RNAs (Cheetham and Steitz, 1999; Cheetham et al., 1999; Kennedy et al., 2007; Tahirov et al., 2002; Yin and Steitz, 2002, 2004), the crystal structure of the RNA polymerase–lysozyme complex has been determined (Jeruzalmi and Steitz, 1998). Many mutants in both RNA polymerase and lysozyme have been characterized that affect their functional interaction, but it is not obvious from the crystal structures how the mutations affect the regulation of transcription. Lysozyme is thought to target the initial transcribing complex in a promoter strength-dependent manner and have no effect on transcription elongation (Huang et al., 1999; Stano and Patel, 2004; Zhang and Studier, 1997, 2004). Lysozyme is, however, important in pausing or terminating transcription at the class II terminator CJ, where it probably interacts directly with the T7 terminase protein gp19 and thus directs the packaging machinery to create the DNA terminus (Lyakhov et al., 1997, 1998; Zhang and Studier, 2004).

After T7 RNA polymerase has been synthesized, there is no further requirement for the host polymerase and it actually becomes inhibitory to phage development. Inactivation is caused by the T7 gp2 protein, which binds to the β′ subunit and inhibits transcription by interfering with strand separation of the open complex at most promoters (Camara et al., 2010; Nechaev and Severinov, 1999). In vivo, the only T7 promoter that must be inhibited by gp2 is A3 (Savalia et al., 2010). Gp2 was found to interact with T7 endonuclease gp3 by a yeast two-hybrid screen (Bartel et al., 1996). An interaction between gp2 and gp3 (albeit not necessarily direct) was suggested earlier (Mooney et al., 1980), but the idea has not yet been critically evaluated experimentally.

5. DNA replication

T7 has been one of the primary model systems for understanding the biochemistry of DNA replication. Central to T7 replication is T7 DNA polymerase, which comprises a heterodimer of gp5 and the host thioredoxin. Binding of gp5 to thioredoxin is strong, with the latter serving as the processivity factor for DNA replication, increasing the length of DNA synthesized per binding event 100-fold. The interface between the two proteins also provides the docking site for binding of the other two proteins, SSB (gp2.5) and the primase–helicase gp4, which are central to T7 replication in vivo. The process of T7 replication in vitro has been reviewed recently (Hamdan and Richardson, 2009).

Two other proteins in the DNA replication cluster are known to interact with host proteins. The gene 5.5 product binds to the abundant nucleoid-associated protein H-NS (Ali et al., 2011) and in vitro relieves its inhibition of both E. coli and T7 RNA polymerase-catalyzed transcription (Studier, 1981). In vivo, T7 5.5 mutants direct reduced rates of DNA replication and exhibit a small plaque phenotype (Lin, 1992; Liu and Richardson, 1993; Studier, 1981). However, the phenotype persists in hns-defective strains, although an hns mutant that renders gene 5.5 essential has been isolated (Liu and Richardson, 1993). Gp5.5 also interacts with the λ proteins of RexAB system; it is not known whether this is a direct protein–protein interaction or whether gp5.5 interacts with a product of RexAB activity. A gene 5.5 missense, but not a null mutant, makes T7 sensitive to Rex-mediated exclusion. The product of gene 5.9 inhibits the E. coli RecBCD nuclease in a comparable manner to λ Gam protein; this interaction is also nonessential as mutants lacking gene 5.9 grow normally in rec(BCD)^{hosts (}Lin, 1992).

The assembly of T7 virions involves several symmetry mismatches and many protein–protein interactions, some of which may be helped by other T7 proteins that potentially have a chaperone-like activity (Kemp et al., 2005). Note that, GenBank annotations notwithstanding, gp13 is not a structural component of the virion (Kemp et al., 2005). Assembly is thought to initiate with oligomerization of the portal protein gp8 to form a dodecameric ring, although both 12- and 13-mers are observed following overexpression of gene 8 (Cerritelli and Studier, 1996b; Kemp et al., 2005; Valpuesta et al., 1992, 2000). The pathway of prohead formation may then involve assembly of the internal core (gp14, gp15, and gp16) on the inner surface of the portal, perhaps concomitant with the recruitment of capsid protein gp10 hexamers bound to scaffolding protein gp9. Subsequent recruitment of gp10 pentamers and hexamers results in shell formation around a scaffold of gp9 (Cerritelli et al., 2003a). The 12-fold symmetrical portal protein faces a symmetry mismatch with the 5-fold vertex of the icosahedral capsid shell. The internal core of T7 contains 4 copies of gp16, 8 copies of gp15, and either 10 or 12 copies of gp14 (Agirrezabala et al., 2007; Cerritelli et al., 2003b; Kemp et al., 2005). These proteins are ejected into the infected cell in order to transport the phage genome into the cytoplasm, and the protein–protein interactions in the mature virion are likely quite different than those found in the infected cell. Gp14 lies at the base of the internal core and is therefore juxtaposed to the 12-fold symmetrical portal protein. Complicating our understanding of the T7 virion is the identification of ≈18 copies of the small gp6.7 protein, which is ejected into the outer membrane of the infected cell at the initiation of infection (Kemp et al., 2005). It is not known where gp6.7 is located; it likely forms part of the internal core or lies within the portal channel.

6. Tail

The outer face of the portal is the site of assembly of the sixfold symmetrical stubby tail. The tail is composed of three proteins—gp7.3, 11 and 12—to which six trimers of the tail fiber protein gp17 are attached. Each trimer consists of three domains: an N-terminal third that binds to the tail; this domain is conserved in phages related to T7 that have a different host range. This domain is followed by a triple-stranded coiled-coil of ≈120 residues and then a C-terminal half consisting of globular domains that interacts with the cell surface lipopolysaccharide (Steven et al., 1988). The body of the tail consists of 12 copies of gp11, 6 copies of gp12, and ≈30 copies of the small protein gp7.3 (Kemp et al., 2005). How these proteins are arranged is unknown. It has been suggested that gp7.3 functions as a tail assembly protein that remains with the mature virion; the protein is ejected into the outer membrane of the infected cell (Chang et al., 2010; Kemp et al, 2005). Host range mutants of T7 suggest that all three proteins interact directly with the lipopolysaccharide of the host cell.

7. Interactome screening

T7 was the focus of the first genome-wide screen for phage protein–protein interactions using the yeast two-hybrid assay (Bartel et al., 1996). As proof of principle, T7 was a good choice: the extensive genetic, molecular biology, and biochemical studies that had been performed on T7 provided a solid basis for evaluating the accuracy and completeness of the approach. Twenty-five interactions were identified in the screen, 15 of which were intramolecular. Many of these are to be expected as even monomeric polypeptides fold into a three-dimensional structure. Virion assembly clearly involves extensive protein–protein interactions, but as discussed by the authors, many of those that were known or plausibly expected from known structures were missed in the screen (Bartel et al., 1996). Conversely, screen-predicted interactions involving gp1.7 that were confirmed subsequently by its purification as an oligomer (Tran et al., 2010) and the predicted interactions between the spanin proteins gp18.5 and gp18.7 predated the discovery of a heterodimer even in phage λ (Casjens et al., 1989). Six intermolecular interactions were predicted that have not yet found confirmatory support, and our current understanding of T7 suggests that some are implausible. Improved screening procedures should help resolve these outstanding issues and, if more intermolecular interactions can be found that are consistent with, in particular, the pathways of virion assembly, the utility of those screening procedures in predicting the protein interactome of lesser-studied phages will be enhanced greatly. All published T7 interactions are summarized in Table II.

TABLE II

Protein–protein interactions of phage T7

Protein	Function	Interacts with	References
Class I
gp0.3	B-DNA mimic	Host HsdRMS	Atanasiu et al., 2001, 2002; Kennaway et al., 2009; Murray, 2000; Stephanou et al., 2009
gp0.3		gp0.3 (dimer)	Atanasiu et al., 2001
gp0.3		gp4.5	Bartel et al., 1996
gp0.7	Protein kinase	Host RNAP (activity change)	Rahmsdorf et al., 1974; Severinova and Severinov, 2006; Zillig et al., 1975
gp0.7		RNase III (activity change)	Mayer and Schweiger, 1983
gp0.7		RNase E (activity change)	Marchand et al., 2001
gp0.7		Translation components	Robertson et al., 1994; Robertson and Nicholson, 1990, 1992
gp0.7		gp4.5	Bartel et al., 1996
gp1	T7 RNA polymerase	gp3.5	Jeruzalmi and Steitz, 1998; Zhang and Studier, 2004
gp1		gp19 (genetic)	Zhang and Studier, 1995, 2004
gp1.2	E. coli dGTPase inhibitor;	Host Dgt (dGTPase)	Huber et al., 1988; Nakai and Richardson, 1990; Saito and Richardson, 1981
gp1.2	F-exclusion	F plasmid PifA	Cheng et al., 2004; Schmitt and Molineux, 1991; Wang et al., 1999
gp1.2		Host FxsA	Cheng et al., 2004; Wang et al., 1999
Class II
gp1.7	Nucleotide kinase	gp1.7 (oligomer)	Tran et al., 2010; Bartel et al., 1996
gp2	E. coli RNAP inhibitor	Host RNAP	Camara et al., 2010; Nechaev and Severinov, 1999
gp2		gp3	Bartel et al., 1996
gp2.5	SSB	gp5 (with thioredoxin)	Hamdan et al., 2005; He et al., 2003; Ghosh et al., 2008, 2010; Marintcheva et al., 2009
gp2.5		gp4	Ghosh et al., 2008; Hamdan et al., 2005; He and Richardson, 2004
gp2.5		gp2.5 (dimer)	Rezende et al., 2002; Bartel et al., 1996
gp3	Endonuclease I, HJ resolvase	gp3 (dimer)	Hadden et al., 2007; Parkinson and Lilley, 1997
gp3.5	Amidase (lysozyme)	gp1	Jeruzalmi and Steitz, 1998; Zhang and Studier, 2004
gp4	gp4A primase-helicase; gp4B	gp5 (with thioredoxin)	Hamdan et al., 2005; He and Richardson, 2004
gp4	helicase	gp2.5	Hamdan et al., 2005; He and Richardson, 2004
gp4		gp4 (hexamer)	Singleton et al., 2000; Bartel et al., 1996
gp4.5		gp0.7	Bartel et al., 1996
gp5	DNA polymerase	gp4 (with thioredoxin)	Ghosh et al., 2008; Hamdan et al., 2005; Bartel et al., 1996
gp5		gp2.5 (with thioredoxin)	Ghosh et al., 2008, 2010; He et al., 2003; Marintcheva et al., 2009; Hamdanet al., 2005
gp5		gp4.7	Bartel et al., 1996
gp5		gp6.5	Bartel et al., 1996
gp5		Host thioredoxin	Ghosh et al., 2008
gp5.3		gp5.3	Bartel et al., 1996
gp5.5		Host H-NS	Lin, 1992; Liu and Richardson, 1993
gp5.9	Inhibits RecBCD nuclease	Host RecBCD	Lin, 1992
Class III
gp6.7	Head completion; ejected virion protein	Other head proteins likely. gp13?	Kemp et al., 2005
gp7.3	Tail chaperone; ejected structural virion protein	gp7.3, gp11, gp12	Kemp et al., 2005
gp7.3		Host LPS	Kemp et al., 2005
gp8	Head–tail connector (portal)	gp19	Fujisawa et al., 1991; Morita et al., 1995
gp8	protein	gp9/10/gp6.7/7.3/11/12/14 (structure)	Agirrezabala et al., 2005a, 2007; Cerritelli and Studier, 1996a; Liu et al., 2010
gp8		gp8 (dodecamer) (structure)	Agirrezabala et al., 2005b; Cerritelli and Studier, 1996b
gp9	Scaffolding protein	gp8, gp10 (structure)	Agirrezabala et al., 2007; Cerritelli and Studier, 1996a,b
gp9	Scaffolding protein	gp8, gp10 (structure)	Bartel et al., 1996
gp9		gp14/15/16 (structure)	—
gp10	Major capsid protein (-1 frame-shift -> minor gp10B)	gp8, gp9, gp14 (structure)	Agirrezabala et al., 2005a,b, 2007; Cerritelli and Studier, 1996a; Cerritelli and Studier, 1996b
gp10	Major capsid protein (-1 frame-shift -> minor gp10B)	gp10 (pentamer/hexamer) (structure)	Cerritelli and Studier, 1996a; Bartel et al., 1996
gp10		Host FxsA; F plasmid PifA	Cheng et al., 2004; Schmitt and Molineux, 1991; Wang et al., 1999
gp11	Tail protein	gp7.3, gp8, gp11, gp12 (structure)	Kemp et al., 2005; Agirrezabala et al., 2007; Liu et al., 2010
gp11		Host LPS	Kemp et al., 2005
gp11		gp2.5	Bartel et al., 1996
gp12	Tail protein	gp7.3, gp8, gp11, gp12 (structure)	Kemp et al., 2005; Liu et al., 2010
gp12		Host LPS	Kemp et al., 2005; Qimron et al., 2006
gp13	Virion chaperone	gp6.7, gp8, gp12, gp14, gp17, gp19 (genetics)	Kemp et al., 2005, Stone and Miller, 1985
gp14	Ejected core protein	gp8, gp14, gp15, gp16 (structure)	Agirrezabala et al., 2007; Liu et al., 2010
gp15	Ejected core protein	gp14, gp15, gp16 (structure)	Agirrezabala et al., 2007; Chang et al., 2010; Liu et al., 2010; Bartel et al., 1996
gp16	Ejected core protein	gp14, gp15, gp16 (structure)	Agirrezabala et al., 2007; Chang et al., 2010; Liu et al., 2010
gp16		Peptidoglycan	Moak and Molineux, 2000, 2004
gp17	Tail fiber protein	gp7.3/11/12 (structure)	Agirrezabala et al., 2007; Chang et al., 2010; Liu et al., 2010
gp17		gp17 (trimer)	Steven et al., 1988; Bartel et al., 1996
gp17		Host LPS	Steven et al., 1988; Qimron et al., 2006
gp18	Small terminase	gp18 (octamer)	White and Richardson, 1987; Hamada et al., 1986
gp18.5-	gp18.7 Casjens et al., 1989 Summer et al., 2007; Bartel et al., 1996	Rz-Rz1 homologues; spanin	gp18.5–gp18.7
gp19	Large terminase	gp8	Fujisawa et al., 1991; Hamada et al., 1986; Morita et al., 1994, 1995
gp19		gp19 (hexamer in the presence of prohead)	Fujisawa et al., 1991
gp19		gp4.7	Bartel et al., 1996

B. Coliphage λ

Phage λ may be the best-studied temperate phage and has been under investigation since the early 1950s. There are probably thousands of papers that have been published on phage λ and its closer relatives N15, HK97, and P22. For instance, there are more than 400 citations in a review of lambdoid phages that do not even focus on the genetic switch of λ, the paradigm for gene regulation (Hendrix and Casjens, 2006). The switch has been described in detail by many other papers and in its own book (Court et al., 2007; Ptashne, 2004). This section summarizes current knowledge of the interactions among phage λ proteins. Finally, a summary of interactions with its host, E. coli, is given. For a more detailed description of phage λ biology and regulation, the authors refer the reader to other reviews (Calendar, 2006; Court et al. 2007; Hendrix and Casjens, 2006; Hendrix et al., 1983; Oppenheim et al., 2005; Ptashne, 2004).

1. The λ virion

The phage λ genome and virion are shown in Figure 3. Note that the virion contains tail fibers that are not visible in most λ micrographs. The variant of λ used in most laboratories around the world, λPaPa, has a frameshift mutation in the structural stf gene and lacks the long “side tail fibers” of λ, Ur-λ, that comes directly from the original E. coli K-12 lysogen. It is still not entirely clear how many different proteins are in the λ particle but 12 are known with confidence and 2 more (gpM and gpL) are possibly in the particle in very low numbers. A current model of the particle with its known proteins is shown in Figure 3. Based on this model, we can predict at least 23 PPIs in the virion (Table III), although several are inferred from genetic experiments and remain to be detected by biochemical or biophysical methods. The virion adsorbs to sensitive cells through an interaction between the tail tip protein gpJ and the host LamB outer membrane protein. DNA is then released through the tail into the cytoplasm.

An external file that holds a picture, illustration, etc.
Object name is nihms404378f3.jpg

FIGURE 3

Genome and virion of phage λ. (A) Genome of phage λ. Colored ORFs correspond to colored proteins in (B). Main transcripts are shown as arrows. After Hendrix and Casjens in Calendar (2006). (B) Schematic model of λ virion. Numbers indicate the number of protein copies in the particle. It is unclear whether gpM and gpL proteins are in the final particle or only required for assembly. (C) Electron micrograph of phage λ. Modified after Hendrix and Casjens (2006).

TABLE III

Published protein–protein interactions of phage λ

λ 1	λ 2	Notes	References
gpA	gpNu1	gpA (N-terminal) – gpNu1 (C-terminal)	Catalano, 2000; de Beer et al., 2002; Hang et al., 1999
gpA	gpB	gpA (C-terminal) – gpB (= portal)	Catalano, 2000; Duffy and Feiss, 2002
gpW	gpB	gpW (NMR, head–tail joining)	Casjens, 1974; Casjens et al., 1972
gpW	gpFII	gpW required for gpFII binding	Casjens, 1974; Maxwell et al., 2001
gpB	gpB	12-mer (22 amino acids removed from gpB N-terminal)	Kochan, Carrascosa and Murialdo, 1984; Tsui and Hendrix, 1980
gpC	gpE	Covalent PPI (in virion?)	Hendrix and Casjens, 1974;Hendrix and Duda, 1998
gpC′	gpB		Murialdo, 1979
gpC	gpNu3	gpC may degrade gpNu3 (before DNA packaging)	Hendrix and Casjens, 1975; Hohn et al., 1975; Medina et al., 2010; Murialdo and Becker, 1978
gpD	gpD	gpD forms trimers	Mikawa et al., 1996; Sternberg and Hoess, 1995; Yang et al., 2000
gpE	gpE	Major capsid protein	Casjens and Hendrix, 1974; Dokland and Murialdo, 1993; Lander et al., 2008
gpU	gpU	“Probably a hexamer”	Katsura and Tsugita, 1977
gpD	gpE		Casjens and Hendrix, 1974; Dokland and Murialdo, 1993; Lander et al., 2008
gpU	gpV	gpU is head-proximal tail protein	Pell et al., 2009
gpV	gpV		Buchwald et al., 1970a,b; Casjens and Hendrix, 1974; Katsura, 1983
gpO	gpP		Wickner and Zahn, 1986; Zahn and Blattner, 1985, 1987; Zahn and Landy, 1996)
gpO	gpO	gpO–gpO interactions when bound to ori DNA	Alfano and McMacken, 1989
gpB	gpA	Genetic evidence	Frackman et al., 1985; Sippy and Feiss, 1992
gpFI	gpA	Genetic evidence	Catalano and Tomka, 1995
gpFI	gpE	Genetic evidence	Murialdo and Tzamtzis, 1997
gpH	gpG/gpGT	gpG/gpGT holds gpH in an extended fashion	Hendrix and Casjens, 2006
gpV	gpGT	T domain binds soluble gpV	Hendrix and Casjens, 2006
gpH	gpV	gpV probably assembles around gpH, displacing gpG/gpGT	Xu et al., 2004
CI	CI	Forms octamer that links O_R to O_L	Bell and Lewis, 2001; Dodd et al., 2001
Exo	Bet		Radding et al., 1971
SieB	Esc	Esc is encoded in frame in SieB and inhbits SieB	Ranade and Poteete, 1993a,b
CII	CII	Homotetramers	Ho and Rosenberg, 1982
Xis	Int		Sam et al., 2002
Xis	Xis	Xis-Xis binding mediates cooperative DNA binding	Sam et al., 2004
Int	Int	Dimer	Warren et al., 2003
Stf	Stf	Likely homotrimer by homology with T4 fibers
gpS	gpS	Large ring in inner membrane	Savva et al., 2008
gpS	gpS′	gpS′ inhibits gpS ring formation	Grundling et al., 2000
gpB	gpE	Copurify in procapsid	Hendrix and Casjens, 1975
gpNu3	pNu3	Monomer–dimer equilibrium	C. Catalano, personal communication
CIII	CIII	Dimer	Halder et al., 2007
Cro	Cro	Dimer; X-ray str	Anderson et al., 1979
gpRz	gpRz1	Heteromultimer supposed to span the periplasm	Summer et al., 2007
gpNu3	gpB	gpNu3 required for gpB incorporation into procapsid	Ray and Murialdo, 1975

2. Assembly of virion head (Fig. 4)

An external file that holds a picture, illustration, etc.
Object name is nihms404378f4.jpg

FIGURE 4

Assembly of the phage λ head. Head assembly has been subdivided into five steps, although most steps are not very well understood in mechanistic terms. Note that the tail is assembled independently. GpC protease, scaffolding protein gpNu3, and portal protein gpB form an ill-defined initiator structure. Protein gpE joins this complex in a step requiring the chaperonins GroES and GroEL. Proteins gpNu1, gpA, and gpFI are required for DNA packaging. GpD joins and stabilizes the capsid as a structural protein; gpFII and gpW are connecting the head to the tail that joins once the head is completed. Modified after Georgopoulos et al. (1983) and Lander et al. (2008).

λ encodes 10 proteins required for head assembly, of which 6 end up in the mature virion. Head assembly starts with assembly of the portal, which requires the host GroEL/S chaperone for folding. It is thought that the portal serves as the nucleus for assembling the remaining capsid. However, the mechanistic details of assembly are not well understood. GpC is involved as a head maturation protease that becomes attached covalently to E, the main capsid protein. The resulting fusion proteins subsequently get trimmed to make slightly shortened products named X1 and X2 (Hendrix and Casjens, 1974). The authors note, however, that attempts to identify X1 and X2 were unsuccessful and thus X1 and X2 may be artifacts (Medina et al., 2010). GpNu3 acts as putative scaffolding protein that aids coat shell assembly and is removed from the capsid by proteolysis before DNA is packaged. Structures of the procapsid and mature λ capsid have been solved by cryo-EM, showing the T=7 arrangement of capsid protein gpE clustered into hexamers and pentamers (Dokland and Murialdo; Lander et al., 2008). Trimers of the decoration protein gpD bind to the capsid at trivalent interaction points between the capsomers and stabilize the capsid (Sternberg and Weisberg, 1977). Comparison of the procapsid with the mature capsid reveals the considerable structural transitions that occur upon capsid maturation and DNA packaging, but this process is not well understood mechanistically.

3. Assembly of virion tail (Fig. 5)

An external file that holds a picture, illustration, etc.
Object name is nihms404378f5.jpg

FIGURE 5

Assembly of the phage λ tail. The λ tail is made of at least six proteins (gpU, V, J, H, tfa, stf) with another seven required for assembly (gpI, M, L, K, G/T, Z). Assembly starts with protein gpJ, which then, in a poorly characterized fashion, requires proteins gpI, gpL, gpK, and gpG/T to add the tape measure protein H. GpM joins and then gpG and gpG/T leave the complex so that the main tail protein gpV can assemble on the gpJ/H scaffold. Finally, gpU is added to the head-proximal end of the tail. GpZ is required to connect the tail to the preassembled head. Protein gpH is cleaved by the action of gpU and gpZ (Tsui and Hendrix, 1983). It remains unclear if proteins gpM and gpL are part of the final particle (Hendrix and Casjens, 2006). From http://www.pitt.edu/~duda/lambdatail.html.

Tail assembly begins with the antireceptor protein gpJ. Proteins gpI, gpK, gpL, and gpM assemble around gpJ in that order, although no mechanistic or structural details are known about this process. In parallel events, gpH, gpG, and gpG-T (the frameshifted form of gpG including the T reading frame) form a complex that binds the tail tip and the major tail protein (gpV). The tape meassure protein gpH determines the length of the phage tail (Katsura, 1990). Once the correct length is achieved, gpV assembly stops and the proximal tail protein gpU is added. Protein gpZ somehow facilitates head–tail joining without being assembled into the final virion.

4. Lysogeny and induction

After phage DNA ejection, circularization, and ligation by the host DNA ligase, the earliest genes to be expressed include N, O, and P, as well as cII and cIII, which play a key role in the lysis/lysogeny decision (Echols, 1986). Both CII and CIII can interact with and be cleaved by the bacterial FtsH (HflB) protease directly, which in turn is complexed with the host HflK and HflC proteins at the cytoplasmic membrane. HflD, a cytoplasmic membrane protein, also binds CII, making it both more susceptible to FtsH and less able to bind DNA (Kihara et al., 2001; Parua et al., 2010). As a substrate for FtsH, CIII acts as a competitive inhibitor of CII degradation and, as the half-life of CII protein increases, leads to an increase in CI repressor synthesis (Datta et al., 2005a; Herman et al., 1997; Kobiler et al., 2007). Except when cIII is overexpressed, lysogeny is only established when the activity of these proteases (FtsH, Lon, and perhaps others) is low. Once established, lysogeny is a stable state until lysogens are exposed to ultraviolet light or other DNA-damaging agents that lead to the activation of RecA, which in turn accelerates autocleavage of the λ repressor CI. Loss of CI leads to prophage induction and the lytic cycle. The mechanism by which phage λ decides whether to grow lytically or become a lysogen is reviewed elsewhere (Court et al., 2007; Friedman and Court, 2001; Gottesman and Weisberg, 2004; Ptashne, 2006). However, Table IV contains the host–phage protein–protein interactions critical for gene expression and thus the genetic switch of phage λ.

TABLE IV

Published phage λ–host interactions^a

λ	Host	Description	References
Transcription
CI	RecA	RecA degrades CI^b	Kobiler et al., 2004
CI	RpoA		Kędzierska et al., 2007
CI	RpoD		Li et al., 1994; Nickels et al., 2002
CII	ClpYQ	ClpYQ degrades CII in vitro	Kobiler et al., 2004
CII	ClpAP	ClpAP degrades CII in vitro	Kobiler et al., 2004
CII	HflD	HflD makes CII more vulnerable to FtsH (hfl, high-frequency lysogenization)	Kihara et al., 2001
CII	HflB^c	HflB (= FtsH) protease degrades cII	Kobiler et al., 2002; Shotland et al., 2000
CII	RpoA		Kędzierska et al., 2004; Marr et al., 2004
CII	RpoD		Kędzierska et al., 2004
CIII	HflB	CIII inhibits HflB; HflB degrades CIII	Herman et al., 1997; Kornitzer et al., 1991
gpN	NusA	Transcriptional regulation	Friedman and Court, 2001
gpN	Lon	Lon degrades gpN	Kobiler et al., 2004; Gottesman et al., 1981
gpQ	σ⁷⁰	gpQ makes RNAP insensitive to cis terminal	Marr et al., 2001;Nickels et al., 2002; Roberts and Roberts, 1996; Roberts et al., 1998
Head
gpB	GroE	genetic interaction; E. coli GroE	Ang et al., 2000; Georgopoulos et al., 1973; Murialdo, 1979
gpE	GroE	genetic interaction	Georgopoulos et al., 1973
Tail
gpJ	LamB	LamB is the E. coli receptor	Buchwald and Siminovitch, 1969; Clement et al., 1983; Mount et al., 1968; Wang et al., 2000a; Werts et al., 1994
Stf	OmpC	OmpC is a secondary E. coli receptor	Hendrix and Duda, 1992
Recombination
Xis	Lon	Xis is degraded by E. coli Lon protease	Leffers and Gottesman, 1998
Xis	FtsH	Xis is degraded by E. coli FtsH protease	Leffers and Gottesman, 1998
Xis	Fis	both required for excision	Ball and Johnson, 1991; Cho et al., 2002; Esposito and Gerard, 2003
Int	IHF	Both catalyze recombination at attP/attB	Campbell et al., 2002; Crisona et al., 1999
Gam	RecB	Gam inhibits RecBCD	Marsic et al., 1993
NinB	SSB	NinB also binds ssDNA	Maxwell et al., 2005
Replication
gpO	ClpXP	ClpXP degrades gpO	Kobiler et al., 2004
gpP	DnaA		Datta et al., 2005b,c
gpP	DnaB		Mallory et al., 1990

^{CbpA (}) perhaps interacts with DnaA and SeqA (; , ).

^{After DNA damage, RecA bound to single-stranded DNA stimulates autocleavage of the CI repressor (}).

^{HflB = FtsH.}

5. Anti-termination

Escherichia coli RNA polymerase transcribes λ DNA starting from the immediate early promoters P_L and P_R, which leads to expression of the N and cro genes, among others. As soon as the gpN-binding sites (nut sites) are transcribed, gpN modifies the transcription complex by recognizing a stem-loop structure in the mRNA and by binding to host Nus proteins [note that gpN interacts with nut RNA (not DNA)]. This modified complex is capable of overriding the termination signal that normally stops immediate early transcription. Because gpN is continually being degraded by the Lon protease, ongoing anti-termination requires continual transcription of the N gene (Fig. 6). In addition to the anti-terminator for early lytic genes (gpN), a second anti-terminator, gpQ, is required for late gene expression. GpQ anti-termination differs from that of gpN in that gpQ binds to the “Q binding element” (QBE, also known as the qut site) of the P_R′ promotor (i.e., unlike the gpN protein, gpQ binds DNA). When RNA polymerase (RNAP) initiates transcription at the P_R′ promotor, it almost immediately pauses at a pause-inducing sequence. Pausing allows gpQ to contact σ^{in the RNAP, which causes a conformational change in the enzyme that enables gpQ to bind the β subunit (RpoB). The gpQ-containing RNAP is then able to override the termination signal at t}_R′ (Deighan et al., 2008; Nickels et al., 2002).

An external file that holds a picture, illustration, etc.
Object name is nihms404378f6.jpg

FIGURE 6

Interactions of phage λ with its host, E. coli. (A) Overview of λ activities in E. coli: (a) free phage, (b) binding and entry, (c) DNA circularization and supercoiling, (d) transcription, (e), replication, and (f) virion assembly. Details are shown in B–E. (B) Infection and DNA ejection, (C) DNA ligation, (D) gene regulation. An asterisk means weakly bound to the RNA polymerase. (E) DNA replication. Phage proteins are labeled in red, host proteins in black. Modified after Das (1992) and Mason and Greenblatt (1991).

6. Replication

After λ gene O and P protein synthesis, replication of the phage genome is initiated (Fig. 6). Replication initiation starts with dimeric gpO bound to the four 18-bp direct repeats of the λ origin, located within the sequence of the O gene. Each repeat binds a gpO dimer, leading to a total of eight monomers bound directly to the DNA, in addition to non-DNA-bound gpO that associates with the complex solely by PPIs (Dodson et al., 1986). This large nucleoprotein complex is called the O-some. GpP extracts the host DnaB helicase from its native complex with DnaC (Zylicz et al., 1989). The DnaB helicase is inactive as long as it is tightly associated with gpP. The gpP–DnaB pair is then added to the O-some through interactions between gpP and gpO. Three host heat shock chaperone proteins—DnaK, DnaJ, and GrpE—are subsequently recruited (DnaK through direct interaction with gpP and gpO) and activate the DnaB helicase by liberating it from its inhibitory interaction with gpP. Finally gpP and the chaperones are released, and replication starts with the help of the host DnaG primase, DNA polymerase III, gyrase, and single-stranded DNA-binding (SSB) proteins (Dodson et al., 1986; LeBowitz and McMacken, 1986). The RNA polymerase β subunit was shown to interact directly with gpO (Szambowska et al., 2010). Even though not yet fully understood, how E. coli RNA polymerase activates λ ori seems to be far more complex than the original idea that it simply changed DNA topology.

Studies on λ ori-dependent plasmids have revealed interactions with even more host proteins, including the DnaJ homolog CbpA (Wegrzyn et al., 1996), and through transcriptional activation of origin activity, perhaps also with DnaA and SeqA (Potrykus et al., 2002; Wegrzyn and Wegrzyn, 2001, 2002). As λ ori-dependent plasmid maintenance and copy number are very sensitive to changes in host physiology, interactions of the O-some with additional proteins may yet be revealed.

7. The λ interactome

A λ Y2H interactome analysis has been completed (Rajagopala et al., 2011). All λ open reading frames (ORFs) were cloned and tested in all possible pairs for interactions, using three different Y2H vectors. These screens identified 97 interactions among 68 tested λ proteins, of which 16 were known previously. Notably, because only 33 interactions had been reported previously, this screen found more than 50% of all previously published interactions in a single experiment! At least 18 of the newly observed Y2H interactions seem to be plausible, based on the functions of the interacting proteins, indicating that even in λ many new discoveries can be made. Interestingly, relatively few interactions involving morphogenetic proteins were revealed, and possible reasons for the still significant fraction of apparent false negatives are discussed later.

C. Salmonella phage P22

P22 infects Salmonella enterica serovar Typhimurium and was originally isolated by Zinder and Lederberg (1952). It is a temperate member of Podoviridae. Its virion proteins are not particularly closely related to other tailed phage groups, but its early genes and overall lifestyle are very closely related to phage λ. For this reason, it is often placed in the “lambdoid” phage group. Phage P22 is also a very important tool for those who use Salmonella genetics, as it is particularly easy to handle in the laboratory and is a very good generalized transducing phage (i.e., it packages a significant amount of host DNA by accident and can be used to move host alleles from one Salmonella strain to another).

1. P22 life cycle

The early operons of P22 are organizationally similar to those of λ and carry parallel prophage repressor, Cro, recombination, DNA replication, and transcription anti-termination functions (Susskind and Botstein, 1978). The P22 repressor has a different DNA operator target specificity from that of λ repressor and encodes nonhomologous types of recombination and replication proteins, but its late operon anti-termination protein is 97% identical to that of λ and has the same target specificity. Unlike λ, in addition to its prophage repressor (C2) and lysogeny control proteins (C1, C3, Cro), P22 carries an “immunity I” region that encodes an anti-repressor protein, Ant, that binds to the prophage repressor and blocks its ability to bind its operator. Two additional repressors, Mnt and Arc, regulate Ant synthesis (Susskind and Botstein, 1975; Vershon et al., 1987a,b). Also, instead of a protein that recruits the host DnaB protein to the replication origin, P22 encodes its own DnaB homologue (Backhaus and Petri, 1984; Wickner, 1984). The repressed P22 prophage encodes several “lysogenic conversion” genes, several of which encode a set of three Gtr (glucose transfer) proteins that modify the host O-antigen polysaccharide by glucosylation (Broadbent et al., 2010; Makela, 1973; Vander Byl and Kropinski, 2000). Finally, P22 has been studied as a prototypical headful DNA packaging phage, which is different from the packaging strategies used by the other phages discussed here; unlike these other phages that package DNA between two specific nucleotide sequences in the DNA substrate, P22 packages DNA until the head is full and the nucleotide sequence at the termination point does not matter (Casjens and Hayden, 1988; Jackson et al., 1978; Tye et al., 1974).

2. P22 virion assembly

The P22 virion is assembled by proteins encoded by 14 genes in its late operon (Fig. 7) (Botstein et al., 1973; Eppler et al., 1991; King et al., 1973; Poteete and King, 1977; Youderian and Susskind, 1980). Some P22-like phages, for example, phages ε34 and L, have a 15th virion assembly gene, dec, at the promoter proximal end of the cluster, whose encoded protein “decorates” the outside of the head and stabilizes it (Gilcrease et al., 2005; Tang et al., 2006; Villafane et al., 2008). The P22 late operon encodes lysis proteins very similar to those of λ; however, the virion assembly genes are not closely related to λ or to any other tailed phage type. P22 virion assembly is best known for the fact that its head-assembling scaffolding protein was the first to be discovered and studied (Cortines et al., 2011; King and Casjens, 1974; Prevelige et al., 1988; Weigele et al., 2005). The P22 procapsid assembles as a complex of 415 coat protein molecules, 12 portal ring subunits, and about 250 scaffolding protein molecules. At about the time when DNA is packaged, all 250 scaffolding proteins are released intact from the procapsid and are reused with newly synthesized coat protein in subsequent rounds of procapsid assembly. Thus, the P22 scaffolding protein acts catalytically in head assembly and individual molecules can participate in at least five rounds of procapsid assembly. A particularly high-resolution asymmetric cryo-EM reconstruction of the virion has been determined (Lander et al., 2006; Tang et al., 2011), and because assembly of the four proteins of its short tail has also been quite well studied [X-ray structures are known for three tail proteins and the portal protein (Olia et al., 2006, 2007; Steinbacher et al., 1997)], the P22 virion is perhaps the best understood of all tailed phage virions. The details of P22 virion assembly have been reviewed in light of its evolution and genome mosaicism (Casjens and Thuman-Commike, 2011), and the authors refer the reader to this review for further details. The published PPIs of phage P22 are summarized in Table V. Interactions between P22 and its host are much less well studied than those of λ and are mostly understood by homology arguments in the case of proteins that are similar in the two phages. Known P22–host interactions are listed in Table VI.

An external file that holds a picture, illustration, etc.
Object name is nihms404378f7.jpg

FIGURE 7

Genome and virion of phage P22. (A) Genome of phage P22 with a scale in kbp below. Green open reading frames (ORFs; rectangles) are transcribed left to right; red ORFs are transcribed right to left. Known RNA polymerase promoters and transcripts are shown as black flags and arrows, respectively. (B) The asymmetric (not icosahedrally averaged) P22 virion three-dimensional cryo-EM reconstruction from Tang et al. (2011). Numbers in parentheses indicate molecules/virion, and bold numbers indicate proteins for which X-ray structures are known. (C). Negatively stained electron micrograph of P22 virions.

TABLE V

P22 intraphage protein interactions

P22 1	P22 2	Description	References
gp3	gp3	Small terminase subunit; homononomer in solution	Nemecek et al., 2007, 2008
gp3	gp2	Purify as complex; bind each other in vitro	Nemecek et al., 2007; Poteete and Botstein, 1979
gp2	gp1	gp2 is large terminase subunit; assembles in packaging motor to gp1 by analogy with other phages (gp2 by itself is monomer in solution); evolutionary argument for P22 gp2–gp1 interaction	Casjens, 2011
gp1	gp1	Portal protein; homododecamer; EM of 12-mer; EM of virion; X-ray structure	Bazinet et al., 1988; Lander et al., 2006; Olia et al., 2011
gp1	gp5	Purify as complex (procapsids); three-dimensional (3D) reconstructions of virion and procapsid	Botstein et al., 1973; Chen et al., 2011; Lander et al., 2006
gp1	gp8	gp8 required for assembly of gp1 into procapsid; interaction seen in procapsid 3D reconstruction	Chen et al., 1989; Greene and King, 1996; Weigele et al., 2005
gp8	gp8	Scaffolding protein; forms homodimers and tetramers in solution	Parker et al., 1997
gp8	gp5	Purify as complex (procapsid); bind in vitro	Fuller and King, 1982; Greene and King, 1994; Parker et al., 1998
gp5	gp5	Coat (major capsid) protein; forms T=7 l icosahedral shell of virion	Botstein et al., 1973; Casjens, 1979; Lander et al., 2006
gp5	gp1	Purify as complex (procapsids); interaction seen in 3D reconstruction	Chen et al., 2011; Lander et al., 2006
gp4	gp4	Tail protein; monomer in solution (naturally unfolded?); gp4s barely touching each other in dodecamer ring in virion (i.e., contacts not large); EM, X-ray structure	Olia et al., 2011; Tang et al., 2011
gp4	gp5	Contact in high-resolution 3D reconstruction of virion	Olia et al., 2006, 2011; Tang et al., 2011
gp4	gp1	Order of assembly, EM, 3D reconstruction; X-ray cocrystal structure	Olia et al., 2006, 2011; Tang et al., 2011
gp4	gp10	12 gp4s contact 6 gp10s; order of assembly, EM, 3D reconstruction	Lander et al., 2006; Olia et al., 2007
gp4	gp9	gp4 ring contacts 6 gp9 trimers; 3D reconstruction	Lander et al., 2006
gp10	gp10	Tail protein; probably hexamer in solution; 3D reconstruction	Lander et al., 2006; Olia et al., 2007
gp10	gp9	6 gp4s contact 6 gp9 trimers; 3D reconstruction	Lander et al., 2006
gp10	gp26	6 gp10s contact one gp26 trimer	Landeret al., 2006
gp26	gp26	Tail needle protein; trimer in solution - crystal structure; one trimer in virion	Olia et al., 2007
gp16	gp8	Mutants of gp8 do not incorporate gp16 into virion	Chen et al., 1989; Greene and King, 1996; Weigele et al., 2005
C1	C1	C1 is homologue of λ CII; homotetramer	Ho et al., 1992
Mnt	Mnt	Repressor of antirepressor gene; tetramer	Nooren et al., 1999
Arc	Arc	Repressor of antirepressor gene; dimer in solution, tetramer on DNA	Bowie and Sauer, 1989; Breg et al., 1990; Brown, Bowie and Sauer, 1990
Ant	C2	Antirepressor; binds to and inhibits C2 repressor	Susskind and Botstein, 1975
gp9	gp9	Tailspike protein; trimer in solution, 6 trimers in virion; crystal structure; binds to polysaccharide on cell surface	Lander et al., 2006; Steinbacher et al., 1994, 1997
Int	Xis	Form heteromultimer	Cho et al., 2002
Xis	Xis	Excisionase; homomultimers	Mattis et al., 2008
Erf	Erf	Recombination protein; homomultimeric ring of 10–14 subunits	Poteete et al., 1983
C3	C3	C3 is homologue of λ CIII; λ CIII is dimer by disulfide bond; P22 interaction by homology with λ CIII	Halder et al., 2007
SieB	Esc	Similar to λ but not studied in detail	Ranade and Poteete, 1993a
C2	C2	Prophage repressor homologue of λ CI; homodimer	De Anda et al., 1983
Cro	Cro	Dimer; solution studies and crystal structure	Darling et al., 2000; Newlove et al., 2004
gp18	gp12	Origin binding protein gp18 recruits DnaB-like gp12 to replication origin by analogy with λ (note that gp12 is not a homologue of λ P protein; gp18 is homologous to λ O protein only in N-terminal domain)	Backhaus and Petri, 1984
Rz	Rz1	Rz and Rz1 are supposed to bind to one another and form a bridge that spans the periplasm in phage λ; P22 gp15 and Rz1 should do the same by homology.	Summer et al., 2007
gp13	gp13	Holin forms rings of large diameter in λ; by homology same should be true of P22 holin	Savva et al., 2008
gp13	gp13′	Antiholin–holin interactions delay the formation of holin rings in λ; by homology same should be true in P22	Rennell and Poteete, 1985
Dec	Dec	Trimer; Note that P22 has no Dec, but its very close relative phage L does and L Dec binds to P22 capsids.	Tang et al., 2006
Dec	gp5	Dec occupies a subset of the quasi-equivalent local threefold positions on the shell	Tang et al., 2006

TABLE VI

P22–host protein interactions

P22	Host	Description	References
gp5	GroEL	Solution studies	de Beus et al., 2000
gp24	NusA? etc?	Homologue/analogue of λ gpN; likely binds same host factors as λ gpN	Franklin, 1985; Hendrix and Casjens, 2006
C3	FtsH(HflB)	C3 is homologue of λ CIII; inhibition of FstH by homology with λ CIII	Semerjian et al., 1989
C1	FtsH(HflB)	C1 is homologue of λ CII; CII is degraded by FtsH	Backhaus and Petri, 1984
C2	RecA	P22 repressor autocleavage stimulated by RecA binding like λ	Prell and Harvey, 1983; Tokuno and Gough, 1976
Xis, gp18, gp24, C1	Host protein degradation machinery	These may be degraded analogously to their λ homologues, but direct information is not available	Backhaus and Petri, 1984
gp23	RNA polymerase?	By homology to λ; gp23 is nearly identical to λ gpQ	Pedulla et al., 2003
gp12	Host replication machinery	Possible interaction with host DnaG primase	Wickner, 1984
Abc2	RecC	Solution studies	Murphy, 2000

3. Comparison of P22 and λ interactomes

As noted earlier, comparison of P22 and λ is a particularly informative way to understand the kinds of relationships among the proteins of different tailed phages. The “lambdoid” phages all encode similar functions and have similar gene organization, but these “similar functions” can be of several different types. Because these phages undergo a significant amount of horizontal transfer of genetic information (especially if they are “closely” related), their functions can be homologous or even nearly identical (Casjens, 2005; Hendrix, 2002). For example, the two “gpQ” late operon control anti-termination proteins of the two phages are nearly identical and can replace one another in function. A second kind of relationship is represented by the prophage repressors. These proteins from the two phages have weak but recognizable homology and similar folds, and probably interact with host proteins the same way, but have different operator-binding specificity. Thus, one repressor cannot functionally replace the other without additional changes in the DNA-binding sites of the protein. A third kind of relationship is that of the virion coat, terminase, and portal proteins, λ gpE, gpA, and gpB and P22 gp5, gp2, and gp1, respectively. These proteins have the same functions in the two phages—coat protein and DNA packaging nuclease/motor ATPase and procapsid DNA entry channel, respectively (summarized in Hendrix and Casjens, 2006). These cognate proteins are not recognizably similar in overall amino acid sequence between the two phages, although the two ATPase proteins have similar ATP-binding motifs. However, the two proteins almost certainly have the same basic fold. Thus, gpE and gp5 and gpA and gp2, as well as gpB and gp1, are all very likely ancient homologous pairs that have diverged beyond the point of having recognizable sequence similarity. Finally, similar functions are performed by nonhomologues in the two phages. Examples of this relationship are the (i) nonhomologous Exo (λ) and Erf (P22) proteins that catalyze homologous recombination in completely different ways (Little, 1967; Poteete and Fenton, 1983); (ii) the endolysin proteins gpR and gp19, where the λ protein is a transglycosylase and the P22 protein is a true lysozyme that both cleave the same peptidoglycan bond (Bienkowska-Szewczyk et al., 1981; Rennell and Poteete, 1989); (iii) the gpW and gp4 proteins, which bind between the portal ring of the head and the tail and yet have no recognizable homology or structural similarity; and (iv) the λ receptor binding “fiber” (gpJ and STF) and P22 tailspike (gp9), which have no homology, recognize different structures, and act in different ways [the P22 gp9 enzyme cleaves its polysaccharide receptor (Iwashita and Kanegasaki, 1976)]. Thus, there are potentially a number of different ways in which parallel λ and P22 functions might interact with the hosts of these two phages, and it will be interesting to understand these in detail in both cases. But are there very similar phage proteins that have different host interactions? Do very different phage proteins that have the same function have equally different host interactions? Although no such situations have been identified to date, it remains possible. A comparison of PPIs in λ and P22 is given in Tables VII and VIII.

TABLE VII

Comparison of λ and P22 interactions

λ 1	λ 2	P22 1	P22 2	Description
gpA	gpB	gp2	gp1	Likely similar contacts—by evolutionary argument^a
gpA	gpNu1	gp2	gp3	Likely similar contacts—by evolutionary argument ^a
gpNu1	gpNu1	gp3	gp3	gpNu1 N-terminal fragment is dimer; gp3 is nonomer; not known if they have the same fold (i.e., are ancient homologues)
gpW	gpB	gp4	gp1	gpW and gp4 have no known homology, but are analogous in that they are the first proteins to bind the underside of the the portal ring; gp4 has no structural homology to λ gpFII or gpW (Cardarelli et al., 2010; Olia et al., 2011)
gpW	gpFII			No clear gpFII homologue in P22 (gp4 and gp10 of P22 have no sequence similarity)
gpB	gpB	gp1	gp1	Both form dodeamer portal rings^a
gpC	gpE			No gpC analogue in P22
gpC′	gpB			No gpC analogue in P22
gpC	gpNu3			No gpC analogue in P22
gpNu3	gpNu3	gp8	gp8	Monomer/dimer/tetramer for gp8; monomer/dimer for gpNu3
gpB	gpE	gp1	gp5	Seen in reconstructions^a
gpNu3	gpB	gp8	gp1	Scaffolds required for portal assembly into procapsid
		gp8	gp5	Scaffold–coat interaction not studied for λ
gpD	gpD	Dec	Dec	Both trimers; bind slightly different sites.^{Note that P22 has no Dec, but its very close relative phage L does and L Dec binds to P22 capsids}
gpE	gpE	gp5	gp5	Both make icosahedral T=7 head shells^a
gpU	gpU			No analogue in P22
gpD	gpE	Dec	gp5	Dec is more discriminatory than D, i.e., unlike D, Dec does not occupy all of the quasi-equivalent local threefold positions on the shell
		gp4	gp5	Unkown in λ
		gp4	gp10	No gp10 analogue in λ (but see gpFII above)
		gp4	gp9	No gp9 analogue in λ (but see Stf below?)
		gp10	gp9	No gp9 analogue in λ (but see Stf below?)
		gp10	gp10	No gp10 analogue in λ (but see gpFII above)
		gp10	gp26	No gp10 or gp26 (?) analogue in λ
		gp26	gp26	No gp26 analogue in λ (maybe gpJ?)
Stf	Stf	gp9	gp9	Both trimers (Stf by homology to T4 fibers); analogues, not homologues
gpU	gpV			No analogues in P22
gpV	gpV			No analogue in P22
gpO	gpP	gp18	gp12	gpO and gp18 are homologous but gpP and gp12 are not; nonetheless, a good argument can be made that gp18 and gp12 must interact similarly to gpO and gpP to get replication started
gpW	gpW	gp4	gp4	Analogues—first proteins to bind below the portal ring (see gpW–gpB interaction above)
gpFI	gpA			No gpFI analogue in P22
gpFI	gpE			No gpFI analogue in P22
gpH	gpG/gpGT			No analogues in P22
gpV	gpGT			No analogues in P22
gpH	gpV			No analogues in P22
		Arc	Arc	No analogue in λ
		Ant	C2	No Ant in λ
		Mnt	Mnt	No analogue in λ
CI	CI	C2	C2	Homologues (Sevilla-Sierra et al., 1994)
Exo	Bet	Erf	Erf	λ Exo and P22 Erf proteins are not homologues and function in basically different ways, but they are both essential for homologous recombination
SieB	Esc	SieB	Esc	Good homologues in P22 but they are unstudied
CII	CII	C1	C1	Homotetramers in both cases
Xis	Int	Xis	Int	Xis proteins are quite different
Xis?	Xis?	Xis	Xis	Monomer in λ, but binds cooperatively so could interact on DNA (Abbani et al., 2007); multimer in P22
CIII	CIII	C3	C3	C3 is homologue of λ CIII; λ CIII is dimer (Halder et al., 2007); P22 interaction by homology with λ CIII
Cro	Cro	Cro	Cro	Both homodimers
gpRz	gpRz1	gp15	gp15	Predicted for P22 by homology

^{Not recognizably similar in amino acid sequence but}very likely ancient homologues.

TABLE VIII

Comparison of phage λ–host and P22–host interactions

λ	Host	P22	Host	Notes
CII	HflB	C1	HflB	Should be true for P22 by homology
CIII	HflB	C3	HflB	Should be true for P22 by homology
gpJ	LamB			P22 tails recognize polysaccharide so no analogue in P22
Stf	OmpC			P22 tails recognize polysaccharide so no analogue in P22
Xis	Lon	Xis	Lon?	Might be true for P22 by homology
Xis	FtsH	Xis	FtsH?	Might be true for P22 by homology
gpN	RNAP	gp24	RNAP?	Likely true for P22 by similarity of function
Int	IHF			P22 does not interact with IHF (Cho et al., 1999)
Xis	Fis			Unknown for P22
gpQ	σ⁷⁰	gp23	σ⁷⁰	Likely true for P22 by similarity of function
P	DnaB			No analogue of P in P22
		gp12	DnaG?	No analogue of 12 in λ
gpE	GroE	gp5	GroE
gpB	GroE			Not known for P22
gpN	Lon	gp24	Lon	Not known for P22
CI	RecA	C2	RecA	Might be true for P22 by homology and induction by ultraviolet light
CII	ClpYQ	C1	ClpYQ	Might be true for P22 by homology
CII	ClpAP	C1	ClpAP	Might be true for P22 by homology
gpO	ClpXP	gp18	ClpXP	Might be true for P22 by homology
gpN	NusA	gp24	NusA	Likely true by similarity of function
CII	HflD	C1	HflD	Might be true for P22 by homology
Gam	RecB			No Gam in P22
		Abc2	RecC	No Abc2 in λ

D. Enterobacteriophage P2 and its satellite phage, P4

Bacteriophage P2 was originally isolated from the Lisbonne & Carrère strain of E. coli by Bertani in 1951 and is a member of the Myoviridae family of viruses. It has an icosahedral head and a contractile tail and is a 33.6-kb double-stranded DNA genome (Bertani, 1951; Bertani and Six, 1988; Nilsson and Haggård-Ljungquist, 2006). P2-like prophages are common in the environment (Breitbart et al., 2002) and are present in about 30% of strains in the E. coli reference collection (Nilsson et al., 2004).

Bacteriophage P2 has attracted much interest as a “helper” for bacteriophage P4, an 11.6-kb replicon that can exist either as a plasmid or integrated into the host genome as a prophage (Briani et al., 2001; Deho and Ghisotti, 2006; Lindqvist et al., 1993). P4 lacks genes encoding major structural proteins, but can utilize the structural gene products encoded by P2 for assembly of its own capsid (Christie and Calendar, 1990; Lindqvist et al., 1993; Six, 1975). An overview of the P2 structural biology and a summary of its interactions are shown in Figure 8 and Table IX.

An external file that holds a picture, illustration, etc.
Object name is nihms404378f8.jpg

FIGURE 8

(A) Schematic diagram of the 33.6-kb P2 genome. Open reading frames (ORFs) are indicated by arrows that reflect their direction and size and are color coded as follow: red/orange, capsid-associated proteins (gpQ, O, N, L); blue, terminase proteins (gpP, M); yellow, lysis proteins (gpY, K, lysA, lysB); green, tail-related proteins (gpR, S, FI, FII, T, U, D); blue, base plate and tail fiber-related proteins (gpV, W, J, H, G); pink, transcriptional control proteins (Ogr, C, Cox); brown, integrase (Int); and purple, replication-related proteins (gpA,B). Nonessential genes and ORFs of unknown function are white. Promoters are indicated by arrows above ORFs. (B) Diagram of protein–protein interactions listed in Table IX. Each P2 and P4 protein is shown as a circle, colored as given earlier. Host proteins are shown as white boxes. (C) Schematic diagram of the P2 virion, color coded as given earlier. Copy numbers of proteins, when known, are indicated in parentheses. (D) Electron micrograph of a P2 virion, stained negatively with uranyl acetate.

TABLE IX

Protein–protein interaction of phages P2 and P4

Protein	Function	Interactions	Notes	References
P2–P2 proteins
gpC	Immunity repressor	Dimer	Y2H 3D structure	Eriksson et al., 2000; Massad et al., 2010
Cox	Transcriptional repressor, excisionase	Tetramer that can self-associate to octamers	In vitro; cross-linking and gel filtration	Eriksson and Haggard-Ljungquist, 2000
Int	Integrase	Dimer	In vitro; cross-linking Y2H	Frumerie et al., 2005
gpF_I	Tail sheath	Polymer, six molecules per sheath striation	Electron microscopy	Lengyel et al., 1974
gpF_II	Tail tube	Polymer, six molecules per tube striation	Electron microscopy	Lengyel et al., 1974
gpM	Small terminase subunit	Polymer, 8- to 10-fold ring	Biochemistry	Dokland, unpublished data
gpM	Small terminase subunit	gpP	Biochemistry	Bowden and Modrich, 1985
gpN	Major capsid protein	T=7 icosahedron (415 copies)	Cryo-EM and 3D reconstruction	Dokland et al., 1992
		gpO	Biochemistry	Chang et al., 2008, 2009
gpO	Internal scaffolding protein	Dimer	Biochemistry, chromatography	Chang et al., 2008, 2009
		gpN
gpP	Large terminase subunit	gpP	Biochemistry	Bowden and Modrich, 1985
gpQ	Connector, portal protein	Dodecameric ring	EM, crystallography	Doan and Dokland, 2007; Rishovd et al., 1994
gpT	Tape measure	gpF_II (gpF_I ?)	Inside tail tube
gpV	Tailspike	Tetramer	Sedimentation velocity and molecular mass determination	Haggard-Ljungquist et al., 1995; Kageyama et al., 2009
gpV	Tailspike	gpW, base plate component. Belongs to the T4 gene 25-like superfamily	Copurification	Haggard-Ljungquist et al., 1995
P2–other phage proteins
Tin	Block superinfection of T-even phages	T4 SSB protein (gp32)	Genetic evidence	Mosig et al., 1997
gpC	Immunity repressor	P4 anti-repressor ε	Genetic evidence. Y2H	Geisselsoder et al., 1981; Liu et al., 1998; Eriksson et al., 2000
gpN	Major capsid protein	P4 Sid; size determinator, external scaffold	Cryo-EM and 3D reconstruction	Marvik et al., 1995; Wang et al., 2000; Dokland et al., 2002
gpN	Major capsid protein	P4 Psu; polarity suppressor; decoration protein	Cryo-EM and 3D reconstruction	Dokland et al., 1993
P2–host proteins
gpB	Helicase loader.	E. coli helicase DnaB	Genetic evidence.	Sunshine et al., 1975; Funnell and Inman, 1983; Odegrip et al., 2000
	Required for lagging strand synthesis		Electron microscopy.
	Required for lagging strand synthesis		In vitro binding.
Ogr	Transcriptional activator of late operons	E. coli RNA polymerase α C-terminal domain	Genetic evidence	Sunshine and Sauer, 1975; Fujiki et al., 1976; King et al., 1992; Ayers et al., 1994; Wood et al., 1997
P4–host proteins
Psu	Polarity suppressor; anti-terminator	Rho	Genetics; chemical cross-linking	Sauer et al., 1981; Pani et al., 2009
Delta	Transcriptional activator, contains two Ogr domains	E. coli RNA polymerase	Genetics; homology to Ogr	Julien et al., 1997

1. Gene control circuits

The decision whether to grow lytically or form a lysogen after P2 infection is dependent on the levels of two repressors: the immunity repressor gpC, which blocks lytic growth, and Cox, which blocks the expression of the genes required for lysogenization. GpC acts as a symmetric dimer, and its three-dimensional structure has been determined (Eriksson et al., 2000; Massad et al., 2010). Cox is multifunctional and acts as a repressor, as well as an excisionase, thereby coupling the transcriptional switch that controls lytic growth versus formation of lysogeny with the integration/excision of the phage genome in or out of the host chromosome. Cox is a tetramer in solution that self-associates to octamers in the absence of DNA (Eriksson and Haggard-Ljungquist, 2000). The transcriptional activator Ogr is required to turn on late gene transcription. Ogr is a Zn finger-containing protein that interacts with the α subunit of the E. coli RNA polymerase, recruiting it to the late promoters (Ayers et al., 1994; Fujiki et al., 1976; King et al., 1992; Sunshine and Sauer, 1975; Wood et al., 1997).

Some P2 prophage genes are not blocked by the action of gpC and instead are expressed from the prophage providing the host cell with new properties. Among these “lysogenic conversion” genes, P2 has three genes, old, fun, and tin, that block superinfection of unrelated phages, but only gpTin is known to have a PPI. P2 gpTin excludes phage T4 by binding to and poisoning T4 gp32 (SSB), thereby inhibiting DNA replication (Mosig et al., 1997).

Satellite phage P4 has to get access to the morphogenetic genes of its helper P2 to grow lytically. This is accomplished by P4-encoded functions during coinfection with P2. The P4 antirepressor Epsilon derepresses the P2 prophage by interacting with gpC, leading to the expression of P2 lytic genes (Eriksson et al., 2000; Geisselsoder et al., 1981; Liu et al., 1998). P4 also has the capacity to trans-activate P2 late genes in the absence of Ogr using the transcriptional activator Delta. Delta contains two Ogr-like domains and recognizes the same DNA sequence as Ogr. Like Ogr, Delta interacts with the α subunit of E. coli RNA polymerase (Christie et al., 2003; Julien et al., 1997; Souza et al., 1977).

2. P2 DNA replication

After infection, the linear DNA molecule is circularized by ligation of the cohesive (cos) ends by the host DNA ligase, and if P2 enters the lytic pathway, DNA replication is initiated by the cis-acting gpA that creates a single-stranded cut at the origin. GpA remains linked covalently to the 5′ end of the cleaved strand (Liu and Haggard-Ljungquist, 1994). Replication occurs via a modified rolling-circle mechanism that generates a double-stranded monomeric circle that is the packaging substrate (Pruss et al., 1975). The cleavage/joining reactions are promoted by two tyrosine residues interspaced by three amino acids in gpA (Odegrip and Haggard-Ljungquist, 2001). The host Rep helicase is required for all DNA replication (Calendar et al., 1970), while the phage-encoded gpB is a helicase loader that interacts with the E. coli DnaB helicase required for lagging strand synthesis (Funnell and Inman, 1983; Odegrip et al., 2000).

3. Head structure and assembly

The main structural proteins involved in P2 capsid assembly are gpN, capsid protein; gpO, internal scaffolding protein; and gpQ, connector or portal protein (Chang et al., 2008; Lengyel et al., 1973; Linderoth et al., 1991). P2 capsids are assembled as empty precursor procapsids consisting of 415 copies of the gpN capsid protein arranged with T=7 icosahedral symmetry and clustered in hexamers and pentamers that make trivalent interactions (Dokland et al., 1992) (Fig. 9). GpQ forms a dodecameric ring that is incorporated into the capsid at one fivefold vertex and through which the DNA is subsequently packaged (Doan and Dokland, 2007; Rishovd et al., 1994).

An external file that holds a picture, illustration, etc.
Object name is nihms404378f9.jpg

FIGURE 9

P2 assembly. Schematic diagram of the P2/P4 capsid assembly pathway. (A) Capsid protein gpN, scaffolding protein gpO, and connector (portal) protein gpQ are assembled into the T=7 P2 procapsid. (B) GpN, gpO, and gpQ are then processed to their mature forms, N*, O*, and Q*. DNA is packaged into the procapsid by the P2 terminase complex (gpM and gpP), accompanied by expansion into the mature, angular capsid (B). (C) In the presence of the P4-encoded external scaffolding protein Sid, P2 structural proteins are assembled into a small, T=4 P4 procapsid. Loss of Sid (D), protein processing, and DNA packaging lead to the mature, expanded P4 capsid (E). The decoration protein Psu is added to the mature capsid (F).

GpO plays a dual role in the assembly process as both a protease and a scaffolding protein (Chang et al., 2009; Dokland, 2011). The full-length, 284 residue gpO protein is associated with procapsids at early times (Marvik et al., 1994), but is cleaved autoproteolytically between residues 141 and 142, leaving an N-terminal fragment, gpO* (formerly known as h7), that remains inside the mature capsid (Chang et al., 2008; Lengyel et al., 1973). Concomitantly, gpN and gpQ are cleaved—presumably by gpO—to their mature forms, gpN* and gpQ*, by removal of 31 and 26 residues, respectively, from the N terminus (Rishovd and Lindqvist, 1992; Rishovd et al., 1994). The protease activity of gpO resides in its N-terminal domain, which binds tightly to gpN and constitutes a serine protease similar to that of the herpesviruses (Chang et al., 2009; Cheng et al., 2004; Dokland, 2011). The C-terminal 90 amino acids comprise a scaffolding domain, which is necessary and sufficient for promoting capsid assembly (Chang et al., 2008, 2009; Wang et al., 2006). It too presumably binds to gpN during assembly, but any interaction is transient (Chang et al., 2009). Both N- and C-terminal domains form dimers in solution (Chang et al., 2009).

Circular P2 genomic DNA substrates are packaged into empty procapsids by the terminase complex, which consists of the small and large terminase proteins (gpM and gpP, respectively). Terminase recognizes and cleaves P2 DNA at the cos sites (Bowden and Modrich, 1985; Ziermann and Calendar, 1990). DNA packaging is associated with major structural transitions that lead to a larger, more angular and thin-shelled, mature capsid (Chang et al., 2008; Dokland et al., 1992). A head completion protein, gpL, is added (Chang et al., 2008), and tails, which are assembled via a separate pathway, are attached to the unique gpQ-containing portal vertex (Lengyel et al., 1974).

4. Tail structure

The 135-nm-long tail is composed of an inner tube and a contractile sheath (Lengyel et al., 1974). The head-proximal end of the free tail has two disc-like structures of different thickness and diameter. In the complete phage, only the thinner disc is seen outside the head. The distal end of the tail contains a baseplate with a spike and six fibers. Sixteen essential tail genes have been identified located in three transcription units. GpFI and gpFII constitute the tail sheath and tail tube, respectively, and there are about 200 copies of each protein per tail (Temple et al., 1991). RpR and gpS are involved in tail completion, as amber mutations in the respective gene produce giant tails and giant tail sheaths (Linderoth et al., 1994). GpV is the tail spike, which forms a trimer in solution (Haggard-Ljungquist et al., 1995; Kageyama et al., 2009). GpW copurifies with gpV and is thus thought to be part of the baseplate, while gpJ localizes to the edge of the baseplate or proximal part of the tail fiber by immunoelectron microscopy (Haggard-Ljungquist et al., 1995). GpH is believed to be the distal part of the tail fiber because it contains regions similar to tail fiber proteins of phages Mu, P1, λ, K3, and T2 (Haggard-Ljungquist et al., 1992). A region of gene H is highly similar to the variable part of the S gene (Sv) of phage Mu and the variable part of gene 19 (19v) of phage P1, in both cases in the (+) orientation of the invertible segments. P2 does not have an invertible segment, but it contains a sequence identical to the left repeat of the Mu invertible G segment located within the H gene and upstream of the region with a high similarity to Mu Sv and P1 19v, indicating that the P2 H gene has been frozen in the (+) orientation, probably by a deletion covering the right inverted repeat and the invertase. This region is therefore inferred to constitute the receptor-binding tip of the tail fiber. By its similarities to gpU and gpU′ of phages Mu and P1, and gp38 of phages T4, Tula, and Tulb, gpG is probably required for fiber assembly. GpT is likely the tape measure, as it is the only protein long enough to span the length of the tail shaft (Christie et al., 2002). Function of the products of genes X, I, E, E′ (programmed -1 frameshift extension of E), U, and D remains unknown.

5. P4-induced redirection of P2 capsid assembly

Satellite phage P4 cannot form phage particles by itself, but in the presence of phage P2, P4 genomes are packaged into phage particles using structural proteins supplied by P2 (Six, 1975). However, the P4 capsid is smaller than P2 and contains only 235 copies of gpN, arranged on a T=4 icosahedral lattice (Dokland et al., 1992) (Fig. 9). This size control is affected by the P4 gene product Sid (size determination)(Barrett et al., 1976), which forms an dodecahedral cage-like scaffold around the outside of P4 procapsids (Marvik et al., 1995). Sid self-associates into trimers and dimers that interact with gpN only in two hexamer subunits near the twofold symmetry axes (Wang et al., 2000; Dokland et al., 2002). Gene N mutants, called sir (sidresponsiveness), render the capsid protein unable to form small capsids even in the presence of Sid (Six et al., 1991). Conversely, the so-called super-sid or nms (N mutation sensitive) mutations in sid act as second-site suppressors of the sir mutations and enable the formation of small capsids (Kim et al., 2001). Although well-formed small procapsids can be assembled from Sid and gpN alone (Dokland et al., 2002), gpO is required for the formation of viable P4 phage (Six, 1975). Presumably this is because its protease activity is essential, and gpO may also facilitate incorporation of the portal. Indeed, sid complements the P2 Oam279 mutant, which produces a 237 residue N-terminal fragment of gpO that includes the protease domain but lacks scaffolding activity (Agarwal et al., 1990). The P4 genome contains P2 cos sites and is packaged into small capsids using the P2 terminase (Ziermann and Calendar, 1990). The P4-encoded protein Psu is a suppressor of Rho-dependent transcription termination (Pani et al., 2009; Sauer et al., 1981); it also functions as a decoration protein that binds to gpN hexamers and stabilizes the P4 capsid (Dokland et al., 1993; Isaksen et al., 1993).

E. Pneumococcus phage Dp-1 and Cp-1

The lytic bacteriophages Dp-1 and Cp-1 infect the Gram-positive bacterium Streptococcus pneumoniae, which colonizes the mucosal surface of the upper respiratory tract and causes serious invasive infections such as pneumonia, meningitis, and sepsis (van der Poll and Opal, 2009). Dp-1 was the first isolated Streptococcus phage and is an unassigned member of the Siphoviridae (Fig. 10) (McDonnell, Lain and Tomasz, 1975). It was reported that the Dp-1 virion surprisingly contains lipids that might be arranged as a bilayer surrounding the capsid (Lopez et al., 1977), a feature not reported for any other phage belonging to the Caudovirales. The lytic enzyme Pal of Dp-1 has been studied intensively because it kills virulent streptococci efficiently and has been used as an enzybiotic to cure pneumococcal infections in animals (Loeffler et al., 2001; Rodriguez-Cerrato et al., 2007). The podovirus Cp-1 lytic enzyme Cpl1 was also demonstrated to be an efficient enzybiotic (Loeffler et al., 2003). The Cp-1 genome is a linear, 20-kbp dsDNA predicted to encode 28 proteins and a packaging RNA (pRNA). Like ϕ29-like viruses, Cp-1 uses protein priming for the initiation of DNA replication (Martin et al., 1996). The biology of Dp-1 and Cp-1 is otherwise only poorly understood.

An external file that holds a picture, illustration, etc.
Object name is nihms404378f10.jpg

FIGURE 10

The Dp-1 genome and interactome. (A) Dp-1 genome map. Open reading frames (ORFs) corresponding to their transcription direction are shown as arrows (genome map). Predicted transcripts and functional gene clusters are indicated. All protein–protein interactions (PPIs) identified by systematic Y2H screens among Dp-1 proteins are shown as lines that connect the corresponding ORF symbols. Proteins that bind themselves (homomers) are highlighted by gray ORF symbols. Other ORF colors: refer to Figure C-E. (B–E) A Dp-1 virion model based on identified binary PPIs, mass spectrometry analysis (MS) of mature Dp-1 virions, and homology predictions. (B) Dp-1 virion proteins, including structural proteins identified by MS analysis (in red), or proteins with a homology-based annotation that indicates a virion-related function (in black). (C) Core model of the Dp-1 virion: proteins highlighted in red were identified by MS analysis as structural components. Proteins highlighted in blue represent hypothetical gene products that interact with structural components, thus presumably playing a transient role in virion assembly as they were not identified by MS analysis of Dp-1 virions. Red edges that connect the proteins were shown to interact in Y2H tests. (D) Y2H screens do not identify all expected PPIs. Red lines represent PPIs that were expected to occur among the corresponding proteins known from homologues but were not identified in the systematic Y2H screens. (E) Systematic Y2H screens reveal unexpected PPIs. Proteins are included that were shown to bind with structural proteins and putative morphogenetic factors, e.g., DNA metabolism (DNA polymerase subunits, RecB, DNA ligase, DnaG), queuosine biosynthesis (Que proteins), lysis (Holin), and transcription (sigma factor). Interactions are symbolized by edges that connect the corresponding proteins (in the case of sigma factor gp69, interaction partners are highlighted by an asterisk). (F) Electron micrograph of a mature Dp-1 virion. C–F from Sabri et al. (2010), with permission.

The 56.5-kbp genome of Dp-1 is predicted to encode 72 proteins (Fig. 10A) (Sabri et al., 2011). Approximately half of the gene products could be functionally annotated by homology predictions. However, in this and hundreds of other completely sequenced phage genomes, more than half of phage gene products could not be annotated. Proteome-wide Y2H screens, coupled with mass spectrometry analysis (MS) of mature Dp-1 virions, identified 156 unique PPIs among 57 Dp-1 proteins [a detailed list can be found in Sabri et al. (2011)], and MS revealed that the mature virion is composed of eight structural proteins. A Dp-1 virion model (Figs. 10B and 10C) physically links many proteins with unknown function to the virion. Because they bind structural proteins but are absent from the mature virion, these Y2H PPIs either indicate novel structural components (when found by MS of the virion particles) or virion assembly facilitators. For instance, homology comparisons of gp40, gp41, and gp53 predict that these proteins are structural (Fig. 10B). However, MS analysis did not detect them as components of the virion. Even though functional predictions by homology comparisons are extremely helpful in proteome annotation, a certain fraction of proteins may be wrongly annotated in the absence of other experimental data (i.e., whether a protein is annotated as a structural component or an assembly facilitator can have an important impact on future experiments).

Such a comprehensive analysis also reveals unexpected interactions that normally would not be tested in hypothesis-driven experiments (Fig. 10E). For example, Dp-1 proteins believed to be involved in DNA replication and recombination, transcriptional regulation, and other pathways exhibit interactions with structural components. This indicates that DNA packaging and DNA replication/repair might be coupled (as in phage T4), for example, by resolving intermediate DNA structures in parallel with packaging. Although the identified PPIs were helpful in obtaining the first hints about the function of the proteins, not all expected PPIs between the structural proteins were detected (Fig. 10D). Nevertheless, this study demonstrated that homology-based genome annotation complemented by proteomic approaches is a useful strategy to gain novel insights into the biology of poorly understood bacteriophages.

The same combination of genome annotation and proteomics was repeated for the 28 putative Cp-1 proteins (Häuser et al., 2011). They detected a total of 12 structural proteins by MS analysis. More than half the proteins showed at least one protein–protein interaction with 17 binary interactions in total. Notably, nearly all were detected between proteins encoded in the structural gene cluster, reflecting a highly modular organization of its interactome. In contrast, Dp-1, with its larger proteome/interactome, shows more cross-communication between proteins belonging to different pathways, although a functional enrichment of interactions between proteins belonging to related cellular processes was still detected (Sabri et al., 2011). It is likely that larger phage proteomes/interaction networks routinely exhibit a higher frequency of protein cross-communication at the protein interaction level due to regulation between various pathways (i.e., the larger the proteome, the more promiscuous the interaction network).

F. Bacillus subtilis phage ϕ29-like viruses

ϕ29-like phages belong to the Picovirinae subfamily of the Podoviridae. Members include ϕ29, ϕ15, PZA, BS32, B103, M2Y (M2), Nf, and GA-1. All are lytic; most infect B. subtilis, but some grow in B. pumilus, B. amyloliquefaciens, and B. licheniformis (Meijer et al., 2001a). The characteristics of their virions are a prolate capsid (e.g., ϕ29 is 41.5 by 31.5 nm) with attached head fibers, a connector, a short noncontractile tail (e.g., ϕ29 with 32.5 nm in length), and the presence of tail appendages (Anderson et al., 1966) (Fig. 11). GA-1 and M2Y lack the gene for head fibers (Yoshikawa et al., 1986), which may stabilize the capsid (Tao et al., 1998). The approximate 20-kbp genome of ϕ29-like phages is linear and encodes about 25 proteins. The 5′ ends of the genome are attached covalently to a terminal protein (TP). ϕ29-like viruses are among the smallest dsDNA phages isolated so far and thus represent minimal model systems (Meijer et al., 2001a).

An external file that holds a picture, illustration, etc.
Object name is nihms404378f11.jpg

FIGURE 11

Genome map and literature curated ϕ29 interactome. (A) Genome and simplified transcriptional map of ϕ29 (after Meijer et al., 2001a). Major promoters are indicated by their names, and major transcripts are shown as dashed (early) and solid (late) arrows. Other promoters are not indicated for clarity. Open reading frame (ORF) arrows indicate transcription direction. The color of ORF symbols corresponds to the color used for protein symbols in B, C, and D. PPIs known for ϕ29, organized by their cellular function (transcription, DNA replication, or virion). B. subtilis proteins are shown as rectangles, phage proteins as circles. Equal protein symbols contacting each other indicate homomers. Heteromeric PPIs are highlighted by protein symbols that are connected by lines. (D) Schematic cross view of the mature ϕ29 virion (after Xiang et al., 2008). UDG, uracil–DNA glycosylase; RNAP, RNA polymerase; TP, terminal protein.

1. Protein–protein interactions

A literature and database search on ϕ29-like viruses retrieved 30 PPIs (Tables X and andXI)XI) with evidence mainly from directed experiments and structure determination studies. Twenty-three are attributed to phage ϕ29 itself, five to GA-1, and two to Nf, emphasizing the role of ϕ29 as a model for the whole group. The biology of other ϕ29-like phages is poorly understood, and only a few proteins have been characterized experimentally. In some cases, ϕ29 PPIs are conserved among other ϕ29-like homologues, supporting a close relationship of these phages. However, there are also differences that reveal interesting insights into their evolution. Known ϕ29 protein–protein interactions include proteins involved in transcriptional regulation, DNA replication, and virion structure. Only three PPIs between ϕ29 and B. subtilis proteins have been identified so far.

TABLE X

Protein–protein interactions of phage ϕ29 including phage–host interactions^a

ϕ29 1	Function 1	ϕ29 2	Function 2	References
gp2	DNA polymerase	gp3	DNA terminal protein TP	Kamtekar et al., 2006; Blanco et al., 1987; Dufour et al., 2000; González-Huici et al., 2000
gp3	DNA terminal protein TP	gp1	Membrane-associated early protein	Bravo et al., 2000
gp3	DNA terminal protein TP	gp3	DNA terminal protein TP	Serna-Rico et al., 2003
gp3	DNA terminal protein	gp16	Packaging ATPase	Guo et al., 1987 Grimes and Anderson, 1997
gp4	Late gene activator	gp4	Late gene activator	Badia et al., 2006 Mencía et al., 1996b
gp4	Late gene activator	gp6	Histone-like, nonspecific DNA-binding protein	Calles et al., 2002; Camacho and Salas, 2001
gp6	Histone-like, nonspecific dsDNA-binding protein	gp6	Histone-like, nonspecific dsDNA-binding protein	Abril et al., 1999; Pastrana et al., 1985
gp7	Scaffolding protein	gp7	Scaffolding protein	Morais et al., 2003
gp7	Scaffolding protein	gp8	Major head protein	Lee and Guo, 1995
gp7	Scaffolding protein	gp10	Upper collar/connector protein	Guo et al., 1991; Lee and Guo, 1995
gp10	Upper collar/connector protein	gp10	Upper collar/connector protein	Guasch et al., 2002; Simpson et al., 2001
gp12	Preneck appendage protein (anti-receptor)	gp12	Preneck appendage protein (anti-receptor)	Xiang et al., 2009
gp13	Tail-associated, peptidoglycan-degrading enzyme	gp9	Tail knob protein	García et al., 1983
gp13	Tail-associated, peptidoglycan-degrading enzyme	gp13	Tail-associated, peptidoglycan-degrading enzyme	Xiang et al., 2008
gp16.7	Replication organizer; DNA-binding membrane protein p16.7	gp16.7	Replication organizer; DNA-binding membrane protein p16.7	Asensio et al., 2005; Albert et al., 2005; Crucitti et al., 1998; Serna-Rico et al., 2003; Meijer et al., 2001b
gp16.7	Replication organizer; DNA-binding membrane protein p16.7	gp3	DNA terminal protein TP	Serna-Rico et al., 2003
gp17	Early protein involved in early DNA replication	gp6	Histone-like, nonspecific dsDNA-binding protein	Crucitti et al. 2003
gp17	Early protein involved in early DNA replication	gp17	Early protein involved in early DNA replication	Crucitti et al., 2003
gp56 = gp0.8	Host uracil-DNA glycosylase inhibitor	gp56= gp0.8	Host uracil-DNA glycosylase inhibitor	Serrano-Heras et al., 2007)
ϕ29–Bacillus subtilis interactions
gp4	Late gene activator (3)	RpoA	RNA polymerase subunit α	Mencía et al., 1996a; Monsalve et al., 1996b
gp16.7	Replication organizer, DNA-binding membrane protein gp16.7 (2)	MreB	Actin-like cytoskeleton protein	Muñoz-Espín et al., 2009
gp56 = gp0.8	UDG inhibitor (1)	UDG	Uracil-DNA glycosylase	Serrano-Heras et al., 2006

^{Data from BOND, IntACT, and PDB (ϕ29, Taxon id 186846). All proteins are termed “gp” (gene product), although they are also termed “p” (protein), e.g., gp4 = p4. In the phage–host interaction list, the first protein is from ϕ29, the second from}B. subtilis. (1) Host RNAP binds via α subunit to p4 (i) bound at the late promoter PA3, thus activating late transcription and (ii) bound to the promoter PA2c, thus blocking transcription through promoter clearance inhibition. (2) PPI plays a role in the organization of ϕ29 DNA replication as peripheral helix-like structures. (3) Inhibition of host UDG; avoids removal of uracil from ϕ29 DNA via UDG-base excision repair.

TABLE XI

Protein–protein interactions of ϕ29-related phages^a

Phage	Protein 1		Protein 2		References
Phage		Function		Function	References
GA-1	gp6	Histone-like, nonspecific dsDNA binding protein	gp6	Histone-like, nonspecific dsDNA-binding protein	Freire et al., 1996
Nf	gp6	dsDNA binding protein	gp6	dsDNA-binding protein	Freire et al., 1996
GA-1	gp2	DNA polymerase	gp3	DNA terminal protein	Illana et al., 1996; Longás et al., 2006
Nf	gp2	DNA polymerase	gp3	DNA terminal protein	Longás et al., 2006; González-Huici et al., 2000
GA-1	gp5	Single-stranded DNA-binding protein SSB (1)	gp5	Single-stranded DNA-binding protein SSB	Gascón et al., 2000
GA-1	gp12	Neck appendage protein (2)	gp12	Neck appendage protein	Schulz et al., 2010
GA-1	gp4_G	Transcriptional regulator (3)	gp4_G	Transcriptional regulator	Horcajadas et al., 1999

^{Data from BOND, IntACT, and PDB (searched for ϕ29, Taxon id 186846). (1) ϕ29 and Nf SSB were shown to be stable monomers in solution, whereas GA-1 SSB forms homohexamers. (2) Homotrimer; autocatalytic cleavage similar to ϕ29 gp12. PDB: 3GUD. (3) Homodimer; negative regulator of early transcription; was shown not to interact with RNAP α subunit as the homologue gp4 of ϕ29.}

2. Protein-primed DNA replication

Phage ϕ29 DNA polymerase (DNAP, gp2) interacts with terminal protein (TP, gp3) to initiate genome replication. Interactions have also been shown between the corresponding GA-1 and Nf orthologs (Table XI). The structure of the ϕ29 TP–DNAP complex has been determined and suggests that the priming domain of TP occupies the dsDNA-binding tunnel, allowing the DNAP to bind negatively charged amino acid residues of TP (Kamtekar et al., 2006).

Phage ϕ29 replication depends on the presence of a single dAMP residue linked covalently via its 5′-phosphoryl group to the hydroxyl group of Ser^{of TP. Formation of the phosphoester bond is catalyzed by the DNAP itself (}Blanco and Salas, 1984; Hermoso et al., 1985); thus formation of a stable DNAP–TP heterodimeric protein complex is required (Blanco et al., 1987). The TP–dAMP is elongated by the same DNAP (Méndez et al., 1992). After the synthesis of six to nine nucleotides, DNAP dissociates from TP and starts continuous, linear synthesis (Méndez et al., 1997). This mechanism of protein-dependent DNA replication initiation was shown for Nf and GA-1 (Illana et al., 1996; Longás et al., 2008), S. pneumoniae phage Cp-1, E. coli phage PRD1, adenoviruses, and some plasmids and bacterial genomes (e.g., Streptomyces), although there may be differences in the various priming mechanisms (de Jong and van der Vliet, 1999; Martin et al., 1996; Salas, 1991).

The TP–DNAP interaction plays a central role in ϕ29 DNA replication initiation. However, other PPIs with TP are known, and thus protein-primed DNA replication may be more complicated than is currently known (Figs. 11C and Fig. 12). Chemical cross-linking shows that TP forms homodimers (Serna-Rico et al., 2003). Analysis of TP mutants suggests that the parental TP (TP bound to the parental genome) recruits the free TP–DNAP complex via a TP–TP interaction (Illana et al., 1999), but an additional interaction between parental TP and DNAP might be important for the specificity of replication initiation (González-Huici et al., 2000).

An external file that holds a picture, illustration, etc.
Object name is nihms404378f12.jpg

FIGURE 12

Interactions in ϕ29 genome replication and membrane localization. Proteins that contact each other have been shown to interact. For clarity, only one genome end is shown. Parental TP is highlighted by a black symbol, whereas primer TP is in white. The upper part of the figure includes PPIs that preferentially play a role in replication initiation. The lower part of the figure indicates elongated DNA with the associated ssDNA regions. Because p16.7 and p5 (SSB) both bind ssDNA, they could either redirect the parental ssDNA strand to the membrane via TP-p16.7 and p16.7–ssDNA interactions or stabilize it in the cytoplasm via SSB alone (Meijer et al., 2001b). After Bravo et al. (2000) and Serna-Rico et al. (2003).

TP interacts with the replication organizer gp16.7 (Serna-Rico et al., 2003) (Fig. 12). Gp16.7, which has the potential to bind to both ssDNA and dsDNA, is an integral membrane protein consisting of an N-terminal membrane anchor and a C-terminal cytoplasmic domain. It localizes ϕ29 DNA replication to the membrane (Meijer et al., 2000, 2001b; Serna-Rico et al., 2002, 2003). The C-terminal domain of gp16.7 dimerizes in solution but forms larger filaments in the presence of ssDNA (Asensio et al., 2005; Serna-Rico et al., 2003). Replication, which starts at both DNA ends, first produces a so-called type I replication intermediate consisting of dsDNA with ssDNA regions caused by displacement of the non-template DNA parental strand. Later, when the two replication forks have merged, the parental DNA strands become separated into two type II replication intermediates, still containing ssDNA regions in the parental template. A crystal structure of the gp16.7 DNA-binding C-terminal domain complexed with dsDNA reveals three gp16.7 dimers forming a deep, positively charged longitudinal cavity that interacts nonspecifically with the DNA phosphate backbone (Albert et al., 2005). Rp16.7 localizes in peripheral helix-like structures in a bacterial MreB cytoskeleton-dependent manner (Muñoz-Espín et al., 2010). In fact, protein gp16.7 interacts directly with MreB; this interaction is required for efficient ϕ29 DNA replication, allowing simultaneous replication of multiple templates at various peripheral sites.

Phage ϕ29 gp1 is another early protein that binds TP (Bravo et al., 2000). Rp1, like gp16.7, is membrane associated and self-associates into long filamentous structures (Bravo and Salas, 1997, 1998). Its membrane association and polymerization are mediated through a C-terminal hydrophobic sequence, whereas binding to TP involves an N-terminal region. In contrast to gp16.7, gp1 does not bind DNA (Fig. 12) and is thought to be another component that directs ϕ29 DNA replication to the membrane (Bravo et al., 2000). Any interplay between gp16.7 and gp1 in this process is unknown.

The ϕ29 histone-like, nonspecific dsDNA-binding protein gp6 forms a nucleoprotein complex with ϕ29 DNA that covers most of the genome, implicating it in genome organization (Serrano et al., 1994). Rp6 either unwinds the genome termini or interacts with TP directly, thus making a TP complex competent to bind the genome ends (Blanco et al., 1986; Serrano et al., 1994). Rp6 forms long oligomeric protein filaments thought to function as a protein backbone for DNA binding (Abril et al., 1999). GA-1 and Nf gp6 orthologues also self-associate (Freire et al., 1996). Rp6 also binds gp17, an early protein that stimulates the first viral replication round in vivo when other replication proteins are present at low levels (Crucitti et al., 1998, 2003). In contrast to gp6, gp17 does not bind DNA but its interaction with gp6 enhances gp6 binding to the replication origins, thereby stimulating early rounds of replication.

A homomeric interaction was demonstrated for the single-stranded DNA-binding protein of GA-1. GA-1 SSB was surprisingly found as a homohexamer in solution, whereas SSBs of Nf and ϕ29 (p5) appear to be stable monomers (Gascón et al., 2000). The homohexamer binds to ssDNA with higher affinity than the monomer and exhibits a higher stimulatory effect on DNA replication.

Bacteriophages have evolved different mechanisms to protect their genomes from degradation by host nucleases in the cell. One protective strategy includes the incorporation of noncanonical nucleotides into their DNA. For instance, B. subtilis phage PBS2 uses uracil instead of thymine (Takahashi and Marmur, 1963). PBS2 achieves this unusual goal by increasing the dUTP pool, relative to dTTP, after infection. In addition, PBS2 encodes Ugi, a protein that specifically inhibits the host uracil–DNA glycosylase (UDG)(Cone et al., 1980), which is involved in the first step of removing incorporated uracil residues via the base excision repair pathway. Normally, this takes place to avoid mutational effects in the host genome but it would impair the incorporation of uracil into the PBS2 genome. Cocrystallization of Ugi and UDG revealed that Ugi mimics uracil-containing duplex DNA, thereby blocking efficient DNA binding of UDG (Putnam et al., 1999). ϕ29 blocks the host UDG by an interaction with the early protein gp56 (gp0.8) that competes with the DNA-binding ability of UDG (Serrano-Heras et al., 2006, 2007, 2008). Although the ϕ29 genome does not contain uracil (Serrano-Heras et al., 2006), ϕ29 DNAP is able to incorporate dUMP with a catalytic activity only twofold lower than for that of dTMP. Because ϕ29 DNAP does not discriminate stringently between dTMP and dUMP, UDG inhibition via gp56 minimizes uracil incorporation.

3. Interactions during transcriptional regulation

Transcriptional regulation of the ϕ29 genome has been studied extensively (Meijer et al., 2001a; Rojo et al., 1998). The transition from early to late gene expression is controlled by gp4 (Fig. 11). Gp4 is a homodimer in solution (Mencía et al., 1996b); it forms a tetramer when bound to the intergenic region between gene 6 and gene 7, which contains the A2b, A2c, and A3 promoters (Fig. 11A). One binding site is upstream of A3, the late promoter that lacks a -35 box (Barthelemy and Salas, 1989). Once gp4 is bound, it recruits, via a PPI, the host RNAP, leading both to its stabilization on the promoter and transcriptional activation (Nuez et al., 1992). Although gp4 activates late gene expression, it represses early gene expression from the A2b and A2c promoters (Monsalve et al., 1997; Rojo and Salas, 1991). Interestingly, repression of A2c and activation of A3 involve the same residues of gp4 with the C-terminal domain of the RNAP α subunit (Mencía et al., 1996a; Monsalve et al., 1996a,b). Rp6 also interacts with gp4 in regulating the transcription of promoters A2b, A2c, and A3. Gp6 binds gp4 cooperatively to form a repression complex at promoters A2b and A2c while stimulating A3 (Calles et al., 2002; Elías-Arnanz and Salas, 1999).

4. Virion structure and morphogenesis

Cryo-EM analyses have revealed many details about the ϕ29 virion structure (Morais et al., 2003, 2005; Tao et al., 1998; Xiang et al., 2008, 2009), and many structures of virion proteins and factors are known (Table X). Assembly starts from a connector core particle consisting of 12 subunits of gp10 on which the scaffolding protein gp7 binds, followed by recruitment of the major capsid protein gp8 (Guo et al., 1991; Lee and Guo, 1995). The scaffolding protein is needed to create a lattice of concentric shells (Morais et al., 2003) and helps to organize the single capsomers built up by gp8 except for the basal connector capsomer. ϕ29 proheads were shown to consist of 235 copies of gp8 arranged as hexameric and pentameric capsomeres, indicating that direct interactions between gp8 monomers occur (Morais et al., 2005). The capsid fiber protein gp8.5, which is present in ϕ29 proheads, is presumably connected via gp8 to the prohead and is suggested to be organized as homotrimers via interactions of a C-terminal coiled-coil region (Morais et al., 2005). In addition to gp8, gp8.5, and gp10, ϕ29 proheads contain a homopentameric packaging RNA (pRNA) that binds to the connector, forming a basket-like complex (Simpson et al., 2000). pRNA is needed to translocate DNA into the prohead. Furthermore, the DNA translocation machinery needs the ATPase gp16 to drive the packaging reaction (see also chapter 9 of this volume). Rp16 was shown to bind to the connector protein (Ibarra et al., 2001). While other phages use specific DNA sequences to recognize and recruit the genome to the procapsids, ϕ29 DNA packaging initiation is dependent on the TP gp3 present at the genome ends, indicating that an interaction between gp16 and DNA-bound gp3 recruits the DNA to proheads and initiates packaging (Guo et al., 1987). Release of the scaffolding protein is associated with DNA packaging and pRNA and gp16 are released when there is no more DNA to package.

An interesting aspect of ϕ29 is that the virion contains several enzymes that are associated with structural proteins. For instance, gp12 is a homotrimer that participates in host cell recognition and entry. It also shows autocatalytic cleavage, although it remains unclear what the function of that reaction is. Similarly, gp13 is bound as a dimer at the distal end of the ϕ29 tail knob to gp9; gp13 cleaves both the polysaccharide backbone and peptide cross-links of the cell wall during infection.

1. Growth physiology

T7 grows on rough strains of E. coli (i.e., those without full-length O-antigen polysaccharide on their surface) and some other enteric bacteria, but close relatives have been described that infect smooth and even capsulated strains (Molineux, 2006a). In such phages, tail fibers present on T7 are replaced with tailspikes with enzymatic activity that degrades the O- or K-antigens on the cell surface. T7 has a short latent period of 17 min at 37°C, which has resulted in most physiological studies being conducted at 30°C where infected cells lyse after 30 min. However, adaptation of T7 to high fitness with an excess of an E. coli K-12 strain growing under optimal conditions at 37°C in rich media results in a phage with a latent period of only ≈11 min. This adapted phage can undergo an effective expansion of its population by more than 10^{in 1 hr of growth (}Heineman and Bull, 2007). There is no obvious subtlety in the T7 developmental pathway: the early region of its genome is transcribed efficiently by the host RNA polymerase, leading to synthesis of a phage-specific RNA polymerase and the concurrent inhibition of the host enzyme. Thus, T7 rapidly takes control of the transcriptional capacity of the infected cell. As E. coli mRNAs are also intrinsically unstable, T7 therefore also gains control of the translational capacity of the cell. Further, because T7 uses nucleotides derived from the host chromosome for its own replication, more than 80% of the DNA in progeny is derived from the bacterial chromosome and only a small burst is observed after infection of DNA-free minicells. T7 development is not known to be dependent on host chaperones, although it does code for two proteins that have been suggested to have chaperone activity in capsid and tail assembly. Aside from host components used for phage adsorption, plus general biosynthetic processes and the translational apparatus, T7 development is remarkably independent of host proteins: only E. coli RNA polymerase, thioredoxin—the cofactor for T7 DNA polymerase—and (deoxy)cytidine monophosphate kinase are known to be essential.

2. Early steps in infection

T7 is a member of Podoviridae, possessing only a short stubby tail that needs to be extended functionally in order to span the cell envelope, which thus allows its genome to be translocated into the cell. Tail extension is affected through internal head proteins that are ejected into the cell at the initiation of infection prior to or concomitant with transport of the leading end of the genome (Chang et al., 2010; Kemp et al., 2005). A schematic of the T7 virion is shown in Figure 2. In T7, the core consists of several copies each of three proteins (gp14, gp15, and gp16), which must interact, albeit perhaps loosely, with themselves and each other, in the phage capsid. However, during their ejection into the cell they must at least partially dissociate and also partially unfold in order to pass through the narrow channel through the portal and tail (Molineux, 2006b). The N-terminal domain of gp16 refolds spontaneously in the periplasm of the infected cell to form an enzyme with lytic transglycosylase activity (Moak and Molineux, 2000, 2004). This muralytic activity is only conditionally essential for T7 infection (Moak and Molineux, 2000).

FIGURE 2

Phage T7: genome, virion, and interactome. ORFeome and protein interaction map of bacteriophage T7. Cloned open reading frames (ORFs) and random fragments were used to generate a two-dimensional interaction map. ORFeomes can also be used by structural genomics projects to derive three-dimensional structures. A combination of interaction maps and crystal structures often allows reconstruction of protein complexes as in viral particles or their subunits. (Top) T7 genome with each ORF represented as a box. ORFs indicated by integral numbers were identified originally as essential by genetic screens. Other genes have decimal numbers; of these, only 2.5 and 7.3 are essential. Proteins found to interact with other proteins are indicated by colored boxes; colors indicate a simplified assignment to functional classes as shown. Interactions between proteins are shown as lines, with self-interactions shown as dimers. Hatched lines indicate expected interactions that have not been found by two-hybrid analysis. Gray proteins correspond to host proteins (E. coli). Blue proteins are involved in virus assembly and structure, and, where known, their location in the virus particle is shown on the right. Proteins with thick borders have been crystallized and their structure determined. Genetic map modified after Dunn and Studier (1983). Figure modified after Uetz et al. (2004). Interactions based primarily on data in Table II (see references therein).

The C-terminal residues of gp16, likely together with a more central portion of gp15, also refold in the periplasm and then insert spontaneously into the cytoplasmic membrane, completing the channel across the cell envelope for DNA translocation into the cell. Likely the last of the core proteins to exit the phage head, gp14 is found exclusively within the outer membrane of the infected cell. Using energy derived from the membrane potential, these proteins then ratchet the leading 1 kb of the T7 genome into the cell (Chang et al., 2010; Kemp et al., 2005). In both T7 and T3, genome internalization by this process stops, and the remainder of the DNA is normally pulled into the cytoplasm as a consequence of host and then phage RNA polymerase-catalyzed transcription (Garcia and Molineux, 1995, 1996; Kemp et al., 2004). Several strong promoters lie on this leading 1-kb segment of the T7 genome.

3. Early genes

Transcription from T7 early promoters not only internalizes the remainder of the genome but also leads to expression of early T7 genes, some of which interact and inhibit or modulate the activity of host proteins (Molineux, 2006a). Of the early proteins, only T7 RNA polymerase is essential for phage growth. The first T7 protein to be synthesized is gp0.3, an anti-type I restriction protein (Studier, 1975). The structure of gp0.3 reveals that it resembles double-stranded DNA, and the protein binds to the specificity subunit of the heteropentameric EcoKI enzyme, preventing both it and the cognate modification enzyme from recognizing its target sequence (Atanasiu et al., 2001, 2002; Kennaway et al., 2009; Murray, 2000; Stephanou et al., 2009). T7 gp0.3 is active against all type I enzymes and thus T7 grows normally on B, C, and K-12 strains of E. coli regardless of its previous host. However, the protein affords no protection against type II and little, if any, against type III enzymes. Gp0.3 proteins of different T7-like phages fall into two distinct groups with no sequence similarity; both have anti-restriction activity but the group typified by T3 gp0.3 also has SAMase activity (Studier and Movva, 1976). The second major early protein whose function is known is gp0.7, a serine-threonine kinase (Rahmsdorf et al., 1974) that is also responsible for the shut off of host transcription. The two activities are separable (Simon and Studier, 1973). Gp0.7 phosphorylates several host proteins, including RNA polymerase (Severinova and Severinov, 2006; Zillig et al., 1975), the translational apparatus including EF-G and the small ribosomal subunit protein S6 (Robertson and Nicholson, 1990, 1992; Robertson et al., 1994), and both RNase III (Marchand et al., 2001) and RNase E (Mayer and Schweiger, 1983). T7 shuts down translation of host mRNAs but it remains unclear whether EF-G or S6 is involved directly. An unknown T7 late gene also modulates the specificity of RNase E (Savalia et al., 2010). Although it is likely that others do so, only 1 other of the 10 early T7 proteins is currently known to interact with host proteins. Gp1.2 binds to E. coli dGTPase, converting it into a rGTP-binding protein (Huber et al., 1988; Nakai and Richardson, 1990); gp1.2 also binds to the membrane protein FxsA and the F plasmid protein PifA (Cheng et al., 2004; Schmitt and Molineux, 1991; Wang et al., 1999).

4. T7 RNA polymerase

The hallmark of T7 is of course its phage-encoded RNA polymerase, which, together with its regulator T7 lysozyme, is responsible for most T7 gene expression. It is also essential for synthesizing RNA primers at the primary origin of replication (Fujiyama et al., 1981) and for initiation of DNA packaging (Zhang and Studier, 1997, 2004). In addition to several complexes containing a promoter and product RNAs (Cheetham and Steitz, 1999; Cheetham et al., 1999; Kennedy et al., 2007; Tahirov et al., 2002; Yin and Steitz, 2002, 2004), the crystal structure of the RNA polymerase–lysozyme complex has been determined (Jeruzalmi and Steitz, 1998). Many mutants in both RNA polymerase and lysozyme have been characterized that affect their functional interaction, but it is not obvious from the crystal structures how the mutations affect the regulation of transcription. Lysozyme is thought to target the initial transcribing complex in a promoter strength-dependent manner and have no effect on transcription elongation (Huang et al., 1999; Stano and Patel, 2004; Zhang and Studier, 1997, 2004). Lysozyme is, however, important in pausing or terminating transcription at the class II terminator CJ, where it probably interacts directly with the T7 terminase protein gp19 and thus directs the packaging machinery to create the DNA terminus (Lyakhov et al., 1997, 1998; Zhang and Studier, 2004).

After T7 RNA polymerase has been synthesized, there is no further requirement for the host polymerase and it actually becomes inhibitory to phage development. Inactivation is caused by the T7 gp2 protein, which binds to the β′ subunit and inhibits transcription by interfering with strand separation of the open complex at most promoters (Camara et al., 2010; Nechaev and Severinov, 1999). In vivo, the only T7 promoter that must be inhibited by gp2 is A3 (Savalia et al., 2010). Gp2 was found to interact with T7 endonuclease gp3 by a yeast two-hybrid screen (Bartel et al., 1996). An interaction between gp2 and gp3 (albeit not necessarily direct) was suggested earlier (Mooney et al., 1980), but the idea has not yet been critically evaluated experimentally.

5. DNA replication

T7 has been one of the primary model systems for understanding the biochemistry of DNA replication. Central to T7 replication is T7 DNA polymerase, which comprises a heterodimer of gp5 and the host thioredoxin. Binding of gp5 to thioredoxin is strong, with the latter serving as the processivity factor for DNA replication, increasing the length of DNA synthesized per binding event 100-fold. The interface between the two proteins also provides the docking site for binding of the other two proteins, SSB (gp2.5) and the primase–helicase gp4, which are central to T7 replication in vivo. The process of T7 replication in vitro has been reviewed recently (Hamdan and Richardson, 2009).

Two other proteins in the DNA replication cluster are known to interact with host proteins. The gene 5.5 product binds to the abundant nucleoid-associated protein H-NS (Ali et al., 2011) and in vitro relieves its inhibition of both E. coli and T7 RNA polymerase-catalyzed transcription (Studier, 1981). In vivo, T7 5.5 mutants direct reduced rates of DNA replication and exhibit a small plaque phenotype (Lin, 1992; Liu and Richardson, 1993; Studier, 1981). However, the phenotype persists in hns-defective strains, although an hns mutant that renders gene 5.5 essential has been isolated (Liu and Richardson, 1993). Gp5.5 also interacts with the λ proteins of RexAB system; it is not known whether this is a direct protein–protein interaction or whether gp5.5 interacts with a product of RexAB activity. A gene 5.5 missense, but not a null mutant, makes T7 sensitive to Rex-mediated exclusion. The product of gene 5.9 inhibits the E. coli RecBCD nuclease in a comparable manner to λ Gam protein; this interaction is also nonessential as mutants lacking gene 5.9 grow normally in rec(BCD)^{hosts (}Lin, 1992).

The assembly of T7 virions involves several symmetry mismatches and many protein–protein interactions, some of which may be helped by other T7 proteins that potentially have a chaperone-like activity (Kemp et al., 2005). Note that, GenBank annotations notwithstanding, gp13 is not a structural component of the virion (Kemp et al., 2005). Assembly is thought to initiate with oligomerization of the portal protein gp8 to form a dodecameric ring, although both 12- and 13-mers are observed following overexpression of gene 8 (Cerritelli and Studier, 1996b; Kemp et al., 2005; Valpuesta et al., 1992, 2000). The pathway of prohead formation may then involve assembly of the internal core (gp14, gp15, and gp16) on the inner surface of the portal, perhaps concomitant with the recruitment of capsid protein gp10 hexamers bound to scaffolding protein gp9. Subsequent recruitment of gp10 pentamers and hexamers results in shell formation around a scaffold of gp9 (Cerritelli et al., 2003a). The 12-fold symmetrical portal protein faces a symmetry mismatch with the 5-fold vertex of the icosahedral capsid shell. The internal core of T7 contains 4 copies of gp16, 8 copies of gp15, and either 10 or 12 copies of gp14 (Agirrezabala et al., 2007; Cerritelli et al., 2003b; Kemp et al., 2005). These proteins are ejected into the infected cell in order to transport the phage genome into the cytoplasm, and the protein–protein interactions in the mature virion are likely quite different than those found in the infected cell. Gp14 lies at the base of the internal core and is therefore juxtaposed to the 12-fold symmetrical portal protein. Complicating our understanding of the T7 virion is the identification of ≈18 copies of the small gp6.7 protein, which is ejected into the outer membrane of the infected cell at the initiation of infection (Kemp et al., 2005). It is not known where gp6.7 is located; it likely forms part of the internal core or lies within the portal channel.

6. Tail

The outer face of the portal is the site of assembly of the sixfold symmetrical stubby tail. The tail is composed of three proteins—gp7.3, 11 and 12—to which six trimers of the tail fiber protein gp17 are attached. Each trimer consists of three domains: an N-terminal third that binds to the tail; this domain is conserved in phages related to T7 that have a different host range. This domain is followed by a triple-stranded coiled-coil of ≈120 residues and then a C-terminal half consisting of globular domains that interacts with the cell surface lipopolysaccharide (Steven et al., 1988). The body of the tail consists of 12 copies of gp11, 6 copies of gp12, and ≈30 copies of the small protein gp7.3 (Kemp et al., 2005). How these proteins are arranged is unknown. It has been suggested that gp7.3 functions as a tail assembly protein that remains with the mature virion; the protein is ejected into the outer membrane of the infected cell (Chang et al., 2010; Kemp et al, 2005). Host range mutants of T7 suggest that all three proteins interact directly with the lipopolysaccharide of the host cell.

7. Interactome screening

T7 was the focus of the first genome-wide screen for phage protein–protein interactions using the yeast two-hybrid assay (Bartel et al., 1996). As proof of principle, T7 was a good choice: the extensive genetic, molecular biology, and biochemical studies that had been performed on T7 provided a solid basis for evaluating the accuracy and completeness of the approach. Twenty-five interactions were identified in the screen, 15 of which were intramolecular. Many of these are to be expected as even monomeric polypeptides fold into a three-dimensional structure. Virion assembly clearly involves extensive protein–protein interactions, but as discussed by the authors, many of those that were known or plausibly expected from known structures were missed in the screen (Bartel et al., 1996). Conversely, screen-predicted interactions involving gp1.7 that were confirmed subsequently by its purification as an oligomer (Tran et al., 2010) and the predicted interactions between the spanin proteins gp18.5 and gp18.7 predated the discovery of a heterodimer even in phage λ (Casjens et al., 1989). Six intermolecular interactions were predicted that have not yet found confirmatory support, and our current understanding of T7 suggests that some are implausible. Improved screening procedures should help resolve these outstanding issues and, if more intermolecular interactions can be found that are consistent with, in particular, the pathways of virion assembly, the utility of those screening procedures in predicting the protein interactome of lesser-studied phages will be enhanced greatly. All published T7 interactions are summarized in Table II.

TABLE II

Protein–protein interactions of phage T7

Protein	Function	Interacts with	References
Class I
gp0.3	B-DNA mimic	Host HsdRMS	Atanasiu et al., 2001, 2002; Kennaway et al., 2009; Murray, 2000; Stephanou et al., 2009
gp0.3		gp0.3 (dimer)	Atanasiu et al., 2001
gp0.3		gp4.5	Bartel et al., 1996
gp0.7	Protein kinase	Host RNAP (activity change)	Rahmsdorf et al., 1974; Severinova and Severinov, 2006; Zillig et al., 1975
gp0.7		RNase III (activity change)	Mayer and Schweiger, 1983
gp0.7		RNase E (activity change)	Marchand et al., 2001
gp0.7		Translation components	Robertson et al., 1994; Robertson and Nicholson, 1990, 1992
gp0.7		gp4.5	Bartel et al., 1996
gp1	T7 RNA polymerase	gp3.5	Jeruzalmi and Steitz, 1998; Zhang and Studier, 2004
gp1		gp19 (genetic)	Zhang and Studier, 1995, 2004
gp1.2	E. coli dGTPase inhibitor;	Host Dgt (dGTPase)	Huber et al., 1988; Nakai and Richardson, 1990; Saito and Richardson, 1981
gp1.2	F-exclusion	F plasmid PifA	Cheng et al., 2004; Schmitt and Molineux, 1991; Wang et al., 1999
gp1.2		Host FxsA	Cheng et al., 2004; Wang et al., 1999
Class II
gp1.7	Nucleotide kinase	gp1.7 (oligomer)	Tran et al., 2010; Bartel et al., 1996
gp2	E. coli RNAP inhibitor	Host RNAP	Camara et al., 2010; Nechaev and Severinov, 1999
gp2		gp3	Bartel et al., 1996
gp2.5	SSB	gp5 (with thioredoxin)	Hamdan et al., 2005; He et al., 2003; Ghosh et al., 2008, 2010; Marintcheva et al., 2009
gp2.5		gp4	Ghosh et al., 2008; Hamdan et al., 2005; He and Richardson, 2004
gp2.5		gp2.5 (dimer)	Rezende et al., 2002; Bartel et al., 1996
gp3	Endonuclease I, HJ resolvase	gp3 (dimer)	Hadden et al., 2007; Parkinson and Lilley, 1997
gp3.5	Amidase (lysozyme)	gp1	Jeruzalmi and Steitz, 1998; Zhang and Studier, 2004
gp4	gp4A primase-helicase; gp4B	gp5 (with thioredoxin)	Hamdan et al., 2005; He and Richardson, 2004
gp4	helicase	gp2.5	Hamdan et al., 2005; He and Richardson, 2004
gp4		gp4 (hexamer)	Singleton et al., 2000; Bartel et al., 1996
gp4.5		gp0.7	Bartel et al., 1996
gp5	DNA polymerase	gp4 (with thioredoxin)	Ghosh et al., 2008; Hamdan et al., 2005; Bartel et al., 1996
gp5		gp2.5 (with thioredoxin)	Ghosh et al., 2008, 2010; He et al., 2003; Marintcheva et al., 2009; Hamdanet al., 2005
gp5		gp4.7	Bartel et al., 1996
gp5		gp6.5	Bartel et al., 1996
gp5		Host thioredoxin	Ghosh et al., 2008
gp5.3		gp5.3	Bartel et al., 1996
gp5.5		Host H-NS	Lin, 1992; Liu and Richardson, 1993
gp5.9	Inhibits RecBCD nuclease	Host RecBCD	Lin, 1992
Class III
gp6.7	Head completion; ejected virion protein	Other head proteins likely. gp13?	Kemp et al., 2005
gp7.3	Tail chaperone; ejected structural virion protein	gp7.3, gp11, gp12	Kemp et al., 2005
gp7.3		Host LPS	Kemp et al., 2005
gp8	Head–tail connector (portal)	gp19	Fujisawa et al., 1991; Morita et al., 1995
gp8	protein	gp9/10/gp6.7/7.3/11/12/14 (structure)	Agirrezabala et al., 2005a, 2007; Cerritelli and Studier, 1996a; Liu et al., 2010
gp8		gp8 (dodecamer) (structure)	Agirrezabala et al., 2005b; Cerritelli and Studier, 1996b
gp9	Scaffolding protein	gp8, gp10 (structure)	Agirrezabala et al., 2007; Cerritelli and Studier, 1996a,b
gp9	Scaffolding protein	gp8, gp10 (structure)	Bartel et al., 1996
gp9		gp14/15/16 (structure)	—
gp10	Major capsid protein (-1 frame-shift -> minor gp10B)	gp8, gp9, gp14 (structure)	Agirrezabala et al., 2005a,b, 2007; Cerritelli and Studier, 1996a; Cerritelli and Studier, 1996b
gp10	Major capsid protein (-1 frame-shift -> minor gp10B)	gp10 (pentamer/hexamer) (structure)	Cerritelli and Studier, 1996a; Bartel et al., 1996
gp10		Host FxsA; F plasmid PifA	Cheng et al., 2004; Schmitt and Molineux, 1991; Wang et al., 1999
gp11	Tail protein	gp7.3, gp8, gp11, gp12 (structure)	Kemp et al., 2005; Agirrezabala et al., 2007; Liu et al., 2010
gp11		Host LPS	Kemp et al., 2005
gp11		gp2.5	Bartel et al., 1996
gp12	Tail protein	gp7.3, gp8, gp11, gp12 (structure)	Kemp et al., 2005; Liu et al., 2010
gp12		Host LPS	Kemp et al., 2005; Qimron et al., 2006
gp13	Virion chaperone	gp6.7, gp8, gp12, gp14, gp17, gp19 (genetics)	Kemp et al., 2005, Stone and Miller, 1985
gp14	Ejected core protein	gp8, gp14, gp15, gp16 (structure)	Agirrezabala et al., 2007; Liu et al., 2010
gp15	Ejected core protein	gp14, gp15, gp16 (structure)	Agirrezabala et al., 2007; Chang et al., 2010; Liu et al., 2010; Bartel et al., 1996
gp16	Ejected core protein	gp14, gp15, gp16 (structure)	Agirrezabala et al., 2007; Chang et al., 2010; Liu et al., 2010
gp16		Peptidoglycan	Moak and Molineux, 2000, 2004
gp17	Tail fiber protein	gp7.3/11/12 (structure)	Agirrezabala et al., 2007; Chang et al., 2010; Liu et al., 2010
gp17		gp17 (trimer)	Steven et al., 1988; Bartel et al., 1996
gp17		Host LPS	Steven et al., 1988; Qimron et al., 2006
gp18	Small terminase	gp18 (octamer)	White and Richardson, 1987; Hamada et al., 1986
gp18.5-	gp18.7 Casjens et al., 1989 Summer et al., 2007; Bartel et al., 1996	Rz-Rz1 homologues; spanin	gp18.5–gp18.7
gp19	Large terminase	gp8	Fujisawa et al., 1991; Hamada et al., 1986; Morita et al., 1994, 1995
gp19		gp19 (hexamer in the presence of prohead)	Fujisawa et al., 1991
gp19		gp4.7	Bartel et al., 1996

1. The λ virion

The phage λ genome and virion are shown in Figure 3. Note that the virion contains tail fibers that are not visible in most λ micrographs. The variant of λ used in most laboratories around the world, λPaPa, has a frameshift mutation in the structural stf gene and lacks the long “side tail fibers” of λ, Ur-λ, that comes directly from the original E. coli K-12 lysogen. It is still not entirely clear how many different proteins are in the λ particle but 12 are known with confidence and 2 more (gpM and gpL) are possibly in the particle in very low numbers. A current model of the particle with its known proteins is shown in Figure 3. Based on this model, we can predict at least 23 PPIs in the virion (Table III), although several are inferred from genetic experiments and remain to be detected by biochemical or biophysical methods. The virion adsorbs to sensitive cells through an interaction between the tail tip protein gpJ and the host LamB outer membrane protein. DNA is then released through the tail into the cytoplasm.

FIGURE 3

Genome and virion of phage λ. (A) Genome of phage λ. Colored ORFs correspond to colored proteins in (B). Main transcripts are shown as arrows. After Hendrix and Casjens in Calendar (2006). (B) Schematic model of λ virion. Numbers indicate the number of protein copies in the particle. It is unclear whether gpM and gpL proteins are in the final particle or only required for assembly. (C) Electron micrograph of phage λ. Modified after Hendrix and Casjens (2006).

TABLE III

Published protein–protein interactions of phage λ

λ 1	λ 2	Notes	References
gpA	gpNu1	gpA (N-terminal) – gpNu1 (C-terminal)	Catalano, 2000; de Beer et al., 2002; Hang et al., 1999
gpA	gpB	gpA (C-terminal) – gpB (= portal)	Catalano, 2000; Duffy and Feiss, 2002
gpW	gpB	gpW (NMR, head–tail joining)	Casjens, 1974; Casjens et al., 1972
gpW	gpFII	gpW required for gpFII binding	Casjens, 1974; Maxwell et al., 2001
gpB	gpB	12-mer (22 amino acids removed from gpB N-terminal)	Kochan, Carrascosa and Murialdo, 1984; Tsui and Hendrix, 1980
gpC	gpE	Covalent PPI (in virion?)	Hendrix and Casjens, 1974;Hendrix and Duda, 1998
gpC′	gpB		Murialdo, 1979
gpC	gpNu3	gpC may degrade gpNu3 (before DNA packaging)	Hendrix and Casjens, 1975; Hohn et al., 1975; Medina et al., 2010; Murialdo and Becker, 1978
gpD	gpD	gpD forms trimers	Mikawa et al., 1996; Sternberg and Hoess, 1995; Yang et al., 2000
gpE	gpE	Major capsid protein	Casjens and Hendrix, 1974; Dokland and Murialdo, 1993; Lander et al., 2008
gpU	gpU	“Probably a hexamer”	Katsura and Tsugita, 1977
gpD	gpE		Casjens and Hendrix, 1974; Dokland and Murialdo, 1993; Lander et al., 2008
gpU	gpV	gpU is head-proximal tail protein	Pell et al., 2009
gpV	gpV		Buchwald et al., 1970a,b; Casjens and Hendrix, 1974; Katsura, 1983
gpO	gpP		Wickner and Zahn, 1986; Zahn and Blattner, 1985, 1987; Zahn and Landy, 1996)
gpO	gpO	gpO–gpO interactions when bound to ori DNA	Alfano and McMacken, 1989
gpB	gpA	Genetic evidence	Frackman et al., 1985; Sippy and Feiss, 1992
gpFI	gpA	Genetic evidence	Catalano and Tomka, 1995
gpFI	gpE	Genetic evidence	Murialdo and Tzamtzis, 1997
gpH	gpG/gpGT	gpG/gpGT holds gpH in an extended fashion	Hendrix and Casjens, 2006
gpV	gpGT	T domain binds soluble gpV	Hendrix and Casjens, 2006
gpH	gpV	gpV probably assembles around gpH, displacing gpG/gpGT	Xu et al., 2004
CI	CI	Forms octamer that links O_R to O_L	Bell and Lewis, 2001; Dodd et al., 2001
Exo	Bet		Radding et al., 1971
SieB	Esc	Esc is encoded in frame in SieB and inhbits SieB	Ranade and Poteete, 1993a,b
CII	CII	Homotetramers	Ho and Rosenberg, 1982
Xis	Int		Sam et al., 2002
Xis	Xis	Xis-Xis binding mediates cooperative DNA binding	Sam et al., 2004
Int	Int	Dimer	Warren et al., 2003
Stf	Stf	Likely homotrimer by homology with T4 fibers
gpS	gpS	Large ring in inner membrane	Savva et al., 2008
gpS	gpS′	gpS′ inhibits gpS ring formation	Grundling et al., 2000
gpB	gpE	Copurify in procapsid	Hendrix and Casjens, 1975
gpNu3	pNu3	Monomer–dimer equilibrium	C. Catalano, personal communication
CIII	CIII	Dimer	Halder et al., 2007
Cro	Cro	Dimer; X-ray str	Anderson et al., 1979
gpRz	gpRz1	Heteromultimer supposed to span the periplasm	Summer et al., 2007
gpNu3	gpB	gpNu3 required for gpB incorporation into procapsid	Ray and Murialdo, 1975

2. Assembly of virion head (Fig. 4)

FIGURE 4

Assembly of the phage λ head. Head assembly has been subdivided into five steps, although most steps are not very well understood in mechanistic terms. Note that the tail is assembled independently. GpC protease, scaffolding protein gpNu3, and portal protein gpB form an ill-defined initiator structure. Protein gpE joins this complex in a step requiring the chaperonins GroES and GroEL. Proteins gpNu1, gpA, and gpFI are required for DNA packaging. GpD joins and stabilizes the capsid as a structural protein; gpFII and gpW are connecting the head to the tail that joins once the head is completed. Modified after Georgopoulos et al. (1983) and Lander et al. (2008).

λ encodes 10 proteins required for head assembly, of which 6 end up in the mature virion. Head assembly starts with assembly of the portal, which requires the host GroEL/S chaperone for folding. It is thought that the portal serves as the nucleus for assembling the remaining capsid. However, the mechanistic details of assembly are not well understood. GpC is involved as a head maturation protease that becomes attached covalently to E, the main capsid protein. The resulting fusion proteins subsequently get trimmed to make slightly shortened products named X1 and X2 (Hendrix and Casjens, 1974). The authors note, however, that attempts to identify X1 and X2 were unsuccessful and thus X1 and X2 may be artifacts (Medina et al., 2010). GpNu3 acts as putative scaffolding protein that aids coat shell assembly and is removed from the capsid by proteolysis before DNA is packaged. Structures of the procapsid and mature λ capsid have been solved by cryo-EM, showing the T=7 arrangement of capsid protein gpE clustered into hexamers and pentamers (Dokland and Murialdo; Lander et al., 2008). Trimers of the decoration protein gpD bind to the capsid at trivalent interaction points between the capsomers and stabilize the capsid (Sternberg and Weisberg, 1977). Comparison of the procapsid with the mature capsid reveals the considerable structural transitions that occur upon capsid maturation and DNA packaging, but this process is not well understood mechanistically.

3. Assembly of virion tail (Fig. 5)

FIGURE 5

Assembly of the phage λ tail. The λ tail is made of at least six proteins (gpU, V, J, H, tfa, stf) with another seven required for assembly (gpI, M, L, K, G/T, Z). Assembly starts with protein gpJ, which then, in a poorly characterized fashion, requires proteins gpI, gpL, gpK, and gpG/T to add the tape measure protein H. GpM joins and then gpG and gpG/T leave the complex so that the main tail protein gpV can assemble on the gpJ/H scaffold. Finally, gpU is added to the head-proximal end of the tail. GpZ is required to connect the tail to the preassembled head. Protein gpH is cleaved by the action of gpU and gpZ (Tsui and Hendrix, 1983). It remains unclear if proteins gpM and gpL are part of the final particle (Hendrix and Casjens, 2006). From http://www.pitt.edu/~duda/lambdatail.html.

Tail assembly begins with the antireceptor protein gpJ. Proteins gpI, gpK, gpL, and gpM assemble around gpJ in that order, although no mechanistic or structural details are known about this process. In parallel events, gpH, gpG, and gpG-T (the frameshifted form of gpG including the T reading frame) form a complex that binds the tail tip and the major tail protein (gpV). The tape meassure protein gpH determines the length of the phage tail (Katsura, 1990). Once the correct length is achieved, gpV assembly stops and the proximal tail protein gpU is added. Protein gpZ somehow facilitates head–tail joining without being assembled into the final virion.

4. Lysogeny and induction

After phage DNA ejection, circularization, and ligation by the host DNA ligase, the earliest genes to be expressed include N, O, and P, as well as cII and cIII, which play a key role in the lysis/lysogeny decision (Echols, 1986). Both CII and CIII can interact with and be cleaved by the bacterial FtsH (HflB) protease directly, which in turn is complexed with the host HflK and HflC proteins at the cytoplasmic membrane. HflD, a cytoplasmic membrane protein, also binds CII, making it both more susceptible to FtsH and less able to bind DNA (Kihara et al., 2001; Parua et al., 2010). As a substrate for FtsH, CIII acts as a competitive inhibitor of CII degradation and, as the half-life of CII protein increases, leads to an increase in CI repressor synthesis (Datta et al., 2005a; Herman et al., 1997; Kobiler et al., 2007). Except when cIII is overexpressed, lysogeny is only established when the activity of these proteases (FtsH, Lon, and perhaps others) is low. Once established, lysogeny is a stable state until lysogens are exposed to ultraviolet light or other DNA-damaging agents that lead to the activation of RecA, which in turn accelerates autocleavage of the λ repressor CI. Loss of CI leads to prophage induction and the lytic cycle. The mechanism by which phage λ decides whether to grow lytically or become a lysogen is reviewed elsewhere (Court et al., 2007; Friedman and Court, 2001; Gottesman and Weisberg, 2004; Ptashne, 2006). However, Table IV contains the host–phage protein–protein interactions critical for gene expression and thus the genetic switch of phage λ.

TABLE IV

Published phage λ–host interactions^a

λ	Host	Description	References
Transcription
CI	RecA	RecA degrades CI^b	Kobiler et al., 2004
CI	RpoA		Kędzierska et al., 2007
CI	RpoD		Li et al., 1994; Nickels et al., 2002
CII	ClpYQ	ClpYQ degrades CII in vitro	Kobiler et al., 2004
CII	ClpAP	ClpAP degrades CII in vitro	Kobiler et al., 2004
CII	HflD	HflD makes CII more vulnerable to FtsH (hfl, high-frequency lysogenization)	Kihara et al., 2001
CII	HflB^c	HflB (= FtsH) protease degrades cII	Kobiler et al., 2002; Shotland et al., 2000
CII	RpoA		Kędzierska et al., 2004; Marr et al., 2004
CII	RpoD		Kędzierska et al., 2004
CIII	HflB	CIII inhibits HflB; HflB degrades CIII	Herman et al., 1997; Kornitzer et al., 1991
gpN	NusA	Transcriptional regulation	Friedman and Court, 2001
gpN	Lon	Lon degrades gpN	Kobiler et al., 2004; Gottesman et al., 1981
gpQ	σ⁷⁰	gpQ makes RNAP insensitive to cis terminal	Marr et al., 2001;Nickels et al., 2002; Roberts and Roberts, 1996; Roberts et al., 1998
Head
gpB	GroE	genetic interaction; E. coli GroE	Ang et al., 2000; Georgopoulos et al., 1973; Murialdo, 1979
gpE	GroE	genetic interaction	Georgopoulos et al., 1973
Tail
gpJ	LamB	LamB is the E. coli receptor	Buchwald and Siminovitch, 1969; Clement et al., 1983; Mount et al., 1968; Wang et al., 2000a; Werts et al., 1994
Stf	OmpC	OmpC is a secondary E. coli receptor	Hendrix and Duda, 1992
Recombination
Xis	Lon	Xis is degraded by E. coli Lon protease	Leffers and Gottesman, 1998
Xis	FtsH	Xis is degraded by E. coli FtsH protease	Leffers and Gottesman, 1998
Xis	Fis	both required for excision	Ball and Johnson, 1991; Cho et al., 2002; Esposito and Gerard, 2003
Int	IHF	Both catalyze recombination at attP/attB	Campbell et al., 2002; Crisona et al., 1999
Gam	RecB	Gam inhibits RecBCD	Marsic et al., 1993
NinB	SSB	NinB also binds ssDNA	Maxwell et al., 2005
Replication
gpO	ClpXP	ClpXP degrades gpO	Kobiler et al., 2004
gpP	DnaA		Datta et al., 2005b,c
gpP	DnaB		Mallory et al., 1990

^{CbpA (}) perhaps interacts with DnaA and SeqA (; , ).

^{After DNA damage, RecA bound to single-stranded DNA stimulates autocleavage of the CI repressor (}).

^{HflB = FtsH.}

5. Anti-termination

Escherichia coli RNA polymerase transcribes λ DNA starting from the immediate early promoters P_L and P_R, which leads to expression of the N and cro genes, among others. As soon as the gpN-binding sites (nut sites) are transcribed, gpN modifies the transcription complex by recognizing a stem-loop structure in the mRNA and by binding to host Nus proteins [note that gpN interacts with nut RNA (not DNA)]. This modified complex is capable of overriding the termination signal that normally stops immediate early transcription. Because gpN is continually being degraded by the Lon protease, ongoing anti-termination requires continual transcription of the N gene (Fig. 6). In addition to the anti-terminator for early lytic genes (gpN), a second anti-terminator, gpQ, is required for late gene expression. GpQ anti-termination differs from that of gpN in that gpQ binds to the “Q binding element” (QBE, also known as the qut site) of the P_R′ promotor (i.e., unlike the gpN protein, gpQ binds DNA). When RNA polymerase (RNAP) initiates transcription at the P_R′ promotor, it almost immediately pauses at a pause-inducing sequence. Pausing allows gpQ to contact σ^{in the RNAP, which causes a conformational change in the enzyme that enables gpQ to bind the β subunit (RpoB). The gpQ-containing RNAP is then able to override the termination signal at t}_R′ (Deighan et al., 2008; Nickels et al., 2002).

FIGURE 6

Interactions of phage λ with its host, E. coli. (A) Overview of λ activities in E. coli: (a) free phage, (b) binding and entry, (c) DNA circularization and supercoiling, (d) transcription, (e), replication, and (f) virion assembly. Details are shown in B–E. (B) Infection and DNA ejection, (C) DNA ligation, (D) gene regulation. An asterisk means weakly bound to the RNA polymerase. (E) DNA replication. Phage proteins are labeled in red, host proteins in black. Modified after Das (1992) and Mason and Greenblatt (1991).

6. Replication

After λ gene O and P protein synthesis, replication of the phage genome is initiated (Fig. 6). Replication initiation starts with dimeric gpO bound to the four 18-bp direct repeats of the λ origin, located within the sequence of the O gene. Each repeat binds a gpO dimer, leading to a total of eight monomers bound directly to the DNA, in addition to non-DNA-bound gpO that associates with the complex solely by PPIs (Dodson et al., 1986). This large nucleoprotein complex is called the O-some. GpP extracts the host DnaB helicase from its native complex with DnaC (Zylicz et al., 1989). The DnaB helicase is inactive as long as it is tightly associated with gpP. The gpP–DnaB pair is then added to the O-some through interactions between gpP and gpO. Three host heat shock chaperone proteins—DnaK, DnaJ, and GrpE—are subsequently recruited (DnaK through direct interaction with gpP and gpO) and activate the DnaB helicase by liberating it from its inhibitory interaction with gpP. Finally gpP and the chaperones are released, and replication starts with the help of the host DnaG primase, DNA polymerase III, gyrase, and single-stranded DNA-binding (SSB) proteins (Dodson et al., 1986; LeBowitz and McMacken, 1986). The RNA polymerase β subunit was shown to interact directly with gpO (Szambowska et al., 2010). Even though not yet fully understood, how E. coli RNA polymerase activates λ ori seems to be far more complex than the original idea that it simply changed DNA topology.

Studies on λ ori-dependent plasmids have revealed interactions with even more host proteins, including the DnaJ homolog CbpA (Wegrzyn et al., 1996), and through transcriptional activation of origin activity, perhaps also with DnaA and SeqA (Potrykus et al., 2002; Wegrzyn and Wegrzyn, 2001, 2002). As λ ori-dependent plasmid maintenance and copy number are very sensitive to changes in host physiology, interactions of the O-some with additional proteins may yet be revealed.

7. The λ interactome

A λ Y2H interactome analysis has been completed (Rajagopala et al., 2011). All λ open reading frames (ORFs) were cloned and tested in all possible pairs for interactions, using three different Y2H vectors. These screens identified 97 interactions among 68 tested λ proteins, of which 16 were known previously. Notably, because only 33 interactions had been reported previously, this screen found more than 50% of all previously published interactions in a single experiment! At least 18 of the newly observed Y2H interactions seem to be plausible, based on the functions of the interacting proteins, indicating that even in λ many new discoveries can be made. Interestingly, relatively few interactions involving morphogenetic proteins were revealed, and possible reasons for the still significant fraction of apparent false negatives are discussed later.

λ	Host	Description	References
Transcription
CI	RecA	RecA degrades CI^b	Kobiler et al., 2004
CI	RpoA		Kędzierska et al., 2007
CI	RpoD		Li et al., 1994; Nickels et al., 2002
CII	ClpYQ	ClpYQ degrades CII in vitro	Kobiler et al., 2004
CII	ClpAP	ClpAP degrades CII in vitro	Kobiler et al., 2004
CII	HflD	HflD makes CII more vulnerable to FtsH (hfl, high-frequency lysogenization)	Kihara et al., 2001
CII	HflB^c	HflB (= FtsH) protease degrades cII	Kobiler et al., 2002; Shotland et al., 2000
CII	RpoA		Kędzierska et al., 2004; Marr et al., 2004
CII	RpoD		Kędzierska et al., 2004
CIII	HflB	CIII inhibits HflB; HflB degrades CIII	Herman et al., 1997; Kornitzer et al., 1991
gpN	NusA	Transcriptional regulation	Friedman and Court, 2001
gpN	Lon	Lon degrades gpN	Kobiler et al., 2004; Gottesman et al., 1981
gpQ	σ⁷⁰	gpQ makes RNAP insensitive to cis terminal	Marr et al., 2001;Nickels et al., 2002; Roberts and Roberts, 1996; Roberts et al., 1998
Head
gpB	GroE	genetic interaction; E. coli GroE	Ang et al., 2000; Georgopoulos et al., 1973; Murialdo, 1979
gpE	GroE	genetic interaction	Georgopoulos et al., 1973
Tail
gpJ	LamB	LamB is the E. coli receptor	Buchwald and Siminovitch, 1969; Clement et al., 1983; Mount et al., 1968; Wang et al., 2000a; Werts et al., 1994
Stf	OmpC	OmpC is a secondary E. coli receptor	Hendrix and Duda, 1992
Recombination
Xis	Lon	Xis is degraded by E. coli Lon protease	Leffers and Gottesman, 1998
Xis	FtsH	Xis is degraded by E. coli FtsH protease	Leffers and Gottesman, 1998
Xis	Fis	both required for excision	Ball and Johnson, 1991; Cho et al., 2002; Esposito and Gerard, 2003
Int	IHF	Both catalyze recombination at attP/attB	Campbell et al., 2002; Crisona et al., 1999
Gam	RecB	Gam inhibits RecBCD	Marsic et al., 1993
NinB	SSB	NinB also binds ssDNA	Maxwell et al., 2005
Replication
gpO	ClpXP	ClpXP degrades gpO	Kobiler et al., 2004
gpP	DnaA		Datta et al., 2005b,c
gpP	DnaB		Mallory et al., 1990

1. P22 life cycle

The early operons of P22 are organizationally similar to those of λ and carry parallel prophage repressor, Cro, recombination, DNA replication, and transcription anti-termination functions (Susskind and Botstein, 1978). The P22 repressor has a different DNA operator target specificity from that of λ repressor and encodes nonhomologous types of recombination and replication proteins, but its late operon anti-termination protein is 97% identical to that of λ and has the same target specificity. Unlike λ, in addition to its prophage repressor (C2) and lysogeny control proteins (C1, C3, Cro), P22 carries an “immunity I” region that encodes an anti-repressor protein, Ant, that binds to the prophage repressor and blocks its ability to bind its operator. Two additional repressors, Mnt and Arc, regulate Ant synthesis (Susskind and Botstein, 1975; Vershon et al., 1987a,b). Also, instead of a protein that recruits the host DnaB protein to the replication origin, P22 encodes its own DnaB homologue (Backhaus and Petri, 1984; Wickner, 1984). The repressed P22 prophage encodes several “lysogenic conversion” genes, several of which encode a set of three Gtr (glucose transfer) proteins that modify the host O-antigen polysaccharide by glucosylation (Broadbent et al., 2010; Makela, 1973; Vander Byl and Kropinski, 2000). Finally, P22 has been studied as a prototypical headful DNA packaging phage, which is different from the packaging strategies used by the other phages discussed here; unlike these other phages that package DNA between two specific nucleotide sequences in the DNA substrate, P22 packages DNA until the head is full and the nucleotide sequence at the termination point does not matter (Casjens and Hayden, 1988; Jackson et al., 1978; Tye et al., 1974).

2. P22 virion assembly

The P22 virion is assembled by proteins encoded by 14 genes in its late operon (Fig. 7) (Botstein et al., 1973; Eppler et al., 1991; King et al., 1973; Poteete and King, 1977; Youderian and Susskind, 1980). Some P22-like phages, for example, phages ε34 and L, have a 15th virion assembly gene, dec, at the promoter proximal end of the cluster, whose encoded protein “decorates” the outside of the head and stabilizes it (Gilcrease et al., 2005; Tang et al., 2006; Villafane et al., 2008). The P22 late operon encodes lysis proteins very similar to those of λ; however, the virion assembly genes are not closely related to λ or to any other tailed phage type. P22 virion assembly is best known for the fact that its head-assembling scaffolding protein was the first to be discovered and studied (Cortines et al., 2011; King and Casjens, 1974; Prevelige et al., 1988; Weigele et al., 2005). The P22 procapsid assembles as a complex of 415 coat protein molecules, 12 portal ring subunits, and about 250 scaffolding protein molecules. At about the time when DNA is packaged, all 250 scaffolding proteins are released intact from the procapsid and are reused with newly synthesized coat protein in subsequent rounds of procapsid assembly. Thus, the P22 scaffolding protein acts catalytically in head assembly and individual molecules can participate in at least five rounds of procapsid assembly. A particularly high-resolution asymmetric cryo-EM reconstruction of the virion has been determined (Lander et al., 2006; Tang et al., 2011), and because assembly of the four proteins of its short tail has also been quite well studied [X-ray structures are known for three tail proteins and the portal protein (Olia et al., 2006, 2007; Steinbacher et al., 1997)], the P22 virion is perhaps the best understood of all tailed phage virions. The details of P22 virion assembly have been reviewed in light of its evolution and genome mosaicism (Casjens and Thuman-Commike, 2011), and the authors refer the reader to this review for further details. The published PPIs of phage P22 are summarized in Table V. Interactions between P22 and its host are much less well studied than those of λ and are mostly understood by homology arguments in the case of proteins that are similar in the two phages. Known P22–host interactions are listed in Table VI.

FIGURE 7

Genome and virion of phage P22. (A) Genome of phage P22 with a scale in kbp below. Green open reading frames (ORFs; rectangles) are transcribed left to right; red ORFs are transcribed right to left. Known RNA polymerase promoters and transcripts are shown as black flags and arrows, respectively. (B) The asymmetric (not icosahedrally averaged) P22 virion three-dimensional cryo-EM reconstruction from Tang et al. (2011). Numbers in parentheses indicate molecules/virion, and bold numbers indicate proteins for which X-ray structures are known. (C). Negatively stained electron micrograph of P22 virions.

TABLE V

P22 intraphage protein interactions

P22 1	P22 2	Description	References
gp3	gp3	Small terminase subunit; homononomer in solution	Nemecek et al., 2007, 2008
gp3	gp2	Purify as complex; bind each other in vitro	Nemecek et al., 2007; Poteete and Botstein, 1979
gp2	gp1	gp2 is large terminase subunit; assembles in packaging motor to gp1 by analogy with other phages (gp2 by itself is monomer in solution); evolutionary argument for P22 gp2–gp1 interaction	Casjens, 2011
gp1	gp1	Portal protein; homododecamer; EM of 12-mer; EM of virion; X-ray structure	Bazinet et al., 1988; Lander et al., 2006; Olia et al., 2011
gp1	gp5	Purify as complex (procapsids); three-dimensional (3D) reconstructions of virion and procapsid	Botstein et al., 1973; Chen et al., 2011; Lander et al., 2006
gp1	gp8	gp8 required for assembly of gp1 into procapsid; interaction seen in procapsid 3D reconstruction	Chen et al., 1989; Greene and King, 1996; Weigele et al., 2005
gp8	gp8	Scaffolding protein; forms homodimers and tetramers in solution	Parker et al., 1997
gp8	gp5	Purify as complex (procapsid); bind in vitro	Fuller and King, 1982; Greene and King, 1994; Parker et al., 1998
gp5	gp5	Coat (major capsid) protein; forms T=7 l icosahedral shell of virion	Botstein et al., 1973; Casjens, 1979; Lander et al., 2006
gp5	gp1	Purify as complex (procapsids); interaction seen in 3D reconstruction	Chen et al., 2011; Lander et al., 2006
gp4	gp4	Tail protein; monomer in solution (naturally unfolded?); gp4s barely touching each other in dodecamer ring in virion (i.e., contacts not large); EM, X-ray structure	Olia et al., 2011; Tang et al., 2011
gp4	gp5	Contact in high-resolution 3D reconstruction of virion	Olia et al., 2006, 2011; Tang et al., 2011
gp4	gp1	Order of assembly, EM, 3D reconstruction; X-ray cocrystal structure	Olia et al., 2006, 2011; Tang et al., 2011
gp4	gp10	12 gp4s contact 6 gp10s; order of assembly, EM, 3D reconstruction	Lander et al., 2006; Olia et al., 2007
gp4	gp9	gp4 ring contacts 6 gp9 trimers; 3D reconstruction	Lander et al., 2006
gp10	gp10	Tail protein; probably hexamer in solution; 3D reconstruction	Lander et al., 2006; Olia et al., 2007
gp10	gp9	6 gp4s contact 6 gp9 trimers; 3D reconstruction	Lander et al., 2006
gp10	gp26	6 gp10s contact one gp26 trimer	Landeret al., 2006
gp26	gp26	Tail needle protein; trimer in solution - crystal structure; one trimer in virion	Olia et al., 2007
gp16	gp8	Mutants of gp8 do not incorporate gp16 into virion	Chen et al., 1989; Greene and King, 1996; Weigele et al., 2005
C1	C1	C1 is homologue of λ CII; homotetramer	Ho et al., 1992
Mnt	Mnt	Repressor of antirepressor gene; tetramer	Nooren et al., 1999
Arc	Arc	Repressor of antirepressor gene; dimer in solution, tetramer on DNA	Bowie and Sauer, 1989; Breg et al., 1990; Brown, Bowie and Sauer, 1990
Ant	C2	Antirepressor; binds to and inhibits C2 repressor	Susskind and Botstein, 1975
gp9	gp9	Tailspike protein; trimer in solution, 6 trimers in virion; crystal structure; binds to polysaccharide on cell surface	Lander et al., 2006; Steinbacher et al., 1994, 1997
Int	Xis	Form heteromultimer	Cho et al., 2002
Xis	Xis	Excisionase; homomultimers	Mattis et al., 2008
Erf	Erf	Recombination protein; homomultimeric ring of 10–14 subunits	Poteete et al., 1983
C3	C3	C3 is homologue of λ CIII; λ CIII is dimer by disulfide bond; P22 interaction by homology with λ CIII	Halder et al., 2007
SieB	Esc	Similar to λ but not studied in detail	Ranade and Poteete, 1993a
C2	C2	Prophage repressor homologue of λ CI; homodimer	De Anda et al., 1983
Cro	Cro	Dimer; solution studies and crystal structure	Darling et al., 2000; Newlove et al., 2004
gp18	gp12	Origin binding protein gp18 recruits DnaB-like gp12 to replication origin by analogy with λ (note that gp12 is not a homologue of λ P protein; gp18 is homologous to λ O protein only in N-terminal domain)	Backhaus and Petri, 1984
Rz	Rz1	Rz and Rz1 are supposed to bind to one another and form a bridge that spans the periplasm in phage λ; P22 gp15 and Rz1 should do the same by homology.	Summer et al., 2007
gp13	gp13	Holin forms rings of large diameter in λ; by homology same should be true of P22 holin	Savva et al., 2008
gp13	gp13′	Antiholin–holin interactions delay the formation of holin rings in λ; by homology same should be true in P22	Rennell and Poteete, 1985
Dec	Dec	Trimer; Note that P22 has no Dec, but its very close relative phage L does and L Dec binds to P22 capsids.	Tang et al., 2006
Dec	gp5	Dec occupies a subset of the quasi-equivalent local threefold positions on the shell	Tang et al., 2006

TABLE VI

P22–host protein interactions

P22	Host	Description	References
gp5	GroEL	Solution studies	de Beus et al., 2000
gp24	NusA? etc?	Homologue/analogue of λ gpN; likely binds same host factors as λ gpN	Franklin, 1985; Hendrix and Casjens, 2006
C3	FtsH(HflB)	C3 is homologue of λ CIII; inhibition of FstH by homology with λ CIII	Semerjian et al., 1989
C1	FtsH(HflB)	C1 is homologue of λ CII; CII is degraded by FtsH	Backhaus and Petri, 1984
C2	RecA	P22 repressor autocleavage stimulated by RecA binding like λ	Prell and Harvey, 1983; Tokuno and Gough, 1976
Xis, gp18, gp24, C1	Host protein degradation machinery	These may be degraded analogously to their λ homologues, but direct information is not available	Backhaus and Petri, 1984
gp23	RNA polymerase?	By homology to λ; gp23 is nearly identical to λ gpQ	Pedulla et al., 2003
gp12	Host replication machinery	Possible interaction with host DnaG primase	Wickner, 1984
Abc2	RecC	Solution studies	Murphy, 2000

3. Comparison of P22 and λ interactomes

As noted earlier, comparison of P22 and λ is a particularly informative way to understand the kinds of relationships among the proteins of different tailed phages. The “lambdoid” phages all encode similar functions and have similar gene organization, but these “similar functions” can be of several different types. Because these phages undergo a significant amount of horizontal transfer of genetic information (especially if they are “closely” related), their functions can be homologous or even nearly identical (Casjens, 2005; Hendrix, 2002). For example, the two “gpQ” late operon control anti-termination proteins of the two phages are nearly identical and can replace one another in function. A second kind of relationship is represented by the prophage repressors. These proteins from the two phages have weak but recognizable homology and similar folds, and probably interact with host proteins the same way, but have different operator-binding specificity. Thus, one repressor cannot functionally replace the other without additional changes in the DNA-binding sites of the protein. A third kind of relationship is that of the virion coat, terminase, and portal proteins, λ gpE, gpA, and gpB and P22 gp5, gp2, and gp1, respectively. These proteins have the same functions in the two phages—coat protein and DNA packaging nuclease/motor ATPase and procapsid DNA entry channel, respectively (summarized in Hendrix and Casjens, 2006). These cognate proteins are not recognizably similar in overall amino acid sequence between the two phages, although the two ATPase proteins have similar ATP-binding motifs. However, the two proteins almost certainly have the same basic fold. Thus, gpE and gp5 and gpA and gp2, as well as gpB and gp1, are all very likely ancient homologous pairs that have diverged beyond the point of having recognizable sequence similarity. Finally, similar functions are performed by nonhomologues in the two phages. Examples of this relationship are the (i) nonhomologous Exo (λ) and Erf (P22) proteins that catalyze homologous recombination in completely different ways (Little, 1967; Poteete and Fenton, 1983); (ii) the endolysin proteins gpR and gp19, where the λ protein is a transglycosylase and the P22 protein is a true lysozyme that both cleave the same peptidoglycan bond (Bienkowska-Szewczyk et al., 1981; Rennell and Poteete, 1989); (iii) the gpW and gp4 proteins, which bind between the portal ring of the head and the tail and yet have no recognizable homology or structural similarity; and (iv) the λ receptor binding “fiber” (gpJ and STF) and P22 tailspike (gp9), which have no homology, recognize different structures, and act in different ways [the P22 gp9 enzyme cleaves its polysaccharide receptor (Iwashita and Kanegasaki, 1976)]. Thus, there are potentially a number of different ways in which parallel λ and P22 functions might interact with the hosts of these two phages, and it will be interesting to understand these in detail in both cases. But are there very similar phage proteins that have different host interactions? Do very different phage proteins that have the same function have equally different host interactions? Although no such situations have been identified to date, it remains possible. A comparison of PPIs in λ and P22 is given in Tables VII and VIII.

TABLE VII

Comparison of λ and P22 interactions

λ 1	λ 2	P22 1	P22 2	Description
gpA	gpB	gp2	gp1	Likely similar contacts—by evolutionary argument^a
gpA	gpNu1	gp2	gp3	Likely similar contacts—by evolutionary argument ^a
gpNu1	gpNu1	gp3	gp3	gpNu1 N-terminal fragment is dimer; gp3 is nonomer; not known if they have the same fold (i.e., are ancient homologues)
gpW	gpB	gp4	gp1	gpW and gp4 have no known homology, but are analogous in that they are the first proteins to bind the underside of the the portal ring; gp4 has no structural homology to λ gpFII or gpW (Cardarelli et al., 2010; Olia et al., 2011)
gpW	gpFII			No clear gpFII homologue in P22 (gp4 and gp10 of P22 have no sequence similarity)
gpB	gpB	gp1	gp1	Both form dodeamer portal rings^a
gpC	gpE			No gpC analogue in P22
gpC′	gpB			No gpC analogue in P22
gpC	gpNu3			No gpC analogue in P22
gpNu3	gpNu3	gp8	gp8	Monomer/dimer/tetramer for gp8; monomer/dimer for gpNu3
gpB	gpE	gp1	gp5	Seen in reconstructions^a
gpNu3	gpB	gp8	gp1	Scaffolds required for portal assembly into procapsid
		gp8	gp5	Scaffold–coat interaction not studied for λ
gpD	gpD	Dec	Dec	Both trimers; bind slightly different sites.^{Note that P22 has no Dec, but its very close relative phage L does and L Dec binds to P22 capsids}
gpE	gpE	gp5	gp5	Both make icosahedral T=7 head shells^a
gpU	gpU			No analogue in P22
gpD	gpE	Dec	gp5	Dec is more discriminatory than D, i.e., unlike D, Dec does not occupy all of the quasi-equivalent local threefold positions on the shell
		gp4	gp5	Unkown in λ
		gp4	gp10	No gp10 analogue in λ (but see gpFII above)
		gp4	gp9	No gp9 analogue in λ (but see Stf below?)
		gp10	gp9	No gp9 analogue in λ (but see Stf below?)
		gp10	gp10	No gp10 analogue in λ (but see gpFII above)
		gp10	gp26	No gp10 or gp26 (?) analogue in λ
		gp26	gp26	No gp26 analogue in λ (maybe gpJ?)
Stf	Stf	gp9	gp9	Both trimers (Stf by homology to T4 fibers); analogues, not homologues
gpU	gpV			No analogues in P22
gpV	gpV			No analogue in P22
gpO	gpP	gp18	gp12	gpO and gp18 are homologous but gpP and gp12 are not; nonetheless, a good argument can be made that gp18 and gp12 must interact similarly to gpO and gpP to get replication started
gpW	gpW	gp4	gp4	Analogues—first proteins to bind below the portal ring (see gpW–gpB interaction above)
gpFI	gpA			No gpFI analogue in P22
gpFI	gpE			No gpFI analogue in P22
gpH	gpG/gpGT			No analogues in P22
gpV	gpGT			No analogues in P22
gpH	gpV			No analogues in P22
		Arc	Arc	No analogue in λ
		Ant	C2	No Ant in λ
		Mnt	Mnt	No analogue in λ
CI	CI	C2	C2	Homologues (Sevilla-Sierra et al., 1994)
Exo	Bet	Erf	Erf	λ Exo and P22 Erf proteins are not homologues and function in basically different ways, but they are both essential for homologous recombination
SieB	Esc	SieB	Esc	Good homologues in P22 but they are unstudied
CII	CII	C1	C1	Homotetramers in both cases
Xis	Int	Xis	Int	Xis proteins are quite different
Xis?	Xis?	Xis	Xis	Monomer in λ, but binds cooperatively so could interact on DNA (Abbani et al., 2007); multimer in P22
CIII	CIII	C3	C3	C3 is homologue of λ CIII; λ CIII is dimer (Halder et al., 2007); P22 interaction by homology with λ CIII
Cro	Cro	Cro	Cro	Both homodimers
gpRz	gpRz1	gp15	gp15	Predicted for P22 by homology

^{Not recognizably similar in amino acid sequence but}very likely ancient homologues.

TABLE VIII

Comparison of phage λ–host and P22–host interactions

λ	Host	P22	Host	Notes
CII	HflB	C1	HflB	Should be true for P22 by homology
CIII	HflB	C3	HflB	Should be true for P22 by homology
gpJ	LamB			P22 tails recognize polysaccharide so no analogue in P22
Stf	OmpC			P22 tails recognize polysaccharide so no analogue in P22
Xis	Lon	Xis	Lon?	Might be true for P22 by homology
Xis	FtsH	Xis	FtsH?	Might be true for P22 by homology
gpN	RNAP	gp24	RNAP?	Likely true for P22 by similarity of function
Int	IHF			P22 does not interact with IHF (Cho et al., 1999)
Xis	Fis			Unknown for P22
gpQ	σ⁷⁰	gp23	σ⁷⁰	Likely true for P22 by similarity of function
P	DnaB			No analogue of P in P22
		gp12	DnaG?	No analogue of 12 in λ
gpE	GroE	gp5	GroE
gpB	GroE			Not known for P22
gpN	Lon	gp24	Lon	Not known for P22
CI	RecA	C2	RecA	Might be true for P22 by homology and induction by ultraviolet light
CII	ClpYQ	C1	ClpYQ	Might be true for P22 by homology
CII	ClpAP	C1	ClpAP	Might be true for P22 by homology
gpO	ClpXP	gp18	ClpXP	Might be true for P22 by homology
gpN	NusA	gp24	NusA	Likely true by similarity of function
CII	HflD	C1	HflD	Might be true for P22 by homology
Gam	RecB			No Gam in P22
		Abc2	RecC	No Abc2 in λ

Protein	Function	Interactions	Notes	References
P2–P2 proteins
gpC	Immunity repressor	Dimer	Y2H 3D structure	Eriksson et al., 2000; Massad et al., 2010
Cox	Transcriptional repressor, excisionase	Tetramer that can self-associate to octamers	In vitro; cross-linking and gel filtration	Eriksson and Haggard-Ljungquist, 2000
Int	Integrase	Dimer	In vitro; cross-linking Y2H	Frumerie et al., 2005
gpF_I	Tail sheath	Polymer, six molecules per sheath striation	Electron microscopy	Lengyel et al., 1974
gpF_II	Tail tube	Polymer, six molecules per tube striation	Electron microscopy	Lengyel et al., 1974
gpM	Small terminase subunit	Polymer, 8- to 10-fold ring	Biochemistry	Dokland, unpublished data
gpM	Small terminase subunit	gpP	Biochemistry	Bowden and Modrich, 1985
gpN	Major capsid protein	T=7 icosahedron (415 copies)	Cryo-EM and 3D reconstruction	Dokland et al., 1992
		gpO	Biochemistry	Chang et al., 2008, 2009
gpO	Internal scaffolding protein	Dimer	Biochemistry, chromatography	Chang et al., 2008, 2009
		gpN
gpP	Large terminase subunit	gpP	Biochemistry	Bowden and Modrich, 1985
gpQ	Connector, portal protein	Dodecameric ring	EM, crystallography	Doan and Dokland, 2007; Rishovd et al., 1994
gpT	Tape measure	gpF_II (gpF_I ?)	Inside tail tube
gpV	Tailspike	Tetramer	Sedimentation velocity and molecular mass determination	Haggard-Ljungquist et al., 1995; Kageyama et al., 2009
gpV	Tailspike	gpW, base plate component. Belongs to the T4 gene 25-like superfamily	Copurification	Haggard-Ljungquist et al., 1995
P2–other phage proteins
Tin	Block superinfection of T-even phages	T4 SSB protein (gp32)	Genetic evidence	Mosig et al., 1997
gpC	Immunity repressor	P4 anti-repressor ε	Genetic evidence. Y2H	Geisselsoder et al., 1981; Liu et al., 1998; Eriksson et al., 2000
gpN	Major capsid protein	P4 Sid; size determinator, external scaffold	Cryo-EM and 3D reconstruction	Marvik et al., 1995; Wang et al., 2000; Dokland et al., 2002
gpN	Major capsid protein	P4 Psu; polarity suppressor; decoration protein	Cryo-EM and 3D reconstruction	Dokland et al., 1993
P2–host proteins
gpB	Helicase loader.	E. coli helicase DnaB	Genetic evidence.	Sunshine et al., 1975; Funnell and Inman, 1983; Odegrip et al., 2000
Required for lagging strand synthesis	Electron microscopy.
In vitro binding.
Ogr	Transcriptional activator of late operons	E. coli RNA polymerase α C-terminal domain	Genetic evidence	Sunshine and Sauer, 1975; Fujiki et al., 1976; King et al., 1992; Ayers et al., 1994; Wood et al., 1997
P4–host proteins
Psu	Polarity suppressor; anti-terminator	Rho	Genetics; chemical cross-linking	Sauer et al., 1981; Pani et al., 2009
Delta	Transcriptional activator, contains two Ogr domains	E. coli RNA polymerase	Genetics; homology to Ogr	Julien et al., 1997

1. Gene control circuits

The decision whether to grow lytically or form a lysogen after P2 infection is dependent on the levels of two repressors: the immunity repressor gpC, which blocks lytic growth, and Cox, which blocks the expression of the genes required for lysogenization. GpC acts as a symmetric dimer, and its three-dimensional structure has been determined (Eriksson et al., 2000; Massad et al., 2010). Cox is multifunctional and acts as a repressor, as well as an excisionase, thereby coupling the transcriptional switch that controls lytic growth versus formation of lysogeny with the integration/excision of the phage genome in or out of the host chromosome. Cox is a tetramer in solution that self-associates to octamers in the absence of DNA (Eriksson and Haggard-Ljungquist, 2000). The transcriptional activator Ogr is required to turn on late gene transcription. Ogr is a Zn finger-containing protein that interacts with the α subunit of the E. coli RNA polymerase, recruiting it to the late promoters (Ayers et al., 1994; Fujiki et al., 1976; King et al., 1992; Sunshine and Sauer, 1975; Wood et al., 1997).

Some P2 prophage genes are not blocked by the action of gpC and instead are expressed from the prophage providing the host cell with new properties. Among these “lysogenic conversion” genes, P2 has three genes, old, fun, and tin, that block superinfection of unrelated phages, but only gpTin is known to have a PPI. P2 gpTin excludes phage T4 by binding to and poisoning T4 gp32 (SSB), thereby inhibiting DNA replication (Mosig et al., 1997).

Satellite phage P4 has to get access to the morphogenetic genes of its helper P2 to grow lytically. This is accomplished by P4-encoded functions during coinfection with P2. The P4 antirepressor Epsilon derepresses the P2 prophage by interacting with gpC, leading to the expression of P2 lytic genes (Eriksson et al., 2000; Geisselsoder et al., 1981; Liu et al., 1998). P4 also has the capacity to trans-activate P2 late genes in the absence of Ogr using the transcriptional activator Delta. Delta contains two Ogr-like domains and recognizes the same DNA sequence as Ogr. Like Ogr, Delta interacts with the α subunit of E. coli RNA polymerase (Christie et al., 2003; Julien et al., 1997; Souza et al., 1977).

2. P2 DNA replication

After infection, the linear DNA molecule is circularized by ligation of the cohesive (cos) ends by the host DNA ligase, and if P2 enters the lytic pathway, DNA replication is initiated by the cis-acting gpA that creates a single-stranded cut at the origin. GpA remains linked covalently to the 5′ end of the cleaved strand (Liu and Haggard-Ljungquist, 1994). Replication occurs via a modified rolling-circle mechanism that generates a double-stranded monomeric circle that is the packaging substrate (Pruss et al., 1975). The cleavage/joining reactions are promoted by two tyrosine residues interspaced by three amino acids in gpA (Odegrip and Haggard-Ljungquist, 2001). The host Rep helicase is required for all DNA replication (Calendar et al., 1970), while the phage-encoded gpB is a helicase loader that interacts with the E. coli DnaB helicase required for lagging strand synthesis (Funnell and Inman, 1983; Odegrip et al., 2000).

3. Head structure and assembly

The main structural proteins involved in P2 capsid assembly are gpN, capsid protein; gpO, internal scaffolding protein; and gpQ, connector or portal protein (Chang et al., 2008; Lengyel et al., 1973; Linderoth et al., 1991). P2 capsids are assembled as empty precursor procapsids consisting of 415 copies of the gpN capsid protein arranged with T=7 icosahedral symmetry and clustered in hexamers and pentamers that make trivalent interactions (Dokland et al., 1992) (Fig. 9). GpQ forms a dodecameric ring that is incorporated into the capsid at one fivefold vertex and through which the DNA is subsequently packaged (Doan and Dokland, 2007; Rishovd et al., 1994).

FIGURE 9

P2 assembly. Schematic diagram of the P2/P4 capsid assembly pathway. (A) Capsid protein gpN, scaffolding protein gpO, and connector (portal) protein gpQ are assembled into the T=7 P2 procapsid. (B) GpN, gpO, and gpQ are then processed to their mature forms, N*, O*, and Q*. DNA is packaged into the procapsid by the P2 terminase complex (gpM and gpP), accompanied by expansion into the mature, angular capsid (B). (C) In the presence of the P4-encoded external scaffolding protein Sid, P2 structural proteins are assembled into a small, T=4 P4 procapsid. Loss of Sid (D), protein processing, and DNA packaging lead to the mature, expanded P4 capsid (E). The decoration protein Psu is added to the mature capsid (F).

GpO plays a dual role in the assembly process as both a protease and a scaffolding protein (Chang et al., 2009; Dokland, 2011). The full-length, 284 residue gpO protein is associated with procapsids at early times (Marvik et al., 1994), but is cleaved autoproteolytically between residues 141 and 142, leaving an N-terminal fragment, gpO* (formerly known as h7), that remains inside the mature capsid (Chang et al., 2008; Lengyel et al., 1973). Concomitantly, gpN and gpQ are cleaved—presumably by gpO—to their mature forms, gpN* and gpQ*, by removal of 31 and 26 residues, respectively, from the N terminus (Rishovd and Lindqvist, 1992; Rishovd et al., 1994). The protease activity of gpO resides in its N-terminal domain, which binds tightly to gpN and constitutes a serine protease similar to that of the herpesviruses (Chang et al., 2009; Cheng et al., 2004; Dokland, 2011). The C-terminal 90 amino acids comprise a scaffolding domain, which is necessary and sufficient for promoting capsid assembly (Chang et al., 2008, 2009; Wang et al., 2006). It too presumably binds to gpN during assembly, but any interaction is transient (Chang et al., 2009). Both N- and C-terminal domains form dimers in solution (Chang et al., 2009).

Circular P2 genomic DNA substrates are packaged into empty procapsids by the terminase complex, which consists of the small and large terminase proteins (gpM and gpP, respectively). Terminase recognizes and cleaves P2 DNA at the cos sites (Bowden and Modrich, 1985; Ziermann and Calendar, 1990). DNA packaging is associated with major structural transitions that lead to a larger, more angular and thin-shelled, mature capsid (Chang et al., 2008; Dokland et al., 1992). A head completion protein, gpL, is added (Chang et al., 2008), and tails, which are assembled via a separate pathway, are attached to the unique gpQ-containing portal vertex (Lengyel et al., 1974).

4. Tail structure

The 135-nm-long tail is composed of an inner tube and a contractile sheath (Lengyel et al., 1974). The head-proximal end of the free tail has two disc-like structures of different thickness and diameter. In the complete phage, only the thinner disc is seen outside the head. The distal end of the tail contains a baseplate with a spike and six fibers. Sixteen essential tail genes have been identified located in three transcription units. GpFI and gpFII constitute the tail sheath and tail tube, respectively, and there are about 200 copies of each protein per tail (Temple et al., 1991). RpR and gpS are involved in tail completion, as amber mutations in the respective gene produce giant tails and giant tail sheaths (Linderoth et al., 1994). GpV is the tail spike, which forms a trimer in solution (Haggard-Ljungquist et al., 1995; Kageyama et al., 2009). GpW copurifies with gpV and is thus thought to be part of the baseplate, while gpJ localizes to the edge of the baseplate or proximal part of the tail fiber by immunoelectron microscopy (Haggard-Ljungquist et al., 1995). GpH is believed to be the distal part of the tail fiber because it contains regions similar to tail fiber proteins of phages Mu, P1, λ, K3, and T2 (Haggard-Ljungquist et al., 1992). A region of gene H is highly similar to the variable part of the S gene (Sv) of phage Mu and the variable part of gene 19 (19v) of phage P1, in both cases in the (+) orientation of the invertible segments. P2 does not have an invertible segment, but it contains a sequence identical to the left repeat of the Mu invertible G segment located within the H gene and upstream of the region with a high similarity to Mu Sv and P1 19v, indicating that the P2 H gene has been frozen in the (+) orientation, probably by a deletion covering the right inverted repeat and the invertase. This region is therefore inferred to constitute the receptor-binding tip of the tail fiber. By its similarities to gpU and gpU′ of phages Mu and P1, and gp38 of phages T4, Tula, and Tulb, gpG is probably required for fiber assembly. GpT is likely the tape measure, as it is the only protein long enough to span the length of the tail shaft (Christie et al., 2002). Function of the products of genes X, I, E, E′ (programmed -1 frameshift extension of E), U, and D remains unknown.

5. P4-induced redirection of P2 capsid assembly

Satellite phage P4 cannot form phage particles by itself, but in the presence of phage P2, P4 genomes are packaged into phage particles using structural proteins supplied by P2 (Six, 1975). However, the P4 capsid is smaller than P2 and contains only 235 copies of gpN, arranged on a T=4 icosahedral lattice (Dokland et al., 1992) (Fig. 9). This size control is affected by the P4 gene product Sid (size determination)(Barrett et al., 1976), which forms an dodecahedral cage-like scaffold around the outside of P4 procapsids (Marvik et al., 1995). Sid self-associates into trimers and dimers that interact with gpN only in two hexamer subunits near the twofold symmetry axes (Wang et al., 2000; Dokland et al., 2002). Gene N mutants, called sir (sidresponsiveness), render the capsid protein unable to form small capsids even in the presence of Sid (Six et al., 1991). Conversely, the so-called super-sid or nms (N mutation sensitive) mutations in sid act as second-site suppressors of the sir mutations and enable the formation of small capsids (Kim et al., 2001). Although well-formed small procapsids can be assembled from Sid and gpN alone (Dokland et al., 2002), gpO is required for the formation of viable P4 phage (Six, 1975). Presumably this is because its protease activity is essential, and gpO may also facilitate incorporation of the portal. Indeed, sid complements the P2 Oam279 mutant, which produces a 237 residue N-terminal fragment of gpO that includes the protease domain but lacks scaffolding activity (Agarwal et al., 1990). The P4 genome contains P2 cos sites and is packaged into small capsids using the P2 terminase (Ziermann and Calendar, 1990). The P4-encoded protein Psu is a suppressor of Rho-dependent transcription termination (Pani et al., 2009; Sauer et al., 1981); it also functions as a decoration protein that binds to gpN hexamers and stabilizes the P4 capsid (Dokland et al., 1993; Isaksen et al., 1993).

1. Protein–protein interactions

A literature and database search on ϕ29-like viruses retrieved 30 PPIs (Tables X and andXI)XI) with evidence mainly from directed experiments and structure determination studies. Twenty-three are attributed to phage ϕ29 itself, five to GA-1, and two to Nf, emphasizing the role of ϕ29 as a model for the whole group. The biology of other ϕ29-like phages is poorly understood, and only a few proteins have been characterized experimentally. In some cases, ϕ29 PPIs are conserved among other ϕ29-like homologues, supporting a close relationship of these phages. However, there are also differences that reveal interesting insights into their evolution. Known ϕ29 protein–protein interactions include proteins involved in transcriptional regulation, DNA replication, and virion structure. Only three PPIs between ϕ29 and B. subtilis proteins have been identified so far.

TABLE X

Protein–protein interactions of phage ϕ29 including phage–host interactions^a

ϕ29 1	Function 1	ϕ29 2	Function 2	References
gp2	DNA polymerase	gp3	DNA terminal protein TP	Kamtekar et al., 2006; Blanco et al., 1987; Dufour et al., 2000; González-Huici et al., 2000
gp3	DNA terminal protein TP	gp1	Membrane-associated early protein	Bravo et al., 2000
gp3	DNA terminal protein TP	gp3	DNA terminal protein TP	Serna-Rico et al., 2003
gp3	DNA terminal protein	gp16	Packaging ATPase	Guo et al., 1987 Grimes and Anderson, 1997
gp4	Late gene activator	gp4	Late gene activator	Badia et al., 2006 Mencía et al., 1996b
gp4	Late gene activator	gp6	Histone-like, nonspecific DNA-binding protein	Calles et al., 2002; Camacho and Salas, 2001
gp6	Histone-like, nonspecific dsDNA-binding protein	gp6	Histone-like, nonspecific dsDNA-binding protein	Abril et al., 1999; Pastrana et al., 1985
gp7	Scaffolding protein	gp7	Scaffolding protein	Morais et al., 2003
gp7	Scaffolding protein	gp8	Major head protein	Lee and Guo, 1995
gp7	Scaffolding protein	gp10	Upper collar/connector protein	Guo et al., 1991; Lee and Guo, 1995
gp10	Upper collar/connector protein	gp10	Upper collar/connector protein	Guasch et al., 2002; Simpson et al., 2001
gp12	Preneck appendage protein (anti-receptor)	gp12	Preneck appendage protein (anti-receptor)	Xiang et al., 2009
gp13	Tail-associated, peptidoglycan-degrading enzyme	gp9	Tail knob protein	García et al., 1983
gp13	Tail-associated, peptidoglycan-degrading enzyme	gp13	Tail-associated, peptidoglycan-degrading enzyme	Xiang et al., 2008
gp16.7	Replication organizer; DNA-binding membrane protein p16.7	gp16.7	Replication organizer; DNA-binding membrane protein p16.7	Asensio et al., 2005; Albert et al., 2005; Crucitti et al., 1998; Serna-Rico et al., 2003; Meijer et al., 2001b
gp16.7	Replication organizer; DNA-binding membrane protein p16.7	gp3	DNA terminal protein TP	Serna-Rico et al., 2003
gp17	Early protein involved in early DNA replication	gp6	Histone-like, nonspecific dsDNA-binding protein	Crucitti et al. 2003
gp17	Early protein involved in early DNA replication	gp17	Early protein involved in early DNA replication	Crucitti et al., 2003
gp56 = gp0.8	Host uracil-DNA glycosylase inhibitor	gp56= gp0.8	Host uracil-DNA glycosylase inhibitor	Serrano-Heras et al., 2007)
ϕ29–Bacillus subtilis interactions
gp4	Late gene activator (3)	RpoA	RNA polymerase subunit α	Mencía et al., 1996a; Monsalve et al., 1996b
gp16.7	Replication organizer, DNA-binding membrane protein gp16.7 (2)	MreB	Actin-like cytoskeleton protein	Muñoz-Espín et al., 2009
gp56 = gp0.8	UDG inhibitor (1)	UDG	Uracil-DNA glycosylase	Serrano-Heras et al., 2006

^{Data from BOND, IntACT, and PDB (ϕ29, Taxon id 186846). All proteins are termed “gp” (gene product), although they are also termed “p” (protein), e.g., gp4 = p4. In the phage–host interaction list, the first protein is from ϕ29, the second from}B. subtilis. (1) Host RNAP binds via α subunit to p4 (i) bound at the late promoter PA3, thus activating late transcription and (ii) bound to the promoter PA2c, thus blocking transcription through promoter clearance inhibition. (2) PPI plays a role in the organization of ϕ29 DNA replication as peripheral helix-like structures. (3) Inhibition of host UDG; avoids removal of uracil from ϕ29 DNA via UDG-base excision repair.

TABLE XI

Protein–protein interactions of ϕ29-related phages^a

Phage	Protein 1		Protein 2		References
Phage		Function		Function	References
GA-1	gp6	Histone-like, nonspecific dsDNA binding protein	gp6	Histone-like, nonspecific dsDNA-binding protein	Freire et al., 1996
Nf	gp6	dsDNA binding protein	gp6	dsDNA-binding protein	Freire et al., 1996
GA-1	gp2	DNA polymerase	gp3	DNA terminal protein	Illana et al., 1996; Longás et al., 2006
Nf	gp2	DNA polymerase	gp3	DNA terminal protein	Longás et al., 2006; González-Huici et al., 2000
GA-1	gp5	Single-stranded DNA-binding protein SSB (1)	gp5	Single-stranded DNA-binding protein SSB	Gascón et al., 2000
GA-1	gp12	Neck appendage protein (2)	gp12	Neck appendage protein	Schulz et al., 2010
GA-1	gp4_G	Transcriptional regulator (3)	gp4_G	Transcriptional regulator	Horcajadas et al., 1999

^{Data from BOND, IntACT, and PDB (searched for ϕ29, Taxon id 186846). (1) ϕ29 and Nf SSB were shown to be stable monomers in solution, whereas GA-1 SSB forms homohexamers. (2) Homotrimer; autocatalytic cleavage similar to ϕ29 gp12. PDB: 3GUD. (3) Homodimer; negative regulator of early transcription; was shown not to interact with RNAP α subunit as the homologue gp4 of ϕ29.}

2. Protein-primed DNA replication

Phage ϕ29 DNA polymerase (DNAP, gp2) interacts with terminal protein (TP, gp3) to initiate genome replication. Interactions have also been shown between the corresponding GA-1 and Nf orthologs (Table XI). The structure of the ϕ29 TP–DNAP complex has been determined and suggests that the priming domain of TP occupies the dsDNA-binding tunnel, allowing the DNAP to bind negatively charged amino acid residues of TP (Kamtekar et al., 2006).

Phage ϕ29 replication depends on the presence of a single dAMP residue linked covalently via its 5′-phosphoryl group to the hydroxyl group of Ser^{of TP. Formation of the phosphoester bond is catalyzed by the DNAP itself (}Blanco and Salas, 1984; Hermoso et al., 1985); thus formation of a stable DNAP–TP heterodimeric protein complex is required (Blanco et al., 1987). The TP–dAMP is elongated by the same DNAP (Méndez et al., 1992). After the synthesis of six to nine nucleotides, DNAP dissociates from TP and starts continuous, linear synthesis (Méndez et al., 1997). This mechanism of protein-dependent DNA replication initiation was shown for Nf and GA-1 (Illana et al., 1996; Longás et al., 2008), S. pneumoniae phage Cp-1, E. coli phage PRD1, adenoviruses, and some plasmids and bacterial genomes (e.g., Streptomyces), although there may be differences in the various priming mechanisms (de Jong and van der Vliet, 1999; Martin et al., 1996; Salas, 1991).

The TP–DNAP interaction plays a central role in ϕ29 DNA replication initiation. However, other PPIs with TP are known, and thus protein-primed DNA replication may be more complicated than is currently known (Figs. 11C and Fig. 12). Chemical cross-linking shows that TP forms homodimers (Serna-Rico et al., 2003). Analysis of TP mutants suggests that the parental TP (TP bound to the parental genome) recruits the free TP–DNAP complex via a TP–TP interaction (Illana et al., 1999), but an additional interaction between parental TP and DNAP might be important for the specificity of replication initiation (González-Huici et al., 2000).

FIGURE 12

Interactions in ϕ29 genome replication and membrane localization. Proteins that contact each other have been shown to interact. For clarity, only one genome end is shown. Parental TP is highlighted by a black symbol, whereas primer TP is in white. The upper part of the figure includes PPIs that preferentially play a role in replication initiation. The lower part of the figure indicates elongated DNA with the associated ssDNA regions. Because p16.7 and p5 (SSB) both bind ssDNA, they could either redirect the parental ssDNA strand to the membrane via TP-p16.7 and p16.7–ssDNA interactions or stabilize it in the cytoplasm via SSB alone (Meijer et al., 2001b). After Bravo et al. (2000) and Serna-Rico et al. (2003).

TP interacts with the replication organizer gp16.7 (Serna-Rico et al., 2003) (Fig. 12). Gp16.7, which has the potential to bind to both ssDNA and dsDNA, is an integral membrane protein consisting of an N-terminal membrane anchor and a C-terminal cytoplasmic domain. It localizes ϕ29 DNA replication to the membrane (Meijer et al., 2000, 2001b; Serna-Rico et al., 2002, 2003). The C-terminal domain of gp16.7 dimerizes in solution but forms larger filaments in the presence of ssDNA (Asensio et al., 2005; Serna-Rico et al., 2003). Replication, which starts at both DNA ends, first produces a so-called type I replication intermediate consisting of dsDNA with ssDNA regions caused by displacement of the non-template DNA parental strand. Later, when the two replication forks have merged, the parental DNA strands become separated into two type II replication intermediates, still containing ssDNA regions in the parental template. A crystal structure of the gp16.7 DNA-binding C-terminal domain complexed with dsDNA reveals three gp16.7 dimers forming a deep, positively charged longitudinal cavity that interacts nonspecifically with the DNA phosphate backbone (Albert et al., 2005). Rp16.7 localizes in peripheral helix-like structures in a bacterial MreB cytoskeleton-dependent manner (Muñoz-Espín et al., 2010). In fact, protein gp16.7 interacts directly with MreB; this interaction is required for efficient ϕ29 DNA replication, allowing simultaneous replication of multiple templates at various peripheral sites.

Phage ϕ29 gp1 is another early protein that binds TP (Bravo et al., 2000). Rp1, like gp16.7, is membrane associated and self-associates into long filamentous structures (Bravo and Salas, 1997, 1998). Its membrane association and polymerization are mediated through a C-terminal hydrophobic sequence, whereas binding to TP involves an N-terminal region. In contrast to gp16.7, gp1 does not bind DNA (Fig. 12) and is thought to be another component that directs ϕ29 DNA replication to the membrane (Bravo et al., 2000). Any interplay between gp16.7 and gp1 in this process is unknown.

The ϕ29 histone-like, nonspecific dsDNA-binding protein gp6 forms a nucleoprotein complex with ϕ29 DNA that covers most of the genome, implicating it in genome organization (Serrano et al., 1994). Rp6 either unwinds the genome termini or interacts with TP directly, thus making a TP complex competent to bind the genome ends (Blanco et al., 1986; Serrano et al., 1994). Rp6 forms long oligomeric protein filaments thought to function as a protein backbone for DNA binding (Abril et al., 1999). GA-1 and Nf gp6 orthologues also self-associate (Freire et al., 1996). Rp6 also binds gp17, an early protein that stimulates the first viral replication round in vivo when other replication proteins are present at low levels (Crucitti et al., 1998, 2003). In contrast to gp6, gp17 does not bind DNA but its interaction with gp6 enhances gp6 binding to the replication origins, thereby stimulating early rounds of replication.

A homomeric interaction was demonstrated for the single-stranded DNA-binding protein of GA-1. GA-1 SSB was surprisingly found as a homohexamer in solution, whereas SSBs of Nf and ϕ29 (p5) appear to be stable monomers (Gascón et al., 2000). The homohexamer binds to ssDNA with higher affinity than the monomer and exhibits a higher stimulatory effect on DNA replication.

Bacteriophages have evolved different mechanisms to protect their genomes from degradation by host nucleases in the cell. One protective strategy includes the incorporation of noncanonical nucleotides into their DNA. For instance, B. subtilis phage PBS2 uses uracil instead of thymine (Takahashi and Marmur, 1963). PBS2 achieves this unusual goal by increasing the dUTP pool, relative to dTTP, after infection. In addition, PBS2 encodes Ugi, a protein that specifically inhibits the host uracil–DNA glycosylase (UDG)(Cone et al., 1980), which is involved in the first step of removing incorporated uracil residues via the base excision repair pathway. Normally, this takes place to avoid mutational effects in the host genome but it would impair the incorporation of uracil into the PBS2 genome. Cocrystallization of Ugi and UDG revealed that Ugi mimics uracil-containing duplex DNA, thereby blocking efficient DNA binding of UDG (Putnam et al., 1999). ϕ29 blocks the host UDG by an interaction with the early protein gp56 (gp0.8) that competes with the DNA-binding ability of UDG (Serrano-Heras et al., 2006, 2007, 2008). Although the ϕ29 genome does not contain uracil (Serrano-Heras et al., 2006), ϕ29 DNAP is able to incorporate dUMP with a catalytic activity only twofold lower than for that of dTMP. Because ϕ29 DNAP does not discriminate stringently between dTMP and dUMP, UDG inhibition via gp56 minimizes uracil incorporation.

3. Interactions during transcriptional regulation

Transcriptional regulation of the ϕ29 genome has been studied extensively (Meijer et al., 2001a; Rojo et al., 1998). The transition from early to late gene expression is controlled by gp4 (Fig. 11). Gp4 is a homodimer in solution (Mencía et al., 1996b); it forms a tetramer when bound to the intergenic region between gene 6 and gene 7, which contains the A2b, A2c, and A3 promoters (Fig. 11A). One binding site is upstream of A3, the late promoter that lacks a -35 box (Barthelemy and Salas, 1989). Once gp4 is bound, it recruits, via a PPI, the host RNAP, leading both to its stabilization on the promoter and transcriptional activation (Nuez et al., 1992). Although gp4 activates late gene expression, it represses early gene expression from the A2b and A2c promoters (Monsalve et al., 1997; Rojo and Salas, 1991). Interestingly, repression of A2c and activation of A3 involve the same residues of gp4 with the C-terminal domain of the RNAP α subunit (Mencía et al., 1996a; Monsalve et al., 1996a,b). Rp6 also interacts with gp4 in regulating the transcription of promoters A2b, A2c, and A3. Gp6 binds gp4 cooperatively to form a repression complex at promoters A2b and A2c while stimulating A3 (Calles et al., 2002; Elías-Arnanz and Salas, 1999).

4. Virion structure and morphogenesis

Cryo-EM analyses have revealed many details about the ϕ29 virion structure (Morais et al., 2003, 2005; Tao et al., 1998; Xiang et al., 2008, 2009), and many structures of virion proteins and factors are known (Table X). Assembly starts from a connector core particle consisting of 12 subunits of gp10 on which the scaffolding protein gp7 binds, followed by recruitment of the major capsid protein gp8 (Guo et al., 1991; Lee and Guo, 1995). The scaffolding protein is needed to create a lattice of concentric shells (Morais et al., 2003) and helps to organize the single capsomers built up by gp8 except for the basal connector capsomer. ϕ29 proheads were shown to consist of 235 copies of gp8 arranged as hexameric and pentameric capsomeres, indicating that direct interactions between gp8 monomers occur (Morais et al., 2005). The capsid fiber protein gp8.5, which is present in ϕ29 proheads, is presumably connected via gp8 to the prohead and is suggested to be organized as homotrimers via interactions of a C-terminal coiled-coil region (Morais et al., 2005). In addition to gp8, gp8.5, and gp10, ϕ29 proheads contain a homopentameric packaging RNA (pRNA) that binds to the connector, forming a basket-like complex (Simpson et al., 2000). pRNA is needed to translocate DNA into the prohead. Furthermore, the DNA translocation machinery needs the ATPase gp16 to drive the packaging reaction (see also chapter 9 of this volume). Rp16 was shown to bind to the connector protein (Ibarra et al., 2001). While other phages use specific DNA sequences to recognize and recruit the genome to the procapsids, ϕ29 DNA packaging initiation is dependent on the TP gp3 present at the genome ends, indicating that an interaction between gp16 and DNA-bound gp3 recruits the DNA to proheads and initiates packaging (Guo et al., 1987). Release of the scaffolding protein is associated with DNA packaging and pRNA and gp16 are released when there is no more DNA to package.

An interesting aspect of ϕ29 is that the virion contains several enzymes that are associated with structural proteins. For instance, gp12 is a homotrimer that participates in host cell recognition and entry. It also shows autocatalytic cleavage, although it remains unclear what the function of that reaction is. Similarly, gp13 is bound as a dimer at the distal end of the ϕ29 tail knob to gp9; gp13 cleaves both the polysaccharide backbone and peptide cross-links of the cell wall during infection.

Phage	Protein 1	Protein 2	References
GA-1	gp6	Histone-like, nonspecific dsDNA binding protein	gp6	Histone-like, nonspecific dsDNA-binding protein	Freire et al., 1996
Nf	gp6	dsDNA binding protein	gp6	dsDNA-binding protein	Freire et al., 1996
GA-1	gp2	DNA polymerase	gp3	DNA terminal protein	Illana et al., 1996; Longás et al., 2006
Nf	gp2	DNA polymerase	gp3	DNA terminal protein	Longás et al., 2006; González-Huici et al., 2000
GA-1	gp5	Single-stranded DNA-binding protein SSB (1)	gp5	Single-stranded DNA-binding protein SSB	Gascón et al., 2000
GA-1	gp12	Neck appendage protein (2)	gp12	Neck appendage protein	Schulz et al., 2010
GA-1	gp4_G	Transcriptional regulator (3)	gp4_G	Transcriptional regulator	Horcajadas et al., 1999

A. High-throughput data vs small-scale interactions

It took several decades to collect the interactions of phage λ, even though it has only a few dozen genes. These interactions have been studied in painstaking individual experiments. Since the mid-1990s, new strategies have been developed to identify and analyze PPIs on a larger scale, particularly two-hybrid-related methods and techniques involving protein complex purification and mass spectrometry. Despite the pioneering efforts of (Bartel et al., 1997) to map the interactome of phage T7 in a “single” experiment, their results remain somewhat contentious. Why is that? Studies show that each two-hybrid system produces system-specific data, and each system may only be able to identify as few as 20–30% of all interactions (Chen et al., 2010). It has been shown that the application of up to 10 different Y2H systems can identify up to 80% of all interactions, but there are still some that go undetected (Chen et al., 2010). This limited sensitivity can be explained by differences between the individual Y2H systems, expression levels, the way fusion proteins are constructed (e.g., N-terminal fusions vs C-terminal fusions), or linkers between fusion proteins. Some interactions may be completely undetectable by Y2H assays as these experiments are typically done in yeast, which may not have the necessary chaperones for bacterial or phage proteins to fold properly or the proteins may localize differently than in their normal organism.

Similar limitations apply to strategies that involve protein complex purification and mass spectrometry. Because many purifications are sensitive to small changes in protocols, even when a complex is purified within the same laboratory different preparations often result in different protein compositions in subsequent MS analyses (Gavin et al., 2006). If two different laboratories purify the same protein complexes using different methods, the results may differ even more dramatically (Hu et al., 2009).

Even if large-scale screens identify the majority of interactions, they will always also identify false positives. For instance, a screen of λPPIs yielded 97 interactions among λ proteins (Rajagopala et al., 2011). While the authors detected about half of the 33 published interactions listed in Table III, they also found many new interactions. Many do not seem plausible and are likely false positives. However, there is no way of knowing which ones are in fact false. We can only make educated guesses and rank all interactions by their plausibility, selecting only the best candidates for more detailed studies.

B. Phage of the same host has different interactomes

Where it is known, all bacteria are infected by numerous unrelated phages. While unrelated phages cannot have identical intraphage interactions, their proteins may interact with the same targets in their host cells. While interaction databases do not have enough information about phage–host interactions to draw general conclusions, several lines of evidence indicate that this is not generally true. For instance, the well-studied phages T7 and λ interact with different host proteins. They seem to share only one common interaction target, namely the RecBCD recombinase complex (see Tables II and andIV).IV). λ certainly has about twice as many host–phage interactions than T7, but this is certainly due to λ being temperate. T4 probably does interact somewhat more extensively with the host than T7, and one could argue that this is due to genome size. Similarly, in Streptococcus phage Dp-1 the authors found 38 interactions among the 72 phage proteins and its host, while the 28 (unrelated) proteins of phage Cp1 showed only 11 such interations (Häuser et al., 2011). Together, Dp-1 and Cp-1 target a total of 36 host proteins, of which only 2 are shared. More host–phage interactomes are needed to draw more general conclusions though.

C. Evolution of interactions [λ, P22]

We expect that related phage have similar interaction patterns, both among phage proteins and between phage and host proteins. This is certainly true for very closely related phages. However, phages evolve much faster than any other living system and thus may lose or gain interactions faster too. Unfortunately, there is not enough interaction data to make detailed comparisons, even among the well-studied lambdoid phages. However, many individual cases of proteins are not obviously homologous by sequence but which are believed to share a common ancestor, for example, the capsid proteins of λ and HK97. These proteins also have similar structures and make similar contacts in the capsid (Hendrix, 2005). Evolutionary relationships are also supported by the extensive similarities in parts of their genomes, that is, genome mosaicism (Casjens, 2011; Casjens and Thuman-Commike, 2011; Hendrix and Casjens, 2006). It remains to be seen how many interactions are conserved and how successful phages are in inventing new interactions (or discarding them). More systematic studies of multiple phages in parallel will be required to obtain comparable data sets in which differences in interaction patterns are not based on the use of noncomparable methods.

Work on this chapter was partially funded by NIH Grant R01GM79710 to P.U., the European Union (Grant HEALTH-F3-2009-223101)(P.U., R.H., and S.B.), NIH Grant RO1 {"type":"entrez-nucleotide","attrs":{"text":"AI074825","term_id":"3401469"}}AI074825 (S.R.C), and the Spanish Ministry of Science and Innovation (Grant BFU2008-00215) to M.S.

Bacteriophage Protein–Protein Interactions

I. INTRODUCTION

TABLE I

II. METHODS USED TO DETECT PROTEIN–PROTEIN INTERACTIONS

A. Genetic detection of interactions

B. Electron microscopy

C. Yeast two-hybrid screens

D. Protein complex purification and mass spectrometry

E. Biochemistry of protein complexes

F. X-ray crystallography and nuclear magnetic resonance

A. Genetic detection of interactions

B. Electron microscopy

C. Yeast two-hybrid screens

D. Protein complex purification and mass spectrometry

E. Biochemistry of protein complexes

F. X-ray crystallography and nuclear magnetic resonance

III. PROTEIN–PROTEIN INTERACTIONS OF BACTERIOPHAGES

A. Coliphage T7

1. Growth physiology

2. Early steps in infection

3. Early genes

4. T7 RNA polymerase

5. DNA replication

6. Tail

7. Interactome screening

TABLE II

B. Coliphage λ

1. The λ virion

TABLE III

2. Assembly of virion head (Fig. 4)

3. Assembly of virion tail (Fig. 5)

4. Lysogeny and induction

TABLE IV

5. Anti-termination

6. Replication

7. The λ interactome

C. Salmonella phage P22

1. P22 life cycle

2. P22 virion assembly

TABLE V

TABLE VI

3. Comparison of P22 and λ interactomes

TABLE VII

TABLE VIII

D. Enterobacteriophage P2 and its satellite phage, P4

TABLE IX

1. Gene control circuits

2. P2 DNA replication

3. Head structure and assembly

4. Tail structure

5. P4-induced redirection of P2 capsid assembly

E. Pneumococcus phage Dp-1 and Cp-1

F. Bacillus subtilis phage ϕ29-like viruses

1. Protein–protein interactions

TABLE X

TABLE XI

2. Protein-primed DNA replication

3. Interactions during transcriptional regulation

4. Virion structure and morphogenesis

A. Coliphage T7

1. Growth physiology

2. Early steps in infection

3. Early genes

4. T7 RNA polymerase

5. DNA replication

6. Tail

7. Interactome screening

TABLE II

1. Growth physiology

2. Early steps in infection

3. Early genes

4. T7 RNA polymerase

5. DNA replication

6. Tail

7. Interactome screening

TABLE II

B. Coliphage λ

1. The λ virion

TABLE III

2. Assembly of virion head (Fig. 4)