Proc Natl Acad Sci U S A 105(34): 12248-12253

PMC: PMC2527897

PMID: 18713870

MS3D structural elucidation of the HIV-1 packaging signal

Results

Structural Probing.

The secondary structure of full-length Ψ-RNA was probed by monofunctional agents that alkylate the base-pairing faces of nucleotides exposed to the medium. Kethoxal (KT) was applied to probe N1 and N2 of unpaired guanines, 1-cyclohexyl-3-(2-morpholinoethyl) carbodiimide (CMCT) was used for N3-uridine, and dimethyl sulfate (DMS) for N1-adenine and N3-cytidine (17, 18). DMS was also used to assess the participation of guanine in helical structures, taking advantage of its ability to methylate N7-guanine in the absence of stacking interactions (22). Probing reactions completed under a variety of experimental conditions were terminated by ethanol precipitation to eliminate excess reagent and undesirable salts. Irreversibly modified products were then digested by specific nucleases (e.g., RNase A, T1, etc.) to enable mass mapping and sequencing by electrospray ionization-Fourier transform ion cyclotron resonance (ESI-FTICR) mass spectrometry (23).

The representative ESI-FTICR mass spectrum in Fig. 1A shows that footprinting reagents induce unique and readily recognizable shifts in the molecular masses of digestion products. The relatively low abundance of alkylated adducts as compared with their unmodified counterparts is the intended result of the low probe/substrate ratios necessary to minimize structure distortion and kinetic traps (17 –20). Nucleases with different specificity were used to obtain overlapping maps and enable the correct assignment of alkylated nucleotides. If necessary, tandem sequencing was performed to solve any ambiguity caused by the possible presence of multiple adducts on the same digestion product (19). A comprehensive summary of products provided by monofunctional probing is provided in Table S1.

An external file that holds a picture, illustration, etc.
Object name is zpq9990845110001.jpg

Fig. 1.

Representative ESI-FTICR mass spectra of RNase A digestion products obtained from full-length Ψ-RNA probed with kethoxal (KT) (A) and nitrogen mustard (NM) (B). Hydrolysis products are identified by their 5′ and 3′ nucleotides separated by a colon. All digestion products included a 2′,3′-cyclic phosphate end, except for those marked by an asterisk, which included a linear 3′-phosphate. A bar connects cross-linked hydrolysis products. (A) Comparing the intensity of A324:U337 (reduced by half to fit on scale) and A324:U337 + KT (enlarged 2.5 times in Inset) gives an idea of typical alkylation yields achieved in probing reactions, which were kept as low as possible to minimize the risk of structure distortion (17,18). (B) The Inset allows one to examine the yields achieved by bifunctional probes.

The tertiary structure of Ψ-RNA was probed using bifunctional agents capable of cross-linking juxtaposed nucleotides, which provide valuable distance relationships between the conjugated moieties. Nitrogen mustard (NM) was used to bridge N7-guanine and N3-adenine with a relatively flexible spacer of 9 ± 3 Å, whereas 1,4-phenyl-diglyoxal (PDG) was used to cross-link N1- and N2-guanine with a more rigid aromatic spacer of 7.5 ± 0.5 Å (20). In the case of bifunctional reagents, nuclease treatment provided a wide range of products (Fig. 1B) consisting of two hydrolytic fragments conjugated by the cross-linker, bifunctional adducts in which both conjugated bases were present on the same fragment or monofunctional adducts corresponding to failed cross-links (20). As described earlier for the footprinting experiments, mass mapping (Fig. S2) and tandem sequencing (Fig. S3) were used as needed to characterize the products of probing reactions and identify the sequence position of conjugated nucleotides. The masses of observed cross-links and their assignments are summarized in Table S2.

Modification Maps.

The results provided by structural probes were rationalized using two-dimensional plots, in which the position of modified nucleotides were marked on the consensus secondary structure of Ψ-RNA (4, 24). The color-coded symbols in Fig. 2 identify bases modified by monofunctional footprinting agents, whereas dashed lines connect representative cross-linked products. The filling offers a pictorial representation of the degree of modification observed at each position under a variety of probing conditions (17, 21), which covered different Mg^{concentrations, probe/substrate ratios, reaction temperatures, and incubation intervals (see}Materials and Methods). In this way, nucleotides that are either consistently exposed or protected can be readily differentiated from those affected by dynamic processes (e.g., transient pair dissociation, or “breathing”). This notation also allows for a direct comparison of the susceptibility of different targets to the same probe, which may be ascribed to different structural contexts. It is important to note that, unlike for pseudoknot substrates investigated in ref. 21, no significant variations in the modification patterns were observed at different Mg^{concentration. In those structures, local changes detected at 50 μM Mg}^{were ascribed to protection effects induced by specific metal coordination by high-affinity sites (}21). No further changes were observed at 100 μM or higher, which could be attributed to the “diffuse” ions attracted near the RNA surface by its electrostatic field (25). No major changes in the modification patterns were detected for Ψ-RNA within the same range of concentrations, which suggests that the effects of Mg^{on this structure are not as significant.}

An external file that holds a picture, illustration, etc.
Object name is zpq9990845110002.jpg

Fig. 2.

Map summarizing the results obtained by probing full-length Ψ-RNA with mono- and bifunctional reagents. Reported are the sites of modification by DMS (red triangles), CMCT (blue circles), KT (green squares), and nucleotides bridged by NM (dashed lines). Filling gives a representation of the modification degree. When multiple adducts were detected with different distributions on the same digestion product, the position of modified nucleotides was indicated by referring to the entire hydrolysis fragment, using its 5′ and 3′ ends separated by a colon. For clarity, only a handful of representative cross-links are included here as green, red, or blue lines according to whether they were used for modeling, validation, or secondary structure confirmation (see Table S2). A complete summary of bifunctional cross-linkers grouped according to usage is provided in Fig. S4.

In the SL1 domain, guanines located in the bulge were highly susceptible to DMS methylation but were modified by CMCT with only half the yield exhibited by those in the apical loop. The N7-position of guanines in canonical A and B helixes is effectively protected by base stacking (17, 22); therefore, increased DMS reactivity is consistent with disruption of regular stacking between contiguous nucleobases in the bulge region. Reduced susceptibility to CMCT can instead be ascribed to a direct participation in Watson–Crick interactions or to a partially obstructed access to its base-pairing face. Consistent with these results, high-resolution structures obtained from isolated SL1 have shown that G247 and A271 form a noncanonical base pair prone to local breathing, whereas G272 and G273 can adopt alternative conformations pointing in or out of the bulge, which may leave them partially exposed to alkylation (26, 27).

Despite their double-stranded context, G265 in the SL1 upper stem and G340 and G350 in SL4 were effectively modified by DMS (Fig. 2). A common characteristic of these nucleotides is their participation in noncanonic GU pairs (or GU wobbles), which force the local helical structure to assume a less compact conformation with reduced stacking stabilization. Therefore, it was not surprising to detect methylation of G339 and G342, which flank the tandem GU wobbles of SL4 and are thus affected by helix distortion. In the triple-base platform of SL2, A296 was effectively modified because its N1 position remains accessible to the relatively small DMS probe (17). Finally, lower alkylation levels were observed for nucleotides involved in pairs at the end of helical regions, which are left unprotected by the ending of the stacked pattern or are transiently exposed by base pair breathing.

Modifications provided by bifunctional probes consisted of intra- and interdomain conjugated products (Fig. 2). The former are usually observed between nucleotides exposed on single-stranded regions of the same stemloop or linker, which are situated within the span of the selected probe. Intradomain conjugates are necessary to corroborate the secondary structures identified by monofunctional probes and are valuable in establishing the pairing alignment between complementary regions. This characteristic is particularly important when the RNA sequence of interest is capable of folding into alternative structures stabilized by minor register shifts in base pairing. In contrast, interdomain cross-links are obtained between nucleotides located in different secondary structures, which are not in close proximity on a 2D map but are juxtaposed by the substrate fold. The information afforded by this type of cross-link is used to constrain the reciprocal positions of entire domains in three dimensions (21). Additionally, these conjugates can lead to the positive identification of long-range tertiary interactions that are not sustained by canonical base-pairing and consequently are not detectable by classic monofunctional probes.

Among intradomain cross-links, products bridging A259 and G261, A259-A263, and G247–G272 correctly define the structure of the apical and internal loops of SL1 (Fig. 2 and Fig. S2). In particular, the first two are consistent with a full-fledged stemloop and rule out the formation of any extended duplex conformation that would contradict the inability of this mutant to form stable dimers (28). More significantly, salient interdomain cross-links were observed between the loops of contiguous stemloops, defining unambiguously their reciprocal placement in 3D space (Fig. 2 and Fig. S4). For example, G292–A319 connects the loops of SL2 and SL3, whereas G265–G292 links SL1 and SL2 (Fig. S2 A and B, respectively). Other important cross-links bridge the loop and upper stem of SL4 with the bulge of SL1 (e.g., G273–G342 and G247–A345) (Fig. S2C). Taken together, the observed bifunctional products provided a network of spatial relationships, which enabled triangulating the positions of the stemloops in the global fold of Ψ-RNA.

MS3D Modeling.

The results provided by Ψ-RNA were compared with those obtained by probing its isolated domains under identical experimental conditions. The vast majority of the described monofunctional products and intradomain cross-links were obtained also from the separate RNA constructs with only small yield variations, whereas none of the interdomain conjugates was detected. The excellent agreement between domain-specific modifications is consistent with the preservation of the RNA secondary structure determined by well defined pairing within each stemloop, which was not significantly affected by the global fold of the full-length construct. This observation prompted a modeling approach that resembled very closely the hierarchic process prospected for RNA folding (29), in which elements of secondary structure are initially formed by the annealing of complementary sequences within the same domain, whereas tertiary interactions are subsequently established between discrete domains mediated by hydrogen bonding and electrostatic interactions. Reproducing this sequential progression, the high-resolution coordinates of SL1, SL2, SL3, and SL4 (PDB entries 2D1B, 1ESY, 1BN0, and 1JTW) (30 –33) were merged with linker structures created by MC-SYM (34). Initial rounds of simulated annealing and energy minimization were performed using CNS (35), restrained only by base pair hydrogen bonding, base pair planarity, and standard backbone torsion angles. In subsequent rounds, distance constraints provided by high-scoring bifunctional cross-links (M in Table S2 and shown in Fig. S4A) were applied to define the domains' spatial arrangement. Intradomain cross-links (I in Fig. S4B) were mainly used to verify that the structures of discrete domains were retained in the global fold of Ψ-RNA, but were not included in model building. After structural strain was relieved by further minimization, the resulting model possessed the atomic-level detail contained in the initial high-resolution structures and the tertiary topology determined by the cross-linking data (Fig. 3).

An external file that holds a picture, illustration, etc.
Object name is zpq9990845110003.jpg

Fig. 3.

Ribbon diagram of full-length Ψ-RNA viewed from the side (A) and top (B). Red, SL1; green, SL2; blue, SL3; yellow, SL4. Linker sequences between adjoining stemloops are orange. The relatively compact cloverleaf conformation is stabilized by a long-range tertiary interaction between the apical loop of SL4 and the upper stem region of SL1.

Validation of the MS3D structure was obtained by using a subset of intra- and interdomain cross-links, which had been purposely excluded from the modeling procedure (V in Table S2 and shown in Fig. S4C). The distances between cross-linked positions in the final model matched very closely the range allowed by the respective probes. A handful of cross-links involving flexible single-stranded regions (F in Fig. S4D), which had been used for simulated annealing, fell slightly out of range after the final step of energy minimization was performed without explicit cross-linking constraints to relieve model strain. However, no steric clashes or unfavorable geometric situations were noted, which would contradict the actual experimental observation of these conjugates, thus indicating that the model offered a valid representation of the structural context encountered in solution by the probes. When the coordinates of double-stranded nucleotides were compared with the starting high-resolution coordinates used as input, an overall ≈2.90 Å rmsd was obtained from all backbone atoms excluding hydrogens, which should be ascribed to the flexible nature of the probe spacers and to the need for minimizing any strain induced by the global fold onto the initial secondary structures.

After an initial model was obtained, further experiments were performed to test salient features revealed by its structure. Because of the numerous bifunctional products bridging the tetraloop and wobble regions of SL4 with the bulge and upper stem of SL1 (Fig. 2 and Table S2), the modeling process placed these domains in very close contiguity of each other, raising the possibility that they may be involved in a long-range tertiary contact. In particular, these constraints could be simultaneously satisfied only by locating the tetraloop in direct contact with the region immediately above the bulge. The fact that isolated SL4 presents a classic GNRA tetraloop structure (15, 33) suggested that such contact may consist of a GNRA loop-receptor (L/R) interaction stabilized by noncanonical hydrogen bonding between its loop nucleobases and the minor groove of the SL1 upper stem (36 –38). The combination of a helical region followed by an asymmetric loop is very common in receptors observed in nature (36), or selected in vitro (37, 38). For these reasons, we investigated the L/R hypothesis by generating a mutant in which A345 was replaced by C to maintain the stability of the SL4 structure (15, 33) while varying the second nucleotide of the tetraloop, which determines receptor recognition (37, 38). Submitted to monofunctional probing, the mutant exhibited no detectable variations of footprinting pattern (data not shown), which confirmed the preservation of the tetraloop fold. In contrast, the mutation induced the complete elimination of interdomain cross-links between SL4 and SL1, providing experimental evidence for the prospected long-range interaction.

In the absence of direct experimental information on the molecular contacts between nucleobases, the L/R interaction was modeled according to structural features exhibited by known loop-receptor motifs, while ensuring that domain placement satisfied the cross-linking constraints. Although the structure of GAGA-specific receptors has not been described, in vitro selection methods demonstrated that this type of tetraloop can bind with excellent affinity to the broad-spectrum class II receptors (38). In Ψ-RNA, the residual susceptibility to footprinting agents manifested by L/R structures and the absence of detectable Mg^{effects are consistent with characteristics exhibited by this class of receptors, which remain solvent accessible even in bound form and are not magnesium-dependent (}38). For these reasons, their structure was used as template for homology-modeling the Ψ-RNA interaction. Matching the consensus sequence at this position, the final model displays A347 making hydrogen bonds in the minor groove of the C248-G270 base pair (Fig. 4). G346 is hydrogen-bonded with the noncanonical G247-A271 base pair that replaces the consensus A-U mismatch, with which it shares a shallower narrow groove (38). This arrangement provides two contiguous base-triplets stabilized by stacking interactions. In addition, A345 forms specific hydrogen bonds with G272 and stacks on G273 protruding out of the bulge motif. This bulge nucleotide is capable of assuming different conformations in isolated SL1 (26, 27, 39) but can be locked in the extrahelical position by stacking with A345, thus contributing to stabilize the L/R interaction. The mutation permissivity described for class II receptors and the stringent requirements driving their selection to maximize loop-binding in vitro (38) can account for local differences with the Ψ-RNA structure, which has likely evolved to retain significant dynamic properties. In this direction, it should be noted that the nucleotides involved in this L/R interaction are nearly 100% conserved in all HIV-1 sequences published in the 2006/07 HIV Sequence Compendium (40). Although the proposed model will require confirmation by high-resolution techniques, the loss of the long-range contact introduced by mutating A345 is consistent with the central position assumed by this nucleobase in the complex network of hydrogen bonds and stacking interactions, which stabilizes the GNRA loop-receptor interaction in Ψ-RNA.

An external file that holds a picture, illustration, etc.
Object name is zpq9990845110004.jpg

Fig. 4.

GNRA loop-receptor interaction in Ψ-RNAs. (A) diagram showing the placement of the SL1 and SL4 domains in the proposed interaction. (B) All-atom model detailing the specific molecular contacts. The model was obtained by using class II receptors as template (38).

Discussion

The MS3D model of full-length Ψ-RNA presents a compact cloverleaf morphology in which the helical stems of the different stemloops are situated nearly parallel to each other (Fig. 3). The apical loops point in the same direction, opposite to the linkers between adjoining stems. This arrangement places the majority of the construct's single-stranded nucleotides in positions that are not hindered by juxtaposed structures but are instead accessible to footprinting agents and possible ligands. Consistent with the constraints provided by bifunctional probes, SL2 and SL4 flank SL1 from nearly opposite sides, whereas SL3 is placed closer to SL4. Significantly, the putative GNRA loop-receptor interaction between SL4 and SL1 defines the overall cloverleaf fold by connecting secondary structures that are coded by nucleotides at the opposite ends of the Ψ-RNA sequence.

The compact nature of this structure provides valuable indications for understanding the modes of interaction between Ψ-RNA and the viral Gag polyprotein during genome recognition and packaging. In particular, the preservation of the individual stemloops accounts for their ability to act as discrete binding units for the NC domain of Gag, which was substantiated by the identification of six specific binding sites on the apical loops, the SL1 bulge, and the linker between SL3 and SL4 (11, 41). The distribution of such sites on the different domains and the observation that NC can bind them in vitro with comparable affinities rule out the possible formation of an alleged supersite of enhanced affinity, which might be produced only in the context of a full-length construct by extensive remodeling of its secondary structure. The very close spatial proximity of these sites could explain the lack of complete packaging suppression after deletion of individual stemloops (13), which could be partially compensated by the intrinsic ability of Gag to dimerize through protein–protein interactions (42).

The participation of SL4 in a specific GNRA loop-receptor interaction further expands the broad range of activities performed by this region of genomic RNA in vitro and in vivo. In addition to spanning the start of the gag gene, this sequence has been shown capable of annealing with different upstream sequences to stabilize alternative conformations of the 5′-untranslated region (5′-UTR) of viral RNA, which have been proposed to mediate distinct biological activities (14). Although probing of 5′-UTR in virions and infected cells has failed to establish the presence of either a full-fledged SL4 stemloop, or alternative RNA conformers (24), these conclusions were based solely on the monofunctional agent DMS, which can only reveal the presence of footprinting protection but cannot identify its source. Considering the ability of NC to bind the SL4 tetraloop (11, 41) in possible competition with its SL1 interaction, even the transient folding of a full-fledged stemloop could provide a mechanism for Gag to affect the 3D structure of 5′-UTR, either by dissociating the GNRA loop-receptor interaction, or by chaperoning the transition between alternative conformers. If confirmed in vivo, the central role suggested by these multifaceted activities would make the SL4 sequence a primary target for possible therapeutic intervention.

Finally, the MS3D elucidation of Ψ-RNA demonstrates the feasibility of a general divide-and-conquer strategy for the structural determination of progressively larger multisubunit assemblies that are not directly amenable to established techniques. According to such strategy, high-resolution coordinates could be obtained for well behaved parts of the target structure by NMR or crystallography, and structural probing would provide the constraints necessary to assemble the separate sections into a complete 3D model. It is important to emphasize that, once a probing reaction is completed without disruption of the native fold, the desired structural information is stored into a pattern of irreversible chemical modifications, which is not affected by subsequent hydrolytic procedures designed to handle substrates of virtually any size. The broad applicability of MS-based platforms to substrates of diverse biochemical nature and the development of software tools for aiding the direct utilization of these types of constraints are expected to propel MS3D to the forefront of the strategies used for elucidating previously intractable biological substrates.

Materials and Methods

RNA Samples.

RNA constructs were prepared by in vitro transcription of appropriate DNA templates, using the phage T7 polymerase reaction. A template spanning the G240–G354 sequence of HIV-1 subtype B (Lai isolate) was prepared from the pNL4-3 plasmid provided by the National Institutes of Health AIDS Reagents Repository (Germantown, MD). The described mutations were introduced using the QuikChange II kit (Stratagene). Transcribed RNAs were purified by denaturing gel electrophoresis performed on 6% (wt/vol) polyacrylamide gels and recovered by electroelution from manually excised bands. Desalting was performed by ultrafiltration using Centricon devices from Millipore (11, 21).

Structural Probing.

Dimethyl sulfate (DMS), 1-cyclohexyl-3-(2-morpholinoethyl)carbodiimide metho-p-toluenesulfonate (CMCT), and bis (2-chloroethyl)methylamine [nitrogen mustard (NM)] were purchased from Sigma–Aldrich and used without further purification. Kethoxal (KT) was purchased from ICN. 1,4-Phenyl-diglyoxal (PDG) was synthesized in house and purified according to procedures described in ref. 20. Due to their known or suspected mutagenic activity, these reagents were treated using all possible precautions to avoid inhalation and direct contact with eyes, mucosae, and skin, including the use of latex gloves, face masks, and fume hoods.

Before probing, each target RNA was reannealed by incubating for five minutes at 90°C and slow-cooling to room temperature. Typical probing buffers consisted of either 50 mM (NH₄)₂BO₃ for KT and PDG or 50 mM NH₄OAc for the remaining agents, with pH adjusted to the optimum value for each probe (17 –19). In selected experiments, 50 to 100 μM MgCl₂ was added to determine the effect of the divalent metal on probing patterns. It should be noted that, although the [Mg^{] in viral particles is still debated, a value of only 240 μM has been reported for the intracellular compartment of human lymphocytes (}43). Probe to RNA ratios were experimentally optimized for each specific reaction to avoid saturating susceptible nucleotides, which might result in structure distortion and secondary alkylation. Typical reactions involved adding an ≈10-fold excess probe to 25 μM RNA in a total volume of 20 μl. Typical reactions were carried out at room temperature for 20 min and quenched by ethanol precipitation in the presence of 1.5 M NH₄OAc. Probed RNA was reconstituted in RNase-free water to prepare ≈10 μM aliquots for digestion with RNase A and RNase T1. Both enzymes were purchased from Sigma–Aldrich and used as received. A total of 0.05 units of RNase was added per mg of RNA substrate; each digestion was carried out in 20 mM NH₄OAc (pH 7.5) at 37°C for 1–2 h to drive the hydrolysis to near completion.

Mass Spectrometry and Data Analysis.

Analyte solutions containing a final concentration of 5–10 μM RNA in electrospray buffer (50:50 10 mM ammonium acetate:iso-propanol) were prepared by appropriate dilution of the original digestion mixtures. All determinations were performed on a Bruker Apex III Fourier transform ion cyclotron resonance (FTICR) mass spectrometer equipped with a 7T shielded superconductive magnet and a thermally-assisted Apollo electrospray ion source (17 –19). Each experiment required loading ≈5 μl of analyte solution into a nanospray needle with a stainless steel wire inserted from the back end to provide the necessary voltage. Tandem mass spectrometry was completed as described earlier (19). Data reduction and analysis were performed with the aid of MS2Links (19) and the Mongo Oligo Mass Calculator version 2.05 [made available by J. A. McCloskey (University of Utah, Salt Lake City)].

Molecular Modeling.

Initial models were generated by combining the coordinates of each stemloop (PDB entries 2D1B, IESY, 1BN0, and 1JTW) (30 –33), using the merge_structure.inp module of CNS (35). Linker sequences were built de novo using MC-SYM (34). In iterative fashion, the number of cross-links used for model building was progressively increased each round, following the ranking order provided in Table S2. The iterations were stopped when no significant variations were detected between successive models. A handful of randomly selected cross-links were used for validation purposes. The CNS modules anneal.inp and minimize.inp were used to submit the model to rounds of simulated annealing and Cartesian molecular dynamics restrained by standard backbone torsion angles, planarity, and hydrogen bonding of base pairs, according to the modified CharmM force field used by CNS. Restraint set files normally used for NOE restraints in anneal.inp were used to specify distances between cross-linked nucleotides. The specific contacts between the GNRA tetraloop and its double-stranded receptor were modeled by homology with interactions supported by class II receptors (38). To establish the correct contacts in the minor groove of the SL1 helix, hydrogen bonding was enforced between nucleotide A347 and base pair C248–G270, between G346 and G247–A271, and between A345 and G272, as discussed in Results. The specific hydrogen bonding arrangement was built according to typical base-triplet interactions. The final model was energy minimized without enforcing the cross-linking constraints to relieve possible strain, which did not result in disruption of the tetraloop docking into the receptor site. Models were viewed using PyMOL (http://pymol.sourceforge.net), and figures were prepared using the NUCCYL module (www.mssm.edu/students/jovinl02/research/nuccyl.html) on a Linux computer.

RNA Samples.

Structural Probing.

Mass Spectrometry and Data Analysis.

Molecular Modeling.

Supplementary Material

Supporting Information:

Click here to view.

Department of Chemistry and Biochemistry, University of Maryland Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250

^{To whom correspondence should be addressed. E-mail:}ude.cbmu@sirbaf

Edited by Roger D. Kornberg, Stanford University School of Medicine, Stanford, CA, and approved July 1, 2008

Author contributions: E.T.Y. and D.F. designed research; E.T.Y. and A.H. performed research; E.T.Y., A.H., J.E., and D.F. analyzed data; and D.F. wrote the paper.

*Present address: Sandia National Laboratories, Livermore, CA 94551.

^{E.T.Y. and A.H. contributed equally to this work.}

Edited by Roger D. Kornberg, Stanford University School of Medicine, Stanford, CA, and approved July 1, 2008

Received 2008 Jan 17

Abstract

The structure of HIV-1 Ψ-RNA has been elucidated by a concerted approach combining structural probes with mass spectrometric detection (MS3D), which is not affected by the size and crystallization properties of target biomolecules. Distance constraints from bifunctional cross-linkers provided the information required for assembling an all-atom model from the high-resolution coordinates of separate domains by triangulating their reciprocal placement in 3D space. The resulting structure revealed a compact cloverleaf morphology stabilized by a long-range tertiary interaction between the GNRA tetraloop of stemloop 4 (SL4) and the upper stem of stemloop 1 (SL1). The preservation of discrete stemloop structures ruled out the possibility that major rearrangements might produce a putative supersite with enhanced affinity for the nucleocapsid (NC) domain of the viral Gag polyprotein, which would drive genome recognition and packaging. The steric situation of single-stranded regions exposed on the cloverleaf structure offered a valid explanation for the stoichiometry exhibited by full-length Ψ-RNA in the presence of NC. The participation of SL4 in a putative GNRA loop-receptor interaction provided further indications of the plasticity of this region of genomic RNA, which can also anneal with upstream sequences to stabilize alternative conformations of the 5′ untranslated region (5′-UTR). Considering the ability to sustain specific NC binding, the multifaceted activities supported by the SL4 sequence suggest a mechanism by which Gag could actively participate in regulating the vital functions mediated by 5′-UTR. Substantiated by the 3D structure of Ψ-RNA, the central role played by SL4 in specific RNA-RNA and protein-RNA interactions advances this domain as a primary target for possible therapeutic intervention.

Keywords: GNRA loop-receptor interaction, high-resolution mass spectrometry, HIV-1 Psi-RNA, modeling, structural probing

Abstract

The multifaceted activities attributed to the packaging signal (Ψ-RNA) (1, 2) of HIV-1 are mediated by discrete stemloop domains that exercise individual functions, but can also operate in concert to complete complex tasks [supporting information (SI) Fig. S1] (3 –5). Stemloop 1 (SL1) is primarily responsible for genome dimerization (6, 7), but its mutations induce significant reductions of RNA encapsidation (8). Stemloop 2 (SL2) contains the major splice donor (SD) (9), but its high affinity for the nucleocapsid (NC) domain of the Gag polyprotein implies its participation in genome recognition and packaging (10, 11). Stemloop 3 (SL3) is sufficient by itself to induce the packaging of heterologous RNA into virus-like particles (12), but its deletion does not completely eliminate encapsidation that is still sustained with suboptimal efficiency by the remaining stemloops (13). Located downstream of the gag's starting codon, stemloop 4 (SL4) may be engaged in both coding and non-coding activities, as suggested by its possible involvement in long-range pairing interactions (14) and its ability to bind NC in vitro (11, 15). Despite a wealth of data showing extensive communication among domains, the bases for their functional relationships are not substantiated by comprehensive structural information, which is still missing because of the size and flexibility of full-length Ψ-RNA.

A promising alternative for elucidating the 3D structure of biomolecules with no intrinsic limitations of size and crystallization properties is represented by the combination of footprinting and cross-linking techniques with mass spectrometric detection, collectively known as MS3D (16). We have recently developed MS3D approaches for RNA and its functional assemblies, which involve the application of structural probes under conditions that preserve the native folding of the target substrate (17). The identity and sequence position of ensuing covalent adducts are determined by nuclease digestion, mass mapping, and tandem sequencing (17 –19), which enable the utilization of a broader probe selection than allowed by traditional electrophoresis-based methods (20). The experimental constraints obtained from structural probing are combined with fundamental information pertaining RNA structure to generate all-atom 3D models of target substrates. The validity of this approach was confirmed by the excellent agreement between the NMR and MS3D structures of the mouse mammary tumor virus pseudoknot completed in earlier work (21).

The MS3D approach was implemented here to complete the structural determination of full-length Ψ-RNA. Any ambiguities arising from the simultaneous presence of alternative dimeric conformations were avoided by focusing on a 115-nt construct corresponding to the 240–354 sequence of HIV-1 subtype B (Lai variant), in which G259 was replaced with A to disrupt the palindrome responsible for initiating RNA dimerization (DIS, Fig. S1) (6, 7). Additional mutants were designed ad hoc to validate salient features that emerged from initial probing experiments. Reproducing the sequential nature of the RNA folding process, full-length Ψ-RNA was modeled from the high-resolution coordinates of separate domains available in the Protein Data Bank (PDB), which were assembled according to the experimental constraints from structural probes. The new insights provided by this all-atom model are discussed in the context of the known functional communications among domains. A greater understanding of the structural basis for these relationships could provide the key for developing novel inhibitors aimed at disrupting the crucial processes of genome recognition, dimerization, and packaging in HIV-1.

Click here to view.

Acknowledgments.

We thank Dr. M. F. Summers (University of Maryland Baltimore County) for helpful discussions. This work was supported by National Institutes of Health Grant GM643208 and National Science Foundation Grant CHE-0439067.

Acknowledgments.

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/cgi/content/full/0800509105/DCSupplemental.

Footnotes

References

1. Harrison GP, Lever AMLThe human immunodeficiency virus type-1 packaging signal and a major splice donor region have a conserved stable secondary structure. J Virol. 1992;66:4144–4153.[Google Scholar]
2. Baudin F, et al Functional sites in the 5′ region of human immunodeficiency virus type 1 RNA form defined structural domains. J Mol Biol. 1993;229:382–397.[PubMed][Google Scholar]
3. Luban J, Goff SMutational analysis of cis-acting packaging signals in human immunodeficiency virus type 1 RNA. J Virol. 1994;68:3784–3793.[Google Scholar]
4. Clever J, Sassetti C, Parslow TGRNA secondary structure and binding sites for gag gene products in the 5′ packaging signal of human immunodeficiency virus type 1. J Virol. 1995;69:2101–2109.[Google Scholar]
5. Berkhout BStructure and function of the human immunodeficiency virus leader RNA. Prog Nucleic Acids Res Mol Biol. 1996;54:1–34.[PubMed][Google Scholar]
6. Laughrea M, Jetté LA 19-nucleotide sequence upstream of the 5′ major splice donor is part of the dimerization domain of human immunodeficiency virus 1 genomic RNA. Biochemistry. 1994;33:13464–13474.[PubMed][Google Scholar]
7. Skripkin E, Paillart JC, Marquet R, Ehresmann B, Ehresmann CIdentification of the primary site of the human immunodeficiency virus type 1 RNA dimerization in vitro. Proc Natl Acad Sci USA. 1994;91:4945–4949.[Google Scholar]
8. Berkhout B, van Wamel JLRole of the DIS hairpin in replication of human immunodeficiency virus type 1. J Virol. 1996;70:6723–6732.[Google Scholar]
9. Purcell DF, Martin MAAlternative splicing of human immunodeficiency virus type 1 mRNA modulates viral protein expression, replication, and infectivity. J Virol. 1993;67:6365–6378.[Google Scholar]
10. Amarasinghe GK, et al. NMR structure of the HIV-1 nucleocapsid protein bound to stem-loop SL2 of the Ψ-RNA packaging signal. Implications for genome recognition. J Mol Biol. 2000;301:491–511.[PubMed]
11. Hagan N, Fabris DDirect mass spectrometric determination of the stoichiometry and binding affinity of the complexes between HIV-1 nucleocapsid protein and RNA stem-loops hairpins of the HIV-1 Ψ-recognition element. Biochemistry. 2003;42:10736–10745.[PubMed][Google Scholar]
12. Hayashi T, Shioda T, Iwakura Y, Shibuta HRNA packaging signal of human immunodeficiency virus type 1. Virology. 1992;188:590–599.[PubMed][Google Scholar]
13. Clever J, Parslow TGMutant human immunodeficiency virus type I genomes with defects in RNA dimerization or encapsidation. J Virol. 1997;71:3407–3414.[Google Scholar]
14. Huthoff H, Berkhout BTwo alternating structures of the HIV-1 leader RNA. RNA. 2001;7:143–157.[Google Scholar]
15. Amarasinghe GK, et al. Stem-loop 4 of the HIV-1 Ψ-RNA packaging signal exhibits weak affinity for the nucleocapsid protein. Structural studies and implications for genome recognition. J Mol Biol. 2001;314:961–970.[PubMed]
16. Young MM, et al High throughput protein fold identification by using experimental constraints derived from intramolecular cross-links and mass spectrometry. Proc Natl Acad Sci USA. 2000;97:5802–5806.[Google Scholar]
17. Yu E, Fabris DDirect probing of RNA structures and RNA-protein interactions in the HIV-1 packaging signal by chemical modification and electrospray ionization Fourier transform mass spectrometry. J Mol Biol. 2003;330:211–223.[PubMed][Google Scholar]
18. Yu ET, Fabris DToward multiplexing the application of solvent accessibility probes for the investigation of RNA three-dimensional structures by electrospray ionization—Fourier transform mass spectrometry. Anal Biochem. 2004;344:356–366.[PubMed][Google Scholar]
19. Kellersberger KA, Yu E, Kruppa GH, Young MM, Fabris DTop-down characterization of nucleic acids modified by structural probes using high-resolution tandem mass spectrometry and automated data interpretation. Anal Chem. 2004;76:2438–2445.[PubMed][Google Scholar]
20. Zhang Q, Yu ET, Kellersberger KA, Crosland E, Fabris DToward building a database of bifunctional probes for the MS3D investigation of nucleic acids structures. J Am Soc Mass Spectrom. 2006;17:1570–1581.[PubMed][Google Scholar]
21. Yu ET, Zhang Q, Fabris DUntying the FIV frameshifting pseudoknot structure by MS3D. J Mol Biol. 2005;345:69–80.[PubMed][Google Scholar]
22. Peattie DA, Gilbert WChemical probes for higher-order structure in RNA. Proc Natl Acad Sci USA. 1980;77:4679–4682.[Google Scholar]
23. Hendrickson CL, Emmett MR, Marshall AGElectrospray ionization Fourier transform ion cyclotron resonance mass spectrometry. Annu Rev Phys Chem. 1999;50:517–536.[PubMed][Google Scholar]
24. Paillart JC, et al First snapshots of the HIV-1 RNA structure in infected cells and in virions. J Biol Chem. 2004;279:48397–48403.[PubMed][Google Scholar]
25. Draper DEA guide to ions and RNA structure. RNA. 2004;10:335–343.[Google Scholar]
26. Yuan Y, Kerwood DJ, Paoletti AC, Shubsda MF, Borer PNStem of SL1 RNA in HIV-1: Structure and nucleocapsid protein binding for a 1 × 3 internal loop. Biochemistry. 2003;42:5259–5269.[PubMed][Google Scholar]
27. Lawrence DC, Stover CC, Noznitsky J, Wu Z, Summers MFStructure of the intact stem and bulge of HIV-1 Psi-RNA stem-loop SL1. J Mol Biol. 2003;326:529–542.[PubMed][Google Scholar]
28. Turner KB, Hagan NA, Fabris DUnderstanding the isomerization of the HIV-1 dimerization initiation domain by the nucleocapsid protein. J Mol Biol. 2007;369:812–828.[Google Scholar]
29. Tinoco I, Bustamante CHow RNA folds. J Mol Biol. 1999;293:271–281.[PubMed][Google Scholar]
30. Baba S, et al Solution RNA Structures of the HIV-1 Dimerization Initiation Site in the Kissing-Loop and Extended-Duplex Dimers. J Biochem (Tokyo) 2005;138:583–592.[PubMed][Google Scholar]
31. Amarasinghe GK, De Guzman RN, Turner RB, Summers MFNMR structure of stem-loop SL2 of the HIV-1 psi RNA packaging signal reveals a novel A-U-A base-triple platform. J Mol Biol. 2000;299:145–156.[PubMed][Google Scholar]
32. Pappalardo L, Kerwood DJ, Pelczer I, Borer PNThree-dimensional folding of an RNA hairpin required for packaging HIV-1. J Mol Biol. 1998;282:801–818.[PubMed][Google Scholar]
33. Kerwood DJ, Cavaluzzi MJ, Borer PNStructure of SL-4 from the HIV-1 packaging signal. Biochemistry. 2001;40:14518–14529.[PubMed][Google Scholar]
34. Major F, Gautheret D, Cedergren RReproducing the three-dimensional structure of a tRNA molecule from structural constraints. Proc Natl Acad Sci USA. 1993;90:9408–9412.[Google Scholar]
35. Brünger AT, et al Crystallography and NMR System: A new software suite for macromolecular structure determination. Acta Crystallogr D. 1998;54:905–921.[PubMed][Google Scholar]
36. Cate JH, et al RNA tertiary structure mediation by adenosine platforms. Science. 1996;273:1696–1699.[PubMed][Google Scholar]
37. Costa M, Michel FRules for RNA recognition of GNRA tetraloops deduced by in vitro selection: Comparison with in vivo evolution. EMBO J. 1997;16:3289–3302.[Google Scholar]
38. Geary C, Baudrey S, Jaeger LComprehensive features of natural and in vitro selected GNRA tetraloop-binding receptors. Nucleic Acids Res. 2008;36:1138–1152.[Google Scholar]
39. Greatorex J, Gallego J, Varani G, Lever AStructure and stability of wild-type and mutant RNA internal loops from the SL-1 domain of the HIV-1 packaging signal. J Mol Biol. 2002;322:543–557.[PubMed][Google Scholar]
40. Leitner T, et al., editors. HIV Sequence Compendium 2006/2007. Los Alamos, NM: Theoretical Biology and Biophysics Group, Los Alamos National Laboratory; 2007. [PubMed]
41. Fabris D, Chaudhari P, Hagan NA, Turner KBFunctional investigations of retroviral protein-ribonucleic acid complexes by nanospray Fourier transform ion cyclotron resonance mass spectrometry. Eur J Mass Spectrom. 2007;13:29–33.[PubMed][Google Scholar]
42. Johnson MC, Scobie HM, Ma YM, Vogt VMNucleic acid-independent retrovirus assembly can be driven by dimerization. J Virol. 2002;76:11177–11185.[Google Scholar]
43. Delva P, Pastori C, Degan M, Montesi G, Lechi ACatecholamine-induced regulation in vitro and ex vivo of intralymphocyte ionized magnesium. J Membr Biol. 2004;199:163–171.[PubMed][Google Scholar]