Proc Natl Acad Sci U S A 100(23): 13280-13285

Trp-cage: Folding free energy landscape in explicit water

Computational Biology Center, IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598; and Department of Chemistry, Columbia University, New York, NY 10027

^E-mail:moc.mbi.su@zgnohur.

Edited by Peter G. Wolynes, University of California at San Diego, La Jolla, CA

Received 2003 May 31; Accepted 2003 Sep 3.

Abstract

Trp-cage is a 20-residue miniprotein, which is believed to be the fastest folder known so far. In this study, the folding free energy landscape of Trp-cage has been explored in explicit solvent by using an OPLSAA force field with periodic boundary condition. A highly parallel replica exchange molecular dynamics method is used for the conformation space sampling, with the help of a recently developed efficient molecular dynamics algorithm P3ME/RESPA (particle–particle particle–mesh Ewald/reference system propagator algorithm). A two-step folding mechanism is proposed that involves an intermediate state where two correctly formed partial hydrophobic cores are separated by an essential salt-bridge between residues Asp-9 and Arg-16 near the center of the peptide. This metastable intermediate state provides an explanation for the superfast folding process. The free energy landscape is found to be rugged at low temperatures, and then becomes smooth and funnel-like above 340 K. The lowest free energy structure at 300 K is only 1.50 Å C^{-RMSD (C}^{-rms deviation) from the NMR structures. The simulated nuclear Overhauser effect pair distances are in excellent agreement with the raw NMR data. The temperature dependence of the Trp-cage population, however, is found to be significantly different from experiment, with a much higher melting transition temperature above 400 K (experimental 315 K), indicating that the current force fields, parameterized at room temperature, need to be improved to correctly predict the temperature dependence.}

Abstract

Understanding protein folding is critical in molecular biology not only because it is one of the fundamental problems remaining in protein science, but also because several fatal diseases are directly related to protein folding/misfolding, such as the Alzheimer's disease, mad cow disease, and Cystic fibrosis disease (1–4). Despite enormous efforts made by various groups, the problem is still largely unsolved. Experiments that probe proteins at different stages of the folding process have helped elucidate both kinetics and thermodynamics of folding, but many of the details remain unknown. Computer simulations performed at various levels of complexity, ranging from simple lattice models with no solvent to all-atom models with explicit solvent, are used to supplement experiment and fill in some of the gaps in our knowledge about protein folding (1–4). Typical molecular dynamics (MD) simulations using all-atom models are still within the nanosecond to microsecond regime, even with today's supercomputers (2–4), but most of the proteins known fold in the microsecond to millisecond time frame. Both experimentalists and theoreticians keep searching for faster and faster folders to make the two ends meet (there is probably a limit to how fast a protein can fold due to the diffusion rate). A number of rapidly folding proteins have been characterized in recent years to fulfill this need because these fast folding proteins can provide the first direct comparisons between simulations and experiments.

The 20-residue miniprotein Trp-cage (NLYIQ WLKDG GPSSG RPPPS) designed recently by Neidigh et al. (5) is probably one of the best such examples. This protein folds spontaneously and cooperatively into a Trp-cage in ≈4 μs (6), which is by far the fastest folding protein known. The protein was derived from the C terminus of a 39-residue extendin-4 peptide. Several constructs of increasing stability were made by gradually introducing stabilizing features such as helical N-capping residues and a solvent-exposed salt-bridge (5). It contains a short α-helix in residues 2–9, a 3₁₀-helix in residues 11–14, and a C-terminal polyproline II helix to pack against the central tryptophan (Trp-6) (5, 6). The folding seems highly cooperative, with CD, fluorescence, and chemical shift deviations (CSD) generating virtually identical sigmoidal thermal denaturation profiles (5, 6). The small size, high stability, and fast folding time make Trp-cage an ideal choice for protein folding simulations.

There are several simulations published already on this Trp-cage: one by Simmerling et al. (7), who ran a few 20- to 50-ns MD simulations with a modified AMBER99 (assisted model building with energy refinement 99) force field and found a structure very close to the native one from many low potential energy structures; and one by Snow et al. (8), who estimated the folding rate by using thousands of short (nanoseconds) kinetics runs with the united atom OPLS force field; and another by Pitera and Swope (9), who ran replica exchange method and found a <1.0-Å C^{-rms deviation (C}^{-RMSD) structure from the simulated ensemble using AMBER94 force field. All three studies used a continuum solvent model, generalized Born (GB) model (}10), to save computational cost. Here, we intend to study the Trp-cage folding in explicit solvent using the powerful replica exchange method (REM) (11, 12) with the help of an efficient MD algorithm P3ME/RESPA (particle–particle particle–mesh Ewald/reference system propagator algorithm) (13). It has been shown recently that GB-type continuum solvent models might have deficiencies, such as predicting incorrect lowest free energy structures (14, 15), overly strong salt-bridges (14), and absence of the desolvation free energy barriers (16–18). Thus, we chose the explicit solvent model to study the Trp-cage folding in water.

REM is a powerful tool for efficient sampling of conformation space (11, 12). The free energy landscapes of protein folding in water are believed to be at least partially rugged. At room temperature (RT), protein systems are often trapped in many local potential energy minima. This trapping limits the capacity for effective sampling of the configurational space using normal MD. The high temperature replica in REM can traverse high energy barriers so it provides a mechanism for the low temperature replicas to overcome the quasi-ergodicity they would otherwise encounter. A recently developed MD algorithm P3ME/RESPA (13) is also used to help efficiently explore the free energy landscape. The all-atom OPLSAA (optimized potential for liquid simulation–all atom model) force field and SPC (simple point charge) explicit water model are used with periodic boundary condition. Based on detailed results from the simulation in explicit solvent, a folding mechanism has been proposed for this Trp-cage that involves an intermediate state where the structures show two partially prepacked hydrophobic cores separated by a salt-bridge between residues Asp-9 and Arg-16 near the center of the peptide. This metastable intermediate state might have provided a mechanism for a fast two-step folding process for this miniprotein. The lowest free energy structure at 300 K shows only a 1.50-Å C^{-RMSD from the NMR structures, and the simulated nuclear Overhauser effect (NOE) pair distances are in excellent agreement with the raw NMR data at 300 K. However, the temperature dependence of the native Trp-cage population is found to be significantly different from experiment, which indicates that the current force fields parameterized near RTs need to be improved to correctly predict the temperature dependence.}

Acknowledgments

I thank Bruce Berne, Jed Pitera, Bill Swope, Angel Garcia, and Ann McDermott for many useful discussions and helpful comments. Part of the simulations was done on the IBM Event Infrastructure (EI) nodes, and I thank Ann Mead, Melvin Calendar, Mark Smith, and Sandra Berman for excellent technical support.

Acknowledgments

Notes

This paper was submitted directly (Track II) to the PNAS office.

Abbreviations: MD, molecular dynamics; CSD, chemical shift deviation; Cα-RMSD, Cα-rms deviation; REM, replica exchange method; AMBER, assisted model building with energy refinement; P3ME/RESPA, particle–particle particle–mesh Ewald/reference system propagator algorithm; NOE, nuclear Overhauser effect; RT, room temperature; Rg, radius of gyration; SPC, simple point charge; OPLSAA, optimized potential for liquid simulation–all atom.

Notes

This paper was submitted directly (Track II) to the PNAS office.

References

1. McCammon, J. A. & Wolynes P. G., eds. (2002) Current Opinion in Structural Biology (Curr. Biol. Press, London).
2. Zhou, Y. & Karplus, M. (199) Nature401, 400-402. [PubMed]
3. Brooks, C. L., Gruebele, M., Onuchic, J. N. & Wolynes, P. G. (1998) Proc. Natl. Acad. Sci.95, 11037-11042.
4. Snow, C. D., Nguyen, H., Pande, V. S. & Gruebele, M. (2002) Nature420, 102-104. [[PubMed]
5. Neidigh J. W., Fesinmeyer R. M. & Andersen N. H. (2002) Nat. Struct. Biol.9, 425-430. [[PubMed]
6. Qiu, L., Pabit, S. A., Roitberg, A. E. & Hagen, S. J. (2002) J. Am. Chem. Soc.124, 12952-12953. [[PubMed]
7. Simmerling, C., Strockbine, B. & Roitberg, A. E. (2002) J. Am. Chem. Soc.124, 11258-11259. [[PubMed]
8. Snow, C. D., Zagrovic, B. & Pande, V. S. (2002) J. Am. Chem. Soc.124, 14548-14549. [[PubMed]
9. Pitera, J. W. & Swope, W. (2003) Proc. Natl. Acad. Sci. USA100, 7587-7592.
10. Still, W. C., Tempczyk, A. Hawley, R. C. & Hendrickson, T. (1990) J. Am. Chem. Soc.112, 6127-6129. [PubMed]
11. Hukushima, K. & Nemoto, K. (1996) J. Phys. Soc. Jpn.65, 1604-1608. [PubMed]
12. Marinari, E., Parisi, G. & Ruiz-Lorenzo, J. J. (1998) in Spin Glass and Random Fields, ed. Young, A. P. (World Scientific, Singapore), p. 59.
13. Zhou, R., Harder, E., Xu, H. & Berne, B. J. (2001) J. Chem. Phys.115, 2348-2358. [PubMed]
14. Zhou, R. & Berne, B. J. (2002) Proc. Natl. Acad. Sci. USA99, 12777-12782.
15. Zhou, R(2003) Proteins Struct. Funct. Genet.52, 561-572. [[PubMed][Google Scholar]
16. Masunov, A. & Lazaridis, T. (2003) J. Am. Chem. Soc.125, 1722-1727. [[PubMed]
17. Kaya, H. & Chan, H. S. (2003) J. Mol. Biol.326, 911-915. [[PubMed]
18. Cheung, M. S., Garcia, A. E. & Onuchic, J. (2002) Proc. Natl. Acad. Sci. USA99, 685-690.
19. Figueirido, F., Levy, R. M., Zhou, R. & Berne, B. J. (1997) J. Chem. Phys.106, 9835-9845. [PubMed]
20. Sugita, Y. & Okamoto, Y. (1999) Chem. Phys. Lett.314, 141-151. [PubMed]
21. Ferrenberg, A. M. & Swendsen, R. H. (1989) Phys. Rev. Lett.63, 1195-1998. [[PubMed]
22. Garcia, A. E. & Sanbonmatsu, K. Y. (2001) Proteins42, 345-354. [[PubMed]
23. Zhou, R., Germain, R. & Berne, B. J. (2001) Proc. Natl. Acad. Sci. USA98, 14931-14936.
24. Garcia, A. E. (1992) Phys. Rev. Lett.68, 2696-2699. [[PubMed]
25. Dill, K. A. & Chan, H. S. (1997) Nat. Struct. Biol.4, 10-19. [[PubMed]
26. Wolynes, P. G. (1997) Proc. Natl. Acad. Sci. USA94, 6170-6175.
27. Munoz, V., Henry, E. R., Hofrichter, J. & Eaton., W. A. (1997) Nature390, 196-199. [[PubMed]
28. Beachy, M., Chasman, D., Murphy, R., Halgren, T. & Friesner, R. (1997) J. Am. Chem. Soc.119, 5908-5912. [PubMed]
29. Waldburger, C., Schildbach, J. & Sauer, R. (1995) Nat. Struct. Biol.2, 122-128. [[PubMed]
30. Horovitz, A., Serrano, L., Avron, B., Bycroft, M. & Fersht, A. (1990) J. Mol. Biol.216, 1031-1044. [[PubMed]
31. Klimov, D. K. & Thirumalai, D. (2000) Proc. Natl. Acad. Sci. USA97, 2544-2549.