Paul Emsley
Citations
All
Search in:AllTitleAbstractAuthor name
Publications
(23)
Patents
Grants
Pathways
Clinical trials
Publication
Journal: Acta crystallographica. Section D, Biological crystallography
February/21/2005
Abstract
CCP4mg is a project that aims to provide a general-purpose tool for structural biologists, providing tools for X-ray structure solution, structure comparison and analysis, and publication-quality graphics. The map-fitting tools are available as a stand-alone package, distributed as 'Coot'.
Pulse
Views:
5
Posts:
No posts
Rating:
Not rated
Publication
Journal: Acta crystallographica. Section D, Biological crystallography
June/26/2011
Abstract
The CCP4 (Collaborative Computational Project, Number 4) software suite is a collection of programs and associated data and software libraries which can be used for macromolecular structure determination by X-ray crystallography. The suite is designed to be flexible, allowing users a number of methods of achieving their aims. The programs are from a wide variety of sources but are connected by a common infrastructure provided by standard file formats, data objects and graphical interfaces. Structure solution by macromolecular crystallography is becoming increasingly automated and the CCP4 suite includes several automation pipelines. After giving a brief description of the evolution of CCP4 over the last 30 years, an overview of the current suite is given. While detailed descriptions are given in the accompanying articles, here it is shown how the individual programs contribute to a complete software package.
Publication
Journal: Acta crystallographica. Section D, Biological crystallography
February/21/2005
Abstract
Progress towards structure determination that is both high-throughput and high-value is dependent on the development of integrated and automatic tools for electron-density map interpretation and for the analysis of the resulting atomic models. Advances in map-interpretation algorithms are extending the resolution regime in which fully automatic tools can work reliably, but at present human intervention is required to interpret poor regions of macromolecular electron density, particularly where crystallographic data is only available to modest resolution [for example, I/sigma(I) < 2.0 for minimum resolution 2.5 A]. In such cases, a set of manual and semi-manual model-building molecular-graphics tools is needed. At the same time, converting the knowledge encapsulated in a molecular structure into understanding is dependent upon visualization tools, which must be able to communicate that understanding to others by means of both static and dynamic representations. CCP4 mg is a program designed to meet these needs in a way that is closely integrated with the ongoing development of CCP4 as a program suite suitable for both low- and high-intervention computational structural biology. As well as providing a carefully designed user interface to advanced algorithms of model building and analysis, CCP4 mg is intended to present a graphical toolkit to developers of novel algorithms in these fields.
Publication
Journal: Science
April/9/2014
Abstract
Mitochondria have specialized ribosomes that have diverged from their bacterial and cytoplasmic counterparts. We have solved the structure of the yeast mitoribosomal large subunit using single-particle cryo-electron microscopy. The resolution of 3.2 angstroms enabled a nearly complete atomic model to be built de novo and refined, including 39 proteins, 13 of which are unique to mitochondria, as well as expansion segments of mitoribosomal RNA. The structure reveals a new exit tunnel path and architecture, unique elements of the E site, and a putative membrane docking site.
Publication
Journal: Acta crystallographica. Section D, Biological crystallography
March/29/2015
Abstract
The recent rapid development of single-particle electron cryo-microscopy (cryo-EM) now allows structures to be solved by this method at resolutions close to 3 Å. Here, a number of tools to facilitate the interpretation of EM reconstructions with stereochemically reasonable all-atom models are described. The BALBES database has been repurposed as a tool for identifying protein folds from density maps. Modifications to Coot, including new Jiggle Fit and morphing tools and improved handling of nucleic acids, enhance its functionality for interpreting EM maps. REFMAC has been modified for optimal fitting of atomic models into EM maps. As external structural information can enhance the reliability of the derived atomic models, stabilize refinement and reduce overfitting, ProSMART has been extended to generate interatomic distance restraints from nucleic acid reference structures, and a new tool, LIBG, has been developed to generate nucleic acid base-pair and parallel-plane restraints. Furthermore, restraint generation has been integrated with visualization and editing in Coot, and these restraints have been applied to both real-space refinement in Coot and reciprocal-space refinement in REFMAC.
Publication
Journal: Structure
February/8/2012
Abstract
This report presents the conclusions of the X-ray Validation Task Force of the worldwide Protein Data Bank (PDB). The PDB has expanded massively since current criteria for validation of deposited structures were adopted, allowing a much more sophisticated understanding of all the components of macromolecular crystals. The size of the PDB creates new opportunities to validate structures by comparison with the existing database, and the now-mandatory deposition of structure factors creates new opportunities to validate the underlying diffraction data. These developments highlighted the need for a new assessment of validation criteria. The Task Force recommends that a small set of validation data be presented in an easily understood format, relative to both the full PDB and the applicable resolution class, with greater detail available to interested users. Most importantly, we recommend that referees and editors judging the quality of structural experiments have access to a concise summary of well-established quality indicators.
Publication
Journal: Acta crystallographica. Section D, Biological crystallography
June/18/2012
Abstract
Coot is a molecular-graphics application primarily aimed to assist in model building and validation of biological macromolecules. Recently, tools have been added to work with small molecules. The newly incorporated tools for the manipulation and validation of ligands include interaction with PRODRG, subgraph isomorphism-based tools, representation of ligand chemistry, ligand fitting and analysis, and are described here.
Publication
Journal: Acta Crystallographica Section D: Structural Biology
October/24/2017
Abstract
The program AceDRG is designed for the derivation of stereochemical information about small molecules. It uses local chemical and topological environment-based atom typing to derive and organize bond lengths and angles from a small-molecule database: the Crystallography Open Database (COD). Information about the hybridization states of atoms, whether they belong to small rings (up to seven-membered rings), ring aromaticity and nearest-neighbour information is encoded in the atom types. All atoms from the COD have been classified according to the generated atom types. All bonds and angles have also been classified according to the atom types and, in a certain sense, bond types. Derived data are tabulated in a machine-readable form that is freely available from CCP4. AceDRG can also generate stereochemical information, provided that the basic bonding pattern of a ligand is known. The basic bonding pattern is perceived from one of the computational chemistry file formats, including SMILES, mmCIF, SDF MOL and SYBYL MOL2 files. Using the bonding chemistry, atom types, and bond and angle tables generated from the COD, AceDRG derives the `ideal' bond lengths, angles, plane groups, aromatic rings and chirality information, and writes them to an mmCIF file that can be used by the refinement program REFMAC5 and the model-building program Coot. Other refinement and model-building programs such as PHENIX and BUSTER can also use these files. AceDRG also generates one or more coordinate sets corresponding to the most favourable conformation(s) of a given ligand. AceDRG employs RDKit for chemistry perception and for initial conformation generation, as well as for the interpretation of SMILES strings, SDF MOL and SYBYL MOL2 files.
Publication
Journal: Protein Science
November/15/2019
Abstract
Coot is a tool widely used for model building, refinement and validation of macromolecular structures. It has been extensively used for crystallography and, more recently, improvements have been introduced to aid in cryo-EM model building and refinement, as cryo-EM structures with resolution ranging 2.5-4 A are now routinely available. Model building into these maps can be time consuming and requires experience in both biochemistry and building into low-resolution maps. To simplify and expedite the model building task, and minimize the needed expertise, new tools are being added in Coot. Some examples include morphing, Geman-McClure restraints, full chain refinement and Fourier-model based residue-type-specific Ramachandran restraints. Here we present the current state-of-the-art in Coot usage. This article is protected by copyright. All rights reserved.
Publication
Journal: Journal of Structural and Functional Genomics
July/8/2009
Abstract
Misfit sidechains in protein crystal structures are a stumbling block in using those structures to direct further scientific inference. Problems due to surface disorder and poor electron density are very difficult to address, but a large class of systematic errors are quite common even in well-ordered regions, resulting in sidechains fit backwards into local density in predictable ways. The MolProbity web site is effective at diagnosing such errors, and can perform reliable automated correction of a few special cases such as 180 degrees flips of Asn or Gln sidechain amides, using all-atom contacts and H-bond networks. However, most at-risk residues involve tetrahedral geometry, and their valid correction requires rigorous evaluation of sidechain movement and sometimes backbone shift. The current work extends the benefits of robust automated correction to more sidechain types. The Autofix method identifies candidate systematic, flipped-over errors in Leu, Thr, Val, and Arg using MolProbity quality statistics, proposes a corrected position using real-space refinement with rotamer selection in Coot, and accepts or rejects the correction based on improvement in MolProbity criteria and on chi angle change. Criteria are chosen conservatively, after examining many individual results, to ensure valid correction. To test this method, Autofix was run and analyzed for 945 representative PDB files and on the 50S ribosomal subunit of file 1YHQ. Over 40% of Leu, Val, and Thr outliers and 15% of Arg outliers were successfully corrected, resulting in a total of 3,679 corrected sidechains, or 4 per structure on average. Summary Sentences: A common class of misfit sidechains in protein crystal structures is due to systematic errors that place the sidechain backwards into the local electron density. A fully automated method called "Autofix" identifies such errors for Leu, Val, Thr, and Arg and corrects over one third of them, using MolProbity validation criteria and Coot real-space refinement of rotamers.
Publication
Journal: Structure
December/15/2016
Abstract
Crystallographic studies of ligands bound to biological macromolecules (proteins and nucleic acids) represent an important source of information concerning drug-target interactions, providing atomic level insights into the physical chemistry of complex formation between macromolecules and ligands. Of the more than 115,000 entries extant in the Protein Data Bank (PDB) archive, ∼75% include at least one non-polymeric ligand. Ligand geometrical and stereochemical quality, the suitability of ligand models for in silico drug discovery and design, and the goodness-of-fit of ligand models to electron-density maps vary widely across the archive. We describe the proceedings and conclusions from the first Worldwide PDB/Cambridge Crystallographic Data Center/Drug Design Data Resource (wwPDB/CCDC/D3R) Ligand Validation Workshop held at the Research Collaboratory for Structural Bioinformatics at Rutgers University on July 30-31, 2015. Experts in protein crystallography from academe and industry came together with non-profit and for-profit software providers for crystallography and with experts in computational chemistry and data archiving to discuss and make recommendations on best practices, as framed by a series of questions central to structural studies of macromolecule-ligand complexes. What data concerning bound ligands should be archived in the PDB? How should the ligands be best represented? How should structural models of macromolecule-ligand complexes be validated? What supplementary information should accompany publications of structural studies of biological macromolecules? Consensus recommendations on best practices developed in response to each of these questions are provided, together with some details regarding implementation. Important issues addressed but not resolved at the workshop are also enumerated.
Publication
Journal: Acta crystallographica. Section D, Biological crystallography
June/26/2011
Publication
Journal: Acta crystallographica. Section D, Biological crystallography
July/16/2008
Abstract
One of the most cumbersome and time-demanding tasks in completing a protein model is building short missing regions or ;loops'. A method is presented that uses structural and electron-density information to build the most likely conformations of such loops. Using the distribution of angles and dihedral angles in pentapeptides as the driving parameters, a set of possible conformations for the C(alpha) backbone of loops was generated. The most likely candidate is then selected in a hierarchical manner: new and stronger restraints are added while the loop is built. The weight of the electron-density correlation relative to geometrical considerations is gradually increased until the most likely loop is selected on map correlation alone. To conclude, the loop is refined against the electron density in real space. This is started by using structural information to trace a set of models for the C(alpha) backbone of the loop. Only in later steps of the algorithm is the electron-density correlation used as a criterion to select the loop(s). Thus, this method is more robust in low-density regions than an approach using density as a primary criterion. The algorithm is implemented in a loop-building program, Loopy, which can be used either alone or as part of an automatic building cycle. Loopy can build loops of up to 14 residues in length within a couple of minutes. The average root-mean-square deviation of the C(alpha) atoms in the loops built during validation was less than 0.4 A. When implemented in the context of automated model building in ARP/wARP, Loopy can increase the completeness of the built models.
Publication
Journal: Methods in Molecular Biology
November/23/2015
Abstract
The frequency of glycosylated protein 3D structures in the Protein Data Bank (PDB) is significantly lower than the proportion of glycoproteins in nature, and if glycan 3D structures are present, then they often exhibit a large degree of errors. There are various reasons for this, one of which is a comparably low support of carbohydrates in software tools for 3D structure determination and validation. This chapter illustrates the current features that assist crystallographers with handling glycans during 3D structure determination in Coot and CNS and with validation of the results.
Publication
Journal: Acta Crystallographica Section D: Structural Biology
November/13/2018
Abstract
Coot is a graphics application that is used to build or manipulate macromolecular models; its particular forte is manipulation of the model at the residue level. The model-building tools of Coot have been combined and extended to assist or automate the building of N-linked glycans. The model is built by the addition of monosaccharides, placed by variation of internal coordinates. The subsequent model is refined by real-space refinement, which is stabilized with modified and additional restraints. It is hoped that these enhanced building tools will help to reduce building errors of N-linked glycans and improve our knowledge of the structures of glycoproteins.
Publication
Journal: Science
November/26/2017
Abstract
Newly transcribed eukaryotic precursor messenger RNAs (pre-mRNAs) are processed at their 3' ends by the ~1-megadalton multiprotein cleavage and polyadenylation factor (CPF). CPF cleaves pre-mRNAs, adds a polyadenylate tail, and triggers transcription termination, but it is unclear how its various enzymes are coordinated and assembled. Here, we show that the nuclease, polymerase, and phosphatase activities of yeast CPF are organized into three modules. Using electron cryomicroscopy, we determined a 3.5-angstrom-resolution structure of the ~200-kilodalton polymerase module. This revealed four β propellers, in an assembly markedly similar to those of other protein complexes that bind nucleic acid. Combined with in vitro reconstitution experiments, our data show that the polymerase module brings together factors required for specific and efficient polyadenylation, to help coordinate mRNA 3'-end processing.
Publication
Journal: Acta Crystallographica Section D: Structural Biology
October/31/2017
Abstract
Coot is a molecular-graphics program primarily aimed at model building using X-ray data. Recently, tools for the manipulation and representation of ligands have been introduced. Here, these new tools for ligand validation and comparison are described. Ligands in the wwPDB have been scored by density-fit, distortion and atom-clash metrics. The distributions of these scores can be used to assess the relative merits of the particular ligand in the protein-ligand complex of interest by means of `sliders' akin to those now available for each accession code on the wwPDB websites.
Publication
Journal: Methods in Molecular Biology
May/21/2012
Abstract
The use of 3D structures derived from X-ray crystal data in drug development has increased in recent years. Molecular graphics applications are important tools at the end of the data processing pipeline and provide means to build, refine and validate protein models and ligand structures. We describe the requirements on useful data, what such data provide and typical problems in dealing with protein-ligand complexes and how one might address them with an emphasis on the use of Coot.
Publication
Journal: Acta Crystallographica Section D: Structural Biology
September/12/2017
Abstract
In this paper, AUSPEX, a new software tool for experimental X-ray data analysis, is presented. Exploring the behaviour of diffraction intensities and the associated estimated uncertainties facilitates the discovery of underlying problems and can help users to improve their data acquisition and processing in order to obtain better structural models. The program enables users to inspect the distribution of observed intensities (or amplitudes) against resolution as well as the associated estimated uncertainties (sigmas). It is demonstrated how AUSPEX can be used to visually and automatically detect ice-ring artefacts in integrated X-ray diffraction data. Such artefacts can hamper structure determination, but may be difficult to identify from the raw diffraction images produced by modern pixel detectors. The analysis suggests that a significant portion of the data sets deposited in the PDB contain ice-ring artefacts. Furthermore, it is demonstrated how other problems in experimental X-ray data caused, for example, by scaling and data-conversion procedures can be detected by AUSPEX.
Publication
Journal: Acta Crystallographica Section D: Structural Biology
October/24/2017
Abstract
A freely available small-molecule structure database, the Crystallography Open Database (COD), is used for the extraction of molecular-geometry information on small-molecule compounds. The results are used for the generation of new ligand descriptions, which are subsequently used by macromolecular model-building and structure-refinement software. To increase the reliability of the derived data, and therefore the new ligand descriptions, the entries from this database were subjected to very strict validation. The selection criteria made sure that the crystal structures used to derive atom types, bond and angle classes are of sufficiently high quality. Any suspicious entries at a crystal or molecular level were removed from further consideration. The selection criteria included (i) the resolution of the data used for refinement (entries solved at 0.84 Å resolution or higher) and (ii) the structure-solution method (structures must be from a single-crystal experiment and all atoms of generated molecules must have full occupancies), as well as basic sanity checks such as (iii) consistency between the valences and the number of connections between atoms, (iv) acceptable bond-length deviations from the expected values and (v) detection of atomic collisions. The derived atom types and bond classes were then validated using high-order moment-based statistical techniques. The results of the statistical analyses were fed back to fine-tune the atom typing. The developed procedure was repeated four times, resulting in fine-grained atom typing, bond and angle classes. The procedure will be repeated in the future as and when new entries are deposited in the COD. The whole procedure can also be applied to any source of small-molecule structures, including the Cambridge Structural Database and the ZINC database.
Publication
Journal: Methods in Molecular Biology
December/15/2013
Abstract
X-ray crystallography is a powerful technique for studying protein-ligand interactions. Advances in techniques have meant that it is now possible to routinely determine the structures of ligand complexes in the majority of cases where crystallization conditions and protein structures are already known. Ligand soaking or cocrystallization, together with the potential use of molecular replacement, provides data for determining the structures of a protein in complex with ligands. Furthermore, advances in protein structure model building facilitate automatic ligand fitting to residual electron density in the protein-ligand complex.
Publication
Journal: Acta Crystallographica Section D: Structural Biology
April/16/2019
Abstract
N-Glycosylation is one of the most common post-translational modifications and is implicated in, for example, protein folding and interaction with ligands and receptors. N-Glycosylation trees are complex structures of linked carbohydrate residues attached to asparagine residues. While carbohydrates are typically modeled in protein structures, they are often incomplete or have the wrong chemistry. Here, new tools are presented to automatically rebuild existing glycosylation trees, to extend them where possible, and to add new glycosylation trees if they are missing from the model. The method has been incorporated in the PDB-REDO pipeline and has been applied to build or rebuild 16 452 carbohydrate residues in 11 651 glycosylation trees in 4498 structure models, and is also available from the PDB-REDO web server. With better modeling of N-glycosylation, the biological function of this important modification can be better and more easily understood.
Publication
Journal: Acta Crystallographica Section D: Structural Biology
November/12/2017