2017-02-22Metals are essential in many biological processes, and metal ions are modeled in roughly 40% of the macromolecular structures in the Protein Data Bank (PDB). However, a significant fraction of these structures contain poorly modeled metal-binding sites. CheckMyMetal (CMM) is an easy-to-use metal-binding site validation server for macromolecules that is freely available at http://csgid.org/csgid/metal_sites. The CMM server can detect incorrect metal assignments as well as geometrical and other irregularities in the metal-binding sites. Guidelines for metal-site modeling and validation in macromolecules are illustrated by several practical examples grouped by the type of metal. These examples show CMM users (and crystallographers in general) problems they may encounter during the modeling of a specific metal ion.
2017-02-22The Cambridge Structural Database (CSD) is the worldwide resource for the dissemination of all published three-dimensional structures of small-molecule organic and metal–organic compounds. This paper briefly describes how this collection of crystal structures can be used en masse in the context of macromolecular crystallography. Examples highlight how the CSD and associated software aid protein–ligand complex validation, and show how the CSD could be further used in the generation of geometrical restraints for protein structure refinement.
2017-02-22Many ligand-discovery stories tell of the use of structures of protein–ligand complexes, but the contribution of structural chemistry is such a core part of finding and improving ligands that it is often overlooked. More than 800 000 crystal structures are available to the community through the Cambridge Structural Database (CSD). Individually, these structures can be of tremendous value and the collection of crystal structures is even more helpful. This article provides examples of how small-molecule crystal structures have been used to complement those of protein–ligand complexes to address challenges ranging from affinity, selectivity and bioavailability though to solubility.
2017-02-24XChemExplorer (XCE) is a data-management and workflow tool to support large-scale simultaneous analysis of protein–ligand complexes during structure-based ligand discovery (SBLD). The user interfaces of established crystallographic software packages such as CCP4 [Winn et al. (2011), Acta Cryst. D67, 235–242] or PHENIX [Adams et al. (2010), Acta Cryst. D66, 213–221] have entrenched the paradigm that a `project' is concerned with solving one structure. This does not hold for SBLD, where many almost identical structures need to be solved and analysed quickly in one batch of work. Functionality to track progress and annotate structures is essential. XCE provides an intuitive graphical user interface which guides the user from data processing, initial map calculation, ligand identification and refinement up until data dissemination. It provides multiple entry points depending on the need of each project, enables batch processing of multiple data sets and records metadata, progress and annotations in an SQLite database. XCE is freely available and works on any Linux and Mac OS X system, and the only dependency is to have the latest version of CCP4 installed. The design and usage of this tool are described here, and its usefulness is demonstrated in the context of fragment-screening campaigns at the Diamond Light Source. It is routinely used to analyse projects comprising 1000 data sets or more, and therefore scales well to even very large ligand-design projects.
2017-02-24In this work, two freely available web-based interactive computational tools that facilitate the analysis and interpretation of protein–ligand interaction data are described. Firstly, WONKA, which assists in uncovering interesting and unusual features (for example residue motions) within ensembles of protein–ligand structures and enables the facile sharing of observations between scientists. Secondly, OOMMPPAA, which incorporates protein–ligand activity data with protein–ligand structural data using three-dimensional matched molecular pairs. OOMMPPAA highlights nuanced structure–activity relationships (SAR) and summarizes available protein–ligand activity data in the protein context. In this paper, the background that led to the development of both tools is described. Their implementation is outlined and their utility using in-house Structural Genomics Consortium (SGC) data sets and openly available data from the PDB and ChEMBL is described. Both tools are freely available to use and download at http://wonka.sgc.ox.ac.uk/WONKA/ and http://oommppaa.sgc.ox.ac.uk/OOMMPPAA/.