Subscribe: Quantitative Structure-Activity Relationships
Added By: Feedage Forager Feedage Grade B rated
Language: English
activity  based  chemical structures  chemical  compounds  data  design  drug  method  model  models  molecular  structure 
Rate this Feed
Rate this feedRate this feedRate this feedRate this feedRate this feed
Rate this feed 1 starRate this feed 2 starRate this feed 3 starRate this feed 4 starRate this feed 5 star

Comments (0)

Feed Details and Statistics Feed Statistics
Preview: Quantitative Structure-Activity Relationships

Molecular Informatics

Wiley Online Library : Molecular Informatics

Published: 2017-11-01T00:00:00-05:00


Identification of Bioactive Scaffolds Based on QSAR Models


In medicinal chemistry, the molecular scaffolds commonly found in compounds with preferable biological activities are called bioactive scaffolds. They are important because if present in a structure, it is more likely that the compound will be bioactive. Traditionally, medicinal chemists use their knowledge to identify bioactive scaffolds from a given data set after systematic extraction of all candidate scaffolds. However, manually sorting all the scaffolds is not practical as the number of compounds in a data set is often very large. Herein, we propose a method to systematically identify bioactive scaffolds based on a structure generator and a QSAR model. Two proof-of-concept studies showed that known bioactive scaffolds as well as scaffolds containing important substructures were extracted. The proposed method does not depend on scaffold frequencies in a data set, which is different from currently used methods for bioactive scaffold identification.

Development of Predictive QSAR Models of 4-Thiazolidinones Antitrypanosomal Activity using Modern Machine Learning Algorithms


This paper presents novel QSAR models for the prediction of antitrypanosomal activity among thiazolidines and related heterocycles. The performance of four machine learning algorithms: Random Forest regression, Stochastic gradient boosting, Multivariate adaptive regression splines and Gaussian processes regression have been studied in order to reach better levels of predictivity. The results for Random Forest and Gaussian processes regression are comparable and outperform other studied methods. The preliminary descriptor selection with Boruta method improved the outcome of machine learning methods. The two novel QSAR-models developed with Random Forest and Gaussian processes regression algorithms have good predictive ability, which was proved by the external evaluation of the test set with corresponding Q2ext=0.812 and Q2ext=0.830. The obtained models can be used further for in silico screening of virtual libraries in the same chemical domain in order to find new antitrypanosomal agents. Thorough analysis of descriptors influence in the QSAR models and interpretation of their chemical meaning allows to highlight a number of structure-activity relationships. The presence of phenyl rings with electron-withdrawing atoms or groups in para-position, increased number of aromatic rings, high branching but short chains, high HOMO energy, and the introduction of 1-substituted 2-indolyl fragment into the molecular structure have been recognized as trypanocidal activity prerequisites.

Protocols for the Design of Kinase-Focused Compound Libraries


Protocols for the design of kinase-focused compound libraries are presented. Kinase-focused compound libraries can be differentiated based on the design goal. Depending on whether the library should be a discovery library specific for one particular kinase, a general discovery library for multiple distinct kinase projects, or even phenotypic screening, there exists today a variety of in silico methods to design candidate compound libraries. We address the following scenarios: 1) Datamining of SAR databases and kinase focused vendor catalogues; 2) Predictions and virtual screening; 3) Structure-based design of combinatorial kinase inhibitors; 4) Design of covalent kinase inhibitors; 5) Design of macrocyclic kinase inhibitors; and 6) Design of allosteric kinase inhibitors and activators.

Unbinding of Kinesin from Microtubule in the Strongly Bound States Enhances under Assisting Forces


The ability to predict the cellular dynamics of intracellular transport has enormous potential to impact human health. A key transporter is kinesin-1, an ATP-driven molecular motor that shuttles cellular cargos along microtubules (MTs). The dynamics of kinesins depends critically on their unbinding rate from MT, which varies depending on the force direction applied on the motor, i.e. the force-unbinding rate relation is asymmetric. However, it remains unclear how changing the force direction from resisting (applied against the motion direction) to assisting (applied in the motion direction) alters the kinesin's unbinding and stepping. Here, we propose a theoretical model for the influence of the force direction on the stepping dynamics of a single kinesin. The model shows that the asymmetry of the force-unbinding rate relation is independent of ATP concentration. It also reveals that the synthesis of ATP from backward stepping under assisting forces is less likely than under resisting forces. It then finds that the unbinding of kinesin in the strongly MT-bound kinetic states enhances under assisting forces.

In Silico Studies of Mammalian δ-ALAD Interactions with Selenides and Selenoxides


Previous studies have shown that the mammalian δ-aminolevulinic acid dehydratase (δ-ALAD) is inhibited by selenides and selenoxides, which can involve thiol oxidation. However, the precise molecular interaction of selenides and selenoxides with the active center of the enzyme is unknown. Here, we try to explain the interaction of selenides and the respective selenoxides with human δ-ALAD by in silico molecular docking. The in silico data indicated that Se atoms of selenoxides have higher electrophilic character than their respective selenides. Further, the presence of oxygen increased the interaction of selenoxides with the δ-ALAD active site by O…Zn coordination. The interaction of S atom from Cys124 with the Se atom indicated the importance of the nucleophilic attack of the enzyme thiolate to the organoselenium molecules. These observations help us to understand the interaction of target proteins with organoselenium compounds.

An Improved Binary Differential Evolution Algorithm for Feature Selection in Molecular Signatures


The discovery of biomarkers from high-dimensional data is a very challenging task in cancer diagnoses. On the one hand, biomarker discovery is the so-called high-dimensional small-sample problem. On the other hand, these data are redundant and noisy. In recent years, biomarker discovery from high-throughput biological data has become an increasingly important emerging topic in the field of bioinformatics. In this study, we propose a binary differential evolution algorithm for feature selection. Firstly, we suggest using a two-stage approach, where three filter methods including the Fisher score, T-statistics, and Information gain are used to generate the feature pool for input to differential evolution (DE). Secondly, in order to improve the performance of differential evolution algorithm for feature selection, a new variant of binary DE called BDE is proposed. Three optimization strategies are incorporated into the BDE. The first strategy is the heuristic method in initial stage, the second one is the self-adaptive parameter control, and the third one is the minimum change value to improve the exploration behaviour thus enhance the diversity. Finally, Support vector machine (SVM) is used as the classifier in 10 fold cross-validation method. The experimental results of our proposed algorithm on some benchmark datasets demonstrate the effectiveness of our algorithm. In addition, the BDE forged in this study will be of great potential in feature selection problems.

Transductive Ridge Regression in Structure-Activity Modeling


In this article we consider the application of the Transductive Ridge Regression (TRR) approach to structure-activity modeling. An original procedure of the TRR parameters optimization is suggested. Calculations performed on 3 different datasets involving two types of descriptors demonstrated that TRR outperforms its non-transductive analogue (Ridge Regression) in more than 90 % of cases. The most significant transductive effect was observed for small datasets. This suggests that transduction may be particularly useful when the data are expensive or difficult to collect.

Generative Recurrent Networks for De Novo Drug Design


Generative artificial intelligence models present a fresh approach to chemogenomics and de novo drug design, as they provide researchers with the ability to narrow down their search of the chemical space and focus on regions of interest. We present a method for molecular de novo design that utilizes generative recurrent neural networks (RNN) containing long short-term memory (LSTM) cells. This computational model captured the syntax of molecular representation in terms of SMILES strings with close to perfect accuracy. The learned pattern probabilities can be used for de novo SMILES generation. This molecular design concept eliminates the need for virtual compound library enumeration. By employing transfer learning, we fine-tuned the RNN′s predictions for specific molecular targets. This approach enables virtual compound design without requiring secondary or external activity prediction, which could introduce error or unwanted bias. The results obtained advocate this generative RNN-LSTM system for high-impact use cases, such as low-data drug discovery, fragment based molecular design, and hit-to-lead optimization for diverse drug targets.

R-based Tool for a Pairwise Structure-Activity Relationship Analysis


The Structure-Activity Relationship analysis is a complex process that can be enhanced by computational techniques. This article describes a simple tool for SAR analysis that has a graphic user interface and a flexible approach towards the input of molecular data. The application allows calculating molecular similarity represented by Tanimoto index & Euclid distance, as well as, determining activity cliffs by means of Structure-Activity Landscape Index. The calculation is performed in a pairwise manner either for the reference compound and other compounds or for all possible pairs in the data set. The results of SAR analysis are visualized using two types of plot. The application capability is demonstrated by the analysis of a set of COX2 inhibitors with respect to Isoxicam. This tool is available online: it includes manual and input file examples.

Importance of an Orchestrate Participation of Each Individual Residue Present at a Catalytic Site


GTP hydrolysis is indispensable to keep a living cell healthy. Nature has evolved so many enzymes to enhance the slow GTP hydrolysis. Rab GTPases are evolved to regulate vesicle trafficking. GTPase activating proteins (GAPs) accelerates their intrinsic slow GTP hydrolysis in order to maintain the sustainability between cellular events. Any malfunction/interference in this hydrolysis disrupts normal cellular events and causes severe diseases. In this study, GTP hydrolysis mechanism of Rab33B catalyzed by TBC-domain GAP protein Gyp1p has been decoded using extensive ab initio QM/MM metadynamics simulations. An organized coupled movement of individual residues present at the catalytic site is found to be the key factor for this reaction. An unorganized coupled movement leads the hydrolysis through very high energy pathways. This also reveals that the chemical transformations occurring at a catalytic site are residue specific.

Ligand-based Modeling for the Prediction of Pharmacophore Features for Multi-targeted Inhibition of the Arachidonic Acid Cascade


The single-target drugs against the arachidonic acid inflammatory pathway are associated with serious side effects, hence, as a first step towards multi-target drugs, we have studied the pharmacophoric features common to the inhibitors of 5-lipoxygenase-activating protein (FLAP), microsomal prostaglandin E-synthase 1 (mPGES-1) and leukotriene A4 hydrolase (LTA4H). FLAP and mPGES-1 shared subfamily-specific positions (SSPs) and four mPGES-1 inhibitors binding to them mapped onto the pharmacophore derived from FLAP inhibitors (Ph-FLAP). The reactions of mPGES-1 and LTA4H had high structural similarity. The pharmacophore derived from two substrate mimic inhibitors of LTA4H (Ph-LTA4H) also mapped onto three mPGES-1 inhibitors. Screening of in-house database for Ph-FLAP and Ph-LTA4H identified one compound, C1. It inhibited the production of the mPGES-1 product, prostaglandin E2 (PGE2) by 97.8±1.6 % at 50 μM in HeLa cells and can be a starting point for designing molecules inhibiting all three targets simultaneously.

Virtual Screening Approach of Bacterial Peptide Deformylase Inhibitors Results in New Antibiotics


The increasing resistance of bacteria to antibacterial therapy poses an enormous health problem, it renders the development of new antibacterial agents with novel mechanism of action an urgent need. Peptide deformylase, a metalloenzyme which catalytically removes N-formyl group from N-terminal methionine of newly synthesized polypeptides, is an important target in antibacterial drug discovery. In this study, we report the structure-based virtual screening of ZINC database in order to discover potential hits as bacterial peptide deformylase enzyme inhibitors with more affinity as compared to GSK1322322, previously known inhibitor. After virtual screening, fifteen compounds of the top hits predicted were purchased and evaluated in vitro for their antibacterial activities against one Gram positive (Staphylococcus aureus) and three Gram negative (Escherichia coli, Pseudomonas aeruginosa and Klebsiella. pneumoniae) bacteria in different concentrations by disc diffusion method. Out of these, three compounds, ZINC00039650, ZINC03872971 and ZINC00126407, exhibited significant zone of inhibition. The results obtained were confirmed using the dilution method. Thus, these proposed compounds may aid the development of more efficient antibacterial agents.

RJSplot: Interactive Graphs with R


Data visualization techniques provide new methods for the generation of interactive graphs. These graphs allow a better exploration and interpretation of data but their creation requires advanced knowledge of graphical libraries. Recent packages have enabled the integration of interactive graphs in R. However, R provides limited graphical packages that allow the generation of interactive graphs for computational biology applications. The present project has joined the analytical power of R with the interactive graphical features of JavaScript in a new R package (RJSplot). It enables the easy generation of interactive graphs in R, provides new visualization capabilities, and contributes to the advance of computational biology analytical methods. At present, 16 interactive graphics are available in RJSplot, such as the genome viewer, Manhattan plots, 3D plots, heatmaps, dendrograms, networks, and so on. The RJSplot package is freely available online at

Multi-Objective Optimization of Benzamide Derivatives as Rho Kinase Inhibitors


Despite recent advances in Computer Aided Drug Discovery and High Throughput Screening, the attrition rates of drug candidates continue to be high, underscoring the inherent complexity of the drug discovery paradigm. Indeed, a compromise between several objectives is often required to obtain successful clinical drugs. The present manuscript details a multi-objective workflow that integrates the 4D-QSAR and molecular docking methods in the simultaneous modeling of the Rho Kinase inhibitory activity and acute toxicity of Benzamide derivatives. To this end, the pIC50/pLD50 ratio is considered as the response variable, permitting the concurrent modeling of both properties and representing a shift from classical step-by-step evaluations. The 4D-QSAR strategy is used to generate the Grid Cell Occupancy Descriptors (GCODs), and Stochastic Gradient Boosting (SGB) and Partial Least Squares (PLS) methods as the model fitting techniques. While the statistical parameters for the PLS model do not meet established criteria for acceptability, the SGB model yields satisfactory performance, with correlation coefficients r2=0.95 and r2pred=0.65 for the training and test set, respectively. Posteriorly, the structural interpretation of the most relevant GCODs according to the SGB model is performed, allowing for the proposal of 139 novel benzamide derivatives, which are then screened using the same model. Of these 9 compounds were predicted to possess pIC50/pLD50 ratio values higher than those for the employed dataset. Finally, in order to corroborate the results obtained with the SGB model, a docking simulation was formed to evaluate the binding affinity of the proposed molecules to the ROCK2 active site and 3 chemical structures (i. e. p6, p14 and p131) showed higher binding affinity than the most active compound in the training set, while the rest generally demonstrated comparable behavior. It may therefore be concluded that the consensus models that intertwine the 4D-QSAR and molecular docking methods contribute to more reliable virtual screening and compound optimization experiments. Additionally, the use of multi-objective modeling schemes permits the simultaneous evaluation of different chemical and biological profiles, which should contribute to the control a priori of causative factors for the high attrition rates in later drug discovery phases.

Design, Synthesis, SAR and Molecular Modeling Studies of Novel Imidazo[2,1-b][1,3,4]Thiadiazole Derivatives as Highly Potent Antimicrobial Agents


In this study, a novel series of phenyl substituted imidazo[2,1-b][1,3,4]thiadiazole derivatives were synthesized, characterized and explored for antibacterial activity against Gram-negative Escherichia coli, Gram-positive Staphylococcus aureus and Bacillus subtilis and antifungal activity against Candida albicans. Most of the synthesized compounds exhibited remarkable antimicrobial activities, some of which being ten times more potent than positive controls. The most promising compound showed excellent activity with MIC value of 0.03 μg/ml against both S. aureus and B. subtilis (MIC values of positive compound Chloramphenicol are 0.4 μg/ml and 0.85 μg/ml, respectively). Furthermore, structure-activity relationship was also investigated with the help of computational tools. Some physicochemical and ADME properties of the compounds were calculated too. The combination of electronic structure calculations performed at PM6 level and molecular docking simulations using Glide extra-precision mode showed that the hydrophobic nature of keto aryl ring with no electron withdrawing substituents at para position enhances activity while electron-donating substituents at the second aryl ring is detrimental to activity.

Novel Method Proposing Chemical Structures with Desirable Profile of Activities Based on Chemical and Protein Spaces


Active molecules among numerous chemical structures in a chemical database can be searched easily by statistical prediction of compound–protein interactions. However, constructing a simple prediction model against one protein does not aid drug design, because detecting chemical structures that act similarly against multiple proteins is necessary for preventing side effects of the potential drug. To tackle this problem, we propose a new method that visualizes chemical and protein spaces. For simultaneous visualization of both spaces, we employ a counterpropagation neural network (CPNN) and develop a new visualization method named multi-input CPNN (MICPNN). In a case study of the kinase protein family, the MICPNN model predicted accurately the complex relationships between compounds and proteins. The proposed method identified chemical structures with promising activity against kinases. Our proposed method is also applicable to other protein families, such as G-protein coupled receptors, ion channels and transporters.

Modeling of The hERG K+ Channel Blockage Using Online Chemical Database and Modeling Environment (OCHEM)


Human ether-a-go-go related gene (hERG) K+ channel plays an important role in cardiac action potential. Blockage of hERG channel may result in long QT syndrome (LQTS), even cause sudden cardiac death. Many drugs have been withdrawn from the market because of the serious hERG-related cardiotoxicity. Therefore, it is quite essential to estimate the chemical blockage of hERG in the early stage of drug discovery. In this study, a diverse set of 3721 compounds with hERG inhibition data was assembled from literature. Then, we make full use of the Online Chemical Modeling Environment (OCHEM), which supplies rich machine learning methods and descriptor sets, to build a series of classification models for hERG blockage. We also generated two consensus models based on the top-performing individual models. The consensus models performed much better than the individual models both on 5-fold cross validation and external validation. Especially, consensus model II yielded the prediction accuracy of 89.5 % and MCC of 0.670 on external validation. This result indicated that the predictive power of consensus model II should be stronger than most of the previously reported models. The 17 top-performing individual models and the consensus models and the data sets used for model development are available at

Theoretical Study on the Conformational Bioeffect of the Fluorination of Acetylcholine


There has been an increasing interest in the study of fluorinated derivatives of gamma-aminobutyric acid (GABA), an acetylcholine (AC) analog. This work reports a theoretical study on the effect of an α-carbonyl fluorination in AC, aiming at understanding the role of a distant fluorine relative to the positively charged nitrogen on the conformational folding of the resulting fluorinated AC. In addition, the chemical and structural changes were evaluated on the basis of ligand-enzyme (acetylcholinesterase) interactions. In an enzyme-free environment, the fluorination yields conformational changes relative to AC due to the appearance of some attractive interactions with fluorine and a weaker steric repulsion between the fluorine substituent and the carboxyl group, rather than to a possible electrostatic interaction F⋅⋅⋅N+. Moreover, the gauche orientation in the N−C−C−O fragment of AC owing to the electrostatic gauche effect is reinforced after fluorination. For instance, the conformational equilibrium in AC is described by a competition between gauche and anti conformers (accounting for the N−C−C−O dihedral angle) in DMSO, while the population for a gauche conformer in the fluorinated AC is almost 100 % in both gas phase and DMSO. However, this arrangement is disrupted in the biological environment even in the fluorinated derivative (whose bioconformation-like geometry shows a ligand-protein interaction of −84.1 kcal mol−1 against −79.5 kcal mol−1 for the most stable enzyme-free conformation), which shows an anti N−C−C−O orientation, because the enzyme induced-fit takes place. Nevertheless, the most likely bioconformation for the fluorinated AC does not match the bioactive AC backbone nor the most stable enzyme-free conformation, thus revealing the role of fluorination on the bioconformational control of AC.

Structure Modification Toward Applicability Domain of a QSAR/QSPR Model Considering Activity/Property


In drug and material design, the activity and property values of the designed chemical structures can be predicted by quantitative structure−activity and structure−property relationship (QSAR/QSPR) models. When a QSAR/QSPR model is applied to chemical structures, its applicability domain (AD) must be considered. The predicted activity/property values are only reliable for chemical structures inside the AD. Chemical structures outside the AD are usually neglected, as the predicted values are unreliable. The purpose of this study is to develop a methodology for obtaining novel chemical structures with the desired activity or property based on a QSAR/QSPR model by making use of the neglected structures. We propose a structure modification strategy for the AD that considers the activity and property simultaneously. The AD is defined by a one-class support vector machine and the structure modification is guided by a partial derivative of the AD model and matched molecular pairs analysis. Three proof-of-concept case studies generate novel chemical structures inside the AD that exhibit preferable activity/property values according to the QSAR/QSPR model.

Metabolomic Studies of Indonesian Jamu Medicines: Prediction of Jamu Efficacy and Identification of Important Metabolites


In order to obtain a better understanding why some Jamu formulas can be used to treat a specific disease, we performed metabolomic studies of Jamu by taking into consideration the biologically active compounds existing in plants used as Jamu ingredients. A thorough integration of information from omics is expected to provide solid evidence-based scientific rationales for the development of modern phytomedicines. This study focused on prediction of Jamu efficacy based on its component metabolites and also identification of important metabolites related to each efficacy group. Initially, we compared the performance of Support Vector Machines and Random Forest to predict the Jamu efficacy with three different data pre-processing approaches, such as no filtering, Single Filtering algorithm, and a combination of Single Filtering algorithm and feature selection using Regularized Random Forest. Both classifiers performed very well and according to 5-fold cross-validation results, the mean accuracy of Support Vector Machine with linear kernel was slightly better than Random Forest. It can be concluded that machine learning methods can successfully relate Jamu efficacy with metabolites. In addition, we extended our analysis by identifying important metabolites from the Random Forest model. The inTrees framework was used to extract the rules and to select important metabolites for each efficacy group. Overall, we identified 94 significant metabolites associated to 12 efficacy groups and many of them were validated by published literature and KNApSAcK Metabolite Activity database.

DMclust, a Density-based Modularity Method for Accurate OTU Picking of 16S rRNA Sequences


Clustering 16S rRNA sequences into operational taxonomic units (OTUs) is a crucial step in analyzing metagenomic data. Although many methods have been developed, how to obtain an appropriate balance between clustering accuracy and computational efficiency is still a major challenge. A novel density-based modularity clustering method, called DMclust, is proposed in this paper to bin 16S rRNA sequences into OTUs with high clustering accuracy. The DMclust algorithm consists of four main phases. It first searches for the sequence dense group defined as n-sequence community, in which the distance between any two sequences is less than a threshold. Then these dense groups are used to construct a weighted network, where dense groups are viewed as nodes, each pair of dense groups is connected by an edge, and the distance of pairwise groups represents the weight of the edge. Then, a modularity-based community detection method is employed to generate the preclusters. Finally, the remaining sequences are assigned to their nearest preclusters to form OTUs. Compared with existing widely used methods, the experimental results on several metagenomic datasets show that DMclust has higher accurate clustering performance with acceptable memory usage.

Cover Picture: (Mol. Inf. 11/2017)


Molecular Informatics publishes research that will deepen our understanding about information storage and processing on the molecular level, signaling and regulation of biological and chemical systems including cellular systems and macromolecular assemblies, modeling of molecular interactions and networks, and the design of molecular modulators that exhibit desired biochemical and pharmacological effects. Various aspects of this transdisciplinary scientific area are depicted on the cover: Cells with their nuclei and membranes (image courtesy of Dr. A. Schreiner and E. Resch), models of receptor-ligand interactions, and an artistic representation of “biological information” as multiple bit-codes presented on a right-handed helix.

Salts Influence Cathechins and Flavonoids Encapsulation in Liposomes: A Molecular Dynamics Investigation


Cathechins and flavonoids are responsible of numerous health benefits. Two of the most representatives’ compounds for their antioxidant and therapeutic effects are Epigallocatechin 3-Gallate (EGCG), from green tea extracts, and morelloflavone (MF), from Garcinia dulcis. Here we explore, by atomistic Molecular Dynamics simulations, how EGCG and MF interact with lipid bilayers and we show the salts’ influence on their encapsulation degree in neutral liposomes. As a result, we found out that EGCGs naturally bind to the hydrophilic regions of phospholipids, positioning themselves mostly at the interface between water and lipid phases. The presence of a salt clearly influences the EGCG molecules’ absorption and the total effect depends strongly on the salt nature and concentration. Beside, for MF, we observed a high stability of the intermolecular MFs aggregates in water that strongly penalizes the flavonoid's interaction with the lipid polar heads. However, salts can influence MF′s liposomal penetration, even if they are not able to promote completely its absorption inside the bilayer. For both compounds, the increase of penetration is more marked in presence of magnesium chloride, whilst calcium chloride showed the opposite effect.

Computational Studies of the Active and Inactive Regulatory Domains of Response Regulator PhoP Using Molecular Dynamics Simulations


The response regulator PhoP is part of the PhoP/PhoQ two-component system, which is responsible for regulating the expression of multiple genes involved in controlling virulence, biofilm formation, and resistance to antimicrobial peptides. Therefore, modulating the transcriptional function of the PhoP protein is a promising strategy for developing new antimicrobial agents. There is evidence suggesting that phosphorylation-mediated dimerization in the regulatory domain of PhoP is essential for its transcriptional function. Disruption or stabilization of protein-protein interactions at the dimerization interface may inhibit or enhance the expression of PhoP-dependent genes. In this study, we performed molecular dynamics simulations on the active and inactive dimers and monomers of the PhoP regulatory domains, followed by pocket-detecting screenings and a quantitative hot-spot analysis in order to assess the druggability of the protein. Consistent with prior hypothesis, the calculation of the binding free energy shows that phosphorylation enhances dimerization of PhoP. Furthermore, we have identified two different putative binding sites at the dimerization active site (the α4-β5-α5 face) with energetic “hot-spot” areas, which could be used to search for modulators of protein-protein interactions. This study delivers insight into the dynamics and druggability of the dimerization interface of the PhoP regulatory domain, and may serve as a basis for the rational identification of new antimicrobial drugs.

Pharmacoinformatic Study on the Selective Inhibition of the Protozoan Dihydrofolate Reductase Enzymes


Dihydrofolate reductase (DHFR) is an essential enzyme of the folate metabolic pathway in protozoa and it is a validated, potential drug target in many infectious diseases. Information about unique conserved residues of the DHFR enzyme is required to understand residual selectivity of the protozoan DHFR enzyme. The three dimensional crystal structures are not available for all the protozoan DHFR enzymes. Enzyme-substrate/inhibitor interaction information is required for the binding mode characterization in protozoan DHFR for selective inhibitor design. In this work, multiple sequence analysis was carried out in all the studied species. Homology models were built for protozoan DHFR enzymes, for which 3D structures are not available in PDB. The molecular docking and Prime-MMGBSA calculations of the natural substrate (dihydrofolate, DHF) and classical DHFR inhibitor (methotrexate, MTX) were performed in protozoan DHFR enzymes. Comparative sequence analysis showed that an overall sequence identity between the studied species ranging from 22.94 % (CfDHFR-BgDHFR) to 94.61 % (LdDHFR-LmDHFR). Interestingly, it was observed that most of the active site residues were conserved in all the cases and all the enzymes exhibit similar key binding interactions with DHF and MTX in molecular docking analysis, but there are a few key binding residues which differ in protozoan species that makes it suitable for target selectivity. This information can be used to design selective and potent protozoan DHFR enzyme inhibitors.

Energy-based Neural Networks as a Tool for Harmony-based Virtual Screening


In Energy-Based Neural Networks (EBNNs), relationships between variables are captured by means of a scalar function conventionally called “energy”. In this article, we introduce a procedure of “harmony search”, which looks for compounds providing the lowest energies for the EBNNs trained on active compounds. It can be considered as a special kind of similarity search that takes into account regularities in the structures of active compounds. In this paper, we show that harmony search can be used for performing virtual screening. The performance of the harmony search based on two types of EBNNs, the Hopfield Networks (HNs) and the Restricted Boltzmann Machines (RBMs), was compared with the performance of the similarity search based on Tanimoto coefficient with “data fusion”. The AUC measure for ROC curves and 1 %-enrichment rates for 20 targets were used in the benchmarking. Five different scores were computed: the energy for HNs, the free energy and the reconstruction error for RBMs, the mean and the maximum values of Tanimoto coefficients. The performance of the harmony search was shown to be comparable or even superior (significantly for several targets) to the performance of the similarity search. Important advantages of using the harmony search for virtual screening are very high computational efficiency of prediction, the ability to reveal and take into account regularities in active structures, flexibility and interpretability of models, etc.

Ensemble Architecture for Prediction of Enzyme-ligand Binding Residues Using Evolutionary Information


Enzyme interactions with ligands are crucial for various biochemical reactions governing life. Over many years attempts to identify these residues for biotechnological manipulations have been made using experimental and computational techniques. The computational approaches have gathered impetus with the accruing availability of sequence and structure information, broadly classified into template-based and de novo methods. One of the predominant de novo methods using sequence information involves application of biological properties for supervised machine learning. Here, we propose a support vector machines-based ensemble for prediction of protein-ligand interacting residues using one of the most important discriminative contributing properties in the interacting residue neighbourhood, i. e., evolutionary information in the form of position-specific- scoring matrix (PSSM). The study has been performed on a non-redundant dataset comprising of 9269 interacting and 91773 non-interacting residues for prediction model generation and further evaluation. Of the various PSSM-based models explored, the proposed method named ROBBY (pRediction Of Biologically relevant small molecule Binding residues on enzYmes) shows an accuracy of 84.0 %, Matthews Correlation Coefficient of 0.343 and F-measure of 39.0 % on 78 test enzymes. Further, scope of adding domain knowledge such as pocket information has also been investigated; results showed significant enhancement in method precision. Findings are hoped to boost the reliability of small-molecule ligand interaction prediction for enzyme applications and drug design.

Predicting the Enzymatic Hydrolysis Half-lives of New Chemicals Using Support Vector Regression Models Based on Stepwise Feature Elimination


The enzymatic hydrolysis of chemicals, which is important for in vitro drug metabolism assays, is an important indicator of drug stability profiles during drug discovery and development. Herein, we employed a stepwise feature elimination (SFE) method with nonlinear support vector machine regression (SVR) models to predict the in vitro half-lives in human plasma/blood of various esters. The SVR model was developed using public databases and literature-reported data on the half-lives of esters in human plasma/blood. In particular, the SFE method was developed to prevent over fitting and under fitting in the nonlinear model, and it provided a novel and efficient method of realizing feature combinations and selections to enhance the prediction accuracy. Our final developed model with 24 features effectively predicted an external validation set using the time-split method and presented reasonably good R2 values (0.6) and also predicted two completely independent validation datasets with R2 values of 0.62 and 0.54; thus, this model performed much better than other prediction models.