{"title":"Predicting function from structure: examples of the serine protease inhibitor canonical loop conformation found in extracellular proteins","authors":"Richard M Jackson , Robert B Russell","doi":"10.1016/S0097-8485(01)00097-3","DOIUrl":"10.1016/S0097-8485(01)00097-3","url":null,"abstract":"<div><p>The prediction of protein function from structure is becoming of growing importance in the age of structural genomics. We have focused on the problem of identifying sites of potential serine protease inhibitor interactions on the surface of proteins of known structure. Given that there is no sequence conservation within canonical loops from different inhibitor families we first compare representative loops to all fragments of equal length among proteins of known structure by calculating main-chain RMS deviation. Fragments with RMS deviation below a certain threshold (hits) are removed if residues have solvent accessibilities appreciably lower than those observed in the search structure. These remaining hits are further filtered to remove those occurring largely within secondary structure elements. Likely functional significance is restricted further by considering only extracellular protein domains. Also a test is performed to see if the loop can dock into the binding site of the serine protease trypsin without unacceptable steric clashes. By comparing different canonical loop structures to the protein structure database we show that the method was able to detect previously known inhibitors. In addition, we discuss potentially new canonical loop structures found in secreted hydrolases, toxins, viral proteins, cytokines and other proteins. We discuss the possible functional significance of several of the examples found.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 31-39"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00097-3","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75323839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modelling protein side-chain conformations using constraint logic programming","authors":"Martin T Swain , Graham J.L Kemp","doi":"10.1016/S0097-8485(01)00103-6","DOIUrl":"10.1016/S0097-8485(01)00103-6","url":null,"abstract":"<div><p>Side-chain placement is an important sub-task in protein modelling. Selecting conformations for side-chains is a difficult problem because of the large search space to be explored. This problem can be addressed using constraint logic programming (CLP), which is an artificial intelligence technique developed to solve large combinatorial search problems. The side-chain placement problem can be expressed as a CLP program in which rotamer conformations are used as values for <em>finite domain variables</em>, and bad steric contacts involving rotamers are represented as <em>constraints</em>. This paper introduces the concept of null rotamers, and shows how these can be used in implementing a novel iterative approach. We present results that compare the accuracy of models constructed using different rotamer libraries and different domain variable enumeration heuristics. The results obtained using this CLP-based approach compare favourably with those obtained by other methods.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 85-95"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00103-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77829236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Steven J Pickering , Andrew J Bulpitt , Nick Efford , Nicola D Gold , David R Westhead
{"title":"AI-based algorithms for protein surface comparisons","authors":"Steven J Pickering , Andrew J Bulpitt , Nick Efford , Nicola D Gold , David R Westhead","doi":"10.1016/S0097-8485(01)00102-4","DOIUrl":"10.1016/S0097-8485(01)00102-4","url":null,"abstract":"<div><p>Many current methods for protein analysis depend on the detection of similarity in either the primary sequence, or the overall tertiary structure (the C<sub>α</sub> atoms of the protein backbone). These common sequences or structures may imply similar functional characteristics or active properties. Active sites and ligand binding sites usually occur on or near the surface of the protein; so similarly shaped surface regions could imply similar functions. We investigate various methods for describing the shape properties of protein surfaces and for comparing them. Our current work uses algorithms from computer vision to describe the protein surfaces, and methods from graph theory to compare the surface regions. Early results indicate that we can successfully match a family of related ligand binding sites, and find their similarly shaped surface regions. This method of surface analysis could be extended to help identify unknown surface regions for possible ligand binding or active sites.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 79-84"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00102-4","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83359745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application of a time-delay neural network to promoter annotation in the Drosophila melanogaster genome","authors":"Martin G Reese","doi":"10.1016/S0097-8485(01)00099-7","DOIUrl":"10.1016/S0097-8485(01)00099-7","url":null,"abstract":"<div><p>Computational methods for automated genome annotation are critical to understanding and interpreting the bewildering mass of genomic sequence data presently being generated and released. A neural network model of the structural and compositional properties of a eukaryotic core promoter region has been developed and its application for analysis of the <em>Drosophila melanogaster</em> genome is presented. The model uses a time-delay architecture, a special case of a feed-forward neural network. The structure of this model allows for variable spacing between functional binding sites, which is known to play a key role in the transcription initiation process. Application of this model to a test set of core promoters not only gave better discrimination of potential promoter sites than previous statistical or neural network models, but also revealed indirectly subtle properties of the transcription initiation signal. When tested in the <em>Adh</em> region of 2.9 Mbases of the <em>Drosophila</em> genome, the neural network for promoter prediction (<span>nnpp</span>) program that incorporates the time-delay neural network model gives a recognition rate of 75% (69/92) with a false positive rate of 1/547 bases. The present work can be regarded as one of the first intensive studies that applies novel gene regulation technologies to the identification of the complex gene regulation sites in the genome of <em>Drosophila melanogaster</em>.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 51-56"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00099-7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81750853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Applications of neural network prediction of conformational states for small peptides from spectra and of fold classes","authors":"H.G Bohr , P Røgen , K.J Jalkanen","doi":"10.1016/S0097-8485(01)00101-2","DOIUrl":"10.1016/S0097-8485(01)00101-2","url":null,"abstract":"<div><p>Electronic structures of small peptides were calculated ‘ab initio’ with the help of Density Functional Theory (DFT) and molecular dynamics that rendered a set of conformational states of the peptides. For the structures of these states it was possible to derive atomic polar tensors that allowed us to construct vibrational spectra for each of the conformational states with low energy. From the spectra, neural networks could be trained to distinguish between the various states and thus be able to generate a larger set of relevant structures and their relation to secondary structures of the peptides. The calculations were done both with solvent atoms (up to ten water molecules) and without, and hence the neural networks could be used to monitor the influence of the solvent on hydrogen bond formation. The calculations at this stage only involved very short peptide fragments of a few alanine amino acids but already at this stage they could be compared with reasonable agreements to experiments. The neural networks are shown to be good in distinguishing the different conformers of the small alanine peptides, especially when in the gas phase. Also the task of predicting protein fold-classes, defined from line-geometry, seems promising.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 65-77"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00101-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73285226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Gilbert , David Westhead , Juris Viksna , Janet Thornton
{"title":"A computer system to perform structure comparison using TOPS representations of protein structure","authors":"David Gilbert , David Westhead , Juris Viksna , Janet Thornton","doi":"10.1016/S0097-8485(01)00096-1","DOIUrl":"10.1016/S0097-8485(01)00096-1","url":null,"abstract":"<div><p>We describe the design and implementation of a fast topology-based method for protein structure comparison. The approach uses the <span>TOPS</span> topological representation of protein structure, aligning two structures using a common discovered pattern and generating measure of distance derived from an insert score. Heavy use is made of a constraint-based pattern-matching algorithm for <span>TOPS</span> diagrams that we have designed and described elsewhere (Bioinformatics 15(4) (1999) 317). The comparison system is maintained at the European Bioinformatics Institute and is available over the Web at <span>tops.ebi.ac.uk/tops</span><svg><path></path></svg>. Users submit a structure description in Protein Data Bank (PDB) format and can compare it with structures in the entire PDB or a representative subset of protein domains, receiving the results by email.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 23-30"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00096-1","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75169718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generating protein three-dimensional fold signatures using inductive logic programming","authors":"M Turcotte , S.H Muggleton , M.J.E Sternberg","doi":"10.1016/S0097-8485(01)00100-0","DOIUrl":"10.1016/S0097-8485(01)00100-0","url":null,"abstract":"<div><p>Inductive logic programming (ILP) has been applied to automatically discover protein fold signatures. This paper investigates the use of topological information to circumvent problems encountered during previous experiments, namely (1) matching of non-structurally related secondary structures and (2) scaling problems. Cross-validation tests were carried out for 20 folds. The overall estimated accuracy is 73.37±0.35%. The new representation allows us to process the complete set of examples, while previously it was necessary to sample the negative examples. Topological information is used in approximately 90% of the rules presented here. Information about the topology of a sheet is present in 63% of the rules. This set of rules presents characteristics of the overall architecture of the fold. In contrast, 26% of the rules contain topological information which is limited to the packing of a restricted number of secondary structures, as such, the later set resembles those found in our previous studies.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 57-64"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00100-0","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79924315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thomas Dandekar , Fuli Du , R.Heiner Schirmer , Steffen Schmidt
{"title":"Medical target prediction from genome sequence: combining different sequence analysis algorithms with expert knowledge and input from artificial intelligence approaches","authors":"Thomas Dandekar , Fuli Du , R.Heiner Schirmer , Steffen Schmidt","doi":"10.1016/S0097-8485(01)00095-X","DOIUrl":"10.1016/S0097-8485(01)00095-X","url":null,"abstract":"<div><p>By exploiting the rapid increase in available sequence data, the definition of medically relevant protein targets has been improved by a combination of: (i) differential genome analysis (target list); and (ii) analysis of individual proteins (target analysis). Fast sequence comparisons, data mining, and genetic algorithms further promote these procedures. <em>Mycobacterium</em> <em>tuberculosis</em> proteins were chosen as applied examples.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 15-21"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00095-X","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72502091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Drug design by machine learning: support vector machines for pharmaceutical data analysis","authors":"R. Burbidge, M. Trotter, B. Buxton, S. Holden","doi":"10.1016/S0097-8485(01)00094-8","DOIUrl":"10.1016/S0097-8485(01)00094-8","url":null,"abstract":"<div><p>We show that the support vector machine (SVM) classification algorithm, a recent development from the machine learning community, proves its potential for structure–activity relationship analysis. In a benchmark test, the SVM is compared to several machine learning techniques currently used in the field. The classification task involves predicting the inhibition of dihydrofolate reductase by pyrimidines, using data obtained from the UCI machine learning repository. Three artificial neural networks, a radial basis function network, and a C5.0 decision tree are all outperformed by the SVM. The SVM is significantly better than all of these, bar a manually capacity-controlled neural network, which takes considerably longer to train.</p></div>","PeriodicalId":79331,"journal":{"name":"Computers & chemistry","volume":"26 1","pages":"Pages 5-14"},"PeriodicalIF":0.0,"publicationDate":"2001-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S0097-8485(01)00094-8","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88063480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}