S. Pramanik, F. Contreras, M. Davari, U. Schwaneberg
{"title":"Protein Engineering by Efficient Sequence Space Exploration Through Combination of Directed Evolution and Computational Design Methodologies","authors":"S. Pramanik, F. Contreras, M. Davari, U. Schwaneberg","doi":"10.1002/9783527815128.ch7","DOIUrl":"https://doi.org/10.1002/9783527815128.ch7","url":null,"abstract":"","PeriodicalId":20902,"journal":{"name":"Protein engineering","volume":"121 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72579698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Engineering Antibody-Based Therapeutics: Progress and Opportunities","authors":"Annalee W. Nguyen, J. Maynard","doi":"10.1002/9783527815128.ch13","DOIUrl":"https://doi.org/10.1002/9783527815128.ch13","url":null,"abstract":"","PeriodicalId":20902,"journal":{"name":"Protein engineering","volume":"57 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84040124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data‐driven Protein Engineering","authors":"J. Greenhalgh, Apoorv Saraogee, Philip A. Romero","doi":"10.1002/9783527815128.ch6","DOIUrl":"https://doi.org/10.1002/9783527815128.ch6","url":null,"abstract":"Introduction A protein’s sequence of amino acids encodes its function. This “function” could refer to a protein’s natural biological function, or it could also be any other property including binding affinity toward a particular ligand, thermodynamic stability, or catalytic activity. A detailed understanding of how these functions are encoded would allow us to more accurately reconstruct the tree of life and possibly predict future evolutionary events, diagnose genetic diseases before they manifest symptoms, and design new proteins with useful properties. We know that a protein sequence folds into a three-dimensional structure, and this structure positions specific chemical groups to perform a function; however, we’re missing the quantitative details of this sequence-structure-function mapping. This mapping is extraordinarily complex because it involves thousands of molecular interactions that are dynamically coupled across multiple length and time scales. Computational methods can be used to model the mapping from sequence to structure to function. Tools such as molecular dynamics simulations or Rosetta use atomic representations of protein structures and physics-based energy functions to model structures and functions (1–3). While these models are based on well-founded physical principles, they often fail to capture a protein’s overall global behavior and properties. There are numerous challenges associated with physics-based models including consideration of conformational dynamics, the requirement to make energy function approximations for the sake of computational efficiency, and the fact that, for many complex properties such as enzyme catalysis, the molecular basis is simply unknown (4). In systems composed of thousands of atoms, the propagation of small errors quickly overwhelms any predictive accuracy. Despite tremendous breakthroughs and research progress over the last century, we still lack the key details to reliably predict, simulate, and design protein function. In this chapter, we present the emerging field of data-driven protein engineering. Instead of physically modeling the relationships between protein sequence, structure, and function, data-driven methods use ideas from statistics and machine learning to infer these complex relationships from data. This top-down modeling approach implicitly captures the numerous and possibly unknown factors that shape the mapping from sequence to function. Statistical models have been used to understand the molecular basis of protein function and provide exceptional predictive accuracy for protein design.","PeriodicalId":20902,"journal":{"name":"Protein engineering","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72950036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jadwiga R Bienkowska, Hyman Hartman, Temple F Smith
{"title":"A search method for homologs of small proteins. Ubiquitin-like proteins in prokaryotic cells?","authors":"Jadwiga R Bienkowska, Hyman Hartman, Temple F Smith","doi":"10.1093/protein/gzg130","DOIUrl":"https://doi.org/10.1093/protein/gzg130","url":null,"abstract":"<p><p>The question of protein homology versus analogy arises when proteins share a common function or a common structural fold without any statistically significant amino acid sequence similarity. Even though two or more proteins do not have similar sequences but share a common fold and the same or closely related function, they are assumed to be homologs, descendant from a common ancestor. The problem of homolog identification is compounded in the case of proteins of 100 or less amino acids. This is due to a limited number of basic single domain folds and to a likelihood of identifying by chance sequence similarity. The latter arises from two conditions: first, any search of the currently very large protein database is likely to identify short regions of chance match; secondly, a direct sequence comparison among a small set of short proteins sharing a similar fold can detect many similar patterns of hydrophobicity even if proteins do not descend from a common ancestor. In an effort to identify distant homologs of the many ubiquitin proteins, we have developed a combined structure and sequence similarity approach that attempts to overcome the above limitations of homolog identification. This approach results in the identification of 90 probable ubiquitin-related proteins, including examples from the two prokaryotic domains of life, Archaea and Bacteria.</p>","PeriodicalId":20902,"journal":{"name":"Protein engineering","volume":"16 12","pages":"897-904"},"PeriodicalIF":0.0,"publicationDate":"2003-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/protein/gzg130","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"24410581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Protein fold comparison by the alignment of topological strings.","authors":"Linus O Johannissen, William R Taylor","doi":"10.1093/protein/gzg128","DOIUrl":"https://doi.org/10.1093/protein/gzg128","url":null,"abstract":"<p><p>Using the definitions of protein folds encoded in a text string, a dynamic programming algorithm was devised to compare these and identify their largest common substructure and calculate the distance (in terms of the number of edit operations) that this lay from each structure. This provided a metric on which the folds were clustered into a 'phylogenetic' tree. This construction differs from previous automatic structure clustering algorithms as it has explicit representation of the structures at 'ancestral' branching nodes, even when these have no corresponding known structure. The resulting tree was compared with that compiled by an 'expert' in the field and while there was broad agreement, differences were found that resulted from differing degrees of emphasis being placed on the types of operations that can be used to transform structures. Some concluding speculations on the relationship of such trees to the evolutionary history and folding of the proteins are advanced.</p>","PeriodicalId":20902,"journal":{"name":"Protein engineering","volume":"16 12","pages":"949-55"},"PeriodicalIF":0.0,"publicationDate":"2003-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/protein/gzg128","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"24410586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sung-Hun Nam, Ki-Hoon Oh, Geun-Joong Kim, Hak-Sung Kim
{"title":"Functional tuning of a salvaged green fluorescent protein variant with a new sequence space by directed evolution.","authors":"Sung-Hun Nam, Ki-Hoon Oh, Geun-Joong Kim, Hak-Sung Kim","doi":"10.1093/protein/gzg146","DOIUrl":"https://doi.org/10.1093/protein/gzg146","url":null,"abstract":"<p><p>We previously reported a method, designated functional salvage screen (FSS), to generate protein lineages with new sequence spaces through the functional or structural salvage of a defective protein by employing green fluorescent protein (GFP) as a model protein. Here, in an attempt to mimic a step in the natural evolution process of proteins, the functionally salvaged mutant GFP-I5 with new sequence space, but showing low fluorescence intensity and stability, was selected and fine-tuned by directed evolution. During a course of functional tuning, GFP-I5 was found to evolve rapidly, recovering the spectral traits to those of the parent GFPuv. The mutant 3E4 from the third round of directed evolution possessed four substitutions; three (F64L, E111V and K166Q) were at the original GFP gene and the other (K8N) at the inserted segment. The fluorescence intensity of 3E4 was approximately 28-fold stronger than GFP-I5, and other spectral properties were retained. Biochemical and biophysical investigations suggested that the fine-tuning by directed evolution led the salvaged variant GFP-I5 to a functionally favorable structure, resulting in recovery of stability and fluorescence. Site-directed mutagenesis of the mutated amino acid residues in both GFPuv and GFP-I5 revealed that each amino acid residue has a different effect on the fluorescence intensity, which implies that 3E4 adopted a new evolutionary path with respect to fluorescence characteristics compared with the parent GFPuv. Directed evolution in conjunction with FSS is expected to be used for generating protein lineages with new fitness landscapes.</p>","PeriodicalId":20902,"journal":{"name":"Protein engineering","volume":"16 12","pages":"1099-105"},"PeriodicalIF":0.0,"publicationDate":"2003-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/protein/gzg146","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"24410920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using a residue clash map to functionally characterize protein recombination hybrids.","authors":"Manish C Saraf, Costas D Maranas","doi":"10.1093/protein/gzg129","DOIUrl":"https://doi.org/10.1093/protein/gzg129","url":null,"abstract":"<p><p>In this article, we introduce a rapid, protein sequence database-driven approach to characterize all contacting residue pairs present in protein hybrids for inconsistency with protein family structural features. This approach is based on examining contacting residue pairs with different parental origins for different types of potentially unfavorable interactions (i.e. electrostatic repulsion, steric hindrance, cavity formation and hydrogen bond disruption). The identified clashing residue pairs between members of a protein family are then contrasted against functionally characterized hybrid libraries. Comparisons for five different protein recombination studies available in the literature: (i) glycinamide ribonucleotide transformylase (GART) from Escherichia coli (purN) and human (hGART), (ii) human Mu class glutathione S-transferase (GST) M1-1 and M2-2, (iii) beta-lactamase TEM-1 and PSE-4, (iv) catechol-2,3-oxygenase xylE and nahH, and (v) dioxygenases (toluene dioxygenase, tetrachlorobenzene dioxygenase and biphenyl dioxygenase) reveal that the patterns of identified clashing residue pairs are remarkably consistent with experimentally found patterns of functional crossover profiles. Specifically, we show that the proposed residue clash maps are on average 5.0 times more effective than randomly generated clashes and 1.6 times more effective than residue contact maps at explaining the observed crossover distributions among functional members of hybrid libraries. This suggests that residue clash maps can provide quantitative guidelines for the placement of crossovers in the design of protein recombination experiments.</p>","PeriodicalId":20902,"journal":{"name":"Protein engineering","volume":"16 12","pages":"1025-34"},"PeriodicalIF":0.0,"publicationDate":"2003-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/protein/gzg129","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"24410518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}