Connecting chemical and protein sequence space to predict biocatalytic reactions

IF 48.5 1区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES
Nature Pub Date : 2025-10-01 DOI:10.1038/s41586-025-09519-5
Alexandra E. Paton, Daniil A. Boiko, Jonathan C. Perkins, Nicholas I. Cemalovic, Thiago Reschützegger, Gabe Gomes, Alison R. H. Narayan
{"title":"Connecting chemical and protein sequence space to predict biocatalytic reactions","authors":"Alexandra E. Paton, Daniil A. Boiko, Jonathan C. Perkins, Nicholas I. Cemalovic, Thiago Reschützegger, Gabe Gomes, Alison R. H. Narayan","doi":"10.1038/s41586-025-09519-5","DOIUrl":null,"url":null,"abstract":"The application of biocatalysis in synthesis has the potential to offer streamlined routes towards target molecules1, tunable catalyst-controlled selectivity2, as well as processes with improved sustainability3. Despite these advantages, biocatalysis is often a high-risk strategy to implement, as identifying an enzyme capable of performing chemistry on a specific intermediate required for a synthesis can be a roadblock that requires extensive screening of enzymes and protein engineering to overcome4. Strategies for predicting which enzyme and small molecule are compatible have been hindered by the lack of well-studied biocatalytic reaction datasets5. The underexploration of connections between chemical and protein sequence space constrains navigation between these two landscapes. Here we report a two-phase effort relying on high-throughput experimentation to populate connections between productive substrate and enzyme pairs and the subsequent development of a tool, CATNIP, for predicting compatible α-ketoglutarate (α-KG)/Fe(ii)-dependent enzymes for a given substrate or, conversely, for ranking potential substrates for a given α-KG/Fe(ii)-dependent enzyme sequence. We anticipate that our approach can be readily expanded to further enzyme and transformation classes and will derisk the investigation and application of biocatalytic methods. A two-phase machine-learning-based tool making use of high-throughput experimentation is introduced to examine the connections between chemical and protein sequence space and predict productive biocatalytic reactions among substrate and enzyme pairs.","PeriodicalId":18787,"journal":{"name":"Nature","volume":"646 8083","pages":"108-116"},"PeriodicalIF":48.5000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s41586-025-09519-5.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature","FirstCategoryId":"103","ListUrlMain":"https://www.nature.com/articles/s41586-025-09519-5","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

The application of biocatalysis in synthesis has the potential to offer streamlined routes towards target molecules1, tunable catalyst-controlled selectivity2, as well as processes with improved sustainability3. Despite these advantages, biocatalysis is often a high-risk strategy to implement, as identifying an enzyme capable of performing chemistry on a specific intermediate required for a synthesis can be a roadblock that requires extensive screening of enzymes and protein engineering to overcome4. Strategies for predicting which enzyme and small molecule are compatible have been hindered by the lack of well-studied biocatalytic reaction datasets5. The underexploration of connections between chemical and protein sequence space constrains navigation between these two landscapes. Here we report a two-phase effort relying on high-throughput experimentation to populate connections between productive substrate and enzyme pairs and the subsequent development of a tool, CATNIP, for predicting compatible α-ketoglutarate (α-KG)/Fe(ii)-dependent enzymes for a given substrate or, conversely, for ranking potential substrates for a given α-KG/Fe(ii)-dependent enzyme sequence. We anticipate that our approach can be readily expanded to further enzyme and transformation classes and will derisk the investigation and application of biocatalytic methods. A two-phase machine-learning-based tool making use of high-throughput experimentation is introduced to examine the connections between chemical and protein sequence space and predict productive biocatalytic reactions among substrate and enzyme pairs.

Abstract Image

连接化学和蛋白质序列空间以预测生物催化反应
生物催化在合成中的应用有可能提供通向目标分子的流线型路线1,可调节的催化剂控制的选择性2,以及提高可持续性的过程3。尽管有这些优势,生物催化通常是一种高风险的策略,因为识别一种能够对合成所需的特定中间体进行化学反应的酶可能是一个障碍,需要大量的酶筛选和蛋白质工程才能克服。由于缺乏经过充分研究的生物催化反应数据集,预测哪些酶和小分子相容的策略受到了阻碍。对化学和蛋白质序列空间之间联系的探索不足限制了这两种景观之间的导航。在这里,我们报告了两阶段的努力,依靠高通量实验来建立生产底物和酶对之间的联系,以及随后开发的工具CATNIP,用于预测给定底物的相容α-酮戊二酸(α-KG)/Fe(ii)依赖性酶,或者相反,用于对给定α-KG/Fe(ii)依赖性酶序列的潜在底物进行排序。我们预计我们的方法可以很容易地扩展到进一步的酶和转化类,并将冒险的研究和应用生物催化方法。介绍了一种基于两阶段机器学习的工具,该工具利用高通量实验来检查化学和蛋白质序列空间之间的联系,并预测底物和酶对之间的生产性生物催化反应。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Nature
Nature 综合性期刊-综合性期刊
CiteScore
90.00
自引率
1.20%
发文量
3652
审稿时长
3 months
期刊介绍: Nature is a prestigious international journal that publishes peer-reviewed research in various scientific and technological fields. The selection of articles is based on criteria such as originality, importance, interdisciplinary relevance, timeliness, accessibility, elegance, and surprising conclusions. In addition to showcasing significant scientific advances, Nature delivers rapid, authoritative, insightful news, and interpretation of current and upcoming trends impacting science, scientists, and the broader public. The journal serves a dual purpose: firstly, to promptly share noteworthy scientific advances and foster discussions among scientists, and secondly, to ensure the swift dissemination of scientific results globally, emphasizing their significance for knowledge, culture, and daily life.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信