Jaehyung Kim, Jihoon Woo, Joon Young Park, Kyung-Jin Kim, Donghyuk Kim
{"title":"Deep learning for NAD/NADP cofactor prediction and engineering using transformer attention analysis in enzymes.","authors":"Jaehyung Kim, Jihoon Woo, Joon Young Park, Kyung-Jin Kim, Donghyuk Kim","doi":"10.1016/j.ymben.2024.11.007","DOIUrl":null,"url":null,"abstract":"<p><p>Understanding and manipulating the cofactor preferences of NAD(P)-dependent oxidoreductases, the most widely distributed enzyme group in nature, is increasingly crucial in bioengineering. However, large-scale identification of the cofactor preferences and the design of mutants to switch cofactor specificity remain as complex tasks. Here, we introduce DISCODE (Deep learning-based iterative pipeline to analyze Specificity of Cofactors and to Design Enzyme), a novel transformer-based deep learning model to predict NAD(P) cofactor preferences. For model training, a total of 7,132 NAD(P)-dependent enzyme sequences were collected. Leveraging whole-length sequence information, DISCODE classifies the cofactor preferences of NAD(P)-dependent oxidoreductase protein sequences without structural or taxonomic limitation. The model showed 97.4% and 97.3% of accuracy and F1 score, respectively. A notable feature of DISCODE is the interpretability of its transformer layers. Analysis of attention layers in the model enables identification of several residues that showed significantly higher attention weights. They were well aligned with structurally important residues that closely interact with NAD(P), facilitating the identification of key residues for determining cofactor specificities. These key residues showed high consistency with verified cofactor switching mutants. Integrated into an enzyme design pipeline, DISCODE coupled with attention analysis, enables a fully automated approach to redesign cofactor specificity.</p>","PeriodicalId":18483,"journal":{"name":"Metabolic engineering","volume":" ","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Metabolic engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.ymben.2024.11.007","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Understanding and manipulating the cofactor preferences of NAD(P)-dependent oxidoreductases, the most widely distributed enzyme group in nature, is increasingly crucial in bioengineering. However, large-scale identification of the cofactor preferences and the design of mutants to switch cofactor specificity remain as complex tasks. Here, we introduce DISCODE (Deep learning-based iterative pipeline to analyze Specificity of Cofactors and to Design Enzyme), a novel transformer-based deep learning model to predict NAD(P) cofactor preferences. For model training, a total of 7,132 NAD(P)-dependent enzyme sequences were collected. Leveraging whole-length sequence information, DISCODE classifies the cofactor preferences of NAD(P)-dependent oxidoreductase protein sequences without structural or taxonomic limitation. The model showed 97.4% and 97.3% of accuracy and F1 score, respectively. A notable feature of DISCODE is the interpretability of its transformer layers. Analysis of attention layers in the model enables identification of several residues that showed significantly higher attention weights. They were well aligned with structurally important residues that closely interact with NAD(P), facilitating the identification of key residues for determining cofactor specificities. These key residues showed high consistency with verified cofactor switching mutants. Integrated into an enzyme design pipeline, DISCODE coupled with attention analysis, enables a fully automated approach to redesign cofactor specificity.
期刊介绍:
Metabolic Engineering (MBE) is a journal that focuses on publishing original research papers on the directed modulation of metabolic pathways for metabolite overproduction or the enhancement of cellular properties. It welcomes papers that describe the engineering of native pathways and the synthesis of heterologous pathways to convert microorganisms into microbial cell factories. The journal covers experimental, computational, and modeling approaches for understanding metabolic pathways and manipulating them through genetic, media, or environmental means. Effective exploration of metabolic pathways necessitates the use of molecular biology and biochemistry methods, as well as engineering techniques for modeling and data analysis. MBE serves as a platform for interdisciplinary research in fields such as biochemistry, molecular biology, applied microbiology, cellular physiology, cellular nutrition in health and disease, and biochemical engineering. The journal publishes various types of papers, including original research papers and review papers. It is indexed and abstracted in databases such as Scopus, Embase, EMBiology, Current Contents - Life Sciences and Clinical Medicine, Science Citation Index, PubMed/Medline, CAS and Biotechnology Citation Index.