Yuanyuan Lei,Rui Liu,Hanxi Yu,Wentao Xu,Ting Long,Hu Mei
{"title":"TCNeKP: A Novel Deep Learning Architecture for Enzyme Catalytic Activity Prediction.","authors":"Yuanyuan Lei,Rui Liu,Hanxi Yu,Wentao Xu,Ting Long,Hu Mei","doi":"10.1021/acs.jcim.5c01830","DOIUrl":null,"url":null,"abstract":"Accurate prediction of enzyme kinetic parameters (Kcat and Km) is crucial for enzyme rational design and engineering research. Based on a heterogeneous data set encompassing 17,893 Kcat and 24,585 Km records across 8911 enzyme sequences from 7 EC classes and 5023 substrates, we introduce novel TCNeKP models for predicting Kcat and Km values. Herein, enzymes' sequences were autoembedded and processed by a temporal convolutional network (TCN) module to extract the key features of catalytic and binding residues frequently located far apart in the primary sequences; substrates were encoded by a pretrained SMILES-Transformer language model; and catalytic conditions (pH and temperature) were encoded via radial basis function (RBF). The fused features were then fed into a fully connected network for single-task prediction of Kcat and Km. Results demonstrate that TCNeKP-Kcat and TCNeKP-Km models achieve robust performance across wild-type and mutant enzymes from 7 EC classes, outperforming state-of-the-art MPEK, UniKP, and DLKcat models (Table S3). Leveraging a cross-task dynamic parameter-sharing module with attention mechanism, we further developed a multitask TCNeKP model that achieves the highest R2 values among the benchmark models for both Kcat (0.677) and Km (0.657) prediction. These findings indicate that collaborative learning between Kcat and Km prediction tasks enhances feature extraction for enzyme-substrate binding and catalysis, thereby significantly enhancing the predictive performance.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"37 1","pages":""},"PeriodicalIF":5.3000,"publicationDate":"2025-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Information and Modeling ","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.jcim.5c01830","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate prediction of enzyme kinetic parameters (Kcat and Km) is crucial for enzyme rational design and engineering research. Based on a heterogeneous data set encompassing 17,893 Kcat and 24,585 Km records across 8911 enzyme sequences from 7 EC classes and 5023 substrates, we introduce novel TCNeKP models for predicting Kcat and Km values. Herein, enzymes' sequences were autoembedded and processed by a temporal convolutional network (TCN) module to extract the key features of catalytic and binding residues frequently located far apart in the primary sequences; substrates were encoded by a pretrained SMILES-Transformer language model; and catalytic conditions (pH and temperature) were encoded via radial basis function (RBF). The fused features were then fed into a fully connected network for single-task prediction of Kcat and Km. Results demonstrate that TCNeKP-Kcat and TCNeKP-Km models achieve robust performance across wild-type and mutant enzymes from 7 EC classes, outperforming state-of-the-art MPEK, UniKP, and DLKcat models (Table S3). Leveraging a cross-task dynamic parameter-sharing module with attention mechanism, we further developed a multitask TCNeKP model that achieves the highest R2 values among the benchmark models for both Kcat (0.677) and Km (0.657) prediction. These findings indicate that collaborative learning between Kcat and Km prediction tasks enhances feature extraction for enzyme-substrate binding and catalysis, thereby significantly enhancing the predictive performance.
期刊介绍:
The Journal of Chemical Information and Modeling publishes papers reporting new methodology and/or important applications in the fields of chemical informatics and molecular modeling. Specific topics include the representation and computer-based searching of chemical databases, molecular modeling, computer-aided molecular design of new materials, catalysts, or ligands, development of new computational methods or efficient algorithms for chemical software, and biopharmaceutical chemistry including analyses of biological activity and other issues related to drug discovery.
Astute chemists, computer scientists, and information specialists look to this monthly’s insightful research studies, programming innovations, and software reviews to keep current with advances in this integral, multidisciplinary field.
As a subscriber you’ll stay abreast of database search systems, use of graph theory in chemical problems, substructure search systems, pattern recognition and clustering, analysis of chemical and physical data, molecular modeling, graphics and natural language interfaces, bibliometric and citation analysis, and synthesis design and reactions databases.