Jiaojiao Fang, Yan Tang, Changda Gong, Zejun Huang, Yanjun Feng, Guixia Liu, Yun Tang and Weihua Li*,
{"title":"Prediction of Cytochrome P450 Substrates Using the Explainable Multitask Deep Learning Models","authors":"Jiaojiao Fang, Yan Tang, Changda Gong, Zejun Huang, Yanjun Feng, Guixia Liu, Yun Tang and Weihua Li*, ","doi":"10.1021/acs.chemrestox.4c0019910.1021/acs.chemrestox.4c00199","DOIUrl":null,"url":null,"abstract":"<p >Cytochromes P450 (P450s or CYPs) are the most important phase I metabolic enzymes in the human body and are responsible for metabolizing ∼75% of the clinically used drugs. P450-mediated metabolism is also closely associated with the formation of toxic metabolites and drug–drug interactions. Therefore, it is of high importance to predict if a compound is the substrate of a given P450 in the early stage of drug development. In this study, we built the multitask learning models to simultaneously predict the substrates of five major drug-metabolizing P450 enzymes, namely, CYP3A4, 2C9, 2C19, 2D6, and 1A2, based on the collected substrate data sets. Compared to the single-task model and conventional machine learning models, the multitask fingerprints and graph neural networks model achieved superior performance with the average AUC values of 90.8% on the test set. Notably, the multitask model demonstrated its good performance on the small amount of substrate data sets such as CYP1A2, 2C9, and 2C19. In addition, the Shapley additive explanation and the attention mechanism were used to reveal specific substructures associated with P450 substrates, which were further confirmed and complemented by the substructure mining tool and the literature.</p>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"3","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.chemrestox.4c00199","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Cytochromes P450 (P450s or CYPs) are the most important phase I metabolic enzymes in the human body and are responsible for metabolizing ∼75% of the clinically used drugs. P450-mediated metabolism is also closely associated with the formation of toxic metabolites and drug–drug interactions. Therefore, it is of high importance to predict if a compound is the substrate of a given P450 in the early stage of drug development. In this study, we built the multitask learning models to simultaneously predict the substrates of five major drug-metabolizing P450 enzymes, namely, CYP3A4, 2C9, 2C19, 2D6, and 1A2, based on the collected substrate data sets. Compared to the single-task model and conventional machine learning models, the multitask fingerprints and graph neural networks model achieved superior performance with the average AUC values of 90.8% on the test set. Notably, the multitask model demonstrated its good performance on the small amount of substrate data sets such as CYP1A2, 2C9, and 2C19. In addition, the Shapley additive explanation and the attention mechanism were used to reveal specific substructures associated with P450 substrates, which were further confirmed and complemented by the substructure mining tool and the literature.