基于分子描述符和图卷积网络的集成机器学习方法用于预测MDR1和BCRP转运蛋白的流出活性。

IF 5 3区 医学 Q1 PHARMACOLOGY & PHARMACY
Asahi Adachi, Tomoki Yamashita, Shigehiko Kanaya, Yohei Kosugi
{"title":"基于分子描述符和图卷积网络的集成机器学习方法用于预测MDR1和BCRP转运蛋白的流出活性。","authors":"Asahi Adachi,&nbsp;Tomoki Yamashita,&nbsp;Shigehiko Kanaya,&nbsp;Yohei Kosugi","doi":"10.1208/s12248-023-00853-y","DOIUrl":null,"url":null,"abstract":"<p><p>Multidrug resistance (MDR1) and breast cancer resistance protein (BCRP) play important roles in drug absorption and distribution. Computational prediction of substrates for both transporters can help reduce time in drug discovery. This study aimed to predict the efflux activity of MDR1 and BCRP using multiple machine learning approaches with molecular descriptors and graph convolutional networks (GCNs). In vitro efflux activity was determined using MDR1- and BCRP-expressing cells. Predictive performance was assessed using an in-house dataset with a chronological split and an external dataset. CatBoost and support vector regression showed the best predictive performance for MDR1 and BCRP efflux activities, respectively, of the 25 descriptor-based machine learning methods based on the coefficient of determination (R<sup>2</sup>). The single-task GCN showed a slightly lower performance than descriptor-based prediction in the in-house dataset. In both approaches, the percentage of compounds predicted within twofold of the observed values in the external dataset was lower than that in the in-house dataset. Multi-task GCN did not show any improvements, whereas multimodal GCN increased the predictive performance of BCRP efflux activity compared with single-task GCN. Furthermore, the ensemble approach of descriptor-based machine learning and GCN achieved the highest predictive performance with R<sup>2</sup> values of 0.706 and 0.587 in MDR1 and BCRP, respectively, in time-split test sets. This result suggests that two different approaches to represent molecular structures complement each other in terms of molecular characteristics. Our study demonstrated that predictive models using advanced machine learning approaches are beneficial for identifying potential substrate liability of both MDR1 and BCRP.</p>","PeriodicalId":50934,"journal":{"name":"AAPS Journal","volume":null,"pages":null},"PeriodicalIF":5.0000,"publicationDate":"2023-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Ensemble Machine Learning Approaches Based on Molecular Descriptors and Graph Convolutional Networks for Predicting the Efflux Activities of MDR1 and BCRP Transporters.\",\"authors\":\"Asahi Adachi,&nbsp;Tomoki Yamashita,&nbsp;Shigehiko Kanaya,&nbsp;Yohei Kosugi\",\"doi\":\"10.1208/s12248-023-00853-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Multidrug resistance (MDR1) and breast cancer resistance protein (BCRP) play important roles in drug absorption and distribution. Computational prediction of substrates for both transporters can help reduce time in drug discovery. This study aimed to predict the efflux activity of MDR1 and BCRP using multiple machine learning approaches with molecular descriptors and graph convolutional networks (GCNs). In vitro efflux activity was determined using MDR1- and BCRP-expressing cells. Predictive performance was assessed using an in-house dataset with a chronological split and an external dataset. CatBoost and support vector regression showed the best predictive performance for MDR1 and BCRP efflux activities, respectively, of the 25 descriptor-based machine learning methods based on the coefficient of determination (R<sup>2</sup>). The single-task GCN showed a slightly lower performance than descriptor-based prediction in the in-house dataset. In both approaches, the percentage of compounds predicted within twofold of the observed values in the external dataset was lower than that in the in-house dataset. Multi-task GCN did not show any improvements, whereas multimodal GCN increased the predictive performance of BCRP efflux activity compared with single-task GCN. Furthermore, the ensemble approach of descriptor-based machine learning and GCN achieved the highest predictive performance with R<sup>2</sup> values of 0.706 and 0.587 in MDR1 and BCRP, respectively, in time-split test sets. This result suggests that two different approaches to represent molecular structures complement each other in terms of molecular characteristics. Our study demonstrated that predictive models using advanced machine learning approaches are beneficial for identifying potential substrate liability of both MDR1 and BCRP.</p>\",\"PeriodicalId\":50934,\"journal\":{\"name\":\"AAPS Journal\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2023-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AAPS Journal\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1208/s12248-023-00853-y\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PHARMACOLOGY & PHARMACY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AAPS Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1208/s12248-023-00853-y","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PHARMACOLOGY & PHARMACY","Score":null,"Total":0}
引用次数: 0

摘要

多药耐药(MDR1)和乳腺癌症耐药蛋白(BCRP)在药物吸收和分布中起着重要作用。对两种转运蛋白的底物进行计算预测可以帮助缩短药物发现的时间。本研究旨在使用具有分子描述符和图卷积网络(GCN)的多种机器学习方法来预测MDR1和BCRP的流出活性。使用MDR1-和BCRP表达细胞测定体外流出活性。预测性能使用按时间划分的内部数据集和外部数据集进行评估。在基于决定系数(R2)的25种基于描述符的机器学习方法中,CatBoost和支持向量回归分别显示出MDR1和BCRP流出活动的最佳预测性能。在内部数据集中,单任务GCN的性能略低于基于描述符的预测。在这两种方法中,外部数据集中预测的化合物百分比在观测值的两倍以内,低于内部数据集中的预测百分比。多任务GCN没有显示出任何改善,而与单任务GCN相比,多模式GCN提高了BCRP流出活动的预测性能。此外,在时间分割测试集中,基于描述符的机器学习和GCN的集成方法在MDR1和BCRP中分别获得了最高的预测性能,R2值分别为0.706和0.587。这一结果表明,表示分子结构的两种不同方法在分子特征方面是互补的。我们的研究表明,使用先进机器学习方法的预测模型有利于识别MDR1和BCRP的潜在底物责任。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Ensemble Machine Learning Approaches Based on Molecular Descriptors and Graph Convolutional Networks for Predicting the Efflux Activities of MDR1 and BCRP Transporters.

Ensemble Machine Learning Approaches Based on Molecular Descriptors and Graph Convolutional Networks for Predicting the Efflux Activities of MDR1 and BCRP Transporters.

Multidrug resistance (MDR1) and breast cancer resistance protein (BCRP) play important roles in drug absorption and distribution. Computational prediction of substrates for both transporters can help reduce time in drug discovery. This study aimed to predict the efflux activity of MDR1 and BCRP using multiple machine learning approaches with molecular descriptors and graph convolutional networks (GCNs). In vitro efflux activity was determined using MDR1- and BCRP-expressing cells. Predictive performance was assessed using an in-house dataset with a chronological split and an external dataset. CatBoost and support vector regression showed the best predictive performance for MDR1 and BCRP efflux activities, respectively, of the 25 descriptor-based machine learning methods based on the coefficient of determination (R2). The single-task GCN showed a slightly lower performance than descriptor-based prediction in the in-house dataset. In both approaches, the percentage of compounds predicted within twofold of the observed values in the external dataset was lower than that in the in-house dataset. Multi-task GCN did not show any improvements, whereas multimodal GCN increased the predictive performance of BCRP efflux activity compared with single-task GCN. Furthermore, the ensemble approach of descriptor-based machine learning and GCN achieved the highest predictive performance with R2 values of 0.706 and 0.587 in MDR1 and BCRP, respectively, in time-split test sets. This result suggests that two different approaches to represent molecular structures complement each other in terms of molecular characteristics. Our study demonstrated that predictive models using advanced machine learning approaches are beneficial for identifying potential substrate liability of both MDR1 and BCRP.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
AAPS Journal
AAPS Journal 医学-药学
CiteScore
7.80
自引率
4.40%
发文量
109
审稿时长
1 months
期刊介绍: The AAPS Journal, an official journal of the American Association of Pharmaceutical Scientists (AAPS), publishes novel and significant findings in the various areas of pharmaceutical sciences impacting human and veterinary therapeutics, including: · Drug Design and Discovery · Pharmaceutical Biotechnology · Biopharmaceutics, Formulation, and Drug Delivery · Metabolism and Transport · Pharmacokinetics, Pharmacodynamics, and Pharmacometrics · Translational Research · Clinical Evaluations and Therapeutic Outcomes · Regulatory Science We invite submissions under the following article types: · Original Research Articles · Reviews and Mini-reviews · White Papers, Commentaries, and Editorials · Meeting Reports · Brief/Technical Reports and Rapid Communications · Regulatory Notes · Tutorials · Protocols in the Pharmaceutical Sciences In addition, The AAPS Journal publishes themes, organized by guest editors, which are focused on particular areas of current interest to our field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信