预测药物开发中的心脏毒性：一种深度学习方法。

IF 8.9

Journal of pharmaceutical analysis Pub Date : 2025-08-01 Epub Date: 2025-03-12 DOI:10.1016/j.jpha.2025.101263

Kaifeng Liu, Huizi Cui, Xiangyu Yu, Wannan Li, Weiwei Han

{"title":"预测药物开发中的心脏毒性：一种深度学习方法。","authors":"Kaifeng Liu, Huizi Cui, Xiangyu Yu, Wannan Li, Weiwei Han","doi":"10.1016/j.jpha.2025.101263","DOIUrl":null,"url":null,"abstract":"Cardiotoxicity is a critical issue in drug development that poses serious health risks, including potentially fatal arrhythmias. The human ether-à-go-go related gene (hERG) potassium channel, as one of the primary targets of cardiotoxicity, has garnered widespread attention. Traditional cardiotoxicity testing methods are expensive and time-consuming, making computational virtual screening a suitable alternative. In this study, we employed machine learning techniques utilizing molecular fingerprints and descriptors to predict the cardiotoxicity of compounds, with the aim of improving prediction accuracy and efficiency. We used four types of molecular fingerprints and descriptors combined with machine learning and deep learning algorithms, including Gaussian naive Bayes (NB), random forest (RF), support vector machine (SVM), K-nearest neighbors (KNN), eXtreme gradient boosting (XGBoost), and Transformer models, to build predictive models. Our models demonstrated advanced predictive performance. The best machine learning model, XGBoost Morgan, achieved an accuracy (ACC) value of 0.84, and the deep learning model, Transformer_Morgan, achieved the best ACC value of 0.85, showing a high ability to distinguish between toxic and non-toxic compounds. On an external independent validation set, it achieved the best area under the curve (AUC) value of 0.93, surpassing ADMETlab3.0, Cardpred, and CardioDPi. In addition, we explored the integration of molecular descriptors and fingerprints to enhance model performance and found that ensemble methods, such as voting and stacking, provided slight improvements in model stability. Furthermore, the SHapley Additive exPlanations (SHAP) explanations revealed the relationship between benzene rings, fluorine-containing groups, NH groups, oxygen in ether groups, and cardiotoxicity, highlighting the importance of these features. This study not only improved the predictive accuracy of cardiotoxicity models but also promoted a more reliable and scientifically interpretable method for drug safety assessment. Using computational methods, this study facilitates a more efficient drug development process, reduces costs, and improves the safety of new drug candidates, ultimately benefiting medical and public health.","PeriodicalId":94338,"journal":{"name":"Journal of pharmaceutical analysis","volume":"15 8","pages":"101263"},"PeriodicalIF":8.9000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12446640/pdf/","citationCount":"0","resultStr":"{\"title\":\"Predicting cardiotoxicity in drug development: A deep learning approach.\",\"authors\":\"Kaifeng Liu, Huizi Cui, Xiangyu Yu, Wannan Li, Weiwei Han\",\"doi\":\"10.1016/j.jpha.2025.101263\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cardiotoxicity is a critical issue in drug development that poses serious health risks, including potentially fatal arrhythmias. The human ether-à-go-go related gene (hERG) potassium channel, as one of the primary targets of cardiotoxicity, has garnered widespread attention. Traditional cardiotoxicity testing methods are expensive and time-consuming, making computational virtual screening a suitable alternative. In this study, we employed machine learning techniques utilizing molecular fingerprints and descriptors to predict the cardiotoxicity of compounds, with the aim of improving prediction accuracy and efficiency. We used four types of molecular fingerprints and descriptors combined with machine learning and deep learning algorithms, including Gaussian naive Bayes (NB), random forest (RF), support vector machine (SVM), K-nearest neighbors (KNN), eXtreme gradient boosting (XGBoost), and Transformer models, to build predictive models. Our models demonstrated advanced predictive performance. The best machine learning model, XGBoost Morgan, achieved an accuracy (ACC) value of 0.84, and the deep learning model, Transformer_Morgan, achieved the best ACC value of 0.85, showing a high ability to distinguish between toxic and non-toxic compounds. On an external independent validation set, it achieved the best area under the curve (AUC) value of 0.93, surpassing ADMETlab3.0, Cardpred, and CardioDPi. In addition, we explored the integration of molecular descriptors and fingerprints to enhance model performance and found that ensemble methods, such as voting and stacking, provided slight improvements in model stability. Furthermore, the SHapley Additive exPlanations (SHAP) explanations revealed the relationship between benzene rings, fluorine-containing groups, NH groups, oxygen in ether groups, and cardiotoxicity, highlighting the importance of these features. This study not only improved the predictive accuracy of cardiotoxicity models but also promoted a more reliable and scientifically interpretable method for drug safety assessment. Using computational methods, this study facilitates a more efficient drug development process, reduces costs, and improves the safety of new drug candidates, ultimately benefiting medical and public health.\",\"PeriodicalId\":94338,\"journal\":{\"name\":\"Journal of pharmaceutical analysis\",\"volume\":\"15 8\",\"pages\":\"101263\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12446640/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of pharmaceutical analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.jpha.2025.101263\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/3/12 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of pharmaceutical analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.jpha.2025.101263","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/12 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

心脏毒性是药物开发中的一个关键问题，它会带来严重的健康风险，包括可能致命的心律失常。人醚-à-go-go相关基因（hERG）钾通道作为心脏毒性的主要靶点之一，受到了广泛关注。传统的心脏毒性测试方法既昂贵又耗时，因此计算虚拟筛选是一种合适的替代方法。在这项研究中，我们利用分子指纹和描述符的机器学习技术来预测化合物的心脏毒性，目的是提高预测的准确性和效率。我们使用四种类型的分子指纹和描述符，结合机器学习和深度学习算法，包括高斯朴素贝叶斯（NB）、随机森林（RF）、支持向量机（SVM）、k近邻（KNN）、极限梯度增强（XGBoost）和Transformer模型，建立预测模型。我们的模型展示了先进的预测性能。最佳机器学习模型XGBoost Morgan达到了0.84的精度（ACC）值，深度学习模型Transformer_Morgan达到了0.85的最佳ACC值，显示出了区分有毒和无毒化合物的高能力。在外部独立验证集上，达到了最佳曲线下面积（AUC）值0.93，超过了ADMETlab3.0、Cardpred和CardioDPi。此外，我们探索了分子描述符和指纹的集成来提高模型性能，并发现集成方法，如投票和堆叠，在模型稳定性方面提供了轻微的改进。此外，SHapley加性解释（SHAP）解释揭示了苯环、含氟基团、NH基团、醚中的氧和心脏毒性之间的关系，突出了这些特征的重要性。本研究不仅提高了心脏毒性模型的预测准确性，而且为药物安全性评估提供了一种更可靠、更科学的解释方法。利用计算方法，本研究促进了更有效的药物开发过程，降低了成本，提高了新药候选的安全性，最终有利于医疗和公众健康。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Predicting cardiotoxicity in drug development: A deep learning approach.

Cardiotoxicity is a critical issue in drug development that poses serious health risks, including potentially fatal arrhythmias. The human ether-à-go-go related gene (hERG) potassium channel, as one of the primary targets of cardiotoxicity, has garnered widespread attention. Traditional cardiotoxicity testing methods are expensive and time-consuming, making computational virtual screening a suitable alternative. In this study, we employed machine learning techniques utilizing molecular fingerprints and descriptors to predict the cardiotoxicity of compounds, with the aim of improving prediction accuracy and efficiency. We used four types of molecular fingerprints and descriptors combined with machine learning and deep learning algorithms, including Gaussian naive Bayes (NB), random forest (RF), support vector machine (SVM), K-nearest neighbors (KNN), eXtreme gradient boosting (XGBoost), and Transformer models, to build predictive models. Our models demonstrated advanced predictive performance. The best machine learning model, XGBoost Morgan, achieved an accuracy (ACC) value of 0.84, and the deep learning model, Transformer_Morgan, achieved the best ACC value of 0.85, showing a high ability to distinguish between toxic and non-toxic compounds. On an external independent validation set, it achieved the best area under the curve (AUC) value of 0.93, surpassing ADMETlab3.0, Cardpred, and CardioDPi. In addition, we explored the integration of molecular descriptors and fingerprints to enhance model performance and found that ensemble methods, such as voting and stacking, provided slight improvements in model stability. Furthermore, the SHapley Additive exPlanations (SHAP) explanations revealed the relationship between benzene rings, fluorine-containing groups, NH groups, oxygen in ether groups, and cardiotoxicity, highlighting the importance of these features. This study not only improved the predictive accuracy of cardiotoxicity models but also promoted a more reliable and scientifically interpretable method for drug safety assessment. Using computational methods, this study facilitates a more efficient drug development process, reduces costs, and improves the safety of new drug candidates, ultimately benefiting medical and public health.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of pharmaceutical analysis

自引率

0.00%

发文量