A deep learning based multi-model approach for predicting drug-like chemical compound’s toxicity

IF 4.2 3区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

Methods Pub Date : 2024-05-01 DOI:10.1016/j.ymeth.2024.04.020

Konda Mani Saravanan , Jiang-Fan Wan , Liujiang Dai , Jiajun Zhang , John Z.H. Zhang , Haiping Zhang

{"title":"A deep learning based multi-model approach for predicting drug-like chemical compound’s toxicity","authors":"Konda Mani Saravanan , Jiang-Fan Wan , Liujiang Dai , Jiajun Zhang , John Z.H. Zhang , Haiping Zhang","doi":"10.1016/j.ymeth.2024.04.020","DOIUrl":null,"url":null,"abstract":"<div><p>Ensuring the safety and efficacy of chemical compounds is crucial in small-molecule drug development. In the later stages of drug development, toxic compounds pose a significant challenge, losing valuable resources and time. Early and accurate prediction of compound toxicity using deep learning models offers a promising solution to mitigate these risks during drug discovery. In this study, we present the development of several deep-learning models aimed at evaluating different types of compound toxicity, including acute toxicity, carcinogenicity, hERG_cardiotoxicity (the human ether-a-go-go related gene caused cardiotoxicity), hepatotoxicity, and mutagenicity. To address the inherent variations in data size, label type, and distribution across different types of toxicity, we employed diverse training strategies. Our first approach involved utilizing a graph convolutional network (GCN) regression model to predict acute toxicity, which achieved notable performance with Pearson R 0.76, 0.74, and 0.65 for intraperitoneal, intravenous, and oral administration routes, respectively. Furthermore, we trained multiple GCN binary classification models, each tailored to a specific type of toxicity. These models exhibited high area under the curve (AUC) scores, with an impressive AUC of 0.69, 0.77, 0.88, and 0.79 for predicting carcinogenicity, hERG_cardiotoxicity, mutagenicity, and hepatotoxicity, respectively. Additionally, we have used the approved drug dataset to determine the appropriate threshold value for the prediction score in model usage. We integrated these models into a virtual screening pipeline to assess their effectiveness in identifying potential low-toxicity drug candidates. Our findings indicate that this deep learning approach has the potential to significantly reduce the cost and risk associated with drug development by expediting the selection of compounds with low toxicity profiles. Therefore, the models developed in this study hold promise as critical tools for early drug candidate screening and selection.</p></div>","PeriodicalId":390,"journal":{"name":"Methods","volume":"226 ","pages":"Pages 164-175"},"PeriodicalIF":4.2000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1046202324001105","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Ensuring the safety and efficacy of chemical compounds is crucial in small-molecule drug development. In the later stages of drug development, toxic compounds pose a significant challenge, losing valuable resources and time. Early and accurate prediction of compound toxicity using deep learning models offers a promising solution to mitigate these risks during drug discovery. In this study, we present the development of several deep-learning models aimed at evaluating different types of compound toxicity, including acute toxicity, carcinogenicity, hERG_cardiotoxicity (the human ether-a-go-go related gene caused cardiotoxicity), hepatotoxicity, and mutagenicity. To address the inherent variations in data size, label type, and distribution across different types of toxicity, we employed diverse training strategies. Our first approach involved utilizing a graph convolutional network (GCN) regression model to predict acute toxicity, which achieved notable performance with Pearson R 0.76, 0.74, and 0.65 for intraperitoneal, intravenous, and oral administration routes, respectively. Furthermore, we trained multiple GCN binary classification models, each tailored to a specific type of toxicity. These models exhibited high area under the curve (AUC) scores, with an impressive AUC of 0.69, 0.77, 0.88, and 0.79 for predicting carcinogenicity, hERG_cardiotoxicity, mutagenicity, and hepatotoxicity, respectively. Additionally, we have used the approved drug dataset to determine the appropriate threshold value for the prediction score in model usage. We integrated these models into a virtual screening pipeline to assess their effectiveness in identifying potential low-toxicity drug candidates. Our findings indicate that this deep learning approach has the potential to significantly reduce the cost and risk associated with drug development by expediting the selection of compounds with low toxicity profiles. Therefore, the models developed in this study hold promise as critical tools for early drug candidate screening and selection.

查看原文本刊更多论文

基于深度学习的多模型方法预测类药物化合物的毒性

确保化合物的安全性和有效性对小分子药物开发至关重要。在药物开发的后期阶段，有毒化合物会带来巨大挑战，损失宝贵的资源和时间。利用深度学习模型对化合物毒性进行早期准确预测，为降低药物研发过程中的这些风险提供了一种前景广阔的解决方案。在本研究中，我们介绍了几种深度学习模型的开发情况，这些模型旨在评估不同类型的化合物毒性，包括急性毒性、致癌性、hERG_cardiotoxicity（人醚相关基因引起的心脏毒性）、肝毒性和致突变性。为了解决不同类型毒性的数据大小、标签类型和分布方面的固有差异，我们采用了不同的训练策略。我们采用的第一种方法是利用图卷积网络（GCN）回归模型预测急性毒性，该模型在腹腔给药、静脉给药和口服给药途径上的 Pearson R 值分别为 0.76、0.74 和 0.65，取得了显著的效果。此外，我们还训练了多个 GCN 二进制分类模型，每个模型都针对特定类型的毒性。这些模型显示出很高的曲线下面积（AUC）得分，在预测致癌性、hERG_心脏毒性、诱变性和肝毒性方面，AUC 分别达到了令人印象深刻的 0.69、0.77、0.88 和 0.79。此外，我们还利用已获批准的药物数据集来确定模型使用中预测得分的适当阈值。我们将这些模型集成到虚拟筛选管道中，以评估它们在识别潜在低毒性候选药物方面的有效性。我们的研究结果表明，这种深度学习方法有可能通过加快筛选低毒性化合物，显著降低药物开发的相关成本和风险。因此，本研究开发的模型有望成为早期候选药物筛选的重要工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Methods 生物-生化研究方法

CiteScore

9.80

自引率

2.10%

发文量

222

审稿时长

11.3 weeks

期刊介绍： Methods focuses on rapidly developing techniques in the experimental biological and medical sciences. Each topical issue, organized by a guest editor who is an expert in the area covered, consists solely of invited quality articles by specialist authors, many of them reviews. Issues are devoted to specific technical approaches with emphasis on clear detailed descriptions of protocols that allow them to be reproduced easily. The background information provided enables researchers to understand the principles underlying the methods; other helpful sections include comparisons of alternative methods giving the advantages and disadvantages of particular methods, guidance on avoiding potential pitfalls, and suggestions for troubleshooting.