Prediction of drug-induced nephrotoxicity based on deep learning algorithm and molecular fingerprints.

IF 3.8 2区 化学 Q2 CHEMISTRY, APPLIED
Shuailong Wang, Yan Li
{"title":"Prediction of drug-induced nephrotoxicity based on deep learning algorithm and molecular fingerprints.","authors":"Shuailong Wang, Yan Li","doi":"10.1007/s11030-025-11376-3","DOIUrl":null,"url":null,"abstract":"<p><p>Drug-induced nephrotoxicity (DIN) is an infrequent adverse reaction to medications and represents a complex clinical outcome influenced by multiple factors. Predicting DIN using preclinical animal models remains challenging, and in silico approaches have emerged as promising alternatives for DIN risk assessment. A high-quality dataset consisting of 1,018 compounds was constructed in this study. Compounds in this dataset were systematically collected from five authoritative sources: the SIDER, FDA, ChEMBL, DrugBank, and literature on \"drug-induced nephrotoxicity\" published in the past decade (screened via keyword search on PubMed). Clear criteria were followed for compound screening and label annotation: using \"kidney,\" \"nephrotoxicity,\" \"kidney injury,\" and \"kidney disease\" as core search terms, compounds retrieved that were clearly associated with kidney injury or could induce kidney disease were classified into the positive set (DIN = 1); compounds with no records of renal adverse reactions, or those explicitly having renal protective effects or used for treating renal diseases, were classified into the negative set (DIN = 0). Ultimately, a dataset of 1018 compounds with clear labels and reliable sources was integrated. The 42 classification models, which depended on six different molecular fingerprints, were built via deep neural network (DNN) and six machine learning algorithms. A comparative study demonstrated that models utilizing DNN consistently surpassed traditional machine learning approaches across six molecular fingerprint types. Notably, the ECFP_6 fingerprint exhibited the highest performance, achieving an area under the receiver operating characteristic curve (AUC) of 75.9%, an accuracy (ACC) of 71.4%, and an F1-score of 76.0%. Furthermore, the SHapley Additive exPlanations (SHAP) algorithm was applied to interpret the predictions of the high-performing models, identifying key structural fragments associated with DIN. The ten most influential substructures, identified based on their impact on model predictions, were chosen as early warning markers for future DIN screening research. Overall, these results suggest that DNN models utilizing molecular fingerprints can function as dependable and efficient tools for assessing nephrotoxicity risk in potential drug candidates during the initial phases of drug development.</p>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":" ","pages":""},"PeriodicalIF":3.8000,"publicationDate":"2025-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Diversity","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1007/s11030-025-11376-3","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, APPLIED","Score":null,"Total":0}
引用次数: 0

Abstract

Drug-induced nephrotoxicity (DIN) is an infrequent adverse reaction to medications and represents a complex clinical outcome influenced by multiple factors. Predicting DIN using preclinical animal models remains challenging, and in silico approaches have emerged as promising alternatives for DIN risk assessment. A high-quality dataset consisting of 1,018 compounds was constructed in this study. Compounds in this dataset were systematically collected from five authoritative sources: the SIDER, FDA, ChEMBL, DrugBank, and literature on "drug-induced nephrotoxicity" published in the past decade (screened via keyword search on PubMed). Clear criteria were followed for compound screening and label annotation: using "kidney," "nephrotoxicity," "kidney injury," and "kidney disease" as core search terms, compounds retrieved that were clearly associated with kidney injury or could induce kidney disease were classified into the positive set (DIN = 1); compounds with no records of renal adverse reactions, or those explicitly having renal protective effects or used for treating renal diseases, were classified into the negative set (DIN = 0). Ultimately, a dataset of 1018 compounds with clear labels and reliable sources was integrated. The 42 classification models, which depended on six different molecular fingerprints, were built via deep neural network (DNN) and six machine learning algorithms. A comparative study demonstrated that models utilizing DNN consistently surpassed traditional machine learning approaches across six molecular fingerprint types. Notably, the ECFP_6 fingerprint exhibited the highest performance, achieving an area under the receiver operating characteristic curve (AUC) of 75.9%, an accuracy (ACC) of 71.4%, and an F1-score of 76.0%. Furthermore, the SHapley Additive exPlanations (SHAP) algorithm was applied to interpret the predictions of the high-performing models, identifying key structural fragments associated with DIN. The ten most influential substructures, identified based on their impact on model predictions, were chosen as early warning markers for future DIN screening research. Overall, these results suggest that DNN models utilizing molecular fingerprints can function as dependable and efficient tools for assessing nephrotoxicity risk in potential drug candidates during the initial phases of drug development.

基于深度学习算法和分子指纹的药物肾毒性预测。
药物性肾毒性(DIN)是一种罕见的药物不良反应,是受多种因素影响的复杂临床结果。使用临床前动物模型预测DIN仍然具有挑战性,而计算机方法已成为DIN风险评估的有希望的替代方法。本研究构建了一个由1018个化合物组成的高质量数据集。本数据集中的化合物系统地从五个权威来源收集:SIDER、FDA、ChEMBL、DrugBank和过去十年发表的关于“药物性肾毒性”的文献(通过PubMed上的关键字搜索筛选)。化合物筛选和标签标注遵循明确的标准:以“肾”、“肾毒性”、“肾损伤”和“肾脏疾病”为核心搜索词,将检索到的与肾损伤明显相关或可诱导肾脏疾病的化合物归类为阳性组(DIN = 1);没有肾脏不良反应记录的化合物,或明确具有肾脏保护作用或用于治疗肾脏疾病的化合物,被归类为阴性组(DIN = 0)。最终,一个包含1018种化合物的数据集被整合,这些化合物具有清晰的标签和可靠的来源。42个分类模型依赖于6种不同的分子指纹,通过深度神经网络(DNN)和6种机器学习算法建立。一项比较研究表明,在六种分子指纹类型中,使用深度神经网络的模型始终优于传统的机器学习方法。其中,ECFP_6指纹识别效果最好,AUC为75.9%,准确度为71.4%,f1评分为76.0%。此外,应用SHapley加性解释(SHAP)算法来解释高性能模型的预测,识别与DIN相关的关键结构片段。根据其对模型预测的影响确定的十个最具影响力的子结构被选为未来DIN筛选研究的早期预警标志。总的来说,这些结果表明,在药物开发的初始阶段,利用分子指纹的DNN模型可以作为评估潜在候选药物肾毒性风险的可靠和有效的工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Molecular Diversity
Molecular Diversity 化学-化学综合
CiteScore
7.30
自引率
7.90%
发文量
219
审稿时长
2.7 months
期刊介绍: Molecular Diversity is a new publication forum for the rapid publication of refereed papers dedicated to describing the development, application and theory of molecular diversity and combinatorial chemistry in basic and applied research and drug discovery. The journal publishes both short and full papers, perspectives, news and reviews dealing with all aspects of the generation of molecular diversity, application of diversity for screening against alternative targets of all types (biological, biophysical, technological), analysis of results obtained and their application in various scientific disciplines/approaches including: combinatorial chemistry and parallel synthesis; small molecule libraries; microwave synthesis; flow synthesis; fluorous synthesis; diversity oriented synthesis (DOS); nanoreactors; click chemistry; multiplex technologies; fragment- and ligand-based design; structure/function/SAR; computational chemistry and molecular design; chemoinformatics; screening techniques and screening interfaces; analytical and purification methods; robotics, automation and miniaturization; targeted libraries; display libraries; peptides and peptoids; proteins; oligonucleotides; carbohydrates; natural diversity; new methods of library formulation and deconvolution; directed evolution, origin of life and recombination; search techniques, landscapes, random chemistry and more;
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信