M3S-GRPred: a novel ensemble learning approach for the interpretable prediction of glucocorticoid receptor antagonists using a multi-step stacking strategy.

IF 2.9 3区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS
Nalini Schaduangrat, Hathaichanok Chuntakaruk, Thanyada Rungrotmongkol, Pakpoom Mookdarsanit, Watshara Shoombuatong
{"title":"M3S-GRPred: a novel ensemble learning approach for the interpretable prediction of glucocorticoid receptor antagonists using a multi-step stacking strategy.","authors":"Nalini Schaduangrat, Hathaichanok Chuntakaruk, Thanyada Rungrotmongkol, Pakpoom Mookdarsanit, Watshara Shoombuatong","doi":"10.1186/s12859-025-06132-1","DOIUrl":null,"url":null,"abstract":"<p><p>Accelerating drug discovery for glucocorticoid receptor (GR)-related disorders, including innovative machine learning (ML)-based approaches, holds promise in advancing therapeutic development, optimizing treatment efficacy, and mitigating adverse effects. While experimental methods can accurately identify GR antagonists, they are often not cost-effective for large-scale drug discovery. Thus, computational approaches leveraging SMILES information for precise in silico identification of GR antagonists are crucial, enabling efficient and scalable drug discovery. Here, we develop a new ensemble learning approach using a multi-step stacking strategy (M3S), termed M3S-GRPred, aimed at rapidly and accurately discovering novel GR antagonists. To the best of our knowledge, M3S-GRPred is the first SMILES-based predictor designed to identify GR antagonists without the use of 3D structural information. In M3S-GRPred, we first constructed different balanced subsets using an under-sampling approach. Using these balanced subsets, we explored and evaluated heterogeneous base-classifiers trained with a variety of SMILES-based feature descriptors coupled with popular ML algorithms. Finally, M3S-GRPred was constructed by integrating probabilistic feature from the selected base-classifiers derived from a two-step feature selection technique. Our comparative experiments demonstrate that M3S-GRPred can precisely identify GR antagonists and effectively address the imbalanced dataset. Compared to traditional ML classifiers, M3S-GRPred attained superior performance in terms of both the training and independent test datasets. Additionally, M3S-GRPred was applied to identify potential GR antagonists among FDA-approved drugs confirmed through molecular docking, followed by detailed MD simulation studies for drug repurposing in Cushing's syndrome. We anticipate that M3S-GRPred will serve as an efficient screening tool for discovering novel GR antagonists from vast libraries of unknown compounds in a cost-effective manner.</p>","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"117"},"PeriodicalIF":2.9000,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12044944/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-025-06132-1","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Accelerating drug discovery for glucocorticoid receptor (GR)-related disorders, including innovative machine learning (ML)-based approaches, holds promise in advancing therapeutic development, optimizing treatment efficacy, and mitigating adverse effects. While experimental methods can accurately identify GR antagonists, they are often not cost-effective for large-scale drug discovery. Thus, computational approaches leveraging SMILES information for precise in silico identification of GR antagonists are crucial, enabling efficient and scalable drug discovery. Here, we develop a new ensemble learning approach using a multi-step stacking strategy (M3S), termed M3S-GRPred, aimed at rapidly and accurately discovering novel GR antagonists. To the best of our knowledge, M3S-GRPred is the first SMILES-based predictor designed to identify GR antagonists without the use of 3D structural information. In M3S-GRPred, we first constructed different balanced subsets using an under-sampling approach. Using these balanced subsets, we explored and evaluated heterogeneous base-classifiers trained with a variety of SMILES-based feature descriptors coupled with popular ML algorithms. Finally, M3S-GRPred was constructed by integrating probabilistic feature from the selected base-classifiers derived from a two-step feature selection technique. Our comparative experiments demonstrate that M3S-GRPred can precisely identify GR antagonists and effectively address the imbalanced dataset. Compared to traditional ML classifiers, M3S-GRPred attained superior performance in terms of both the training and independent test datasets. Additionally, M3S-GRPred was applied to identify potential GR antagonists among FDA-approved drugs confirmed through molecular docking, followed by detailed MD simulation studies for drug repurposing in Cushing's syndrome. We anticipate that M3S-GRPred will serve as an efficient screening tool for discovering novel GR antagonists from vast libraries of unknown compounds in a cost-effective manner.

M3S-GRPred:一种新的集成学习方法,用于糖皮质激素受体拮抗剂的可解释预测,使用多步堆叠策略。
加速糖皮质激素受体(GR)相关疾病的药物发现,包括创新的基于机器学习(ML)的方法,有望推进治疗开发,优化治疗效果,减轻不良反应。虽然实验方法可以准确地识别GR拮抗剂,但对于大规模的药物发现来说,它们往往不具有成本效益。因此,利用SMILES信息对GR拮抗剂进行精确的计算机识别的计算方法至关重要,可以实现高效和可扩展的药物发现。在这里,我们开发了一种新的集成学习方法,使用多步堆叠策略(M3S),称为M3S- grpred,旨在快速准确地发现新的GR拮抗剂。据我们所知,M3S-GRPred是第一个基于smiles的预测器,旨在识别GR拮抗剂,而不使用3D结构信息。在M3S-GRPred中,我们首先使用欠采样方法构建了不同的平衡子集。使用这些平衡子集,我们探索并评估了使用各种基于smiles的特征描述符以及流行的ML算法训练的异构基分类器。最后,通过对两步特征选择技术得到的基分类器的概率特征进行整合,构建M3S-GRPred。我们的对比实验表明,M3S-GRPred可以精确识别GR拮抗剂并有效地解决不平衡数据集。与传统的ML分类器相比,M3S-GRPred在训练数据集和独立测试数据集方面都取得了更好的性能。此外,M3S-GRPred应用于通过分子对接确认的fda批准药物中识别潜在的GR拮抗剂,然后进行详细的MD模拟研究,用于库欣综合征药物再利用。我们预计M3S-GRPred将作为一种有效的筛选工具,以经济有效的方式从大量未知化合物中发现新的GR拮抗剂。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
BMC Bioinformatics
BMC Bioinformatics 生物-生化研究方法
CiteScore
5.70
自引率
3.30%
发文量
506
审稿时长
4.3 months
期刊介绍: BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology. BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信