开发环境毒性终点改进q-RASAR模型的多级ARKA框架。

IF 4.3 3区 环境科学与生态学 Q1 CHEMISTRY, ANALYTICAL
Arkaprava Banerjee and Kunal Roy
{"title":"开发环境毒性终点改进q-RASAR模型的多级ARKA框架。","authors":"Arkaprava Banerjee and Kunal Roy","doi":"10.1039/D5EM00068H","DOIUrl":null,"url":null,"abstract":"<p >The continuous quest for the quick, accurate, and efficient methods for filling the gaps in the toxicity data of commercial chemicals is the need of the hour. Thus, it has become essential to develop simple and improved modeling strategies that aim to generate more accurate predictions. Recently, quantitative Read-Across Structure–Activity Relationship (q-RASAR) modeling has been reported to enhance the external predictivity of QSAR models. However, the cross-validation metrics of some q-RASAR models show compromised values compared to those of the corresponding QSAR models. We report here an improved q-RASAR workflow coupled with the Arithmetic Residuals in <em>K</em>-groups Analysis (ARKA) framework. This improved workflow (ARKA-RASAR) considers two important aspects: the contribution of different QSAR descriptors to different experimental response ranges, and the identification of similarity among close congeners based on both the selected QSAR descriptors and the contribution of different QSAR descriptors to different experimental response ranges. A simple, free, and user-friendly Java-based tool, Multiclass ARKA-v1.0, has been developed to compute the multiclass ARKA descriptors. In this study, five different toxicity datasets previously used for the development of QSAR and q-RASAR models were considered. We developed hybrid ARKA models that consist of a combination of QSAR descriptors and ARKA descriptors. These hybrid feature spaces were used to compute RASAR descriptors and develop ARKA-RASAR models. We used the same modeling strategies used to develop the previously reported QSAR and q-RASAR models for a fair comparison. Additionally, these modeling algorithms are straightforward, reproducible, and transferable. A multi-criteria decision-making statistical approach, the Sum of Ranking Differences (SRD), indicated that the ARKA-RASAR models are the best-performing models, considering training, test, and cross-validation statistics. The least significant difference procedure ensured that the SRD values were significantly different for most models, presenting an unbiased workflow. True external validation using a set of pesticide metabolites and predicting their early-stage acute fish toxicity using relevant ARKA-RASAR models was also carried out and yielded encouraging results. The promising results and the ease of computation of ARKA and RASAR descriptors using our tools suggest that the ARKA-RASAR modeling framework may be a potential choice for developing highly robust and predictive models for filling the gaps in environmental toxicity data.</p>","PeriodicalId":74,"journal":{"name":"Environmental Science: Processes & Impacts","volume":" 5","pages":" 1229-1243"},"PeriodicalIF":4.3000,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The multiclass ARKA framework for developing improved q-RASAR models for environmental toxicity endpoints†\",\"authors\":\"Arkaprava Banerjee and Kunal Roy\",\"doi\":\"10.1039/D5EM00068H\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >The continuous quest for the quick, accurate, and efficient methods for filling the gaps in the toxicity data of commercial chemicals is the need of the hour. Thus, it has become essential to develop simple and improved modeling strategies that aim to generate more accurate predictions. Recently, quantitative Read-Across Structure–Activity Relationship (q-RASAR) modeling has been reported to enhance the external predictivity of QSAR models. However, the cross-validation metrics of some q-RASAR models show compromised values compared to those of the corresponding QSAR models. We report here an improved q-RASAR workflow coupled with the Arithmetic Residuals in <em>K</em>-groups Analysis (ARKA) framework. This improved workflow (ARKA-RASAR) considers two important aspects: the contribution of different QSAR descriptors to different experimental response ranges, and the identification of similarity among close congeners based on both the selected QSAR descriptors and the contribution of different QSAR descriptors to different experimental response ranges. A simple, free, and user-friendly Java-based tool, Multiclass ARKA-v1.0, has been developed to compute the multiclass ARKA descriptors. In this study, five different toxicity datasets previously used for the development of QSAR and q-RASAR models were considered. We developed hybrid ARKA models that consist of a combination of QSAR descriptors and ARKA descriptors. These hybrid feature spaces were used to compute RASAR descriptors and develop ARKA-RASAR models. We used the same modeling strategies used to develop the previously reported QSAR and q-RASAR models for a fair comparison. Additionally, these modeling algorithms are straightforward, reproducible, and transferable. A multi-criteria decision-making statistical approach, the Sum of Ranking Differences (SRD), indicated that the ARKA-RASAR models are the best-performing models, considering training, test, and cross-validation statistics. The least significant difference procedure ensured that the SRD values were significantly different for most models, presenting an unbiased workflow. True external validation using a set of pesticide metabolites and predicting their early-stage acute fish toxicity using relevant ARKA-RASAR models was also carried out and yielded encouraging results. The promising results and the ease of computation of ARKA and RASAR descriptors using our tools suggest that the ARKA-RASAR modeling framework may be a potential choice for developing highly robust and predictive models for filling the gaps in environmental toxicity data.</p>\",\"PeriodicalId\":74,\"journal\":{\"name\":\"Environmental Science: Processes & Impacts\",\"volume\":\" 5\",\"pages\":\" 1229-1243\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-04-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Environmental Science: Processes & Impacts\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://pubs.rsc.org/en/content/articlelanding/2025/em/d5em00068h\",\"RegionNum\":3,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, ANALYTICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Science: Processes & Impacts","FirstCategoryId":"93","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2025/em/d5em00068h","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0

摘要

不断寻求快速、准确和有效的方法来填补商业化学品毒性数据的空白是当前的需要。因此,开发旨在生成更准确预测的简单和改进的建模策略变得至关重要。最近,定量读跨结构-活性关系(q-RASAR)模型被报道用于提高QSAR模型的外部预测能力。然而,与相应的QSAR模型相比,一些q-RASAR模型的交叉验证指标显示出折衷的值。我们在这里报告了一个改进的q-RASAR工作流与k群分析(ARKA)框架中的算术残差相结合。这种改进的工作流程(ARKA-RASAR)考虑了两个重要方面:不同QSAR描述符对不同实验响应范围的贡献,以及基于所选择的QSAR描述符和不同QSAR描述符对不同实验响应范围的贡献来识别近亲之间的相似性。已经开发了一个简单、免费且用户友好的基于java的工具,Multiclass ARKA-v1.0,用于计算多类ARKA描述符。在这项研究中,考虑了以前用于开发QSAR和q-RASAR模型的五种不同的毒性数据集。我们开发了混合ARKA模型,该模型由QSAR描述符和ARKA描述符的组合组成。利用这些混合特征空间计算RASAR描述符,建立ARKA-RASAR模型。为了进行公平的比较,我们使用了与开发先前报道的QSAR和q-RASAR模型相同的建模策略。此外,这些建模算法是直接的、可重复的和可转移的。多准则决策统计方法排序差异和(SRD)表明,考虑到训练、检验和交叉验证统计,ARKA-RASAR模型是表现最好的模型。最小显著差异程序确保大多数模型的SRD值显著不同,呈现无偏工作流程。使用一组农药代谢物进行真正的外部验证,并使用相关的ARKA-RASAR模型预测其早期急性鱼类毒性,并取得了令人鼓舞的结果。使用我们的工具,ARKA和RASAR描述符的有希望的结果和易于计算表明,ARKA-RASAR建模框架可能是开发高鲁棒性和预测性模型的潜在选择,以填补环境毒性数据的空白。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The multiclass ARKA framework for developing improved q-RASAR models for environmental toxicity endpoints†

The continuous quest for the quick, accurate, and efficient methods for filling the gaps in the toxicity data of commercial chemicals is the need of the hour. Thus, it has become essential to develop simple and improved modeling strategies that aim to generate more accurate predictions. Recently, quantitative Read-Across Structure–Activity Relationship (q-RASAR) modeling has been reported to enhance the external predictivity of QSAR models. However, the cross-validation metrics of some q-RASAR models show compromised values compared to those of the corresponding QSAR models. We report here an improved q-RASAR workflow coupled with the Arithmetic Residuals in K-groups Analysis (ARKA) framework. This improved workflow (ARKA-RASAR) considers two important aspects: the contribution of different QSAR descriptors to different experimental response ranges, and the identification of similarity among close congeners based on both the selected QSAR descriptors and the contribution of different QSAR descriptors to different experimental response ranges. A simple, free, and user-friendly Java-based tool, Multiclass ARKA-v1.0, has been developed to compute the multiclass ARKA descriptors. In this study, five different toxicity datasets previously used for the development of QSAR and q-RASAR models were considered. We developed hybrid ARKA models that consist of a combination of QSAR descriptors and ARKA descriptors. These hybrid feature spaces were used to compute RASAR descriptors and develop ARKA-RASAR models. We used the same modeling strategies used to develop the previously reported QSAR and q-RASAR models for a fair comparison. Additionally, these modeling algorithms are straightforward, reproducible, and transferable. A multi-criteria decision-making statistical approach, the Sum of Ranking Differences (SRD), indicated that the ARKA-RASAR models are the best-performing models, considering training, test, and cross-validation statistics. The least significant difference procedure ensured that the SRD values were significantly different for most models, presenting an unbiased workflow. True external validation using a set of pesticide metabolites and predicting their early-stage acute fish toxicity using relevant ARKA-RASAR models was also carried out and yielded encouraging results. The promising results and the ease of computation of ARKA and RASAR descriptors using our tools suggest that the ARKA-RASAR modeling framework may be a potential choice for developing highly robust and predictive models for filling the gaps in environmental toxicity data.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Environmental Science: Processes & Impacts
Environmental Science: Processes & Impacts CHEMISTRY, ANALYTICAL-ENVIRONMENTAL SCIENCES
CiteScore
9.50
自引率
3.60%
发文量
202
审稿时长
1 months
期刊介绍: Environmental Science: Processes & Impacts publishes high quality papers in all areas of the environmental chemical sciences, including chemistry of the air, water, soil and sediment. We welcome studies on the environmental fate and effects of anthropogenic and naturally occurring contaminants, both chemical and microbiological, as well as related natural element cycling processes.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信