环境污染的混合智能:通过图关注网络和QSAR模型的多模态集成来评价有机化合物的生物降解性。

IF 4.3 3区 环境科学与生态学 Q1 CHEMISTRY, ANALYTICAL
Abbas Salimi and Jin Yong Lee
{"title":"环境污染的混合智能:通过图关注网络和QSAR模型的多模态集成来评价有机化合物的生物降解性。","authors":"Abbas Salimi and Jin Yong Lee","doi":"10.1039/D4EM00594E","DOIUrl":null,"url":null,"abstract":"<p >Computational methods are crucial for assessing chemical biodegradability, given their significant impact on both environmental and human health. Organic compounds that are not biodegradable can persist in the environment, contributing to pollution. Our novel approach leverages graph attention networks (GATs) and incorporates node and edge attributes for biodegradability prediction. Quantitative Structure–Activity Relationship (QSAR) models using two-dimensional descriptors alongside weighted average and stacking approaches were employed to generate ensemble models. The GAT models demonstrated a stable function and generally higher specificity on the validation set compared to a graph convolutional network, although definitive superiority is challenging to establish owing to overlapping standard deviations. However, the sensitivities tended to decrease with potential performance overlap owing to the interval intersection. Ensemble learning enhanced several performance metrics compared with individual models and base models, with the combination of extreme Gradient Boosting and GAT achieving the highest precision and specificity. Combining GAT with random forest and Gradient Boosting may be preferable for accurately predicting biodegradable molecules, whereas the stacking approach may be suitable for prioritizing the correct classification of nonbiodegradable substances. Important descriptors, such as SpMax1_Bh(m) and SAscore, were identified in at least two QSAR models. Despite inherent complexities, the ease of implementation depends on factors such as data availability, and domain knowledge. Assessing the biodegradability of organic compounds is essential for reducing their environmental impact, assessing risks, ensuring regulatory compliance, promoting sustainable development, and supporting effective pollution remediation. It assists in making informed decisions about chemical use, waste management, and environmental protection.</p>","PeriodicalId":74,"journal":{"name":"Environmental Science: Processes & Impacts","volume":" 4","pages":" 981-991"},"PeriodicalIF":4.3000,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hybrid intelligence for environmental pollution: biodegradability assessment of organic compounds through multimodal integration of graph attention networks and QSAR models†\",\"authors\":\"Abbas Salimi and Jin Yong Lee\",\"doi\":\"10.1039/D4EM00594E\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >Computational methods are crucial for assessing chemical biodegradability, given their significant impact on both environmental and human health. Organic compounds that are not biodegradable can persist in the environment, contributing to pollution. Our novel approach leverages graph attention networks (GATs) and incorporates node and edge attributes for biodegradability prediction. Quantitative Structure–Activity Relationship (QSAR) models using two-dimensional descriptors alongside weighted average and stacking approaches were employed to generate ensemble models. The GAT models demonstrated a stable function and generally higher specificity on the validation set compared to a graph convolutional network, although definitive superiority is challenging to establish owing to overlapping standard deviations. However, the sensitivities tended to decrease with potential performance overlap owing to the interval intersection. Ensemble learning enhanced several performance metrics compared with individual models and base models, with the combination of extreme Gradient Boosting and GAT achieving the highest precision and specificity. Combining GAT with random forest and Gradient Boosting may be preferable for accurately predicting biodegradable molecules, whereas the stacking approach may be suitable for prioritizing the correct classification of nonbiodegradable substances. Important descriptors, such as SpMax1_Bh(m) and SAscore, were identified in at least two QSAR models. Despite inherent complexities, the ease of implementation depends on factors such as data availability, and domain knowledge. Assessing the biodegradability of organic compounds is essential for reducing their environmental impact, assessing risks, ensuring regulatory compliance, promoting sustainable development, and supporting effective pollution remediation. It assists in making informed decisions about chemical use, waste management, and environmental protection.</p>\",\"PeriodicalId\":74,\"journal\":{\"name\":\"Environmental Science: Processes & Impacts\",\"volume\":\" 4\",\"pages\":\" 981-991\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Environmental Science: Processes & Impacts\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://pubs.rsc.org/en/content/articlelanding/2025/em/d4em00594e\",\"RegionNum\":3,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, ANALYTICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Science: Processes & Impacts","FirstCategoryId":"93","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2025/em/d4em00594e","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0

摘要

考虑到化学物质对环境和人类健康的重大影响,计算方法对于评估化学物质的生物降解性至关重要。不可生物降解的有机化合物会持续存在于环境中,造成污染。我们的新方法利用图注意网络(GATs)并结合节点和边缘属性进行生物降解性预测。采用二维描述符、加权平均和叠加方法的定量构效关系(QSAR)模型生成集成模型。与图卷积网络相比,GAT模型在验证集上表现出稳定的功能和更高的特异性,尽管由于重叠的标准偏差,很难确定确定的优势。但由于区间交叉,随着潜在性能的重叠,灵敏度有降低的趋势。与单个模型和基本模型相比,集成学习提高了几个性能指标,其中极端梯度增强和GAT的结合达到了最高的精度和特异性。将GAT与随机森林和梯度增强相结合可能更适合于准确预测生物可降解分子,而叠加方法可能适合于对不可降解物质进行正确的优先分类。重要的描述符,如SpMax1_Bh(m)和SAscore,在至少两个QSAR模型中被确定。尽管存在固有的复杂性,但实现的难易程度取决于数据可用性和领域知识等因素。评估有机化合物的生物降解性对于减少其对环境的影响、评估风险、确保法规遵守、促进可持续发展和支持有效的污染补救至关重要。它有助于在化学品使用、废物管理和环境保护方面作出明智的决定。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Hybrid intelligence for environmental pollution: biodegradability assessment of organic compounds through multimodal integration of graph attention networks and QSAR models†

Computational methods are crucial for assessing chemical biodegradability, given their significant impact on both environmental and human health. Organic compounds that are not biodegradable can persist in the environment, contributing to pollution. Our novel approach leverages graph attention networks (GATs) and incorporates node and edge attributes for biodegradability prediction. Quantitative Structure–Activity Relationship (QSAR) models using two-dimensional descriptors alongside weighted average and stacking approaches were employed to generate ensemble models. The GAT models demonstrated a stable function and generally higher specificity on the validation set compared to a graph convolutional network, although definitive superiority is challenging to establish owing to overlapping standard deviations. However, the sensitivities tended to decrease with potential performance overlap owing to the interval intersection. Ensemble learning enhanced several performance metrics compared with individual models and base models, with the combination of extreme Gradient Boosting and GAT achieving the highest precision and specificity. Combining GAT with random forest and Gradient Boosting may be preferable for accurately predicting biodegradable molecules, whereas the stacking approach may be suitable for prioritizing the correct classification of nonbiodegradable substances. Important descriptors, such as SpMax1_Bh(m) and SAscore, were identified in at least two QSAR models. Despite inherent complexities, the ease of implementation depends on factors such as data availability, and domain knowledge. Assessing the biodegradability of organic compounds is essential for reducing their environmental impact, assessing risks, ensuring regulatory compliance, promoting sustainable development, and supporting effective pollution remediation. It assists in making informed decisions about chemical use, waste management, and environmental protection.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Environmental Science: Processes & Impacts
Environmental Science: Processes & Impacts CHEMISTRY, ANALYTICAL-ENVIRONMENTAL SCIENCES
CiteScore
9.50
自引率
3.60%
发文量
202
审稿时长
1 months
期刊介绍: Environmental Science: Processes & Impacts publishes high quality papers in all areas of the environmental chemical sciences, including chemistry of the air, water, soil and sediment. We welcome studies on the environmental fate and effects of anthropogenic and naturally occurring contaminants, both chemical and microbiological, as well as related natural element cycling processes.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信