单任务回归自然适应于多物种(生态)毒理学建模:对动物的案例研究。

IF 5.8 3区 环境科学与生态学 0 ENVIRONMENTAL SCIENCES
Suyu Mei
{"title":"单任务回归自然适应于多物种(生态)毒理学建模:对动物的案例研究。","authors":"Suyu Mei","doi":"10.1007/s11356-025-36025-y","DOIUrl":null,"url":null,"abstract":"<div><p>In silico (eco)toxicological modelling has gained increasing popularity with chemical environmentalists in accelerating toxicity assessment of hazardous chemicals on environments, animal well-being and human health. Existing local and multi-task models commonly exhibit restricted extensibility in multi-species modelling scenarios. In this work, we propose a strategy of single-task regression to naturally adapt modelling to (eco)toxicological measurements on multiple species without requiring a certain number of common pesticides among tested species as multi-task regression does. This strategy treats all species equally in an integral model to facilitate data augmentation and inter-species transfer of common patterns of fragmental toxicities. We aggregate 37,305 measurements of 29,140 pesticides on 10 tested groups of animals to train four machine learning models including extreme gradient boosting (XGBoost), deep neural networks (DNN), random forest (RF) and support vector regression (SVR). Five-fold stratified cross-validation shows that the XGBoost outperforms the other three models with overall 0.67 <i>R</i><sup>2</sup>, 0.44 RMSE and 0.29 MAE. As compared to local models focusing on one animal group, the proposed single-task regression model achieves a 0.08 ~ 0.49 <i>R</i><sup>2</sup> increase. XGBoost feature importance shows that Morgan bit 389 (five-atom fraction of the aromatic ring) exhibits top importance to single-task regression and single-animal regression. Lastly, taking the pesticide parathion and dimethoate as control baselines, we demonstrate the credibility of several case studies from the viewpoints of toxicity profile similarities and pesticide structural similarities.</p></div>","PeriodicalId":545,"journal":{"name":"Environmental Science and Pollution Research","volume":"32 8","pages":"4910 - 4925"},"PeriodicalIF":5.8000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Single-task regression naturally adapts to multi-species (eco)toxicological modelling: a case study on animals\",\"authors\":\"Suyu Mei\",\"doi\":\"10.1007/s11356-025-36025-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In silico (eco)toxicological modelling has gained increasing popularity with chemical environmentalists in accelerating toxicity assessment of hazardous chemicals on environments, animal well-being and human health. Existing local and multi-task models commonly exhibit restricted extensibility in multi-species modelling scenarios. In this work, we propose a strategy of single-task regression to naturally adapt modelling to (eco)toxicological measurements on multiple species without requiring a certain number of common pesticides among tested species as multi-task regression does. This strategy treats all species equally in an integral model to facilitate data augmentation and inter-species transfer of common patterns of fragmental toxicities. We aggregate 37,305 measurements of 29,140 pesticides on 10 tested groups of animals to train four machine learning models including extreme gradient boosting (XGBoost), deep neural networks (DNN), random forest (RF) and support vector regression (SVR). Five-fold stratified cross-validation shows that the XGBoost outperforms the other three models with overall 0.67 <i>R</i><sup>2</sup>, 0.44 RMSE and 0.29 MAE. As compared to local models focusing on one animal group, the proposed single-task regression model achieves a 0.08 ~ 0.49 <i>R</i><sup>2</sup> increase. XGBoost feature importance shows that Morgan bit 389 (five-atom fraction of the aromatic ring) exhibits top importance to single-task regression and single-animal regression. Lastly, taking the pesticide parathion and dimethoate as control baselines, we demonstrate the credibility of several case studies from the viewpoints of toxicity profile similarities and pesticide structural similarities.</p></div>\",\"PeriodicalId\":545,\"journal\":{\"name\":\"Environmental Science and Pollution Research\",\"volume\":\"32 8\",\"pages\":\"4910 - 4925\"},\"PeriodicalIF\":5.8000,\"publicationDate\":\"2025-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Environmental Science and Pollution Research\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s11356-025-36025-y\",\"RegionNum\":3,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Science and Pollution Research","FirstCategoryId":"93","ListUrlMain":"https://link.springer.com/article/10.1007/s11356-025-36025-y","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

在加速危险化学品对环境、动物福利和人类健康的毒性评估方面,计算机(生态)毒理学模型越来越受到化学环保主义者的欢迎。现有的局部和多任务模型在多物种建模场景中普遍表现出有限的可扩展性。在这项工作中,我们提出了一种单任务回归策略,以自然地使模型适应多物种的(生态)毒理学测量,而不像多任务回归那样需要在测试物种中使用一定数量的常见农药。该策略在一个整体模型中平等对待所有物种,以促进数据的扩充和片段毒性共同模式的种间转移。我们在10组测试动物中收集了29140种农药的37305个测量值,以训练四种机器学习模型,包括极端梯度增强(XGBoost)、深度神经网络(DNN)、随机森林(RF)和支持向量回归(SVR)。五次分层交叉验证表明,XGBoost的总体R2为0.67,RMSE为0.44,MAE为0.29,优于其他三种模型。与只关注一个动物群体的局部模型相比,本文提出的单任务回归模型的R2提高了0.08 ~ 0.49。XGBoost特征的重要性表明Morgan bit 389(芳香环的五原子分数)对单任务回归和单动物回归具有最高的重要性。最后,以农药对硫磷和乐果果为对照基线,从毒性谱相似性和农药结构相似性的角度论证了几个案例研究的可信度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Single-task regression naturally adapts to multi-species (eco)toxicological modelling: a case study on animals

Single-task regression naturally adapts to multi-species (eco)toxicological modelling: a case study on animals

In silico (eco)toxicological modelling has gained increasing popularity with chemical environmentalists in accelerating toxicity assessment of hazardous chemicals on environments, animal well-being and human health. Existing local and multi-task models commonly exhibit restricted extensibility in multi-species modelling scenarios. In this work, we propose a strategy of single-task regression to naturally adapt modelling to (eco)toxicological measurements on multiple species without requiring a certain number of common pesticides among tested species as multi-task regression does. This strategy treats all species equally in an integral model to facilitate data augmentation and inter-species transfer of common patterns of fragmental toxicities. We aggregate 37,305 measurements of 29,140 pesticides on 10 tested groups of animals to train four machine learning models including extreme gradient boosting (XGBoost), deep neural networks (DNN), random forest (RF) and support vector regression (SVR). Five-fold stratified cross-validation shows that the XGBoost outperforms the other three models with overall 0.67 R2, 0.44 RMSE and 0.29 MAE. As compared to local models focusing on one animal group, the proposed single-task regression model achieves a 0.08 ~ 0.49 R2 increase. XGBoost feature importance shows that Morgan bit 389 (five-atom fraction of the aromatic ring) exhibits top importance to single-task regression and single-animal regression. Lastly, taking the pesticide parathion and dimethoate as control baselines, we demonstrate the credibility of several case studies from the viewpoints of toxicity profile similarities and pesticide structural similarities.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
8.70
自引率
17.20%
发文量
6549
审稿时长
3.8 months
期刊介绍: Environmental Science and Pollution Research (ESPR) serves the international community in all areas of Environmental Science and related subjects with emphasis on chemical compounds. This includes: - Terrestrial Biology and Ecology - Aquatic Biology and Ecology - Atmospheric Chemistry - Environmental Microbiology/Biobased Energy Sources - Phytoremediation and Ecosystem Restoration - Environmental Analyses and Monitoring - Assessment of Risks and Interactions of Pollutants in the Environment - Conservation Biology and Sustainable Agriculture - Impact of Chemicals/Pollutants on Human and Animal Health It reports from a broad interdisciplinary outlook.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信