新兴化学物质大鼠急性口服毒性预测的有效机器学习模型:多领域应用和构效关系。

IF 2.3 3区 环境科学与生态学 Q3 CHEMISTRY, MULTIDISCIPLINARY
J Yan, Z Shen
{"title":"新兴化学物质大鼠急性口服毒性预测的有效机器学习模型:多领域应用和构效关系。","authors":"J Yan, Z Shen","doi":"10.1080/1062936X.2025.2531172","DOIUrl":null,"url":null,"abstract":"<p><p>Given the widespread presence of emerging contaminants in the environment, assessing and ensuring their biosafety is urgent. Under the Globally Harmonized System (GHS), the LD<sub>50</sub> parameter of acute oral toxicity (AOT) is crucial for chemical safety classification. Animal testing limitations have highlighted the need for alternative methods, and machine learning offers a new approach to predict LD<sub>50</sub> through quantitative structure-activity relationship (QSAR) models. This study developed and optimized a machine learning model for LD<sub>50</sub> classification of emerging contaminants based on data from more than 6000 known AOT. Using molecular descriptors and fingerprints, the model achieves an accuracy above 0.86 and a recall score over 0.84, outperforming previous models. The model's robustness was confirmed across various types of emerging contaminants. Shapley additive explanations (SHAP) identified key descriptors like BCUTp_1h, ATSC1pe, and SLogP_VSA4, while the information gain (IG) method highlighted alert substructures [P-O, P-S]. These findings suggest that compounds with high polarizability, mean electronegativity and significant surface area may adversely affect rats. This model enhances understanding of acute toxicity mechanisms and serves as a tool for early screening of safer compounds, promoting the design of greener chemicals.</p>","PeriodicalId":21446,"journal":{"name":"SAR and QSAR in Environmental Research","volume":"36 6","pages":"537-554"},"PeriodicalIF":2.3000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An effective machine learning model for rat acute oral toxicity prediction of emerging chemicals: multi-domain applications and structure-activity relationships.\",\"authors\":\"J Yan, Z Shen\",\"doi\":\"10.1080/1062936X.2025.2531172\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Given the widespread presence of emerging contaminants in the environment, assessing and ensuring their biosafety is urgent. Under the Globally Harmonized System (GHS), the LD<sub>50</sub> parameter of acute oral toxicity (AOT) is crucial for chemical safety classification. Animal testing limitations have highlighted the need for alternative methods, and machine learning offers a new approach to predict LD<sub>50</sub> through quantitative structure-activity relationship (QSAR) models. This study developed and optimized a machine learning model for LD<sub>50</sub> classification of emerging contaminants based on data from more than 6000 known AOT. Using molecular descriptors and fingerprints, the model achieves an accuracy above 0.86 and a recall score over 0.84, outperforming previous models. The model's robustness was confirmed across various types of emerging contaminants. Shapley additive explanations (SHAP) identified key descriptors like BCUTp_1h, ATSC1pe, and SLogP_VSA4, while the information gain (IG) method highlighted alert substructures [P-O, P-S]. These findings suggest that compounds with high polarizability, mean electronegativity and significant surface area may adversely affect rats. This model enhances understanding of acute toxicity mechanisms and serves as a tool for early screening of safer compounds, promoting the design of greener chemicals.</p>\",\"PeriodicalId\":21446,\"journal\":{\"name\":\"SAR and QSAR in Environmental Research\",\"volume\":\"36 6\",\"pages\":\"537-554\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SAR and QSAR in Environmental Research\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://doi.org/10.1080/1062936X.2025.2531172\",\"RegionNum\":3,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/7/31 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SAR and QSAR in Environmental Research","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1080/1062936X.2025.2531172","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/31 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

鉴于新出现的污染物在环境中广泛存在,评估和确保其生物安全性迫在眉睫。在全球统一制度(GHS)下,急性口服毒性(AOT)的LD50参数是化学品安全分类的关键参数。动物实验的局限性突出了对替代方法的需求,机器学习提供了一种通过定量结构-活性关系(QSAR)模型预测LD50的新方法。本研究基于6000多个已知AOT的数据,开发并优化了一个用于新兴污染物LD50分类的机器学习模型。使用分子描述符和指纹,该模型的准确率超过0.86,召回率超过0.84,优于之前的模型。该模型的稳健性在各种类型的新兴污染物中得到了证实。Shapley加性解释(SHAP)识别了关键描述符,如BCUTp_1h、ATSC1pe和SLogP_VSA4,而信息增益(IG)方法突出了警报子结构[P-O, P-S]。这些发现表明,具有高极化率、平均电负性和显著表面积的化合物可能对大鼠产生不利影响。该模型增强了对急性毒性机制的理解,并作为早期筛选更安全化合物的工具,促进了绿色化学品的设计。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An effective machine learning model for rat acute oral toxicity prediction of emerging chemicals: multi-domain applications and structure-activity relationships.

Given the widespread presence of emerging contaminants in the environment, assessing and ensuring their biosafety is urgent. Under the Globally Harmonized System (GHS), the LD50 parameter of acute oral toxicity (AOT) is crucial for chemical safety classification. Animal testing limitations have highlighted the need for alternative methods, and machine learning offers a new approach to predict LD50 through quantitative structure-activity relationship (QSAR) models. This study developed and optimized a machine learning model for LD50 classification of emerging contaminants based on data from more than 6000 known AOT. Using molecular descriptors and fingerprints, the model achieves an accuracy above 0.86 and a recall score over 0.84, outperforming previous models. The model's robustness was confirmed across various types of emerging contaminants. Shapley additive explanations (SHAP) identified key descriptors like BCUTp_1h, ATSC1pe, and SLogP_VSA4, while the information gain (IG) method highlighted alert substructures [P-O, P-S]. These findings suggest that compounds with high polarizability, mean electronegativity and significant surface area may adversely affect rats. This model enhances understanding of acute toxicity mechanisms and serves as a tool for early screening of safer compounds, promoting the design of greener chemicals.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
5.20
自引率
20.00%
发文量
78
审稿时长
>24 weeks
期刊介绍: SAR and QSAR in Environmental Research is an international journal welcoming papers on the fundamental and practical aspects of the structure-activity and structure-property relationships in the fields of environmental science, agrochemistry, toxicology, pharmacology and applied chemistry. A unique aspect of the journal is the focus on emerging techniques for the building of SAR and QSAR models in these widely varying fields. The scope of the journal includes, but is not limited to, the topics of topological and physicochemical descriptors, mathematical, statistical and graphical methods for data analysis, computer methods and programs, original applications and comparative studies. In addition to primary scientific papers, the journal contains reviews of books and software and news of conferences. Special issues on topics of current and widespread interest to the SAR and QSAR community will be published from time to time.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信