Predicting reaction kinetics of reactive bromine species with organic compounds by machine learning: Feature combination and knowledge transfer with reactive chlorine species

IF 12.2 1区 环境科学与生态学 Q1 ENGINEERING, ENVIRONMENTAL
Wenlei Qin, Shanshan Zheng, Kaiheng Guo, Ming Yang, Jingyun Fang
{"title":"Predicting reaction kinetics of reactive bromine species with organic compounds by machine learning: Feature combination and knowledge transfer with reactive chlorine species","authors":"Wenlei Qin, Shanshan Zheng, Kaiheng Guo, Ming Yang, Jingyun Fang","doi":"10.1016/j.jhazmat.2024.136410","DOIUrl":null,"url":null,"abstract":"Reactive bromine species (RBS) such as bromine atom (Br<sup>•</sup>) and dibromine radical (Br<sub>2</sub><sup>•−</sup>) are important oxidative species accounting for the transformation of organic compounds in bromide-containing water. This study developed quantitative structure−activity relationship (QSAR) models to predict second order rate constants (<em>k</em>) of RBS by machine learning (ML) and conducted knowledge transfer between RBS and reactive chlorine species (RCS, e.g., Cl<sup>•</sup> and Cl<sub>2</sub><sup>•−</sup>) to improve model performance. The ML-based models (<em>RMSE</em><sub>test</sub> = 0.476−0.712) outperformed the multiple linear regression-based models (<em>RMSE</em><sub>test</sub> = 0.572−3.68) for predicting <em>k</em> of RBS. In addition, the combination of molecular fingerprints (MFs) and quantum descriptors (QDs) as input features improved the performance of ML-based models (<em>RMSE</em><sub>test</sub> = 0.476−0.712) compared to those developed by MFs (<em>RMSE</em><sub>test</sub> = 0.524−0.834) or QDs (<em>RMSE</em><sub>test</sub> = 0.572−0.806) alone. <em>E</em><sub>HOMO</sub> and <em>E</em><sub>gap</sub> were identified to be the most important features affecting <em>k</em> of RBS based on SHAP analysis. A unified model integrating the datasets of four reactive halogen species (RHS, e.g., Br<sup>•</sup>, Br<sub>2</sub><sup>•−</sup>, Cl<sup>•</sup> and Cl<sub>2</sub><sup>•−</sup>) was further developed (<em>R</em><sup>2</sup><sub>test</sub> = 0.802), which showed better predictive performance than the individual models (<em>R</em><sup>2</sup><sub>test</sub> = 0.521−0.776). Meanwhile, the model performance changed differently by employing knowledge transfer among RHS, which was improved for Br<sup>•</sup>/Cl<sup>•</sup>, mixed for Br<sup>•</sup>/Br<sub>2</sub><sup>•−</sup> and Cl<sup>•</sup>/Cl<sub>2</sub><sup>•−</sup>, but worse for Br<sub>2</sub><sup>•−</sup>/Cl<sub>2</sub><sup>•−</sup>. This study provides useful tools for predicting <em>k</em> of RHS in aqueous environments.","PeriodicalId":361,"journal":{"name":"Journal of Hazardous Materials","volume":null,"pages":null},"PeriodicalIF":12.2000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Hazardous Materials","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1016/j.jhazmat.2024.136410","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

Reactive bromine species (RBS) such as bromine atom (Br) and dibromine radical (Br2•−) are important oxidative species accounting for the transformation of organic compounds in bromide-containing water. This study developed quantitative structure−activity relationship (QSAR) models to predict second order rate constants (k) of RBS by machine learning (ML) and conducted knowledge transfer between RBS and reactive chlorine species (RCS, e.g., Cl and Cl2•−) to improve model performance. The ML-based models (RMSEtest = 0.476−0.712) outperformed the multiple linear regression-based models (RMSEtest = 0.572−3.68) for predicting k of RBS. In addition, the combination of molecular fingerprints (MFs) and quantum descriptors (QDs) as input features improved the performance of ML-based models (RMSEtest = 0.476−0.712) compared to those developed by MFs (RMSEtest = 0.524−0.834) or QDs (RMSEtest = 0.572−0.806) alone. EHOMO and Egap were identified to be the most important features affecting k of RBS based on SHAP analysis. A unified model integrating the datasets of four reactive halogen species (RHS, e.g., Br, Br2•−, Cl and Cl2•−) was further developed (R2test = 0.802), which showed better predictive performance than the individual models (R2test = 0.521−0.776). Meanwhile, the model performance changed differently by employing knowledge transfer among RHS, which was improved for Br/Cl, mixed for Br/Br2•− and Cl/Cl2•−, but worse for Br2•−/Cl2•−. This study provides useful tools for predicting k of RHS in aqueous environments.

Abstract Image

通过机器学习预测活性溴与有机化合物的反应动力学:与活性氯物种的特征组合和知识转移
溴原子(Br-)和二溴自由基(Br2--)等反应性溴物种(RBS)是含溴水中有机化合物转化的重要氧化物种。本研究建立了定量结构-活性关系(QSAR)模型,通过机器学习(ML)预测RBS的二阶速率常数(k),并在RBS和活性氯物种(RCS,如Cl-和Cl2--)之间进行知识转移以提高模型性能。基于 ML 的模型(RMSEtest = 0.476-0.712)在预测 RBS 的 k 方面优于基于多元线性回归的模型(RMSEtest = 0.572-3.68)。此外,将分子指纹(MFs)和量子描述符(QDs)组合作为输入特征,与单独使用 MFs(RMSEtest = 0.524-0.834)或 QDs(RMSEtest = 0.572-0.806)建立的模型相比,提高了基于 ML 的模型的性能(RMSEtest = 0.476-0.712)。根据 SHAP 分析,EHOMO 和 Egap 被认为是影响 RBS k 的最重要特征。进一步建立了整合四种活性卤素物种(RHS,如 Br-、Br2-、Cl-和 Cl2-)数据集的统一模型(R2test = 0.802),其预测性能优于单个模型(R2test = 0.521-0.776)。同时,通过在 RHS 之间进行知识转移,模型的性能也发生了不同的变化,Br-/Cl- 的性能有所提高,Br-/Br2-- 和 Cl-/Cl2--的性能参差不齐,但 Br2--/Cl2--的性能较差。这项研究为预测水环境中 RHS 的 k 提供了有用的工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Hazardous Materials
Journal of Hazardous Materials 工程技术-工程:环境
CiteScore
25.40
自引率
5.90%
发文量
3059
审稿时长
58 days
期刊介绍: The Journal of Hazardous Materials serves as a global platform for promoting cutting-edge research in the field of Environmental Science and Engineering. Our publication features a wide range of articles, including full-length research papers, review articles, and perspectives, with the aim of enhancing our understanding of the dangers and risks associated with various materials concerning public health and the environment. It is important to note that the term "environmental contaminants" refers specifically to substances that pose hazardous effects through contamination, while excluding those that do not have such impacts on the environment or human health. Moreover, we emphasize the distinction between wastes and hazardous materials in order to provide further clarity on the scope of the journal. We have a keen interest in exploring specific compounds and microbial agents that have adverse effects on the environment.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信