Machine learning model to predict rate constants for sonochemical degradation of organic pollutants

IF 8.7 1区化学 Q1 ACOUSTICS

Ultrasonics Sonochemistry Pub Date : 2024-08-21 DOI:10.1016/j.ultsonch.2024.107032

Iseul Na , Taeho Kim , Pengpeng Qiu , Younggyu Son

{"title":"Machine learning model to predict rate constants for sonochemical degradation of organic pollutants","authors":"Iseul Na , Taeho Kim , Pengpeng Qiu , Younggyu Son","doi":"10.1016/j.ultsonch.2024.107032","DOIUrl":null,"url":null,"abstract":"<div>In this study, machine learning (ML) algorithms were employed to predict the pseudo-1st-order reaction rate constants for the sonochemical degradation of aqueous organic pollutants under various conditions. A total of 618 sets of data, including ultrasonic, solution, and pollutant characteristics, were collected from 89 previous studies. Considering the difference between the electrical power (Pele) and calorimetric power (Pcal), the collected data were divided into two groups: data with Pele and data with Pcal. Eight input variables, including frequency, power density, pH, temperature, initial concentration, solubility, vapor pressure, and octanol–water partition coefficient (Kow), and one target variable of the degradation rate constant, were selected for ML. Statistical analysis was conducted, and outliers were determined separately for the two groups. ML models, including random forest (RF), extreme gradient boosting (XGB), and light gradient boosting machine (LGB), were used to predict the pseudo-1st-order reaction rate constants for the removal of aqueous pollutants. The prediction performance of the ML models was evaluated using different metrics, including the root mean squared error (RMSE), mean absolute error (MAE), and R squared (R2). A significantly higher prediction performance was obtained using data without outliers and augmented data. Consequently, all the applied ML models could be used to predict the sonochemical degradation of aqueous pollutants, and the XGB model showed the highest accuracy in predicting the rate constants. In addition, the power density and frequency were the most influential factors among the eight input variables in prediction with the Shapley additive explanation (SHAP) values method. The degradation rate constants of the two pollutants over a wide frequency range (20–1,000 kHz) were predicted using the trained ML model (XGB) and the prediction results were analyzed.</div>","PeriodicalId":442,"journal":{"name":"Ultrasonics Sonochemistry","volume":"110 ","pages":"Article 107032"},"PeriodicalIF":8.7000,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1350417724002803/pdfft?md5=7b3ab12f2de5f3e0859b1059c55662e3&pid=1-s2.0-S1350417724002803-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ultrasonics Sonochemistry","FirstCategoryId":"92","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1350417724002803","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}

引用次数: 0

Abstract

In this study, machine learning (ML) algorithms were employed to predict the pseudo-1st-order reaction rate constants for the sonochemical degradation of aqueous organic pollutants under various conditions. A total of 618 sets of data, including ultrasonic, solution, and pollutant characteristics, were collected from 89 previous studies. Considering the difference between the electrical power (P_ele) and calorimetric power (P_cal), the collected data were divided into two groups: data with P_ele and data with P_cal. Eight input variables, including frequency, power density, pH, temperature, initial concentration, solubility, vapor pressure, and octanol–water partition coefficient (K_ow), and one target variable of the degradation rate constant, were selected for ML. Statistical analysis was conducted, and outliers were determined separately for the two groups. ML models, including random forest (RF), extreme gradient boosting (XGB), and light gradient boosting machine (LGB), were used to predict the pseudo-1st-order reaction rate constants for the removal of aqueous pollutants. The prediction performance of the ML models was evaluated using different metrics, including the root mean squared error (RMSE), mean absolute error (MAE), and R squared (R²). A significantly higher prediction performance was obtained using data without outliers and augmented data. Consequently, all the applied ML models could be used to predict the sonochemical degradation of aqueous pollutants, and the XGB model showed the highest accuracy in predicting the rate constants. In addition, the power density and frequency were the most influential factors among the eight input variables in prediction with the Shapley additive explanation (SHAP) values method. The degradation rate constants of the two pollutants over a wide frequency range (20–1,000 kHz) were predicted using the trained ML model (XGB) and the prediction results were analyzed.

查看原文本刊更多论文

预测有机污染物声化学降解速率常数的机器学习模型

本研究采用机器学习（ML）算法预测了不同条件下水性有机污染物超声化学降解的伪一阶反应速率常数。从 89 项以往研究中收集了共 618 组数据，包括超声波、溶液和污染物特征。考虑到电功率（Pele）和热功率（Pcal）之间的差异，收集的数据被分为两组：Pele 数据和 Pcal 数据。选择频率、功率密度、pH 值、温度、初始浓度、溶解度、蒸气压、辛醇-水分配系数（Kow）等 8 个输入变量和降解速率常数等 1 个目标变量进行 ML 分析。进行了统计分析，并分别确定了两组的异常值。采用随机森林（RF）、极梯度提升（XGB）和光梯度提升机（LGB）等 ML 模型预测去除水污染物的伪一阶反应速率常数。使用不同的指标评估了 ML 模型的预测性能，包括均方根误差 (RMSE)、平均绝对误差 (MAE) 和 R 平方 (R2)。无异常值数据和增强数据的预测性能明显更高。因此，所有应用的 ML 模型都可用于预测水体污染物的声化学降解，而 XGB 模型在预测速率常数方面表现出最高的准确性。此外，在使用夏普利加法解释（SHAP）值法进行预测时，功率密度和频率是八个输入变量中影响最大的因素。利用训练有素的 ML 模型（XGB）预测了两种污染物在较宽频率范围（20-1,000 kHz）内的降解率常数，并对预测结果进行了分析。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Ultrasonics Sonochemistry 化学-化学综合

CiteScore

15.80

自引率

11.90%

发文量

361

审稿时长

59 days

期刊介绍： Ultrasonics Sonochemistry stands as a premier international journal dedicated to the publication of high-quality research articles primarily focusing on chemical reactions and reactors induced by ultrasonic waves, known as sonochemistry. Beyond chemical reactions, the journal also welcomes contributions related to cavitation-induced events and processing, including sonoluminescence, and the transformation of materials on chemical, physical, and biological levels. Since its inception in 1994, Ultrasonics Sonochemistry has consistently maintained a top ranking in the "Acoustics" category, reflecting its esteemed reputation in the field. The journal publishes exceptional papers covering various areas of ultrasonics and sonochemistry. Its contributions are highly regarded by both academia and industry stakeholders, demonstrating its relevance and impact in advancing research and innovation.