A Machine Learning Framework for Enhanced Assessment of Sewer System Operation under Data Constraints and Skewed Distributions

IF 7.4 Q1 ENGINEERING, ENVIRONMENTAL
Wan-Xin Yin, Yu-Qi Wang, Jia-Qiang Lv, Jia-Ji Chen, Shuai Liu, Zheng Pang, Ye Yuan, Hong-Xu Bao, Hong-Cheng Wang* and Ai-Jie Wang*, 
{"title":"A Machine Learning Framework for Enhanced Assessment of Sewer System Operation under Data Constraints and Skewed Distributions","authors":"Wan-Xin Yin,&nbsp;Yu-Qi Wang,&nbsp;Jia-Qiang Lv,&nbsp;Jia-Ji Chen,&nbsp;Shuai Liu,&nbsp;Zheng Pang,&nbsp;Ye Yuan,&nbsp;Hong-Xu Bao,&nbsp;Hong-Cheng Wang* and Ai-Jie Wang*,&nbsp;","doi":"10.1021/acsestengg.4c0047710.1021/acsestengg.4c00477","DOIUrl":null,"url":null,"abstract":"<p >In the realm of sewer management, precise machine learning simulations of physicobiochemical processes during sewage transport are essential yet are hindered by skewed distributions and data constraints. To address this issue, the present study introduces an innovative algorithm, the Automatic Synthetic Minority Over-Sampling Technique for Regression with Gaussian Noise (AutoSMOGN), designed to mitigate the adverse effects of skewed data set distributions. The findings reveal that the integration of the AutoSMOGN algorithm with ML models significantly enhances the precision of gaseous H<sub>2</sub>S concentration predictions. Of these approaches, ensemble learning models demonstrated superior accuracy in forecasting gaseous H<sub>2</sub>S concentrations within sewer environments, achieving the highest coefficient of determination (<i>R</i><sup>2</sup>) of 0.80. Furthermore, the study validates the effectiveness of the AutoSMOGN algorithm in addressing skewed distribution, as evidenced by its acceptable predictive performance on a full-sequence data set (<i>R</i><sup>2</sup> of 0.52) and when applied to multiple variables, yielding <i>R</i><sup>2</sup> values of 0.88 for total nitrogen and 0.66 for total organic carbon, respectively. These results underscore the potential of the AutoSMOGN algorithm to significantly contribute to the development of new control and optimization strategies, thereby enhancing the maintenance and operational efficacy of sewer systems.</p>","PeriodicalId":7008,"journal":{"name":"ACS ES&T engineering","volume":"5 1","pages":"126–136 126–136"},"PeriodicalIF":7.4000,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS ES&T engineering","FirstCategoryId":"1085","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acsestengg.4c00477","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

In the realm of sewer management, precise machine learning simulations of physicobiochemical processes during sewage transport are essential yet are hindered by skewed distributions and data constraints. To address this issue, the present study introduces an innovative algorithm, the Automatic Synthetic Minority Over-Sampling Technique for Regression with Gaussian Noise (AutoSMOGN), designed to mitigate the adverse effects of skewed data set distributions. The findings reveal that the integration of the AutoSMOGN algorithm with ML models significantly enhances the precision of gaseous H2S concentration predictions. Of these approaches, ensemble learning models demonstrated superior accuracy in forecasting gaseous H2S concentrations within sewer environments, achieving the highest coefficient of determination (R2) of 0.80. Furthermore, the study validates the effectiveness of the AutoSMOGN algorithm in addressing skewed distribution, as evidenced by its acceptable predictive performance on a full-sequence data set (R2 of 0.52) and when applied to multiple variables, yielding R2 values of 0.88 for total nitrogen and 0.66 for total organic carbon, respectively. These results underscore the potential of the AutoSMOGN algorithm to significantly contribute to the development of new control and optimization strategies, thereby enhancing the maintenance and operational efficacy of sewer systems.

Abstract Image

求助全文
约1分钟内获得全文 求助全文
来源期刊
ACS ES&T engineering
ACS ES&T engineering ENGINEERING, ENVIRONMENTAL-
CiteScore
8.50
自引率
0.00%
发文量
0
期刊介绍: ACS ES&T Engineering publishes impactful research and review articles across all realms of environmental technology and engineering, employing a rigorous peer-review process. As a specialized journal, it aims to provide an international platform for research and innovation, inviting contributions on materials technologies, processes, data analytics, and engineering systems that can effectively manage, protect, and remediate air, water, and soil quality, as well as treat wastes and recover resources. The journal encourages research that supports informed decision-making within complex engineered systems and is grounded in mechanistic science and analytics, describing intricate environmental engineering systems. It considers papers presenting novel advancements, spanning from laboratory discovery to field-based application. However, case or demonstration studies lacking significant scientific advancements and technological innovations are not within its scope. Contributions containing experimental and/or theoretical methods, rooted in engineering principles and integrated with knowledge from other disciplines, are welcomed.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信