Hybrid Machine Learning Models for Discharge Coefficient Prediction in Hydrofoil-Crested Stepped Spillways

IF 12.1 2区 工程技术 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Ehsan Afaridegan, Nosratollah Amanian, Mohammad Reza Goodarzi
{"title":"Hybrid Machine Learning Models for Discharge Coefficient Prediction in Hydrofoil-Crested Stepped Spillways","authors":"Ehsan Afaridegan,&nbsp;Nosratollah Amanian,&nbsp;Mohammad Reza Goodarzi","doi":"10.1007/s11831-025-10274-z","DOIUrl":null,"url":null,"abstract":"<div><p>Accurately estimating the discharge coefficient (<i>C</i><sub><i>d</i></sub>) in spillways remains a complex challenge, critical to hydraulic engineering. Recent advancements suggest that hybrid Machine Learning (ML) models offer significant potential for improving <i>C</i><sub><i>d</i></sub> predictions. This study explores the application of four novel hybrid ML models to estimate <i>C</i><sub><i>d</i></sub> in Hydrofoil-Crested Stepped Spillways (HCSSs): Light Gradient Boosting Machine with Pelican Optimization Algorithm (LightGBM-POA), Neural Gradient Boosting with Osprey Optimization Algorithm (NGBoost-OOA), Tabular Neural Network with Moth Flame Optimization (TabNet-MFO), and Support Vector Regression with Improved Whale Optimization Algorithm (SVR-IWOA). Outlier detection was performed using the Isolation Forest algorithm, and dimensional analysis identified the hydrofoil formation index (<i>t</i>) and the ratio of upstream flow depth to total spillway height (<i>y</i><sub><i>up</i></sub>/<i>P</i>) as the most influential parameters for <i>C</i><sub><i>d</i></sub> estimation. The parameters were validated through ANOVA, while SHapley Additive exPlanations (SHAP) and Explainable Boosting Machine (EBM) quantified their contributions to <i>C</i><sub><i>d</i></sub> modeling, highlighting the dominant influence of <i>t</i>. Data normalization employed the StandardScaler method, with the dataset split into training (75%; 342 records) and testing (25%; 115 records) subsets. Model performance was assessed using metrics such as <i>R</i>², RMSE, SI, WMAPE, and sMAPE, and further evaluated using Taylor diagrams and a performance index (PI). During training stage, NGBoost-OOA achieved the highest accuracy, followed by LightGBM-POA, TabNet-MFO, and SVR-IWOA, with centered root mean square error (<i>E’</i>) values of 0.0057, 0.0064, 0.0067, and 0.0068, and PI scores of 165.5, 165.17, 123.25, and 123.25, respectively. In testing stage, TabNet-MFO and SVR-IWOA outperformed the other models, achieving equal <i>E′</i> values of 0.0060 and PI scores of 165.34, ranking first. NGBoost-OOA and LightGBM-POA ranked third and fourth, respectively. These findings demonstrate the potential of hybrid ML models in accurately predicting <i>C</i><sub><i>d</i></sub> for complex hydraulic structures like HCSSs, offering valuable insights for future engineering applications.</p></div>","PeriodicalId":55473,"journal":{"name":"Archives of Computational Methods in Engineering","volume":"32 7","pages":"4413 - 4445"},"PeriodicalIF":12.1000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Archives of Computational Methods in Engineering","FirstCategoryId":"5","ListUrlMain":"https://link.springer.com/article/10.1007/s11831-025-10274-z","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Accurately estimating the discharge coefficient (Cd) in spillways remains a complex challenge, critical to hydraulic engineering. Recent advancements suggest that hybrid Machine Learning (ML) models offer significant potential for improving Cd predictions. This study explores the application of four novel hybrid ML models to estimate Cd in Hydrofoil-Crested Stepped Spillways (HCSSs): Light Gradient Boosting Machine with Pelican Optimization Algorithm (LightGBM-POA), Neural Gradient Boosting with Osprey Optimization Algorithm (NGBoost-OOA), Tabular Neural Network with Moth Flame Optimization (TabNet-MFO), and Support Vector Regression with Improved Whale Optimization Algorithm (SVR-IWOA). Outlier detection was performed using the Isolation Forest algorithm, and dimensional analysis identified the hydrofoil formation index (t) and the ratio of upstream flow depth to total spillway height (yup/P) as the most influential parameters for Cd estimation. The parameters were validated through ANOVA, while SHapley Additive exPlanations (SHAP) and Explainable Boosting Machine (EBM) quantified their contributions to Cd modeling, highlighting the dominant influence of t. Data normalization employed the StandardScaler method, with the dataset split into training (75%; 342 records) and testing (25%; 115 records) subsets. Model performance was assessed using metrics such as R², RMSE, SI, WMAPE, and sMAPE, and further evaluated using Taylor diagrams and a performance index (PI). During training stage, NGBoost-OOA achieved the highest accuracy, followed by LightGBM-POA, TabNet-MFO, and SVR-IWOA, with centered root mean square error (E’) values of 0.0057, 0.0064, 0.0067, and 0.0068, and PI scores of 165.5, 165.17, 123.25, and 123.25, respectively. In testing stage, TabNet-MFO and SVR-IWOA outperformed the other models, achieving equal E′ values of 0.0060 and PI scores of 165.34, ranking first. NGBoost-OOA and LightGBM-POA ranked third and fourth, respectively. These findings demonstrate the potential of hybrid ML models in accurately predicting Cd for complex hydraulic structures like HCSSs, offering valuable insights for future engineering applications.

水翼顶梯级溢洪道流量系数预测的混合机器学习模型
准确估算溢洪道泄洪系数(Cd)一直是一项复杂的挑战,对水利工程至关重要。最近的进展表明,混合机器学习(ML)模型为改进Cd预测提供了巨大的潜力。本研究探讨了四种新型混合ML模型在水翼顶阶梯溢洪道Cd估计中的应用:鹈鹕优化算法的光梯度增压机(lightgbf - poa)、鱼鹰优化算法的神经梯度增压(NGBoost-OOA)、蛾焰优化的表格神经网络(TabNet-MFO)和改进鲸鱼优化算法的支持向量回归(SVR-IWOA)。采用隔离森林算法进行离群值检测,通过量纲分析发现,水翼形成指数(t)和上游水流深度与溢洪道总高度之比(yup/P)是影响Cd估计的最重要参数。参数通过方差分析进行验证,而SHapley Additive exPlanations (SHAP)和Explainable Boosting Machine (EBM)量化了它们对Cd建模的贡献,突出了t的主要影响。数据归一化采用StandardScaler方法,将数据集分为训练子集(75%;342条记录)和测试子集(25%;115条记录)。使用R²、RMSE、SI、WMAPE和sMAPE等指标评估模型性能,并使用泰勒图和性能指数(PI)进一步评估模型性能。在训练阶段,NGBoost-OOA的准确率最高,其次是LightGBM-POA、TabNet-MFO和SVR-IWOA,其中心均方根误差(E ')值分别为0.0057、0.0064、0.0067和0.0068,PI得分分别为165.5、165.17、123.25和123.25。在测试阶段,TabNet-MFO和SVR-IWOA的表现优于其他模型,E′值均为0.0060,PI得分为165.34,排名第一。NGBoost-OOA和LightGBM-POA分别排名第三和第四。这些发现证明了混合ML模型在准确预测复杂水工结构(如hcss)的Cd方面的潜力,为未来的工程应用提供了有价值的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
19.80
自引率
4.10%
发文量
153
审稿时长
>12 weeks
期刊介绍: Archives of Computational Methods in Engineering Aim and Scope: Archives of Computational Methods in Engineering serves as an active forum for disseminating research and advanced practices in computational engineering, particularly focusing on mechanics and related fields. The journal emphasizes extended state-of-the-art reviews in selected areas, a unique feature of its publication. Review Format: Reviews published in the journal offer: A survey of current literature Critical exposition of topics in their full complexity By organizing the information in this manner, readers can quickly grasp the focus, coverage, and unique features of the Archives of Computational Methods in Engineering.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信