Optimized machine learning algorithms with SHAP analysis for predicting compressive strength in high-performance concrete

Samuel Olaoluwa Abioye, Yusuf Olawale Babatunde, Oluwafikejimi Abigail Abikoye, Aisha Nene Shaibu, Bailey Jonathan Bankole
{"title":"Optimized machine learning algorithms with SHAP analysis for predicting compressive strength in high-performance concrete","authors":"Samuel Olaoluwa Abioye,&nbsp;Yusuf Olawale Babatunde,&nbsp;Oluwafikejimi Abigail Abikoye,&nbsp;Aisha Nene Shaibu,&nbsp;Bailey Jonathan Bankole","doi":"10.1007/s43503-025-00061-x","DOIUrl":null,"url":null,"abstract":"<div><p>This research examines the application of eight different machine learning (ML) algorithms for predicting the compressive strength of high-performance concrete (HPC). Achieving precise predictions is crucial for enhancing structural reliability and optimizing resource usage in construction projects. The analysis utilized the “Concrete Compressive Strength” dataset, sourced from UC Irvine’s publicly available ML repository. The models evaluated include Gradient Boosting Regressor (GBR), Extreme Gradient Boosting Regression (XGBoost), Random Forest (RF), Support Vector Regression (SVR), Artificial Neural Network (ANN), Multilayer Perceptron (MLP), Lasso, and k-Nearest Neighbors (KNN). To enhance performance, critical data preprocessing steps were undertaken, which involved feature scaling, cleaning, and normalization. Hyperparameter tuning via Grid Search (GS) and K-fold cross-validation further optimized the models. Among those analyzed, XGBoost and GBR achieved the highest predictive accuracy, with R<sup>2</sup> values of 93.49% and 92.09% respectively, coupled with lower mean squared error (MSE), mean absolute error (MAE), and root mean squared error (RMSE). SHapley Additive exPlanations (SHAP) analysis revealed cement content and curing age as the most significant factors affecting compressive strength. Validation against experimental data confirmed the reliability of XGBoost and GBR through consistent prediction patterns and close alignment with empirical measurements. The results establish ML as an effective approach for HPC strength prediction, offering advantages in computational efficiency and accuracy over conventional analytical methods.</p></div>","PeriodicalId":72138,"journal":{"name":"AI in civil engineering","volume":"4 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s43503-025-00061-x.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI in civil engineering","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s43503-025-00061-x","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This research examines the application of eight different machine learning (ML) algorithms for predicting the compressive strength of high-performance concrete (HPC). Achieving precise predictions is crucial for enhancing structural reliability and optimizing resource usage in construction projects. The analysis utilized the “Concrete Compressive Strength” dataset, sourced from UC Irvine’s publicly available ML repository. The models evaluated include Gradient Boosting Regressor (GBR), Extreme Gradient Boosting Regression (XGBoost), Random Forest (RF), Support Vector Regression (SVR), Artificial Neural Network (ANN), Multilayer Perceptron (MLP), Lasso, and k-Nearest Neighbors (KNN). To enhance performance, critical data preprocessing steps were undertaken, which involved feature scaling, cleaning, and normalization. Hyperparameter tuning via Grid Search (GS) and K-fold cross-validation further optimized the models. Among those analyzed, XGBoost and GBR achieved the highest predictive accuracy, with R2 values of 93.49% and 92.09% respectively, coupled with lower mean squared error (MSE), mean absolute error (MAE), and root mean squared error (RMSE). SHapley Additive exPlanations (SHAP) analysis revealed cement content and curing age as the most significant factors affecting compressive strength. Validation against experimental data confirmed the reliability of XGBoost and GBR through consistent prediction patterns and close alignment with empirical measurements. The results establish ML as an effective approach for HPC strength prediction, offering advantages in computational efficiency and accuracy over conventional analytical methods.

优化机器学习算法与SHAP分析预测高性能混凝土抗压强度
本研究探讨了八种不同的机器学习(ML)算法在预测高性能混凝土(HPC)抗压强度方面的应用。实现准确的预测对于提高结构可靠性和优化建设项目的资源利用至关重要。该分析利用了“混凝土抗压强度”数据集,该数据集来自加州大学欧文分校公开可用的ML存储库。评估的模型包括梯度增强回归(GBR)、极端梯度增强回归(XGBoost)、随机森林(RF)、支持向量回归(SVR)、人工神经网络(ANN)、多层感知器(MLP)、Lasso和k-近邻(KNN)。为了提高性能,进行了关键的数据预处理步骤,包括特征缩放、清理和规范化。通过网格搜索(GS)和K-fold交叉验证的超参数调整进一步优化了模型。其中,XGBoost和GBR的预测准确率最高,R2值分别为93.49%和92.09%,均方误差(MSE)、平均绝对误差(MAE)和均方根误差(RMSE)均较低。SHapley添加剂解释(SHAP)分析显示水泥掺量和养护龄期是影响抗压强度最显著的因素。对实验数据的验证证实了XGBoost和GBR的可靠性,通过一致的预测模式和与经验测量的密切一致。结果表明,ML是一种有效的HPC强度预测方法,与传统的分析方法相比,在计算效率和准确性方面具有优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信