{"title":"Towards robust predictions: Ensemble machine learning framework for entrainment fraction with uncertainty in annular flow regime","authors":"Anadi Mondal, Subash L. Sharma","doi":"10.1016/j.icheatmasstransfer.2025.108813","DOIUrl":null,"url":null,"abstract":"<div><div>Entrainment fraction measures the fraction of liquid that is entrained as droplets into the gas core in gas-liquid annular flow. It directly impacts heat transfer characteristics, pressure drop, and flow stability in systems with an annular flow regime. Given its significance in industrial processes, many experiments have been carried out, leading to the proposal of various empirical and semi-empirical correlations. However, these correlations are often limited to specific operating conditions or gas-liquid combinations, making them unsuitable for robust prediction. Additionally, some correlations require many input parameters and iterative methods for predicting the entrainment fraction. This paper proposes two ensemble machine learning models—Random Forest (RF) and Gradient Boosting Regression (GBR)—for robust entrainment prediction in annular flow, applicable to varying operating conditions as well as gas-liquid combinations. The proposed models can learn complex correlations among 5 dimensionless input parameters—liquid and gas Reynolds number (<em>Re</em><sub><em>L</em></sub> and <em>Re</em><sub><em>g</em></sub>), Weber (gas) number (<em>We</em><sub><em>g</em></sub>), density ratio of liquid to gas(<em>ρ</em><sub><em>l</em></sub><em>/ρ</em><sub><em>g</em></sub>), and viscosity ratio of liquid to gas (μ<sub><em>l</em></sub><em>/</em>μ<sub><em>g</em></sub>)—to predict the entrainment fraction. A dataset of about 1628 data points on liquid entrainment from 11 authors was used to train (1302 data points—80 % of the data) and test (326 data points—20 % of the data) the models. The developed models were further validated on 140 completely raw(unseen) data points (not used for training and testing, but in the range of training dataset) and 153 extrapolated (outside of training range or flow configuration) data points from varying operating conditions. The performance of these two models was compared to five correlations and three other algorithms—Linear Regression (LR), K-nearest neighbors (KNN), and Support Vector Regression (SVR). Four evaluation metrics—Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Coefficient of Determination (R<sup>2</sup>)—were utilized for performance evaluation. The results indicate that the ensemble models outperformed both empirical correlations and other machine learning models, achieving the lowest RMSE, MAE, and MAPE values, along with the highest R<sup>2</sup> on test and unseen data points. Only 15.7 % and 20 % of unseen data points were outside the ±30 % limit for the RF and GBR models, respectively, compared to at least 22.1 % for the empirical models and 22.8 % for the other machine-learning models. Moreover, predictions on extrapolated data points, along with uncertainty quantification and the effect of experimental parameters on prediction accuracy for unseen data points, have also been conducted here.</div></div>","PeriodicalId":332,"journal":{"name":"International Communications in Heat and Mass Transfer","volume":"164 ","pages":"Article 108813"},"PeriodicalIF":6.4000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Communications in Heat and Mass Transfer","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0735193325002386","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MECHANICS","Score":null,"Total":0}
引用次数: 0
Abstract
Entrainment fraction measures the fraction of liquid that is entrained as droplets into the gas core in gas-liquid annular flow. It directly impacts heat transfer characteristics, pressure drop, and flow stability in systems with an annular flow regime. Given its significance in industrial processes, many experiments have been carried out, leading to the proposal of various empirical and semi-empirical correlations. However, these correlations are often limited to specific operating conditions or gas-liquid combinations, making them unsuitable for robust prediction. Additionally, some correlations require many input parameters and iterative methods for predicting the entrainment fraction. This paper proposes two ensemble machine learning models—Random Forest (RF) and Gradient Boosting Regression (GBR)—for robust entrainment prediction in annular flow, applicable to varying operating conditions as well as gas-liquid combinations. The proposed models can learn complex correlations among 5 dimensionless input parameters—liquid and gas Reynolds number (ReL and Reg), Weber (gas) number (Weg), density ratio of liquid to gas(ρl/ρg), and viscosity ratio of liquid to gas (μl/μg)—to predict the entrainment fraction. A dataset of about 1628 data points on liquid entrainment from 11 authors was used to train (1302 data points—80 % of the data) and test (326 data points—20 % of the data) the models. The developed models were further validated on 140 completely raw(unseen) data points (not used for training and testing, but in the range of training dataset) and 153 extrapolated (outside of training range or flow configuration) data points from varying operating conditions. The performance of these two models was compared to five correlations and three other algorithms—Linear Regression (LR), K-nearest neighbors (KNN), and Support Vector Regression (SVR). Four evaluation metrics—Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Coefficient of Determination (R2)—were utilized for performance evaluation. The results indicate that the ensemble models outperformed both empirical correlations and other machine learning models, achieving the lowest RMSE, MAE, and MAPE values, along with the highest R2 on test and unseen data points. Only 15.7 % and 20 % of unseen data points were outside the ±30 % limit for the RF and GBR models, respectively, compared to at least 22.1 % for the empirical models and 22.8 % for the other machine-learning models. Moreover, predictions on extrapolated data points, along with uncertainty quantification and the effect of experimental parameters on prediction accuracy for unseen data points, have also been conducted here.
期刊介绍:
International Communications in Heat and Mass Transfer serves as a world forum for the rapid dissemination of new ideas, new measurement techniques, preliminary findings of ongoing investigations, discussions, and criticisms in the field of heat and mass transfer. Two types of manuscript will be considered for publication: communications (short reports of new work or discussions of work which has already been published) and summaries (abstracts of reports, theses or manuscripts which are too long for publication in full). Together with its companion publication, International Journal of Heat and Mass Transfer, with which it shares the same Board of Editors, this journal is read by research workers and engineers throughout the world.