Jiajia Gao , Weidong Yang , Fuzhi Chen , Long Chang , Hai Lin , Yutian Feng , Gengchen Bian , Hengyi Jiang
{"title":"利用机器学习方法进行孔隙压力预测,测井数据采用高斯混合聚类模型","authors":"Jiajia Gao , Weidong Yang , Fuzhi Chen , Long Chang , Hai Lin , Yutian Feng , Gengchen Bian , Hengyi Jiang","doi":"10.1016/j.geoen.2025.214188","DOIUrl":null,"url":null,"abstract":"<div><div>This work proposes an innovative framework that combines intelligent clustering and machine learning methods to address the limitations of insufficient accuracy and complex operation in traditional pore pressure prediction methods. Firstly, the Gaussian mixture clustering model (GMCM) automatically identifies normal compaction clusters and eliminates the subjective error associated with manual division. Secondly, based on the clustering results and the CPO algorithm, the empirical coefficient of the Eaton method is dynamically optimized to generate high-precision pore pressure samples. Finally, the training set is constructed by integrating the preferred samples and logging data. The performances of the four machine learning models, including LSTM, XGBoost, SVR, and CNN-BiLSTM, are systematically evaluated using the ReliefF feature analysis. The empirical study demonstrates that the GMCM-CPO-Eaton method significantly enhances prediction accuracy in formations with abnormally high pressures. The errors of the AC-Eaton method and the RT-Eaton method are reduced by 11.7 % and 89.8 %, respectively. The predicted pore pressure curve highly matches the measured point height. The comprehensive performance of the models is in descending order: XGBoost, CNN-BiLSTM, LSTM, and SVR. The XGBoost model is susceptible to overfitting, which decreases its generalization ability, as evidenced by the MSE of the verification wells being above 0.12. The CNN-BiLSTM exhibits excellent stability, with its performance least affected by variations in data quantity and characteristic fluctuations. The determination coefficient R<sup>2</sup> remains stable above 0.95 in large samples and above 0.8 in small samples, performing well in adjacent well prediction with MSE values of 0.1005 and 0.0971 for the two studied wells. This further indicates that the CNN-BiSLTM model exhibits high generalization in predicting pore pressure.</div></div>","PeriodicalId":100578,"journal":{"name":"Geoenergy Science and Engineering","volume":"257 ","pages":"Article 214188"},"PeriodicalIF":4.6000,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Pore pressure prediction using machine learning methods and logging data considering Gaussian mixture clustering model\",\"authors\":\"Jiajia Gao , Weidong Yang , Fuzhi Chen , Long Chang , Hai Lin , Yutian Feng , Gengchen Bian , Hengyi Jiang\",\"doi\":\"10.1016/j.geoen.2025.214188\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This work proposes an innovative framework that combines intelligent clustering and machine learning methods to address the limitations of insufficient accuracy and complex operation in traditional pore pressure prediction methods. Firstly, the Gaussian mixture clustering model (GMCM) automatically identifies normal compaction clusters and eliminates the subjective error associated with manual division. Secondly, based on the clustering results and the CPO algorithm, the empirical coefficient of the Eaton method is dynamically optimized to generate high-precision pore pressure samples. Finally, the training set is constructed by integrating the preferred samples and logging data. The performances of the four machine learning models, including LSTM, XGBoost, SVR, and CNN-BiLSTM, are systematically evaluated using the ReliefF feature analysis. The empirical study demonstrates that the GMCM-CPO-Eaton method significantly enhances prediction accuracy in formations with abnormally high pressures. The errors of the AC-Eaton method and the RT-Eaton method are reduced by 11.7 % and 89.8 %, respectively. The predicted pore pressure curve highly matches the measured point height. The comprehensive performance of the models is in descending order: XGBoost, CNN-BiLSTM, LSTM, and SVR. The XGBoost model is susceptible to overfitting, which decreases its generalization ability, as evidenced by the MSE of the verification wells being above 0.12. The CNN-BiLSTM exhibits excellent stability, with its performance least affected by variations in data quantity and characteristic fluctuations. The determination coefficient R<sup>2</sup> remains stable above 0.95 in large samples and above 0.8 in small samples, performing well in adjacent well prediction with MSE values of 0.1005 and 0.0971 for the two studied wells. This further indicates that the CNN-BiSLTM model exhibits high generalization in predicting pore pressure.</div></div>\",\"PeriodicalId\":100578,\"journal\":{\"name\":\"Geoenergy Science and Engineering\",\"volume\":\"257 \",\"pages\":\"Article 214188\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Geoenergy Science and Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2949891025005469\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"ENERGY & FUELS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geoenergy Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949891025005469","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
Pore pressure prediction using machine learning methods and logging data considering Gaussian mixture clustering model
This work proposes an innovative framework that combines intelligent clustering and machine learning methods to address the limitations of insufficient accuracy and complex operation in traditional pore pressure prediction methods. Firstly, the Gaussian mixture clustering model (GMCM) automatically identifies normal compaction clusters and eliminates the subjective error associated with manual division. Secondly, based on the clustering results and the CPO algorithm, the empirical coefficient of the Eaton method is dynamically optimized to generate high-precision pore pressure samples. Finally, the training set is constructed by integrating the preferred samples and logging data. The performances of the four machine learning models, including LSTM, XGBoost, SVR, and CNN-BiLSTM, are systematically evaluated using the ReliefF feature analysis. The empirical study demonstrates that the GMCM-CPO-Eaton method significantly enhances prediction accuracy in formations with abnormally high pressures. The errors of the AC-Eaton method and the RT-Eaton method are reduced by 11.7 % and 89.8 %, respectively. The predicted pore pressure curve highly matches the measured point height. The comprehensive performance of the models is in descending order: XGBoost, CNN-BiLSTM, LSTM, and SVR. The XGBoost model is susceptible to overfitting, which decreases its generalization ability, as evidenced by the MSE of the verification wells being above 0.12. The CNN-BiLSTM exhibits excellent stability, with its performance least affected by variations in data quantity and characteristic fluctuations. The determination coefficient R2 remains stable above 0.95 in large samples and above 0.8 in small samples, performing well in adjacent well prediction with MSE values of 0.1005 and 0.0971 for the two studied wells. This further indicates that the CNN-BiSLTM model exhibits high generalization in predicting pore pressure.