Ning Ma , Shaoqun Dong , Lexiu Wang , Leting Wang , Xu Yang , Shuo Liu
{"title":"Unbalanced graph isomorphism network for fracture identification by well logs","authors":"Ning Ma , Shaoqun Dong , Lexiu Wang , Leting Wang , Xu Yang , Shuo Liu","doi":"10.1016/j.eswa.2024.125794","DOIUrl":null,"url":null,"abstract":"<div><div>Fracture identification and prediction are of great significance for the production of tight oil and gas reservoirs. The high angles of fractures limit their traceability and reduce drilling intersection, leading to significant data imbalance and making fracture identification an imbalanced classification problem. Lithology and fluid properties can create similar features in fracture samples, often resulting in nonlinear relationships and a non-Euclidean structure. This complexity makes fracture identification a nonlinear process. To address this issue, the unbalanced graph isomorphism network (UGIN) algorithm is introduced. This approach leverages the GIN and incorporates a binary cross-entropy loss function specifically designed for unbalanced samples during fracture identification, aiming to adjust the model’s focus toward minority classes by assigning higher penalties to misclassified fracture samples, thereby improving detection accuracy in imbalanced datasets. The identification process is divided into three stages: First, the sample logging similarity information is integrated into the graph structure using the sequence edge method. Second, node-level information is embedded via the GIN algorithm, and nodes are clustered using K-means to derive the local graph’s embedding representation. Finally, nodes are classified using the model. To test the validation of the UGIN algorithm, a dataset of fractured carbonate reservoirs in A Oilfield, of the Zagros Mountain fold belt is used. The results demonstrate robust generalization on both training and test datasets through the use of cross-validation, achieving an AUC score of 0.938, higher than the baseline model. The classification accuracy on test data reaches 96.7%, with particularly strong performance in identifying fracture samples. To evaluate the impact of different graph construction methods on UGIN’s performance, we compare the K-means clustering method, hierarchical clustering method, the comprehensive connectivity method, the enhanced linkage strategy and the sequence edge method. Results indicate that the sequence edge method performs best, maximizing the retention of depth-related information in logging features and enhancing sample embedding.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"263 ","pages":"Article 125794"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417424026617","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Fracture identification and prediction are of great significance for the production of tight oil and gas reservoirs. The high angles of fractures limit their traceability and reduce drilling intersection, leading to significant data imbalance and making fracture identification an imbalanced classification problem. Lithology and fluid properties can create similar features in fracture samples, often resulting in nonlinear relationships and a non-Euclidean structure. This complexity makes fracture identification a nonlinear process. To address this issue, the unbalanced graph isomorphism network (UGIN) algorithm is introduced. This approach leverages the GIN and incorporates a binary cross-entropy loss function specifically designed for unbalanced samples during fracture identification, aiming to adjust the model’s focus toward minority classes by assigning higher penalties to misclassified fracture samples, thereby improving detection accuracy in imbalanced datasets. The identification process is divided into three stages: First, the sample logging similarity information is integrated into the graph structure using the sequence edge method. Second, node-level information is embedded via the GIN algorithm, and nodes are clustered using K-means to derive the local graph’s embedding representation. Finally, nodes are classified using the model. To test the validation of the UGIN algorithm, a dataset of fractured carbonate reservoirs in A Oilfield, of the Zagros Mountain fold belt is used. The results demonstrate robust generalization on both training and test datasets through the use of cross-validation, achieving an AUC score of 0.938, higher than the baseline model. The classification accuracy on test data reaches 96.7%, with particularly strong performance in identifying fracture samples. To evaluate the impact of different graph construction methods on UGIN’s performance, we compare the K-means clustering method, hierarchical clustering method, the comprehensive connectivity method, the enhanced linkage strategy and the sequence edge method. Results indicate that the sequence edge method performs best, maximizing the retention of depth-related information in logging features and enhancing sample embedding.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.