基于生成对抗网络(gan)的油气管道突发失效风险分析综合数据生成

IF 2.3 Q2 ENGINEERING, MULTIDISCIPLINARY

ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems Part B-Mechanical Engineering Pub Date : 2023-06-15 DOI:10.1115/1.4062741

R. K. Mazumder, Gourav Modanwal, Yue Li

{"title":"基于生成对抗网络(gan)的油气管道突发失效风险分析综合数据生成","authors":"R. K. Mazumder, Gourav Modanwal, Yue Li","doi":"10.1115/1.4062741","DOIUrl":null,"url":null,"abstract":"\n Despite the pipeline network being the safest mode of oil and gas transportation systems, the pipeline failure rate has increased significantly over the last decade, particularly for aging pipelines. Predicting failure risk and prioritizing the riskiest asset from a large set of pipelines is one of the demanding tasks for the utilities. Machine Learning (ML) application in pipeline failure risk prediction has recently shown promising results. However, due to safety and security concerns, obtaining sufficient operation and failure data to train ML models accurately is a significant challenge. This study employed a Generative Adversarial Network (GAN) based framework to generate synthetic pipeline data (DSyn, N=100) based on a subset (70%) of experimental burst test results data (DExp) compiled from the literature (N= 92) to overcome the limitation of accessing operational data. The proposed framework was tested on (1) real data, and (2) combined real and generated synthetic data. The burst failure risk of corroded oil and gas pipelines was determined using probabilistic approaches, and pipelines were classified into two classes: (1) low risk (pf:0-0.5) and (2) high risk (pf:>0.5). Two Random Forest (RF) models (MExp and MComb) were trained using a subset of actual experimental pipeline data (DExp, N=64) and combined data (DExp + DSyn, N=164). These models were validated on the remaining subset (30%) of experimental test data (N=28). The validation results reveal that adding synthetic data can further improve the performance of the ML models. The area under the ROC Curve was found to be 0.96 and 0.99 for real model (MExp) and combined model (MComb) data, respectively. The combined model with improved performance can be used in strategic oil and gas pipeline resilience improvement planning, which sets long-term critical decisions regarding maintenance and potential replacement of pipes.","PeriodicalId":44694,"journal":{"name":"ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems Part B-Mechanical Engineering","volume":"1 1","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2023-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Synthetic Data Generation Using Generative Adversarial Network (gan) for Burst Failure Risk Analysis of Oil and Gas Pipelines\",\"authors\":\"R. K. Mazumder, Gourav Modanwal, Yue Li\",\"doi\":\"10.1115/1.4062741\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Despite the pipeline network being the safest mode of oil and gas transportation systems, the pipeline failure rate has increased significantly over the last decade, particularly for aging pipelines. Predicting failure risk and prioritizing the riskiest asset from a large set of pipelines is one of the demanding tasks for the utilities. Machine Learning (ML) application in pipeline failure risk prediction has recently shown promising results. However, due to safety and security concerns, obtaining sufficient operation and failure data to train ML models accurately is a significant challenge. This study employed a Generative Adversarial Network (GAN) based framework to generate synthetic pipeline data (DSyn, N=100) based on a subset (70%) of experimental burst test results data (DExp) compiled from the literature (N= 92) to overcome the limitation of accessing operational data. The proposed framework was tested on (1) real data, and (2) combined real and generated synthetic data. The burst failure risk of corroded oil and gas pipelines was determined using probabilistic approaches, and pipelines were classified into two classes: (1) low risk (pf:0-0.5) and (2) high risk (pf:>0.5). Two Random Forest (RF) models (MExp and MComb) were trained using a subset of actual experimental pipeline data (DExp, N=64) and combined data (DExp + DSyn, N=164). These models were validated on the remaining subset (30%) of experimental test data (N=28). The validation results reveal that adding synthetic data can further improve the performance of the ML models. The area under the ROC Curve was found to be 0.96 and 0.99 for real model (MExp) and combined model (MComb) data, respectively. The combined model with improved performance can be used in strategic oil and gas pipeline resilience improvement planning, which sets long-term critical decisions regarding maintenance and potential replacement of pipes.\",\"PeriodicalId\":44694,\"journal\":{\"name\":\"ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems Part B-Mechanical Engineering\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2023-06-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems Part B-Mechanical Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1115/1.4062741\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems Part B-Mechanical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1115/1.4062741","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

尽管管道网络是石油和天然气运输系统中最安全的模式，但在过去十年中，管道故障率显着增加，特别是老化的管道。对于公用事业公司来说，预测故障风险并优先考虑大量管道中风险最大的资产是一项艰巨的任务。近年来，机器学习在管道故障风险预测中的应用取得了可喜的成果。然而，出于安全和保障方面的考虑，获得足够的操作和故障数据来准确训练ML模型是一个重大挑战。本研究采用基于生成对抗网络(GAN)的框架，基于从文献(N= 92)中编译的实验爆炸测试结果数据(DExp)的子集(70%)生成合成管道数据(DSyn, N=100)，以克服访问操作数据的限制。在(1)真实数据和(2)真实数据与生成的合成数据的结合上对该框架进行了测试。采用概率法确定了腐蚀油气管道的爆裂失效风险，并将管道分为低风险(pf:0 ~ 0.5)和高风险(pf:>0.5)两类。两个随机森林(RF)模型(MExp和MComb)使用实际实验管道数据(DExp, N=64)和组合数据(DExp + DSyn, N=164)的子集进行训练。这些模型在剩余子集(30%)的实验测试数据(N=28)上进行验证。验证结果表明，添加合成数据可以进一步提高机器学习模型的性能。真实模型(MExp)和组合模型(MComb)数据的ROC曲线下面积分别为0.96和0.99。该组合模型的性能得到了改善，可用于油气管道弹性改善战略规划，该规划可制定有关管道维护和潜在更换的长期关键决策。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Synthetic Data Generation Using Generative Adversarial Network (gan) for Burst Failure Risk Analysis of Oil and Gas Pipelines

Despite the pipeline network being the safest mode of oil and gas transportation systems, the pipeline failure rate has increased significantly over the last decade, particularly for aging pipelines. Predicting failure risk and prioritizing the riskiest asset from a large set of pipelines is one of the demanding tasks for the utilities. Machine Learning (ML) application in pipeline failure risk prediction has recently shown promising results. However, due to safety and security concerns, obtaining sufficient operation and failure data to train ML models accurately is a significant challenge. This study employed a Generative Adversarial Network (GAN) based framework to generate synthetic pipeline data (DSyn, N=100) based on a subset (70%) of experimental burst test results data (DExp) compiled from the literature (N= 92) to overcome the limitation of accessing operational data. The proposed framework was tested on (1) real data, and (2) combined real and generated synthetic data. The burst failure risk of corroded oil and gas pipelines was determined using probabilistic approaches, and pipelines were classified into two classes: (1) low risk (pf:0-0.5) and (2) high risk (pf:>0.5). Two Random Forest (RF) models (MExp and MComb) were trained using a subset of actual experimental pipeline data (DExp, N=64) and combined data (DExp + DSyn, N=164). These models were validated on the remaining subset (30%) of experimental test data (N=28). The validation results reveal that adding synthetic data can further improve the performance of the ML models. The area under the ROC Curve was found to be 0.96 and 0.99 for real model (MExp) and combined model (MComb) data, respectively. The combined model with improved performance can be used in strategic oil and gas pipeline resilience improvement planning, which sets long-term critical decisions regarding maintenance and potential replacement of pipes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems Part B-Mechanical Engineering ENGINEERING, MULTIDISCIPLINARY-

CiteScore

5.20

自引率

13.60%

发文量