A robust hybrid predictive model of mixed oil length with deep integration of mechanism and data

IF 4.8 Q2 ENERGY & FUELS

Journal of Pipeline Science and Engineering Pub Date : 2021-12-01 DOI:10.1016/j.jpse.2021.12.002

Ziyun Yuan , Lei Chen , Weiming Shao , Zhiheng Zuo , Wan Zhang , Gang Liu

{"title":"A robust hybrid predictive model of mixed oil length with deep integration of mechanism and data","authors":"Ziyun Yuan , Lei Chen , Weiming Shao , Zhiheng Zuo , Wan Zhang , Gang Liu","doi":"10.1016/j.jpse.2021.12.002","DOIUrl":null,"url":null,"abstract":"<div><p>Accurate estimation of mixed oil length is highly required in multi-product pipelines because it can guide the operator to correctly handle the mixed oil segment and effectively reduce the loss of petroleum product quality. In previous study, a hybrid model combined with machine learning algorithm with existing mechanism has been developed and has good predictive accuracy. Unfortunately, due to incorrect measurement and improper recording, outliers are widely present in industrial datasets and may render the predictive performance of the previous model quite disappointing, while the effect of outliers on predictive models for the mixed oil length is rarely discussed. In order to deal with such issues, this paper first proposes a way to define the outlier sample and explicitly studies its impact on the performance of the predictive model for mixed oil prediction. Subsequentially, various new hybrid modeling methods are developed driven by both operation data (exploited by the Gradient Boosting Decision Tree algorithm) and the mechanism (based on the Austin-Palfrey equation) in different arrangements. Extensive experiments are conducted on real-life transportation pipelines, and the results show that with the clean training set, the <em>R</em><sup>2</sup> index of the proposed serial-parallel hybrid model (SPHM) is 0.96, which is higher than that of mechanism model and the existing hybrid model. Even with all the outliers added, advantage in prediction accuracy of the SPHM is still noticed, demonstrating feasibility and robustness of the hybrid modeling approach for prediction of mixed oil length.</p></div>","PeriodicalId":100824,"journal":{"name":"Journal of Pipeline Science and Engineering","volume":"1 4","pages":"Pages 459-467"},"PeriodicalIF":4.8000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667143321000779/pdfft?md5=c4d650925b164e98595bef5a6aa818ad&pid=1-s2.0-S2667143321000779-main.pdf","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pipeline Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667143321000779","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENERGY & FUELS","Score":null,"Total":0}

引用次数: 8

Abstract

Accurate estimation of mixed oil length is highly required in multi-product pipelines because it can guide the operator to correctly handle the mixed oil segment and effectively reduce the loss of petroleum product quality. In previous study, a hybrid model combined with machine learning algorithm with existing mechanism has been developed and has good predictive accuracy. Unfortunately, due to incorrect measurement and improper recording, outliers are widely present in industrial datasets and may render the predictive performance of the previous model quite disappointing, while the effect of outliers on predictive models for the mixed oil length is rarely discussed. In order to deal with such issues, this paper first proposes a way to define the outlier sample and explicitly studies its impact on the performance of the predictive model for mixed oil prediction. Subsequentially, various new hybrid modeling methods are developed driven by both operation data (exploited by the Gradient Boosting Decision Tree algorithm) and the mechanism (based on the Austin-Palfrey equation) in different arrangements. Extensive experiments are conducted on real-life transportation pipelines, and the results show that with the clean training set, the R² index of the proposed serial-parallel hybrid model (SPHM) is 0.96, which is higher than that of mechanism model and the existing hybrid model. Even with all the outliers added, advantage in prediction accuracy of the SPHM is still noticed, demonstrating feasibility and robustness of the hybrid modeling approach for prediction of mixed oil length.

查看原文本刊更多论文

一种机制与数据深度融合的混合油长度鲁棒混合预测模型

在多产品管道中，对混合油长度的准确估计是非常重要的，因为它可以指导操作人员正确处理混合油段，有效地减少成品油质量的损失。在之前的研究中，已经开发了一种结合机器学习算法的混合模型，该模型具有良好的预测精度。不幸的是，由于测量不正确和记录不当，异常值在工业数据集中广泛存在，可能会使先前模型的预测性能非常令人失望，而异常值对混合油长度预测模型的影响很少被讨论。为了解决这些问题，本文首先提出了一种定义离群样本的方法，并明确研究了离群样本对混合油预测模型性能的影响。在此基础上，基于梯度提升决策树算法的运行数据驱动和基于Austin-Palfrey方程的机制驱动，提出了多种新的混合建模方法。在实际运输管道上进行了大量实验，结果表明，在训练集干净的情况下，所提出的串并联混合模型(SPHM)的R2指数为0.96，高于机理模型和现有的混合模型。即使加入了所有的异常值，SPHM在预测精度上的优势仍然被注意到，这证明了混合建模方法预测混合油长度的可行性和鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Pipeline Science and Engineering

CiteScore

7.50

自引率

0.00%

发文量