Ziyun Yuan , Lei Chen , Weiming Shao , Zhiheng Zuo , Wan Zhang , Gang Liu
{"title":"A robust hybrid predictive model of mixed oil length with deep integration of mechanism and data","authors":"Ziyun Yuan , Lei Chen , Weiming Shao , Zhiheng Zuo , Wan Zhang , Gang Liu","doi":"10.1016/j.jpse.2021.12.002","DOIUrl":null,"url":null,"abstract":"<div><p>Accurate estimation of mixed oil length is highly required in multi-product pipelines because it can guide the operator to correctly handle the mixed oil segment and effectively reduce the loss of petroleum product quality. In previous study, a hybrid model combined with machine learning algorithm with existing mechanism has been developed and has good predictive accuracy. Unfortunately, due to incorrect measurement and improper recording, outliers are widely present in industrial datasets and may render the predictive performance of the previous model quite disappointing, while the effect of outliers on predictive models for the mixed oil length is rarely discussed. In order to deal with such issues, this paper first proposes a way to define the outlier sample and explicitly studies its impact on the performance of the predictive model for mixed oil prediction. Subsequentially, various new hybrid modeling methods are developed driven by both operation data (exploited by the Gradient Boosting Decision Tree algorithm) and the mechanism (based on the Austin-Palfrey equation) in different arrangements. Extensive experiments are conducted on real-life transportation pipelines, and the results show that with the clean training set, the <em>R</em><sup>2</sup> index of the proposed serial-parallel hybrid model (SPHM) is 0.96, which is higher than that of mechanism model and the existing hybrid model. Even with all the outliers added, advantage in prediction accuracy of the SPHM is still noticed, demonstrating feasibility and robustness of the hybrid modeling approach for prediction of mixed oil length.</p></div>","PeriodicalId":100824,"journal":{"name":"Journal of Pipeline Science and Engineering","volume":"1 4","pages":"Pages 459-467"},"PeriodicalIF":4.8000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667143321000779/pdfft?md5=c4d650925b164e98595bef5a6aa818ad&pid=1-s2.0-S2667143321000779-main.pdf","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pipeline Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667143321000779","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 8
Abstract
Accurate estimation of mixed oil length is highly required in multi-product pipelines because it can guide the operator to correctly handle the mixed oil segment and effectively reduce the loss of petroleum product quality. In previous study, a hybrid model combined with machine learning algorithm with existing mechanism has been developed and has good predictive accuracy. Unfortunately, due to incorrect measurement and improper recording, outliers are widely present in industrial datasets and may render the predictive performance of the previous model quite disappointing, while the effect of outliers on predictive models for the mixed oil length is rarely discussed. In order to deal with such issues, this paper first proposes a way to define the outlier sample and explicitly studies its impact on the performance of the predictive model for mixed oil prediction. Subsequentially, various new hybrid modeling methods are developed driven by both operation data (exploited by the Gradient Boosting Decision Tree algorithm) and the mechanism (based on the Austin-Palfrey equation) in different arrangements. Extensive experiments are conducted on real-life transportation pipelines, and the results show that with the clean training set, the R2 index of the proposed serial-parallel hybrid model (SPHM) is 0.96, which is higher than that of mechanism model and the existing hybrid model. Even with all the outliers added, advantage in prediction accuracy of the SPHM is still noticed, demonstrating feasibility and robustness of the hybrid modeling approach for prediction of mixed oil length.