A systematic data-driven modelling framework for nonlinear distillation processes incorporating data intervals clustering and new integrated learning algorithm

IF 3.7 3区 工程技术 Q2 ENGINEERING, CHEMICAL
Zhe Wang , Renchu He , Jian Long
{"title":"A systematic data-driven modelling framework for nonlinear distillation processes incorporating data intervals clustering and new integrated learning algorithm","authors":"Zhe Wang ,&nbsp;Renchu He ,&nbsp;Jian Long","doi":"10.1016/j.cjche.2025.02.013","DOIUrl":null,"url":null,"abstract":"<div><div>The distillation process is an important chemical process, and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling, thus improving the efficiency of process optimization or monitoring studies. However, the distillation process is highly nonlinear and has multiple uncertainty perturbation intervals, which brings challenges to accurate data-driven modelling of distillation processes. This paper proposes a systematic data-driven modelling framework to solve these problems. Firstly, data segment variance was introduced into the K-means algorithm to form K-means data interval (KMDI) clustering in order to cluster the data into perturbed and steady state intervals for steady-state data extraction. Secondly, maximal information coefficient (MIC) was employed to calculate the nonlinear correlation between variables for removing redundant features. Finally, extreme gradient boosting (XGBoost) was integrated as the basic learner into adaptive boosting (AdaBoost) with the error threshold (ET) set to improve weights update strategy to construct the new integrated learning algorithm, XGBoost-AdaBoost-ET. The superiority of the proposed framework is verified by applying this data-driven modelling framework to a real industrial process of propylene distillation.</div></div>","PeriodicalId":9966,"journal":{"name":"Chinese Journal of Chemical Engineering","volume":"81 ","pages":"Pages 182-199"},"PeriodicalIF":3.7000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chinese Journal of Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1004954125000953","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
引用次数: 0

Abstract

The distillation process is an important chemical process, and the application of data-driven modelling approach has the potential to reduce model complexity compared to mechanistic modelling, thus improving the efficiency of process optimization or monitoring studies. However, the distillation process is highly nonlinear and has multiple uncertainty perturbation intervals, which brings challenges to accurate data-driven modelling of distillation processes. This paper proposes a systematic data-driven modelling framework to solve these problems. Firstly, data segment variance was introduced into the K-means algorithm to form K-means data interval (KMDI) clustering in order to cluster the data into perturbed and steady state intervals for steady-state data extraction. Secondly, maximal information coefficient (MIC) was employed to calculate the nonlinear correlation between variables for removing redundant features. Finally, extreme gradient boosting (XGBoost) was integrated as the basic learner into adaptive boosting (AdaBoost) with the error threshold (ET) set to improve weights update strategy to construct the new integrated learning algorithm, XGBoost-AdaBoost-ET. The superiority of the proposed framework is verified by applying this data-driven modelling framework to a real industrial process of propylene distillation.

Abstract Image

结合数据区间聚类和新的集成学习算法的非线性蒸馏过程系统数据驱动建模框架
蒸馏过程是一个重要的化学过程,与机械建模相比,数据驱动建模方法的应用有可能降低模型的复杂性,从而提高过程优化或监测研究的效率。然而,精馏过程是高度非线性的,具有多个不确定性扰动区间,这给精馏过程的精确数据驱动建模带来了挑战。本文提出了一个系统的数据驱动建模框架来解决这些问题。首先,在K-means算法中引入数据段方差,形成K-means数据区间聚类(KMDI),将数据聚类到扰动区间和稳态区间进行稳态数据提取;其次,利用最大信息系数(MIC)计算变量间的非线性相关性,去除冗余特征;最后,将极限梯度增强(XGBoost)作为基本学习器集成到自适应增强(AdaBoost)中,设置误差阈值(ET)改进权值更新策略,构建新的集成学习算法XGBoost-AdaBoost-ET。通过将该数据驱动的建模框架应用于丙烯蒸馏的实际工业过程,验证了该框架的优越性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Chinese Journal of Chemical Engineering
Chinese Journal of Chemical Engineering 工程技术-工程:化工
CiteScore
6.60
自引率
5.30%
发文量
4309
审稿时长
31 days
期刊介绍: The Chinese Journal of Chemical Engineering (Monthly, started in 1982) is the official journal of the Chemical Industry and Engineering Society of China and published by the Chemical Industry Press Co. Ltd. The aim of the journal is to develop the international exchange of scientific and technical information in the field of chemical engineering. It publishes original research papers that cover the major advancements and achievements in chemical engineering in China as well as some articles from overseas contributors. The topics of journal include chemical engineering, chemical technology, biochemical engineering, energy and environmental engineering and other relevant fields. Papers are published on the basis of their relevance to theoretical research, practical application or potential uses in the industry as Research Papers, Communications, Reviews and Perspectives. Prominent domestic and overseas chemical experts and scholars have been invited to form an International Advisory Board and the Editorial Committee. It enjoys recognition among Chinese academia and industry as a reliable source of information of what is going on in chemical engineering research, both domestic and abroad.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信