An integrated framework combining CenFormer and PLS regression for rapid distillate oil classification and property prediction

IF 3.8 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS
Yifan Wang , Xisong Chen , Lei Jiang , Yunyun Hu
{"title":"An integrated framework combining CenFormer and PLS regression for rapid distillate oil classification and property prediction","authors":"Yifan Wang ,&nbsp;Xisong Chen ,&nbsp;Lei Jiang ,&nbsp;Yunyun Hu","doi":"10.1016/j.chemolab.2025.105530","DOIUrl":null,"url":null,"abstract":"<div><div>Rapid and accurate classification and property prediction of distillate oil are essential for intelligent quality control and process optimization in modern refineries. Traditional methods, such as spectral analysis with chemometrics, are widely applied, but heavily depend on manual feature engineering and offer limited representation capacities. Recent advances in deep learning have shown promise for oil analysis, yet existing models often struggle to jointly capture fine-grained local patterns and long-range spectral dependencies, and rarely optimize feature space geometry. To address these challenges, an integrated framework is proposed, integrating spectral preprocessing, a dual-branch CenFormer model, a joint loss function, and dynamic property prediction. Spectral preprocessing is employed to sharpen spectral features by applying baseline correction, spectral truncation, and vector normalization. The CenFormer model leverages a CNN-Transformer dual-branch architecture, enabling the simultaneous capture of fine-grained local patterns and long-range spectral dependencies. A joint loss function, combining softmax and center loss, enforces intra-class compactness and inter-class separability, thereby improving feature discriminability. For property prediction, a similarity-based sample selection strategy is performed, followed by PLS regression, to enable adaptive modeling of physicochemical attributes. Experimental results demonstrate the effectiveness of the framework, achieving a classification accuracy of 99.51 %, low RMSEs and rRMSEs, and high <span><math><mrow><msup><mi>R</mi><mn>2</mn></msup></mrow></math></span> in property prediction, highlighting its potential for rapid and reliable spectral analysis in industrial applications.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"267 ","pages":"Article 105530"},"PeriodicalIF":3.8000,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemometrics and Intelligent Laboratory Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169743925002151","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Rapid and accurate classification and property prediction of distillate oil are essential for intelligent quality control and process optimization in modern refineries. Traditional methods, such as spectral analysis with chemometrics, are widely applied, but heavily depend on manual feature engineering and offer limited representation capacities. Recent advances in deep learning have shown promise for oil analysis, yet existing models often struggle to jointly capture fine-grained local patterns and long-range spectral dependencies, and rarely optimize feature space geometry. To address these challenges, an integrated framework is proposed, integrating spectral preprocessing, a dual-branch CenFormer model, a joint loss function, and dynamic property prediction. Spectral preprocessing is employed to sharpen spectral features by applying baseline correction, spectral truncation, and vector normalization. The CenFormer model leverages a CNN-Transformer dual-branch architecture, enabling the simultaneous capture of fine-grained local patterns and long-range spectral dependencies. A joint loss function, combining softmax and center loss, enforces intra-class compactness and inter-class separability, thereby improving feature discriminability. For property prediction, a similarity-based sample selection strategy is performed, followed by PLS regression, to enable adaptive modeling of physicochemical attributes. Experimental results demonstrate the effectiveness of the framework, achieving a classification accuracy of 99.51 %, low RMSEs and rRMSEs, and high R2 in property prediction, highlighting its potential for rapid and reliable spectral analysis in industrial applications.

Abstract Image

结合CenFormer和PLS回归的馏分油快速分类和性质预测集成框架
快速准确的馏分油分类和性质预测是现代炼油厂智能质量控制和工艺优化的必要条件。传统的方法,如化学计量学的光谱分析,被广泛应用,但严重依赖于人工特征工程和提供有限的表示能力。深度学习的最新进展显示了石油分析的前景,但现有模型通常难以共同捕获细粒度的局部模式和远程光谱依赖关系,并且很少优化特征空间几何。为了解决这些挑战,提出了一个集成框架,集成了光谱预处理、双分支CenFormer模型、联合损失函数和动态特性预测。光谱预处理通过基线校正、光谱截断和矢量归一化来锐化光谱特征。CenFormer模型利用CNN-Transformer双分支架构,能够同时捕获细粒度的本地模式和远程频谱依赖关系。结合softmax和中心损失的联合损失函数增强了类内紧性和类间可分性,从而提高了特征的可判别性。对于属性预测,执行基于相似性的样本选择策略,然后是PLS回归,以实现物理化学属性的自适应建模。实验结果证明了该框架的有效性,分类准确率达到99.51%,在属性预测中rmse和rrmse较低,R2较高,在工业应用中具有快速可靠的光谱分析潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.50
自引率
7.70%
发文量
169
审稿时长
3.4 months
期刊介绍: Chemometrics and Intelligent Laboratory Systems publishes original research papers, short communications, reviews, tutorials and Original Software Publications reporting on development of novel statistical, mathematical, or computer techniques in Chemistry and related disciplines. Chemometrics is the chemical discipline that uses mathematical and statistical methods to design or select optimal procedures and experiments, and to provide maximum chemical information by analysing chemical data. The journal deals with the following topics: 1) Development of new statistical, mathematical and chemometrical methods for Chemistry and related fields (Environmental Chemistry, Biochemistry, Toxicology, System Biology, -Omics, etc.) 2) Novel applications of chemometrics to all branches of Chemistry and related fields (typical domains of interest are: process data analysis, experimental design, data mining, signal processing, supervised modelling, decision making, robust statistics, mixture analysis, multivariate calibration etc.) Routine applications of established chemometrical techniques will not be considered. 3) Development of new software that provides novel tools or truly advances the use of chemometrical methods. 4) Well characterized data sets to test performance for the new methods and software. The journal complies with International Committee of Medical Journal Editors'' Uniform requirements for manuscripts.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信