改进的基于分层聚类的多变量传感器延迟估计方法

IF 3.7 2区 化学 Q2 AUTOMATION & CONTROL SYSTEMS
Bente M. van Son , Tim Offermans , Carlo G. Bertinetto , Jeroen J. Jansen
{"title":"改进的基于分层聚类的多变量传感器延迟估计方法","authors":"Bente M. van Son ,&nbsp;Tim Offermans ,&nbsp;Carlo G. Bertinetto ,&nbsp;Jeroen J. Jansen","doi":"10.1016/j.chemolab.2024.105306","DOIUrl":null,"url":null,"abstract":"<div><div>An often overlooked challenge in multivariate statistical modelling of industrial data is the presence of time delays caused by the residence time in the process, leading to event misalignment. To perform accurate data analysis, time delays must be estimated and corrected using a dedicated preprocessing step. Despite the multivariate nature of process data, most existing statistical Time Delay Estimation (TDE) methods only consider bivariate correlations. This study hypothesized that multivariate TDE methods would outperform bivariate methods, particularly with a large number of sensors. To test this, we selected data subsets with varying numbers of sensors using correlation-based hierarchical clustering and applied different TDE methods. Results showed that two multivariate methods, <em>PLS-CON-LOAD</em> and <em>PLS-SEQ</em>, outperformed the bivariate methods, exhibiting lower errors in the time delay estimation and less sensitivity to the number of sensors. Additionally, we proposed an enhancement to the TDE methods by embedding a clustering step to determine the order in which time delays should be estimated. This approach reduced TDE errors for all methods when number of sensors is high. We recommend the newly proposed clustering-based <em>PLS-CON-LOAD</em> method for low-error time delay estimation, which enhances the predictive value and insights obtainable from industrial data analysis.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"257 ","pages":"Article 105306"},"PeriodicalIF":3.7000,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improved multivariate sensor delay estimation using a hierarchical clustering-based approach\",\"authors\":\"Bente M. van Son ,&nbsp;Tim Offermans ,&nbsp;Carlo G. Bertinetto ,&nbsp;Jeroen J. Jansen\",\"doi\":\"10.1016/j.chemolab.2024.105306\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>An often overlooked challenge in multivariate statistical modelling of industrial data is the presence of time delays caused by the residence time in the process, leading to event misalignment. To perform accurate data analysis, time delays must be estimated and corrected using a dedicated preprocessing step. Despite the multivariate nature of process data, most existing statistical Time Delay Estimation (TDE) methods only consider bivariate correlations. This study hypothesized that multivariate TDE methods would outperform bivariate methods, particularly with a large number of sensors. To test this, we selected data subsets with varying numbers of sensors using correlation-based hierarchical clustering and applied different TDE methods. Results showed that two multivariate methods, <em>PLS-CON-LOAD</em> and <em>PLS-SEQ</em>, outperformed the bivariate methods, exhibiting lower errors in the time delay estimation and less sensitivity to the number of sensors. Additionally, we proposed an enhancement to the TDE methods by embedding a clustering step to determine the order in which time delays should be estimated. This approach reduced TDE errors for all methods when number of sensors is high. We recommend the newly proposed clustering-based <em>PLS-CON-LOAD</em> method for low-error time delay estimation, which enhances the predictive value and insights obtainable from industrial data analysis.</div></div>\",\"PeriodicalId\":9774,\"journal\":{\"name\":\"Chemometrics and Intelligent Laboratory Systems\",\"volume\":\"257 \",\"pages\":\"Article 105306\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Chemometrics and Intelligent Laboratory Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169743924002466\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemometrics and Intelligent Laboratory Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169743924002466","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

在工业数据的多变量统计建模中,一个经常被忽视的挑战是由过程中停留时间引起的时间延迟,导致事件不对齐。为了执行准确的数据分析,必须使用专用的预处理步骤估计和纠正时间延迟。尽管过程数据具有多元特性,但大多数现有的统计时延估计方法只考虑二元相关性。本研究假设多元TDE方法优于二元方法,特别是在大量传感器的情况下。为了验证这一点,我们使用基于相关性的分层聚类选择了具有不同数量传感器的数据子集,并应用了不同的TDE方法。结果表明,PLS-CON-LOAD和PLS-SEQ两种多变量方法的时延估计误差较小,对传感器数量的敏感性较低,优于双变量方法。此外,我们提出了对TDE方法的改进,通过嵌入聚类步骤来确定估计时延的顺序。当传感器数量较大时,该方法降低了所有方法的TDE误差。我们推荐新提出的基于聚类的PLS-CON-LOAD方法用于低误差时延估计,该方法提高了预测值和从工业数据分析中获得的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Improved multivariate sensor delay estimation using a hierarchical clustering-based approach
An often overlooked challenge in multivariate statistical modelling of industrial data is the presence of time delays caused by the residence time in the process, leading to event misalignment. To perform accurate data analysis, time delays must be estimated and corrected using a dedicated preprocessing step. Despite the multivariate nature of process data, most existing statistical Time Delay Estimation (TDE) methods only consider bivariate correlations. This study hypothesized that multivariate TDE methods would outperform bivariate methods, particularly with a large number of sensors. To test this, we selected data subsets with varying numbers of sensors using correlation-based hierarchical clustering and applied different TDE methods. Results showed that two multivariate methods, PLS-CON-LOAD and PLS-SEQ, outperformed the bivariate methods, exhibiting lower errors in the time delay estimation and less sensitivity to the number of sensors. Additionally, we proposed an enhancement to the TDE methods by embedding a clustering step to determine the order in which time delays should be estimated. This approach reduced TDE errors for all methods when number of sensors is high. We recommend the newly proposed clustering-based PLS-CON-LOAD method for low-error time delay estimation, which enhances the predictive value and insights obtainable from industrial data analysis.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.50
自引率
7.70%
发文量
169
审稿时长
3.4 months
期刊介绍: Chemometrics and Intelligent Laboratory Systems publishes original research papers, short communications, reviews, tutorials and Original Software Publications reporting on development of novel statistical, mathematical, or computer techniques in Chemistry and related disciplines. Chemometrics is the chemical discipline that uses mathematical and statistical methods to design or select optimal procedures and experiments, and to provide maximum chemical information by analysing chemical data. The journal deals with the following topics: 1) Development of new statistical, mathematical and chemometrical methods for Chemistry and related fields (Environmental Chemistry, Biochemistry, Toxicology, System Biology, -Omics, etc.) 2) Novel applications of chemometrics to all branches of Chemistry and related fields (typical domains of interest are: process data analysis, experimental design, data mining, signal processing, supervised modelling, decision making, robust statistics, mixture analysis, multivariate calibration etc.) Routine applications of established chemometrical techniques will not be considered. 3) Development of new software that provides novel tools or truly advances the use of chemometrical methods. 4) Well characterized data sets to test performance for the new methods and software. The journal complies with International Committee of Medical Journal Editors'' Uniform requirements for manuscripts.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信