Enhanced process monitoring using machine learning-based control charts for poisson-distributed data

IF 7.5 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence Pub Date : 2025-06-11 DOI:10.1016/j.engappai.2025.111227

Faraz Mukhtiar , Babar Zaman , Naveed Razzaq Butt

{"title":"Enhanced process monitoring using machine learning-based control charts for poisson-distributed data","authors":"Faraz Mukhtiar , Babar Zaman , Naveed Razzaq Butt","doi":"10.1016/j.engappai.2025.111227","DOIUrl":null,"url":null,"abstract":"<div><div>The ability to detect shifts (e.g., outliers) in process monitoring is crucial for maintaining high-quality standards and operational efficiency in industrial environments. Control Charts (CCs) provide an organized framework for recognizing and managing anomalies, generally caused by assignable factors (e.g., identifiable issues) rather than inherent process variation. Traditional CCs, such as classical Shewhart, CUSUM, and EWMA, are commonly used to monitor Poisson observations in modern industries. The classical exponentially weighted moving average (EWMA) and cumulative sum (CUSUM) CCs are individually effective at detecting small-to-moderate shifts while Shewhart CCs identify moderate-to-large shifts in the process location and/or dispersion parameters. However, the classical CCs face limitations due to their sensitivity being constrained to specific ranges of shifts. To enhance the detection abilities of classical CCs in detecting all kinds of shifts in the process location parameter, this study proposes the integration of Machine Learning (ML) techniques into CCs to optimize the shift’s detection in process location parameter across a wider range. This study generates a dataset using the statistics of classical CCs based on Poisson-distributed data, which includes both in-control (stable process) and out-of-control (unstable process) processes. This dataset is used to train ML models, which are pre-processed through normalization and feature engineering through a heuristic approach before training. The performance of ML models is evaluated using standard regression metrics, specifically mean squared error and the coefficient of determination (<span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span>-score), to ensure effective generalization across varying process conditions. After training, these models are implemented within the proposed ML-based CC (<span><math><mrow><mi>M</mi><mi>L</mi><mi>C</mi><mi>C</mi></mrow></math></span>) schemes. Their process monitoring capabilities are then rigorously compared with traditional and existing CCs, utilizing key performance indicators such as average run length and standard deviation of run length. These metrics are computed through a Python-based algorithm developed using the Monte Carlo simulation method. For practical purposes, implementing the proposed <span><math><mrow><mi>M</mi><mi>L</mi><mi>C</mi><mi>C</mi></mrow></math></span> schemes with real-life data in the food processing industry, specifically the packaging of frozen orange juice concentrate. This practical example highlights the superiority of proposed <span><math><mrow><mi>M</mi><mi>L</mi><mi>C</mi><mi>C</mi></mrow></math></span> schemes in the early detection of shift(s) in process location parameter(s) against classical CCs in real-life scenarios.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"157 ","pages":"Article 111227"},"PeriodicalIF":7.5000,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S095219762501228X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

The ability to detect shifts (e.g., outliers) in process monitoring is crucial for maintaining high-quality standards and operational efficiency in industrial environments. Control Charts (CCs) provide an organized framework for recognizing and managing anomalies, generally caused by assignable factors (e.g., identifiable issues) rather than inherent process variation. Traditional CCs, such as classical Shewhart, CUSUM, and EWMA, are commonly used to monitor Poisson observations in modern industries. The classical exponentially weighted moving average (EWMA) and cumulative sum (CUSUM) CCs are individually effective at detecting small-to-moderate shifts while Shewhart CCs identify moderate-to-large shifts in the process location and/or dispersion parameters. However, the classical CCs face limitations due to their sensitivity being constrained to specific ranges of shifts. To enhance the detection abilities of classical CCs in detecting all kinds of shifts in the process location parameter, this study proposes the integration of Machine Learning (ML) techniques into CCs to optimize the shift’s detection in process location parameter across a wider range. This study generates a dataset using the statistics of classical CCs based on Poisson-distributed data, which includes both in-control (stable process) and out-of-control (unstable process) processes. This dataset is used to train ML models, which are pre-processed through normalization and feature engineering through a heuristic approach before training. The performance of ML models is evaluated using standard regression metrics, specifically mean squared error and the coefficient of determination (

R^{2}

-score), to ensure effective generalization across varying process conditions. After training, these models are implemented within the proposed ML-based CC (

M L C C

) schemes. Their process monitoring capabilities are then rigorously compared with traditional and existing CCs, utilizing key performance indicators such as average run length and standard deviation of run length. These metrics are computed through a Python-based algorithm developed using the Monte Carlo simulation method. For practical purposes, implementing the proposed

M L C C

schemes with real-life data in the food processing industry, specifically the packaging of frozen orange juice concentrate. This practical example highlights the superiority of proposed

M L C C

schemes in the early detection of shift(s) in process location parameter(s) against classical CCs in real-life scenarios.

查看原文本刊更多论文

使用基于机器学习的泊松分布数据控制图增强过程监控

在过程监控中检测变化（例如，异常值）的能力对于在工业环境中保持高质量标准和操作效率至关重要。控制图（cc）为识别和管理异常提供了一个有组织的框架，这些异常通常是由可分配的因素（例如，可识别的问题）引起的，而不是由固有的过程变化引起的。传统的CCs，如经典的Shewhart， CUSUM和EWMA，通常用于监测泊松观测在现代工业中。经典的指数加权移动平均（EWMA）和累积和（CUSUM） cc分别有效地检测小到中等的变化，而Shewhart cc识别过程位置和/或分散参数的中到大的变化。然而，经典的CCs由于其灵敏度受到特定位移范围的限制而面临局限性。为了提高经典cc检测各种工艺位置参数位移的能力，本研究提出将机器学习技术集成到cc中，在更大范围内优化工艺位置参数位移的检测。本研究基于泊松分布数据，利用经典CCs的统计数据生成了一个数据集，其中包括可控过程（稳定过程）和失控过程（不稳定过程）。该数据集用于训练ML模型，在训练前通过启发式方法通过归一化和特征工程进行预处理。ML模型的性能使用标准回归指标进行评估，特别是均方误差和决定系数（R2-score），以确保在不同的过程条件下有效泛化。经过训练后，这些模型在提出的基于ml的CC （MLCC）方案中实现。然后，利用平均运行长度和运行长度的标准偏差等关键性能指标，将它们的过程监控能力与传统和现有的CCs进行严格比较。这些指标是通过使用蒙特卡罗模拟方法开发的基于python的算法计算的。为了实际目的，在食品加工行业，特别是冷冻浓缩橙汁的包装中，实施拟议的MLCC计划，并使用实际数据。这个实际的例子强调了MLCC方案在早期检测工艺位置参数的变化方面的优越性，而不是现实场景中的经典cc。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Engineering Applications of Artificial Intelligence 工程技术-工程：电子与电气

CiteScore

9.60

自引率

10.00%

发文量

505

审稿时长

68 days

期刊介绍： Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.