Gyeongho Kim , Jae Gyeong Choi , Sujin Jeon , Soyeon Park , Sunghoon Lim
{"title":"Towards efficient data-driven fault diagnosis under low-budget scenarios via hybrid deep active learning","authors":"Gyeongho Kim , Jae Gyeong Choi , Sujin Jeon , Soyeon Park , Sunghoon Lim","doi":"10.1016/j.ress.2025.111637","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate fault diagnosis using deep learning (DL) has become essential for effective quality control, maintenance, and process automation in various industrial processes. However, an efficient labeling strategy is required because constructing large-scale labeled datasets to train DL-based predictive models entails considerable cost and labor. While active learning (AL) has been a prominent solution for efficient data labeling in fault diagnosis, existing AL approaches are unsuitable in practice due to low-budget scenarios where there is insufficient labeled data to train the model stably. In this regard, this work proposes a novel method, called a hybrid deep active learning for low-budget (HDAL-LB) scenarios, that addresses emerging challenges in the label-scarce regime to perform efficient fault diagnosis. First, self-supervised learning is performed with a deep stacked residual variational auto-encoder to efficiently initialize an encoder for latent feature extraction. Second, an evidential learning-based training technique is developed to enable a cost-efficient generation of calibrated predictive uncertainty. Third, a hybrid query selection is systematically formulated under a combinatorial optimization framework, utilizing both uncertainty and data diversity for deep AL. The efficacy of the proposed method (i.e., HDAL-LB) in fault diagnosis is validated through four case studies, utilizing three public benchmark datasets and one private real-world dataset. The comprehensive experimental results demonstrate the superior performance of HDAL-LB under low-budget scenarios compared to existing baseline and state-of-the-art (SOTA) AL methods. Furthermore, extensive ablation studies demonstrate that HDAL-LB consistently exhibits effective fault diagnosis performance across various experimental settings, highlighting its label efficiency and practical applicability in real-world practice.</div></div>","PeriodicalId":54500,"journal":{"name":"Reliability Engineering & System Safety","volume":"266 ","pages":"Article 111637"},"PeriodicalIF":11.0000,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Reliability Engineering & System Safety","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0951832025008373","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate fault diagnosis using deep learning (DL) has become essential for effective quality control, maintenance, and process automation in various industrial processes. However, an efficient labeling strategy is required because constructing large-scale labeled datasets to train DL-based predictive models entails considerable cost and labor. While active learning (AL) has been a prominent solution for efficient data labeling in fault diagnosis, existing AL approaches are unsuitable in practice due to low-budget scenarios where there is insufficient labeled data to train the model stably. In this regard, this work proposes a novel method, called a hybrid deep active learning for low-budget (HDAL-LB) scenarios, that addresses emerging challenges in the label-scarce regime to perform efficient fault diagnosis. First, self-supervised learning is performed with a deep stacked residual variational auto-encoder to efficiently initialize an encoder for latent feature extraction. Second, an evidential learning-based training technique is developed to enable a cost-efficient generation of calibrated predictive uncertainty. Third, a hybrid query selection is systematically formulated under a combinatorial optimization framework, utilizing both uncertainty and data diversity for deep AL. The efficacy of the proposed method (i.e., HDAL-LB) in fault diagnosis is validated through four case studies, utilizing three public benchmark datasets and one private real-world dataset. The comprehensive experimental results demonstrate the superior performance of HDAL-LB under low-budget scenarios compared to existing baseline and state-of-the-art (SOTA) AL methods. Furthermore, extensive ablation studies demonstrate that HDAL-LB consistently exhibits effective fault diagnosis performance across various experimental settings, highlighting its label efficiency and practical applicability in real-world practice.
期刊介绍:
Elsevier publishes Reliability Engineering & System Safety in association with the European Safety and Reliability Association and the Safety Engineering and Risk Analysis Division. The international journal is devoted to developing and applying methods to enhance the safety and reliability of complex technological systems, like nuclear power plants, chemical plants, hazardous waste facilities, space systems, offshore and maritime systems, transportation systems, constructed infrastructure, and manufacturing plants. The journal normally publishes only articles that involve the analysis of substantive problems related to the reliability of complex systems or present techniques and/or theoretical results that have a discernable relationship to the solution of such problems. An important aim is to balance academic material and practical applications.