Xiang Cheng, Jun Kit Chaw, Shafrida Sahrani, Mei Choo Ang, Saraswathy Shamini Gunasekaran, Moamin A. Mahmoud, Halimah Badioze Zaman, Yanfeng Zhao, Fuchen Ren
{"title":"An adaptive dual distillation framework for efficient remaining useful life prediction","authors":"Xiang Cheng, Jun Kit Chaw, Shafrida Sahrani, Mei Choo Ang, Saraswathy Shamini Gunasekaran, Moamin A. Mahmoud, Halimah Badioze Zaman, Yanfeng Zhao, Fuchen Ren","doi":"10.1007/s40747-025-01886-w","DOIUrl":null,"url":null,"abstract":"<p>Predicting the Remaining Useful Life (RUL) of industrial equipment is essential for proactive maintenance and health assessment, particularly under the computational constraints of edge devices. While deep learning methods, such as Long Short-Term Memory (LSTM) networks, excel at modeling complex time series, their high computational cost often restricts real-time deployment. To address this challenge, we present an Adaptive Dual Distillation Framework (A-DDF) that transfers knowledge from a large LSTM teacher model to a lightweight bidirectional Gated Recurrent Unit (GRU) student model. Soft-target distillation refines predictive distributions to provide robust supervision and our correlation-based feature alignment preserves inter-feature relationships and prevents information loss. An adaptive weighting mechanism balances these two distillation strategies, enabling the student model to maintain high predictive accuracy while reducing model complexity. We validate our approach on NASA’s C-MAPSS dataset, which includes diverse operating conditions. A-DDF outperforms previous methods, achieving a 12% decrease in relative error (MAPE), improving prediction accuracy and stability. Ablation experiments show the dual distillation strategy improves predictive accuracy, surpassing single distillation approaches. Notably, the student model achieves a 5.34-fold compression rate, reducing parameters by 83%, while maintaining or exceeding the performance of the LSTM teacher model. These results highlight A-DDF’s potential for efficient, high-accuracy predictive maintenance on edge devices. Comparisons with mainstream benchmarks confirm A-DDF’s superior performance across datasets. Finally, generality and quantization experiments validate its broad applicability and deployability. The proposed method emphasizes reducing model size without sacrificing performance, making it ideal for real-world predictive maintenance scenarios and intelligence-driven manufacturing applications.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"1 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-025-01886-w","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Predicting the Remaining Useful Life (RUL) of industrial equipment is essential for proactive maintenance and health assessment, particularly under the computational constraints of edge devices. While deep learning methods, such as Long Short-Term Memory (LSTM) networks, excel at modeling complex time series, their high computational cost often restricts real-time deployment. To address this challenge, we present an Adaptive Dual Distillation Framework (A-DDF) that transfers knowledge from a large LSTM teacher model to a lightweight bidirectional Gated Recurrent Unit (GRU) student model. Soft-target distillation refines predictive distributions to provide robust supervision and our correlation-based feature alignment preserves inter-feature relationships and prevents information loss. An adaptive weighting mechanism balances these two distillation strategies, enabling the student model to maintain high predictive accuracy while reducing model complexity. We validate our approach on NASA’s C-MAPSS dataset, which includes diverse operating conditions. A-DDF outperforms previous methods, achieving a 12% decrease in relative error (MAPE), improving prediction accuracy and stability. Ablation experiments show the dual distillation strategy improves predictive accuracy, surpassing single distillation approaches. Notably, the student model achieves a 5.34-fold compression rate, reducing parameters by 83%, while maintaining or exceeding the performance of the LSTM teacher model. These results highlight A-DDF’s potential for efficient, high-accuracy predictive maintenance on edge devices. Comparisons with mainstream benchmarks confirm A-DDF’s superior performance across datasets. Finally, generality and quantization experiments validate its broad applicability and deployability. The proposed method emphasizes reducing model size without sacrificing performance, making it ideal for real-world predictive maintenance scenarios and intelligence-driven manufacturing applications.
期刊介绍:
Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.