Deep Q-learning recommender algorithm with update policy for a real steam turbine system

IF 3.1 Q2 ENGINEERING, INDUSTRIAL

IET Collaborative Intelligent Manufacturing Pub Date : 2023-09-02 DOI:10.1049/cim2.12081

Mohammad Hossein Modirrousta, Mahdi Aliyari Shoorehdeli, Mostafa Yari, Arash Ghahremani

{"title":"Deep Q-learning recommender algorithm with update policy for a real steam turbine system","authors":"Mohammad Hossein Modirrousta, Mahdi Aliyari Shoorehdeli, Mostafa Yari, Arash Ghahremani","doi":"10.1049/cim2.12081","DOIUrl":null,"url":null,"abstract":"<p>In modern industrial systems, diagnosing faults in time and using the best methods becomes increasingly crucial. It is possible to fail a system or to waste resources if faults are not detected or are detected late. Machine learning and deep learning (DL) have proposed various methods for data-based fault diagnosis, and the authors are looking for the most reliable and practical ones. A framework based on DL and reinforcement learning (RL) is developed for fault detection. The authors have utilised two algorithms in their work: Q-Learning and Soft Q-Learning. Reinforcement learning frameworks frequently include efficient algorithms for policy updates, including Q-learning. These algorithms optimise the policy based on the predictions and rewards, resulting in more efficient updates and quicker convergence. The authors can increase accuracy, overcome data imbalance, and better predict future defects by updating the RL policy when new data is received. By applying their method, an increase of 3%–4% in all evaluation metrics by updating policy, an improvement in prediction speed, and an increase of 3%–6% in all evaluation metrics compared to a typical backpropagation multi-layer neural network prediction with comparable parameters is observed. In addition, the Soft Q-learning algorithm yields better outcomes compared to Q-learning.</p>","PeriodicalId":33286,"journal":{"name":"IET Collaborative Intelligent Manufacturing","volume":"5 3","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cim2.12081","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Collaborative Intelligent Manufacturing","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/cim2.12081","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}

引用次数: 0

Abstract

In modern industrial systems, diagnosing faults in time and using the best methods becomes increasingly crucial. It is possible to fail a system or to waste resources if faults are not detected or are detected late. Machine learning and deep learning (DL) have proposed various methods for data-based fault diagnosis, and the authors are looking for the most reliable and practical ones. A framework based on DL and reinforcement learning (RL) is developed for fault detection. The authors have utilised two algorithms in their work: Q-Learning and Soft Q-Learning. Reinforcement learning frameworks frequently include efficient algorithms for policy updates, including Q-learning. These algorithms optimise the policy based on the predictions and rewards, resulting in more efficient updates and quicker convergence. The authors can increase accuracy, overcome data imbalance, and better predict future defects by updating the RL policy when new data is received. By applying their method, an increase of 3%–4% in all evaluation metrics by updating policy, an improvement in prediction speed, and an increase of 3%–6% in all evaluation metrics compared to a typical backpropagation multi-layer neural network prediction with comparable parameters is observed. In addition, the Soft Q-learning algorithm yields better outcomes compared to Q-learning.

Abstract Image

查看原文本刊更多论文

基于更新策略的汽轮机系统深度Q学习推荐算法

在现代工业系统中，及时诊断故障并使用最佳方法变得越来越重要。如果未检测到故障或检测到故障较晚，则可能导致系统故障或浪费资源。机器学习和深度学习（DL）提出了各种基于数据的故障诊断方法，作者正在寻找最可靠、最实用的方法。开发了一个基于DL和强化学习（RL）的故障检测框架。作者在他们的工作中使用了两种算法：Q学习和软Q学习。强化学习框架通常包括用于策略更新的高效算法，包括Q学习。这些算法根据预测和奖励优化策略，从而实现更高效的更新和更快的收敛。当收到新数据时，作者可以通过更新RL策略来提高准确性，克服数据不平衡，并更好地预测未来的缺陷。通过应用他们的方法，与具有可比参数的典型反向传播多层神经网络预测相比，通过更新策略，所有评估指标增加了3%-4%，预测速度提高，所有评估度量增加了3%-6%。此外，与Q学习相比，软Q学习算法产生了更好的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IET Collaborative Intelligent Manufacturing Engineering-Industrial and Manufacturing Engineering

CiteScore

9.10

自引率

2.40%

发文量

审稿时长

20 weeks

期刊介绍： IET Collaborative Intelligent Manufacturing is a Gold Open Access journal that focuses on the development of efficient and adaptive production and distribution systems. It aims to meet the ever-changing market demands by publishing original research on methodologies and techniques for the application of intelligence, data science, and emerging information and communication technologies in various aspects of manufacturing, such as design, modeling, simulation, planning, and optimization of products, processes, production, and assembly. The journal is indexed in COMPENDEX (Elsevier), Directory of Open Access Journals (DOAJ), Emerging Sources Citation Index (Clarivate Analytics), INSPEC (IET), SCOPUS (Elsevier) and Web of Science (Clarivate Analytics).