Discovery of Customized Dispatching Rule for Single-Machine Production Scheduling Using Deep Reinforcement Learning

Volume 2: 42nd Computers and Information in Engineering Conference (CIE) Pub Date : 2022-08-14 DOI:10.1115/detc2022-89829

P. C. Chua, S. K. Moon, Y. Ng, H. Ng, Manel Lopez

{"title":"Discovery of Customized Dispatching Rule for Single-Machine Production Scheduling Using Deep Reinforcement Learning","authors":"P. C. Chua, S. K. Moon, Y. Ng, H. Ng, Manel Lopez","doi":"10.1115/detc2022-89829","DOIUrl":null,"url":null,"abstract":"\n A dispatching rule has become one of the most widely used approaches in producing scheduling due to its low time complexities and the ability to respond to dynamic changes in production. However, there is no one dispatching rule that dominates the others for the performance measure of interest. By modelling the selection of a dispatching rule to transit from one production state to another using a Markov decision process, current methods involving reinforcement learning make use of a predefined list of dispatching rules, which may limit the optimization of a specified performance measure. Greater flexibility can be achieved by creating customized dispatching rules through the important selection of production parameters for the performance measure in question. Using parameters obtained readily within the digital twin setting, this paper investigates the application of deep reinforcement learning to select customized dispatching rules formed by weighted combinations of production parameters on a single machine production scheduling problem. Due to the curse of dimensionality of storing Q values for all possible production states in a Q-table, a deep Q network is trained for the dynamic selection of the customized dispatching rules. Preliminary results show its effectiveness in minimizing total tardiness and outperform well-known existing dispatching rules.","PeriodicalId":382970,"journal":{"name":"Volume 2: 42nd Computers and Information in Engineering Conference (CIE)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Volume 2: 42nd Computers and Information in Engineering Conference (CIE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1115/detc2022-89829","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

A dispatching rule has become one of the most widely used approaches in producing scheduling due to its low time complexities and the ability to respond to dynamic changes in production. However, there is no one dispatching rule that dominates the others for the performance measure of interest. By modelling the selection of a dispatching rule to transit from one production state to another using a Markov decision process, current methods involving reinforcement learning make use of a predefined list of dispatching rules, which may limit the optimization of a specified performance measure. Greater flexibility can be achieved by creating customized dispatching rules through the important selection of production parameters for the performance measure in question. Using parameters obtained readily within the digital twin setting, this paper investigates the application of deep reinforcement learning to select customized dispatching rules formed by weighted combinations of production parameters on a single machine production scheduling problem. Due to the curse of dimensionality of storing Q values for all possible production states in a Q-table, a deep Q network is trained for the dynamic selection of the customized dispatching rules. Preliminary results show its effectiveness in minimizing total tardiness and outperform well-known existing dispatching rules.

查看原文本刊更多论文

基于深度强化学习的单机生产调度自定义调度规则发现

调度规则由于其时间复杂度低，能够对生产的动态变化作出响应，已成为应用最广泛的生产调度方法之一。但是，对于感兴趣的性能度量，没有一个调度规则支配其他调度规则。通过使用马尔可夫决策过程对从一个生产状态过渡到另一个生产状态的调度规则的选择进行建模，目前涉及强化学习的方法使用了预定义的调度规则列表，这可能会限制指定性能度量的优化。通过为所讨论的性能度量选择重要的生产参数来创建定制的调度规则，可以实现更大的灵活性。利用数字孪生设定中容易获得的参数，研究了在单机生产调度问题上，应用深度强化学习选择由生产参数加权组合形成的定制调度规则。由于在Q表中存储所有可能生产状态的Q值具有维数限制，因此训练深度Q网络动态选择自定义调度规则。初步结果表明，该算法在最大限度地减少总延误方面是有效的，优于现有的知名调度规则。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Volume 2: 42nd Computers and Information in Engineering Conference (CIE)

自引率

0.00%

发文量