Discovery of Customized Dispatching Rule for Single-Machine Production Scheduling Using Deep Reinforcement Learning

P. C. Chua, S. K. Moon, Y. Ng, H. Ng, Manel Lopez
{"title":"Discovery of Customized Dispatching Rule for Single-Machine Production Scheduling Using Deep Reinforcement Learning","authors":"P. C. Chua, S. K. Moon, Y. Ng, H. Ng, Manel Lopez","doi":"10.1115/detc2022-89829","DOIUrl":null,"url":null,"abstract":"\n A dispatching rule has become one of the most widely used approaches in producing scheduling due to its low time complexities and the ability to respond to dynamic changes in production. However, there is no one dispatching rule that dominates the others for the performance measure of interest. By modelling the selection of a dispatching rule to transit from one production state to another using a Markov decision process, current methods involving reinforcement learning make use of a predefined list of dispatching rules, which may limit the optimization of a specified performance measure. Greater flexibility can be achieved by creating customized dispatching rules through the important selection of production parameters for the performance measure in question. Using parameters obtained readily within the digital twin setting, this paper investigates the application of deep reinforcement learning to select customized dispatching rules formed by weighted combinations of production parameters on a single machine production scheduling problem. Due to the curse of dimensionality of storing Q values for all possible production states in a Q-table, a deep Q network is trained for the dynamic selection of the customized dispatching rules. Preliminary results show its effectiveness in minimizing total tardiness and outperform well-known existing dispatching rules.","PeriodicalId":382970,"journal":{"name":"Volume 2: 42nd Computers and Information in Engineering Conference (CIE)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Volume 2: 42nd Computers and Information in Engineering Conference (CIE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1115/detc2022-89829","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

A dispatching rule has become one of the most widely used approaches in producing scheduling due to its low time complexities and the ability to respond to dynamic changes in production. However, there is no one dispatching rule that dominates the others for the performance measure of interest. By modelling the selection of a dispatching rule to transit from one production state to another using a Markov decision process, current methods involving reinforcement learning make use of a predefined list of dispatching rules, which may limit the optimization of a specified performance measure. Greater flexibility can be achieved by creating customized dispatching rules through the important selection of production parameters for the performance measure in question. Using parameters obtained readily within the digital twin setting, this paper investigates the application of deep reinforcement learning to select customized dispatching rules formed by weighted combinations of production parameters on a single machine production scheduling problem. Due to the curse of dimensionality of storing Q values for all possible production states in a Q-table, a deep Q network is trained for the dynamic selection of the customized dispatching rules. Preliminary results show its effectiveness in minimizing total tardiness and outperform well-known existing dispatching rules.
基于深度强化学习的单机生产调度自定义调度规则发现
调度规则由于其时间复杂度低,能够对生产的动态变化作出响应,已成为应用最广泛的生产调度方法之一。但是,对于感兴趣的性能度量,没有一个调度规则支配其他调度规则。通过使用马尔可夫决策过程对从一个生产状态过渡到另一个生产状态的调度规则的选择进行建模,目前涉及强化学习的方法使用了预定义的调度规则列表,这可能会限制指定性能度量的优化。通过为所讨论的性能度量选择重要的生产参数来创建定制的调度规则,可以实现更大的灵活性。利用数字孪生设定中容易获得的参数,研究了在单机生产调度问题上,应用深度强化学习选择由生产参数加权组合形成的定制调度规则。由于在Q表中存储所有可能生产状态的Q值具有维数限制,因此训练深度Q网络动态选择自定义调度规则。初步结果表明,该算法在最大限度地减少总延误方面是有效的,优于现有的知名调度规则。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信