基于强化学习的随机作业车间系统动态变化排序规则

Jens Heger, T. Voss
{"title":"基于强化学习的随机作业车间系统动态变化排序规则","authors":"Jens Heger, T. Voss","doi":"10.1109/WSC48552.2020.9383903","DOIUrl":null,"url":null,"abstract":"Sequencing operations can be difficult, especially under uncertain conditions. Applying decentral sequencing rules has been a viable option; however, no rule exists that can outperform all other rules under varying system performance. For this reason, reinforcement learning (RL) is used as a hyper heuristic to select a sequencing rule based on the system status. Based on multiple training scenarios considering stochastic influences, such as varying inter arrival time or customers changing the product mix, the advantages of RL are presented. For evaluation, the trained agents are exploited in a generic manufacturing system. The best agent trained is able to dynamically adjust sequencing rules based on system performance, thereby matching and outperforming the presumed best static sequencing rules by ≈ 3%. Using the trained policy in an unknown scenario, the RL heuristic is still able to change the sequencing rule according to the system status, thereby providing robust performance.","PeriodicalId":6692,"journal":{"name":"2020 Winter Simulation Conference (WSC)","volume":"102 1","pages":"1608-1618"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Dynamically Changing Sequencing Rules with Reinforcement Learning in a Job Shop System With Stochastic Influences\",\"authors\":\"Jens Heger, T. Voss\",\"doi\":\"10.1109/WSC48552.2020.9383903\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sequencing operations can be difficult, especially under uncertain conditions. Applying decentral sequencing rules has been a viable option; however, no rule exists that can outperform all other rules under varying system performance. For this reason, reinforcement learning (RL) is used as a hyper heuristic to select a sequencing rule based on the system status. Based on multiple training scenarios considering stochastic influences, such as varying inter arrival time or customers changing the product mix, the advantages of RL are presented. For evaluation, the trained agents are exploited in a generic manufacturing system. The best agent trained is able to dynamically adjust sequencing rules based on system performance, thereby matching and outperforming the presumed best static sequencing rules by ≈ 3%. Using the trained policy in an unknown scenario, the RL heuristic is still able to change the sequencing rule according to the system status, thereby providing robust performance.\",\"PeriodicalId\":6692,\"journal\":{\"name\":\"2020 Winter Simulation Conference (WSC)\",\"volume\":\"102 1\",\"pages\":\"1608-1618\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 Winter Simulation Conference (WSC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WSC48552.2020.9383903\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Winter Simulation Conference (WSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WSC48552.2020.9383903","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

测序操作可能很困难,特别是在不确定的条件下。应用分散排序规则是一个可行的选择;但是,在不同的系统性能下,不存在优于所有其他规则的规则。因此,强化学习(RL)被用作基于系统状态选择排序规则的超启发式方法。基于考虑随机影响的多种训练场景,如不同的到达时间或客户改变产品组合,展示了强化学习的优势。为了进行评估,在通用制造系统中利用经过培训的代理。训练出的最佳智能体能够根据系统性能动态调整排序规则,从而匹配并优于假定的最佳静态排序规则≈3%。在未知场景中使用训练好的策略,RL启发式仍然能够根据系统状态改变排序规则,从而提供鲁棒性性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Dynamically Changing Sequencing Rules with Reinforcement Learning in a Job Shop System With Stochastic Influences
Sequencing operations can be difficult, especially under uncertain conditions. Applying decentral sequencing rules has been a viable option; however, no rule exists that can outperform all other rules under varying system performance. For this reason, reinforcement learning (RL) is used as a hyper heuristic to select a sequencing rule based on the system status. Based on multiple training scenarios considering stochastic influences, such as varying inter arrival time or customers changing the product mix, the advantages of RL are presented. For evaluation, the trained agents are exploited in a generic manufacturing system. The best agent trained is able to dynamically adjust sequencing rules based on system performance, thereby matching and outperforming the presumed best static sequencing rules by ≈ 3%. Using the trained policy in an unknown scenario, the RL heuristic is still able to change the sequencing rule according to the system status, thereby providing robust performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信