基于强化学习的随机作业车间系统动态变化排序规则

2020 Winter Simulation Conference (WSC) Pub Date : 2020-12-14 DOI:10.1109/WSC48552.2020.9383903

Jens Heger, T. Voss

{"title":"基于强化学习的随机作业车间系统动态变化排序规则","authors":"Jens Heger, T. Voss","doi":"10.1109/WSC48552.2020.9383903","DOIUrl":null,"url":null,"abstract":"Sequencing operations can be difficult, especially under uncertain conditions. Applying decentral sequencing rules has been a viable option; however, no rule exists that can outperform all other rules under varying system performance. For this reason, reinforcement learning (RL) is used as a hyper heuristic to select a sequencing rule based on the system status. Based on multiple training scenarios considering stochastic influences, such as varying inter arrival time or customers changing the product mix, the advantages of RL are presented. For evaluation, the trained agents are exploited in a generic manufacturing system. The best agent trained is able to dynamically adjust sequencing rules based on system performance, thereby matching and outperforming the presumed best static sequencing rules by ≈ 3%. Using the trained policy in an unknown scenario, the RL heuristic is still able to change the sequencing rule according to the system status, thereby providing robust performance.","PeriodicalId":6692,"journal":{"name":"2020 Winter Simulation Conference (WSC)","volume":"102 1","pages":"1608-1618"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Dynamically Changing Sequencing Rules with Reinforcement Learning in a Job Shop System With Stochastic Influences\",\"authors\":\"Jens Heger, T. Voss\",\"doi\":\"10.1109/WSC48552.2020.9383903\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sequencing operations can be difficult, especially under uncertain conditions. Applying decentral sequencing rules has been a viable option; however, no rule exists that can outperform all other rules under varying system performance. For this reason, reinforcement learning (RL) is used as a hyper heuristic to select a sequencing rule based on the system status. Based on multiple training scenarios considering stochastic influences, such as varying inter arrival time or customers changing the product mix, the advantages of RL are presented. For evaluation, the trained agents are exploited in a generic manufacturing system. The best agent trained is able to dynamically adjust sequencing rules based on system performance, thereby matching and outperforming the presumed best static sequencing rules by ≈ 3%. Using the trained policy in an unknown scenario, the RL heuristic is still able to change the sequencing rule according to the system status, thereby providing robust performance.\",\"PeriodicalId\":6692,\"journal\":{\"name\":\"2020 Winter Simulation Conference (WSC)\",\"volume\":\"102 1\",\"pages\":\"1608-1618\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 Winter Simulation Conference (WSC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WSC48552.2020.9383903\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Winter Simulation Conference (WSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WSC48552.2020.9383903","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

测序操作可能很困难，特别是在不确定的条件下。应用分散排序规则是一个可行的选择;但是，在不同的系统性能下，不存在优于所有其他规则的规则。因此，强化学习(RL)被用作基于系统状态选择排序规则的超启发式方法。基于考虑随机影响的多种训练场景，如不同的到达时间或客户改变产品组合，展示了强化学习的优势。为了进行评估，在通用制造系统中利用经过培训的代理。训练出的最佳智能体能够根据系统性能动态调整排序规则，从而匹配并优于假定的最佳静态排序规则≈3%。在未知场景中使用训练好的策略，RL启发式仍然能够根据系统状态改变排序规则，从而提供鲁棒性性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Dynamically Changing Sequencing Rules with Reinforcement Learning in a Job Shop System With Stochastic Influences

Sequencing operations can be difficult, especially under uncertain conditions. Applying decentral sequencing rules has been a viable option; however, no rule exists that can outperform all other rules under varying system performance. For this reason, reinforcement learning (RL) is used as a hyper heuristic to select a sequencing rule based on the system status. Based on multiple training scenarios considering stochastic influences, such as varying inter arrival time or customers changing the product mix, the advantages of RL are presented. For evaluation, the trained agents are exploited in a generic manufacturing system. The best agent trained is able to dynamically adjust sequencing rules based on system performance, thereby matching and outperforming the presumed best static sequencing rules by ≈ 3%. Using the trained policy in an unknown scenario, the RL heuristic is still able to change the sequencing rule according to the system status, thereby providing robust performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 Winter Simulation Conference (WSC)

自引率

0.00%

发文量