{"title":"Reinforcement learning for Order Acceptance on a shared resource","authors":"M. M. Hing, A. van Harten, P. Schuur","doi":"10.1109/ICONIP.2002.1202861","DOIUrl":null,"url":null,"abstract":"Order acceptance (OA) is one of the main functions in business control. Basically, OA involves for each order a reject/accept decision. Always accepting an order when capacity is available could disable the system to accept more convenient orders in the future with opportunity losses as a consequence. Another important aspect is the availability of information to the decision-maker. We use the stochastic modeling approach, Markov decision theory and learning methods from artificial intelligence to find decision policies, even under uncertain information. Reinforcement learning (RL) is a quite new approach in OA. It is capable of learning both the decision policy and incomplete information, simultaneously. It is shown here that RL works well compared with heuristics. Finding good heuristics in a complex situation is a delicate art. It is demonstrated that a RL trained agent can be used to support the detection of good heuristics.","PeriodicalId":146553,"journal":{"name":"Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02.","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICONIP.2002.1202861","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Order acceptance (OA) is one of the main functions in business control. Basically, OA involves for each order a reject/accept decision. Always accepting an order when capacity is available could disable the system to accept more convenient orders in the future with opportunity losses as a consequence. Another important aspect is the availability of information to the decision-maker. We use the stochastic modeling approach, Markov decision theory and learning methods from artificial intelligence to find decision policies, even under uncertain information. Reinforcement learning (RL) is a quite new approach in OA. It is capable of learning both the decision policy and incomplete information, simultaneously. It is shown here that RL works well compared with heuristics. Finding good heuristics in a complex situation is a delicate art. It is demonstrated that a RL trained agent can be used to support the detection of good heuristics.