Marwan Mousa, Damien van de Berg, Niki Kotecha, Ehecatl Antonio del Rio Chanona, Max Mowbray
{"title":"分散式库存控制系统的多代理强化学习分析","authors":"Marwan Mousa, Damien van de Berg, Niki Kotecha, Ehecatl Antonio del Rio Chanona, Max Mowbray","doi":"10.1016/j.compchemeng.2024.108783","DOIUrl":null,"url":null,"abstract":"<div><p>Most solutions to the inventory management problem assume a centralization of information that is incompatible with organizational constraints in supply chain networks. The problem can be naturally decomposed into sub-problems, each associated with an independent entity, turning it into a multi-agent system. A decentralized solution to inventory management using multi-agent reinforcement learning (MARL) is proposed where each entity is controlled by an agent. Three multi-agent variations of the proximal policy optimization algorithm are investigated through simulations of different supply chain networks and levels of uncertainty. A framework is deployed, which relies on offline centralization during simulation-based policy identification but enables decentralization when the policies are deployed online to the real system. Results show that reducing information sharing constraints in training enables MARL to perform comparatively to a centralized learning-based solution when deployed, and to outperform a distributed model-based solution in most cases, whilst respecting the information constraints of the system.</p></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":null,"pages":null},"PeriodicalIF":3.9000,"publicationDate":"2024-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0098135424002011/pdfft?md5=305f3729d482de640ac0d1dacf771be9&pid=1-s2.0-S0098135424002011-main.pdf","citationCount":"0","resultStr":"{\"title\":\"An analysis of multi-agent reinforcement learning for decentralized inventory control systems\",\"authors\":\"Marwan Mousa, Damien van de Berg, Niki Kotecha, Ehecatl Antonio del Rio Chanona, Max Mowbray\",\"doi\":\"10.1016/j.compchemeng.2024.108783\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Most solutions to the inventory management problem assume a centralization of information that is incompatible with organizational constraints in supply chain networks. The problem can be naturally decomposed into sub-problems, each associated with an independent entity, turning it into a multi-agent system. A decentralized solution to inventory management using multi-agent reinforcement learning (MARL) is proposed where each entity is controlled by an agent. Three multi-agent variations of the proximal policy optimization algorithm are investigated through simulations of different supply chain networks and levels of uncertainty. A framework is deployed, which relies on offline centralization during simulation-based policy identification but enables decentralization when the policies are deployed online to the real system. Results show that reducing information sharing constraints in training enables MARL to perform comparatively to a centralized learning-based solution when deployed, and to outperform a distributed model-based solution in most cases, whilst respecting the information constraints of the system.</p></div>\",\"PeriodicalId\":286,\"journal\":{\"name\":\"Computers & Chemical Engineering\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-06-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0098135424002011/pdfft?md5=305f3729d482de640ac0d1dacf771be9&pid=1-s2.0-S0098135424002011-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Chemical Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0098135424002011\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098135424002011","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
An analysis of multi-agent reinforcement learning for decentralized inventory control systems
Most solutions to the inventory management problem assume a centralization of information that is incompatible with organizational constraints in supply chain networks. The problem can be naturally decomposed into sub-problems, each associated with an independent entity, turning it into a multi-agent system. A decentralized solution to inventory management using multi-agent reinforcement learning (MARL) is proposed where each entity is controlled by an agent. Three multi-agent variations of the proximal policy optimization algorithm are investigated through simulations of different supply chain networks and levels of uncertainty. A framework is deployed, which relies on offline centralization during simulation-based policy identification but enables decentralization when the policies are deployed online to the real system. Results show that reducing information sharing constraints in training enables MARL to perform comparatively to a centralized learning-based solution when deployed, and to outperform a distributed model-based solution in most cases, whilst respecting the information constraints of the system.
期刊介绍:
Computers & Chemical Engineering is primarily a journal of record for new developments in the application of computing and systems technology to chemical engineering problems.