{"title":"Train small, deploy large: Scaling multi-agent reinforcement learning for multi-stage manufacturing lines","authors":"Kshitij Bhatta, Qing Chang","doi":"10.1016/j.jmsy.2025.04.017","DOIUrl":null,"url":null,"abstract":"<div><div>We present a novel control framework using Multi Agent Reinforcement learning (MARL) that is scalable in the number of workstations in a multi-stage manufacturing line. We show that the dynamics of any production line, regardless of size, can be decoupled into three fundamental expressions. These expressions capture the dynamics of (1) the first workstation, (2) all intermediate workstations, and (3) the last workstation. This decoupling, combined with observation engineering enables training a characteristic 3-workstation, 2-buffer model using MARL methods, which can then generalize to production lines with <span><math><mi>w</mi></math></span> workstations with arbitrary cycle times, buffer capacities and reliability models. A numerical study is then conducted to validate the framework.</div></div>","PeriodicalId":16227,"journal":{"name":"Journal of Manufacturing Systems","volume":"81 ","pages":"Pages 155-168"},"PeriodicalIF":14.2000,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Manufacturing Systems","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0278612525001062","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}
引用次数: 0
Abstract
We present a novel control framework using Multi Agent Reinforcement learning (MARL) that is scalable in the number of workstations in a multi-stage manufacturing line. We show that the dynamics of any production line, regardless of size, can be decoupled into three fundamental expressions. These expressions capture the dynamics of (1) the first workstation, (2) all intermediate workstations, and (3) the last workstation. This decoupling, combined with observation engineering enables training a characteristic 3-workstation, 2-buffer model using MARL methods, which can then generalize to production lines with workstations with arbitrary cycle times, buffer capacities and reliability models. A numerical study is then conducted to validate the framework.
期刊介绍:
The Journal of Manufacturing Systems is dedicated to showcasing cutting-edge fundamental and applied research in manufacturing at the systems level. Encompassing products, equipment, people, information, control, and support functions, manufacturing systems play a pivotal role in the economical and competitive development, production, delivery, and total lifecycle of products, meeting market and societal needs.
With a commitment to publishing archival scholarly literature, the journal strives to advance the state of the art in manufacturing systems and foster innovation in crafting efficient, robust, and sustainable manufacturing systems. The focus extends from equipment-level considerations to the broader scope of the extended enterprise. The Journal welcomes research addressing challenges across various scales, including nano, micro, and macro-scale manufacturing, and spanning diverse sectors such as aerospace, automotive, energy, and medical device manufacturing.