Digital-twin-based AGV cluster dynamic scheduling for solar cell production workshop using deep reinforcement learning

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing Pub Date : 2025-06-15 DOI:10.1016/j.neucom.2025.130772

Zhuo Zhou , Liyun Xu , Yiyang Chen , Liqiang Liao , Zhun Xu

{"title":"Digital-twin-based AGV cluster dynamic scheduling for solar cell production workshop using deep reinforcement learning","authors":"Zhuo Zhou , Liyun Xu , Yiyang Chen , Liqiang Liao , Zhun Xu","doi":"10.1016/j.neucom.2025.130772","DOIUrl":null,"url":null,"abstract":"<div><div>In recent years, the demand for renewable energy sources, notably solar energy, has rapidly increased. As the most essential photovoltaic module, solar cells with high cleanliness and fragility rely on automated guided vehicles (AGVs) for transportation between various processes. However, the solar cell production workshop with massive AGVs has the characteristics of high dynamics, complexity, and uncertainty, which makes the traditional AGV scheduling methods unable to meet the dynamic scheduling requirements. Therefore, this paper proposes a digital-twin-based (DT-based) AGV cluster dynamic scheduling method using deep reinforcement learning (DRL)<strong>.</strong> Firstly, a DT-based AGV cluster dynamic scheduling framework is constructed, ensuring operational synergy among DT, decision-making model formulation, and real-world application. Secondly, an AGV cluster dynamic scheduling mathematical model that minimizes the average waiting time is established. Thirdly, the problem of AGV cluster dynamic scheduling is transformed into a Markov Decision Process (MDP) with detailed descriptions. Moreover, an improved soft actor-critic (ISAC) DRL algorithm, adding the Softmax function to the actor network and introducing a multi-stage sample selection strategy, is implemented to resolve the established MDP model. Finally, the six cases derived from real-world solar cell production workshops are studied, and the results demonstrate the effectiveness of the proposed methodology.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"648 ","pages":"Article 130772"},"PeriodicalIF":5.5000,"publicationDate":"2025-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225014444","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

In recent years, the demand for renewable energy sources, notably solar energy, has rapidly increased. As the most essential photovoltaic module, solar cells with high cleanliness and fragility rely on automated guided vehicles (AGVs) for transportation between various processes. However, the solar cell production workshop with massive AGVs has the characteristics of high dynamics, complexity, and uncertainty, which makes the traditional AGV scheduling methods unable to meet the dynamic scheduling requirements. Therefore, this paper proposes a digital-twin-based (DT-based) AGV cluster dynamic scheduling method using deep reinforcement learning (DRL). Firstly, a DT-based AGV cluster dynamic scheduling framework is constructed, ensuring operational synergy among DT, decision-making model formulation, and real-world application. Secondly, an AGV cluster dynamic scheduling mathematical model that minimizes the average waiting time is established. Thirdly, the problem of AGV cluster dynamic scheduling is transformed into a Markov Decision Process (MDP) with detailed descriptions. Moreover, an improved soft actor-critic (ISAC) DRL algorithm, adding the Softmax function to the actor network and introducing a multi-stage sample selection strategy, is implemented to resolve the established MDP model. Finally, the six cases derived from real-world solar cell production workshops are studied, and the results demonstrate the effectiveness of the proposed methodology.

查看原文本刊更多论文

基于深度强化学习的太阳能电池生产车间数字孪生AGV集群动态调度

近年来，对可再生能源，特别是太阳能的需求迅速增加。太阳能电池作为最重要的光伏组件，具有高清洁度和易碎性，需要依靠自动导引车（agv）在各个工序之间运输。然而，大规模AGV太阳能电池生产车间具有高动态性、复杂性和不确定性的特点，传统的AGV调度方法无法满足动态调度要求。为此，本文提出了一种基于深度强化学习（DRL）的基于数字孪生的AGV集群动态调度方法。首先，构建了基于DT的AGV集群动态调度框架，保证了DT、决策模型制定和实际应用之间的运行协同。其次，建立了最小化平均等待时间的AGV集群动态调度数学模型；第三，将AGV集群动态调度问题转化为马尔可夫决策过程，并对其进行了详细描述。此外，采用一种改进的ISAC (soft actor-critic) DRL算法，在actor网络中加入Softmax函数，并引入多阶段样本选择策略来求解所建立的MDP模型。最后，对实际太阳能电池生产车间的六个案例进行了研究，结果表明了所提出方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neurocomputing 工程技术-计算机：人工智能

CiteScore

13.10

自引率

10.00%

发文量

1382

审稿时长

70 days

期刊介绍： Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.