基于层次强化学习和注意机制的不确定事件云制造动态调度

IF 7.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge-Based Systems Pub Date : 2025-03-21 DOI:10.1016/j.knosys.2025.113335

Jianxiong Zhang , Yuming Jiang , Bing Guo , Tingting Liu , Dasha Hu , Jinbo Zhang , Yifei Deng , Hao Wang , Jv Yang , Xuefeng Ding

{"title":"基于层次强化学习和注意机制的不确定事件云制造动态调度","authors":"Jianxiong Zhang , Yuming Jiang , Bing Guo , Tingting Liu , Dasha Hu , Jinbo Zhang , Yifei Deng , Hao Wang , Jv Yang , Xuefeng Ding","doi":"10.1016/j.knosys.2025.113335","DOIUrl":null,"url":null,"abstract":"<div><div>Cloud manufacturing provides a platform for many-to-many scheduling of consumer tasks assigned to service providers. The dynamics and uncertainties of the cloud environment pose stringent requirements on the real-time performance and generalizability of scheduling algorithms. Moreover, the continuous variations in environmental states, task scale, and service statuses further complicate decision-making. However, existing dynamic scheduling methods, primarily developed to address static environments and constant scales, fall short of addressing the escalating volatility and complexity of real-world scheduling. To achieve near-real-time decision-making, a deep hierarchical reinforcement learning framework incorporating attention mechanisms and pointer networks is proposed for multi-objective dynamic scheduling in cloud manufacturing. This framework divides the scheduling problem into three subproblems (optimization objectives, manufacturing tasks, and service selection) and leverages a hierarchical structure to realize a three-step scheduling decision-making process. The proposed framework comprises three encoder–decoder-based agents, each corresponding to a subproblem and collaborating to achieve the overall decision. The agents utilize the multi-head attention mechanism to extract inter-task and inter-service relationships, enhancing decision precision and environmental adaptability. Additionally, the pointer network is incorporated into each agent, endowing the proposed framework with generalizability when inserting new tasks (or services) or removing existing ones. Experimental results across nine dynamic scenarios demonstrate that our framework outperforms five deep reinforcement learning algorithms and three meta-heuristics in terms of scheduling performance and runtime. Results from six out-of-training-scale instances further indicate that our framework exhibits superior generalization and scalability.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"316 ","pages":"Article 113335"},"PeriodicalIF":7.6000,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dynamic scheduling for cloud manufacturing with uncertain events by hierarchical reinforcement learning and attention mechanism\",\"authors\":\"Jianxiong Zhang , Yuming Jiang , Bing Guo , Tingting Liu , Dasha Hu , Jinbo Zhang , Yifei Deng , Hao Wang , Jv Yang , Xuefeng Ding\",\"doi\":\"10.1016/j.knosys.2025.113335\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Cloud manufacturing provides a platform for many-to-many scheduling of consumer tasks assigned to service providers. The dynamics and uncertainties of the cloud environment pose stringent requirements on the real-time performance and generalizability of scheduling algorithms. Moreover, the continuous variations in environmental states, task scale, and service statuses further complicate decision-making. However, existing dynamic scheduling methods, primarily developed to address static environments and constant scales, fall short of addressing the escalating volatility and complexity of real-world scheduling. To achieve near-real-time decision-making, a deep hierarchical reinforcement learning framework incorporating attention mechanisms and pointer networks is proposed for multi-objective dynamic scheduling in cloud manufacturing. This framework divides the scheduling problem into three subproblems (optimization objectives, manufacturing tasks, and service selection) and leverages a hierarchical structure to realize a three-step scheduling decision-making process. The proposed framework comprises three encoder–decoder-based agents, each corresponding to a subproblem and collaborating to achieve the overall decision. The agents utilize the multi-head attention mechanism to extract inter-task and inter-service relationships, enhancing decision precision and environmental adaptability. Additionally, the pointer network is incorporated into each agent, endowing the proposed framework with generalizability when inserting new tasks (or services) or removing existing ones. Experimental results across nine dynamic scenarios demonstrate that our framework outperforms five deep reinforcement learning algorithms and three meta-heuristics in terms of scheduling performance and runtime. Results from six out-of-training-scale instances further indicate that our framework exhibits superior generalization and scalability.</div></div>\",\"PeriodicalId\":49939,\"journal\":{\"name\":\"Knowledge-Based Systems\",\"volume\":\"316 \",\"pages\":\"Article 113335\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-03-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge-Based Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S095070512500382X\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S095070512500382X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

云制造为分配给服务提供商的用户任务的多对多调度提供了一个平台。云环境的动态性和不确定性对调度算法的实时性和通用性提出了严格的要求。此外，环境状态、任务规模和服务状态的持续变化进一步使决策复杂化。然而，现有的动态调度方法主要是为了解决静态环境和恒定规模而开发的，无法解决现实世界调度不断升级的波动性和复杂性。针对云制造中多目标动态调度问题，提出了一种结合注意力机制和指针网络的深度分层强化学习框架。该框架将调度问题划分为三个子问题（优化目标、制造任务和服务选择），并利用分层结构实现三步调度决策过程。提出的框架包括三个基于编码器-解码器的代理，每个代理对应一个子问题，并相互协作以实现总体决策。智能体利用多头注意机制提取任务间和服务间的关系，提高决策精度和环境适应性。此外，指针网络被合并到每个代理中，使所提出的框架在插入新任务（或服务）或删除现有任务（或服务）时具有泛化性。九个动态场景的实验结果表明，我们的框架在调度性能和运行时间方面优于五种深度强化学习算法和三种元启发式算法。六个超出训练规模的实例的结果进一步表明，我们的框架具有优越的泛化和可扩展性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Dynamic scheduling for cloud manufacturing with uncertain events by hierarchical reinforcement learning and attention mechanism

Cloud manufacturing provides a platform for many-to-many scheduling of consumer tasks assigned to service providers. The dynamics and uncertainties of the cloud environment pose stringent requirements on the real-time performance and generalizability of scheduling algorithms. Moreover, the continuous variations in environmental states, task scale, and service statuses further complicate decision-making. However, existing dynamic scheduling methods, primarily developed to address static environments and constant scales, fall short of addressing the escalating volatility and complexity of real-world scheduling. To achieve near-real-time decision-making, a deep hierarchical reinforcement learning framework incorporating attention mechanisms and pointer networks is proposed for multi-objective dynamic scheduling in cloud manufacturing. This framework divides the scheduling problem into three subproblems (optimization objectives, manufacturing tasks, and service selection) and leverages a hierarchical structure to realize a three-step scheduling decision-making process. The proposed framework comprises three encoder–decoder-based agents, each corresponding to a subproblem and collaborating to achieve the overall decision. The agents utilize the multi-head attention mechanism to extract inter-task and inter-service relationships, enhancing decision precision and environmental adaptability. Additionally, the pointer network is incorporated into each agent, endowing the proposed framework with generalizability when inserting new tasks (or services) or removing existing ones. Experimental results across nine dynamic scenarios demonstrate that our framework outperforms five deep reinforcement learning algorithms and three meta-heuristics in terms of scheduling performance and runtime. Results from six out-of-training-scale instances further indicate that our framework exhibits superior generalization and scalability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Knowledge-Based Systems 工程技术-计算机：人工智能

CiteScore

14.80

自引率

12.50%

发文量

1245

审稿时长

7.8 months

期刊介绍： Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.