H2-MARL: Multi-agent reinforcement learning for Pareto optimality in hospital capacity strain and human mobility during epidemic

IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Xueting Luo , Hao Deng , Jihong Yang , Yao Shen , Huanhuan Guo , Zhiyuan Sun , Mingqing Liu , Jiming Wei , Shengjie Zhao
{"title":"H2-MARL: Multi-agent reinforcement learning for Pareto optimality in hospital capacity strain and human mobility during epidemic","authors":"Xueting Luo ,&nbsp;Hao Deng ,&nbsp;Jihong Yang ,&nbsp;Yao Shen ,&nbsp;Huanhuan Guo ,&nbsp;Zhiyuan Sun ,&nbsp;Mingqing Liu ,&nbsp;Jiming Wei ,&nbsp;Shengjie Zhao","doi":"10.1016/j.eswa.2025.128432","DOIUrl":null,"url":null,"abstract":"<div><div>Effectively balancing the losses from mobility restrictions and hospital capacity strain has drawn significant attention in the aftermath of COVID-19. Reinforcement learning (RL)-based strategies for human mobility management have recently advanced in addressing the dynamic evolution of cities and epidemics; however, they still face challenges in achieving coordinated control at the township level and adapting to cities of varying scales. To address the above issues, we propose a multi-agent RL approach that achieves Pareto optimality in managing hospital capacity and human mobility (H2-MARL), applicable across cities of different population scales. We first develop a township-level infection model with online-updatable parameters to simulate disease transmission and construct a city-wide dynamic spatiotemporal epidemic simulator. On this basis, H2-MARL is designed to treat each division as an agent, with a trade-off dual-objective reward function formulated and an experience replay buffer enriched with expert knowledge built. To evaluate the effectiveness of the model, we construct a township-level human mobility dataset containing over one billion records from four representative cities of varying scales. Extensive experiments demonstrate that H2-MARL has the optimal dual-objective trade-off capability, which can simultaneously minimize hospital capacity strain and human mobility restriction loss. Meanwhile, the applicability of the proposed model to epidemic control in cities of varying scales is verified, which showcases its feasibility and versatility in practical applications.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"291 ","pages":"Article 128432"},"PeriodicalIF":7.5000,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425020512","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Effectively balancing the losses from mobility restrictions and hospital capacity strain has drawn significant attention in the aftermath of COVID-19. Reinforcement learning (RL)-based strategies for human mobility management have recently advanced in addressing the dynamic evolution of cities and epidemics; however, they still face challenges in achieving coordinated control at the township level and adapting to cities of varying scales. To address the above issues, we propose a multi-agent RL approach that achieves Pareto optimality in managing hospital capacity and human mobility (H2-MARL), applicable across cities of different population scales. We first develop a township-level infection model with online-updatable parameters to simulate disease transmission and construct a city-wide dynamic spatiotemporal epidemic simulator. On this basis, H2-MARL is designed to treat each division as an agent, with a trade-off dual-objective reward function formulated and an experience replay buffer enriched with expert knowledge built. To evaluate the effectiveness of the model, we construct a township-level human mobility dataset containing over one billion records from four representative cities of varying scales. Extensive experiments demonstrate that H2-MARL has the optimal dual-objective trade-off capability, which can simultaneously minimize hospital capacity strain and human mobility restriction loss. Meanwhile, the applicability of the proposed model to epidemic control in cities of varying scales is verified, which showcases its feasibility and versatility in practical applications.
H2-MARL:流行病期间医院容量、应变和人员流动的Pareto最优多智能体强化学习
在2019冠状病毒病之后,有效平衡流动限制和医院能力紧张造成的损失引起了极大的关注。基于强化学习(RL)的人员流动管理战略最近在解决城市和流行病的动态演变方面取得了进展;然而,他们在乡镇一级实现协调控制和适应不同规模的城市方面仍然面临挑战。为了解决上述问题,我们提出了一种多智能体强化学习方法,该方法在管理医院容量和人员流动性(H2-MARL)方面实现了帕累托最优,适用于不同人口规模的城市。首先建立了参数可在线更新的乡级感染模型,模拟疾病传播,构建了全市范围的动态时空流行病模拟器。在此基础上,设计H2-MARL,将每个部门视为一个agent,制定了权衡双目标奖励函数,构建了丰富专家知识的经验回放缓冲区。为了评估该模型的有效性,我们构建了一个乡镇级别的人口流动数据集,其中包含来自四个不同规模的代表性城市的超过10亿条记录。大量实验表明,H2-MARL具有最优的双目标权衡能力,可以同时最大限度地减少医院容量紧张和人员行动限制损失。同时,验证了该模型对不同规模城市疫情控制的适用性,显示了该模型在实际应用中的可行性和通用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Expert Systems with Applications
Expert Systems with Applications 工程技术-工程:电子与电气
CiteScore
13.80
自引率
10.60%
发文量
2045
审稿时长
8.7 months
期刊介绍: Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信