Deep reinforcement learning for solving the stochastic e-waste collection problem

IF 6 2区 管理学 Q1 OPERATIONS RESEARCH & MANAGEMENT SCIENCE
Dang Viet Anh Nguyen, Aldy Gunawan, Mustafa Misir, Lim Kwan Hui, Pieter Vansteenwegen
{"title":"Deep reinforcement learning for solving the stochastic e-waste collection problem","authors":"Dang Viet Anh Nguyen, Aldy Gunawan, Mustafa Misir, Lim Kwan Hui, Pieter Vansteenwegen","doi":"10.1016/j.ejor.2025.04.033","DOIUrl":null,"url":null,"abstract":"With the growing influence of the internet and information technology, Electrical and Electronic Equipment (EEE) has become a gateway to technological innovations. However, discarded devices, also called e-waste, pose a significant threat to the environment and human health if not properly treated, disposed of, or recycled. In this study, we extend a novel model for the e-waste collection in an urban context: the Heterogeneous VRP with Multiple Time Windows and Stochastic Travel Times (HVRP-MTWSTT). We propose a solution method that employs deep reinforcement learning to guide local search heuristics (DRL-LSH). The contributions of this paper are as follows: (1) HVRP-MTWSTT represents the first stochastic VRP in the context of the e-waste collection problem, incorporating complex constraints such as multiple time windows across a multi-period horizon with a heterogeneous vehicle fleet, (2) The DRL-LSH model uses deep reinforcement learning to provide an online adaptive operator selection layer, selecting the appropriate heuristic based on the search state. The computational experiments demonstrate that DRL-LSH outperforms the state-of-the-art hyperheuristic method by 24.26% on large-scale benchmark instances, with the performance gap increasing as the problem size grows. Additionally, to demonstrate the capability of DRL-LSH in addressing real-world problems, we tested and compared it with reference metaheuristic and hyperheuristic algorithms using a real-world e-waste collection case study in Singapore. The results showed that DRL-LSH significantly outperformed the reference algorithms on a real-world instance in terms of operating profit.","PeriodicalId":55161,"journal":{"name":"European Journal of Operational Research","volume":"12 1","pages":""},"PeriodicalIF":6.0000,"publicationDate":"2025-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Operational Research","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1016/j.ejor.2025.04.033","RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPERATIONS RESEARCH & MANAGEMENT SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

With the growing influence of the internet and information technology, Electrical and Electronic Equipment (EEE) has become a gateway to technological innovations. However, discarded devices, also called e-waste, pose a significant threat to the environment and human health if not properly treated, disposed of, or recycled. In this study, we extend a novel model for the e-waste collection in an urban context: the Heterogeneous VRP with Multiple Time Windows and Stochastic Travel Times (HVRP-MTWSTT). We propose a solution method that employs deep reinforcement learning to guide local search heuristics (DRL-LSH). The contributions of this paper are as follows: (1) HVRP-MTWSTT represents the first stochastic VRP in the context of the e-waste collection problem, incorporating complex constraints such as multiple time windows across a multi-period horizon with a heterogeneous vehicle fleet, (2) The DRL-LSH model uses deep reinforcement learning to provide an online adaptive operator selection layer, selecting the appropriate heuristic based on the search state. The computational experiments demonstrate that DRL-LSH outperforms the state-of-the-art hyperheuristic method by 24.26% on large-scale benchmark instances, with the performance gap increasing as the problem size grows. Additionally, to demonstrate the capability of DRL-LSH in addressing real-world problems, we tested and compared it with reference metaheuristic and hyperheuristic algorithms using a real-world e-waste collection case study in Singapore. The results showed that DRL-LSH significantly outperformed the reference algorithms on a real-world instance in terms of operating profit.
求解随机电子垃圾收集问题的深度强化学习
随着互联网和信息技术的影响越来越大,电气电子设备(EEE)已成为技术创新的门户。然而,被丢弃的设备,也称为电子废物,如果不加以适当处理、处置或回收,将对环境和人类健康构成重大威胁。在这项研究中,我们扩展了一个新的城市背景下的电子垃圾收集模型:具有多时间窗和随机旅行时间的异构VRP (HVRP-MTWSTT)。我们提出了一种采用深度强化学习来指导局部搜索启发式(DRL-LSH)的解决方法。本文的贡献如下:(1)HVRP-MTWSTT代表了电子垃圾收集问题背景下的第一个随机VRP,包含了复杂的约束条件,如跨多周期水平的多时间窗口和异构车队;(2)DRL-LSH模型使用深度强化学习提供了一个在线自适应算子选择层,根据搜索状态选择合适的启发式。计算实验表明,在大规模基准实例上,DRL-LSH的性能优于最先进的超启发式方法24.26%,并且随着问题规模的增加,性能差距越来越大。此外,为了证明DRL-LSH在解决现实世界问题方面的能力,我们使用新加坡的现实世界电子垃圾收集案例研究,对其与参考元启发式和超启发式算法进行了测试和比较。结果表明,在实际实例中,DRL-LSH在运营利润方面明显优于参考算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
European Journal of Operational Research
European Journal of Operational Research 管理科学-运筹学与管理科学
CiteScore
11.90
自引率
9.40%
发文量
786
审稿时长
8.2 months
期刊介绍: The European Journal of Operational Research (EJOR) publishes high quality, original papers that contribute to the methodology of operational research (OR) and to the practice of decision making.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信