碳税政策下基于强化学习的废旧电器电子设备闭环供应链多级库存优化

IF 8 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence Pub Date : 2025-05-09 DOI:10.1016/j.engappai.2025.110987

Jinlong Wang , Shangzhuo Zhou , Min Li , Guanyu Ren , Xianquan Ren , Xiaoyun Xiong , Yuanyuan Zhang

{"title":"碳税政策下基于强化学习的废旧电器电子设备闭环供应链多级库存优化","authors":"Jinlong Wang , Shangzhuo Zhou , Min Li , Guanyu Ren , Xianquan Ren , Xiaoyun Xiong , Yuanyuan Zhang","doi":"10.1016/j.engappai.2025.110987","DOIUrl":null,"url":null,"abstract":"<div><div>In response to environmental challenges posed by waste electrical and electronic equipment (WEEE), the WEEE closed-loop supply chain (CLSC) has emerged as a crucial means to promote circular economy through the recycling and reuse of WEEE, which not only reduces waste emissions but also improves the efficiency of resource utilization. In practice, both economic and environmental benefits must be considered in sustainable manufacturing within the CLSC to ensure the sustainable development of enterprises. Therefore, a multi-echelon, multi-period inventory model for the CLSC under carbon tax policy is developed in this paper, which innovatively introduces the carbon footprint to assess environmental impact and integrates the impacts of collection planning and the uncertainty of recycling quantity and product demand on the system, aiming to minimize total enterprise costs. To address the uncertainties in the model, the Proximal Policy Optimization (PPO) algorithm is employed to train a reinforcement learning (RL) agent. This agent enables enterprises to dynamically adjust internal strategies such as collection and production, as well as external procurement strategies, based on inventory levels and market conditions. By internalizing carbon emission costs through the carbon tax rate, the RL agent optimizes total costs while achieving a balance between economic and environmental benefits. Numerical experiments demonstrate that the PPO algorithm outperforms the traditional inventory management policy in terms of both cost control and carbon footprint reduction. Moreover, the moderate carbon tax policy on the CLSC appropriately increases the cost of enterprises while significantly reducing their carbon footprint and promoting sustainable development.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"154 ","pages":"Article 110987"},"PeriodicalIF":8.0000,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-echelon inventory optimization of waste electrical and electronic equipment closed-loop supply chain based on reinforcement learning under carbon tax policy\",\"authors\":\"Jinlong Wang , Shangzhuo Zhou , Min Li , Guanyu Ren , Xianquan Ren , Xiaoyun Xiong , Yuanyuan Zhang\",\"doi\":\"10.1016/j.engappai.2025.110987\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In response to environmental challenges posed by waste electrical and electronic equipment (WEEE), the WEEE closed-loop supply chain (CLSC) has emerged as a crucial means to promote circular economy through the recycling and reuse of WEEE, which not only reduces waste emissions but also improves the efficiency of resource utilization. In practice, both economic and environmental benefits must be considered in sustainable manufacturing within the CLSC to ensure the sustainable development of enterprises. Therefore, a multi-echelon, multi-period inventory model for the CLSC under carbon tax policy is developed in this paper, which innovatively introduces the carbon footprint to assess environmental impact and integrates the impacts of collection planning and the uncertainty of recycling quantity and product demand on the system, aiming to minimize total enterprise costs. To address the uncertainties in the model, the Proximal Policy Optimization (PPO) algorithm is employed to train a reinforcement learning (RL) agent. This agent enables enterprises to dynamically adjust internal strategies such as collection and production, as well as external procurement strategies, based on inventory levels and market conditions. By internalizing carbon emission costs through the carbon tax rate, the RL agent optimizes total costs while achieving a balance between economic and environmental benefits. Numerical experiments demonstrate that the PPO algorithm outperforms the traditional inventory management policy in terms of both cost control and carbon footprint reduction. Moreover, the moderate carbon tax policy on the CLSC appropriately increases the cost of enterprises while significantly reducing their carbon footprint and promoting sustainable development.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"154 \",\"pages\":\"Article 110987\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-05-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S095219762500987X\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S095219762500987X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

为应对报废电子电气设备对环境的挑战，废旧电子电气设备闭环供应链（CLSC）作为促进循环经济的重要手段应运而生，通过废旧电子电气设备的回收再利用，既减少了废物排放，又提高了资源利用效率。在实践中，可持续制造必须兼顾经济效益和环境效益，以保证企业的可持续发展。因此，本文构建了碳税政策下的CLSC多级、多期库存模型，创新地引入碳足迹来评估环境影响，并整合了收集计划、回收量和产品需求的不确定性对系统的影响，以实现企业总成本最小化。为了解决模型中的不确定性问题，采用近端策略优化（PPO）算法训练一个强化学习（RL）智能体。该代理使企业能够根据库存水平和市场情况动态调整收集和生产等内部策略以及外部采购策略。RL代理通过碳税率将碳排放成本内部化，在实现经济效益与环境效益平衡的同时，优化了总成本。数值实验表明，PPO算法在成本控制和碳足迹减少两方面都优于传统的库存管理策略。此外，中小企业适度的碳税政策在显著降低企业碳足迹、促进可持续发展的同时，也适当提高了企业的成本。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-echelon inventory optimization of waste electrical and electronic equipment closed-loop supply chain based on reinforcement learning under carbon tax policy

In response to environmental challenges posed by waste electrical and electronic equipment (WEEE), the WEEE closed-loop supply chain (CLSC) has emerged as a crucial means to promote circular economy through the recycling and reuse of WEEE, which not only reduces waste emissions but also improves the efficiency of resource utilization. In practice, both economic and environmental benefits must be considered in sustainable manufacturing within the CLSC to ensure the sustainable development of enterprises. Therefore, a multi-echelon, multi-period inventory model for the CLSC under carbon tax policy is developed in this paper, which innovatively introduces the carbon footprint to assess environmental impact and integrates the impacts of collection planning and the uncertainty of recycling quantity and product demand on the system, aiming to minimize total enterprise costs. To address the uncertainties in the model, the Proximal Policy Optimization (PPO) algorithm is employed to train a reinforcement learning (RL) agent. This agent enables enterprises to dynamically adjust internal strategies such as collection and production, as well as external procurement strategies, based on inventory levels and market conditions. By internalizing carbon emission costs through the carbon tax rate, the RL agent optimizes total costs while achieving a balance between economic and environmental benefits. Numerical experiments demonstrate that the PPO algorithm outperforms the traditional inventory management policy in terms of both cost control and carbon footprint reduction. Moreover, the moderate carbon tax policy on the CLSC appropriately increases the cost of enterprises while significantly reducing their carbon footprint and promoting sustainable development.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Engineering Applications of Artificial Intelligence 工程技术-工程：电子与电气

CiteScore

9.60

自引率

10.00%

发文量

505

审稿时长

68 days

期刊介绍： Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.