{"title":"An adaptive K-means and reinforcement learning (RL) algorithm to effective vaccine distribution","authors":"Elson Cibaku , İ. Esra Büyüktahtakın","doi":"10.1016/j.cor.2025.107275","DOIUrl":"10.1016/j.cor.2025.107275","url":null,"abstract":"<div><div>We present a new adaptive reinforcement learning (RL) approach, integrated with a K-means clustering algorithm and guided by simulated annealing, to address the capacitated vehicle routing for vaccine distribution (CVRVD) problem. This integrated method provides an efficient and scalable solution for optimizing vaccine distribution logistics. By incorporating cost factors related to travel distance, inventory levels, and penalty terms – while adhering to delivery time windows – our approach improves both operational efficiency and vaccine allocation effectiveness. Experimental results demonstrate that our K-means supported RL algorithm significantly outperforms traditional solvers in tackling this NP-hard problem, particularly in large-scale scenarios. Specifically, our approach can efficiently solve CVRVD instances with up to 1,000 facilities—scenarios that are computationally intractable for exact methods. We demonstrate the effectiveness of the adaptive K-means supported RL algorithm using data from New Jersey, USA, where facility-level vaccination data were available through the state’s Immunization Information System. Beyond vaccine distribution, our method has broad applicability in logistics and transportation, enabling more efficient and cost-effective allocation of critical resources such as vaccines and medical supplies.</div></div>","PeriodicalId":10542,"journal":{"name":"Computers & Operations Research","volume":"185 ","pages":"Article 107275"},"PeriodicalIF":4.3,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145096133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yongchun Wang , Qingjin Peng , Zhen Wang , Shuiquan Huang , Zhengkai Xu , Chuanzhen Huang , Baosu Guo
{"title":"Q-learning-based hyper-heuristic algorithm for open dimension irregular packing problems","authors":"Yongchun Wang , Qingjin Peng , Zhen Wang , Shuiquan Huang , Zhengkai Xu , Chuanzhen Huang , Baosu Guo","doi":"10.1016/j.cor.2025.107279","DOIUrl":"10.1016/j.cor.2025.107279","url":null,"abstract":"<div><div>Heuristic methods provide a computationally efficient framework for addressing two-dimensional irregular packing problems, particularly in resource-constrained industrial settings. As a typical combinatorial optimization problem, irregular packing exhibits exponential growth in computational complexity with increasing workpiece counts, while the solution space dynamically reconfigures due to geometric variability among workpieces. Although heuristic algorithms can generate feasible layouts within acceptable timeframes, their reliance on fixed search rule limits adaptability to diverse scenarios, necessitating more flexible approaches. In this paper, a hyper-heuristic algorithm based on Q-Learning is proposed to solve open dimension packing problems, including one-open and two-open dimension problems. Q-Learning is adopted as the high-level strategy for its ability to optimize low-level heuristic selection through reward-driven experience accumulation. The method incorporates a mixed encoding method for solution representation, four specialized low-level heuristic operators, a linear population decline mechanism, and an elite preservation strategy to balance exploration–exploitation. The Q-Learning controller dynamically selects operators by updating the Q-table based on Bellman’s equation. The proposed algorithm is compared to some advanced algorithms in general datasets. The results show that our method has better performance and applicability.</div></div>","PeriodicalId":10542,"journal":{"name":"Computers & Operations Research","volume":"185 ","pages":"Article 107279"},"PeriodicalIF":4.3,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145096233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hybrid modelling using simulation and machine learning in healthcare","authors":"Ali Ahmadi , Masoud Fakhimi , Carin Magnusson","doi":"10.1016/j.cor.2025.107278","DOIUrl":"10.1016/j.cor.2025.107278","url":null,"abstract":"<div><div>Modelling & Simulation (M&S) and Machine Learning (ML) methodologies have undergone significant advancements, enabling transformative applications across various industries. The integration of M&S and ML into a Hybrid M&S-ML approach leverages the unique strengths of both fields, offering enhanced model precision, improved efficiency, and more effective decision support. This review explores the increasing convergence of ML algorithms with traditional M&S methods- namely Agent-Based Modelling & Simulation, Discrete Event Simulation, and System Dynamics- in healthcare applications. Through a systematic review of 90 relevant studies, this article provides a comprehensive synthesis of the current state-of-the-art Hybrid M&S-ML in healthcare. Specifically, it examines the M&S and ML methodologies employed, associated software tools and programming languages, analyses integration patterns and data exchange mechanisms, and explores application domains, as well as the types and motivations for hybridisation. Key findings highlight prominent methodological and technical trends, as well as opportunities for combining M&S with ML to address healthcare challenges. These insights provide direction for modellers and data scientists in developing hybrid M&S–ML approaches that more effectively combine simulation capabilities with data-driven learning. The review also demonstrates the potential of such approaches to advance methodological innovation and support evidence-based decision-making in diverse healthcare contexts.</div></div>","PeriodicalId":10542,"journal":{"name":"Computers & Operations Research","volume":"185 ","pages":"Article 107278"},"PeriodicalIF":4.3,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145155372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A note on battery swapping policies in the electric vehicle routing problem with time windows and battery swapping vehicles","authors":"Bülent Çatay , İhsan Sadati","doi":"10.1016/j.cor.2025.107277","DOIUrl":"10.1016/j.cor.2025.107277","url":null,"abstract":"<div><div>Çatay and Sadati [An improved matheuristic for solving the electric vehicle routing problem with time windows and synchronized mobile charging/battery swapping. <em>Computers & Operations Research</em> 159, 106310, 2023] explores a variant of the Electric Vehicle Routing Problem with Time Windows that incorporates mobile chargers for recharging electric vehicles (EVs) at selected locations while serving customers. The authors propose a matheuristic method to address this problem and its special case, where EV batteries are swapped in constant time instead of being recharged over variable durations. While comparing their results with those in the literature, the authors overlook a critical assumption regarding the swapping policy, potentially causing confusion in interpreting the findings. This note addresses the issue, clarifies the overlooked assumption, and updates the results that do not align with the actual scenario in the literature. Furthermore, it introduces two new battery swapping policies and presents an extensive computational study to offer new insights on synchronized mobile battery swapping.</div></div>","PeriodicalId":10542,"journal":{"name":"Computers & Operations Research","volume":"185 ","pages":"Article 107277"},"PeriodicalIF":4.3,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145046602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Felipe O. Mota , Luís Paquete , Daniel Vanderpooten
{"title":"Grouping strategies on two-phase methods for bi-objective combinatorial optimization","authors":"Felipe O. Mota , Luís Paquete , Daniel Vanderpooten","doi":"10.1016/j.cor.2025.107254","DOIUrl":"10.1016/j.cor.2025.107254","url":null,"abstract":"<div><div>Two-phase methods are commonly used to solve bi-objective combinatorial optimization problems. In the first phase, all extreme supported nondominated points are generated through a dichotomic search. This phase also allows the identification of search zones that may contain other nondominated points. The second phase focuses on exploring these search zones to locate the remaining points, which typically accounts for most of the computational cost. Ranking algorithms are frequently employed to explore each zone individually, but this approach leads to redundancies, causing multiple visits to the same solutions. To mitigate these redundancies, we propose several strategies that group adjacent zones, allowing a single run of the ranking algorithm for the entire group. Additionally, we explore an implicit grouping approach based on a new concept of coverage. Our experiments on the Bi-Objective Spanning Tree Problem demonstrate the beneficial impact of these grouping strategies when combined with coverage.</div></div>","PeriodicalId":10542,"journal":{"name":"Computers & Operations Research","volume":"185 ","pages":"Article 107254"},"PeriodicalIF":4.3,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145046604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning-guided iterated local search for the minmax multiple traveling salesman problem","authors":"Pengfei He , Jin-Kao Hao , Jinhui Xia","doi":"10.1016/j.cor.2025.107255","DOIUrl":"10.1016/j.cor.2025.107255","url":null,"abstract":"<div><div>The minmax multiple traveling salesman problem involves minimizing the costs of a longest tour among a set of tours. The problem is of great practical interest because it can be used to formulate several real-life applications. To solve this computationally challenging problem, we propose a learning-driven iterated local search approach that combines an effective local search procedure to find high-quality local optimal solutions and a multi-armed bandit algorithm to select removal and insertion operators to escape local optimal traps. Extensive experiments on 77 commonly used benchmark instances show that the algorithm achieves excellent results in terms of solution quality and running time. In particular, it achieves 32 new best results (improved upper bounds) and matches the best-known results for 35 other instances. Additional experiments shed light on the understanding of the algorithm’s constituent elements. Multi-armed bandit selection can be used advantageously in other multi-operator local search algorithms.</div></div>","PeriodicalId":10542,"journal":{"name":"Computers & Operations Research","volume":"185 ","pages":"Article 107255"},"PeriodicalIF":4.3,"publicationDate":"2025-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145096132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zi-Qi Zhang , Xin-Yun Wu , Bin Qian , Rong Hu , Jian-Bo Yang
{"title":"A Q-learning-based multi-objective hyper-heuristic algorithm for energy-efficient integrated distributed hybrid flow-shop scheduling with preventive maintenance","authors":"Zi-Qi Zhang , Xin-Yun Wu , Bin Qian , Rong Hu , Jian-Bo Yang","doi":"10.1016/j.cor.2025.107267","DOIUrl":"10.1016/j.cor.2025.107267","url":null,"abstract":"<div><div>Driven by the dual engines of supply chain integration and low-carbon transformation in industrial Internet of Things (IIoT) manufacturing systems, energy-efficient integrated distributed scheduling has emerged as a pivotal component of industrial intelligence-driven smart manufacturing. This article investigates the energy-efficient integrated distributed hybrid flow shop scheduling problem with preventive maintenance (EE-IDHFSP-PM), which aims to minimize the dual objectives of makespan and total carbon emissions. In this study, a mixed-integer linear programming (MILP) model is established for the EE-IDHFSP-PM, making the first attempt to solve such NP-hard problem by using a <em>Q</em>-learning-based multi-objective hyper-heuristic algorithm (QLMHHA). First, a modified NEH-based initialization method is introduced to produce high-quality solutions that balance multiple optimization objectives, ensuring both the quality and diversity of initial populations. Second, a novel multi-stage collaborative energy-efficient strategy (MSC_EES) is developed to dynamically adjust the processing speeds of machines on non-critical paths, which reduces energy consumption across stages. Third, a new <em>Q</em>-learning-based high-level strategy (HLS) is devised to dynamically coordinate twelve low-level heuristics (LLHs) according to specific states, improving adaptive search efficiency through superior exploration–exploitation trade-offs. Fourth, a dual-criterion reward mechanism is proposed to evaluate population quality in terms of both convergence and diversity, which can deliver immediate feedback and effectively guide evolutionary processes. Fifth, comprehensive convergence and computational complexity analyses of critical components are conducted to confirm the stability, reliability, and efficiency of QLMHHA. Extensive experiments are carried out on 54 small-scale and 24 large-scale instances, which demonstrate that QLMHHA achieves promising performance in both effectiveness and efficacy against state-of-the-art multi-objective algorithms for addressing the EE-IDHFSP-PM. These findings validate the efficacy and superiority of QLMHHA in tackling complex scheduling challenges, providing valuable theoretical implications and practical insights for energy-efficient distributed manufacturing systems.</div></div>","PeriodicalId":10542,"journal":{"name":"Computers & Operations Research","volume":"185 ","pages":"Article 107267"},"PeriodicalIF":4.3,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145217198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhixue Wang , Maowei He , Hanning Chen , Yabao Hu , Yelin Xia
{"title":"A Q-learning-based evolutionary algorithm for solving the low-carbon multi-objective flexible job shop scheduling problem","authors":"Zhixue Wang , Maowei He , Hanning Chen , Yabao Hu , Yelin Xia","doi":"10.1016/j.cor.2025.107266","DOIUrl":"10.1016/j.cor.2025.107266","url":null,"abstract":"<div><div>In recent years, how to reduce energy consumption at the manufacturing system level in the low-carbon multi-objective flexible job shop scheduling problem (LCM-FJSP) has received significant attention. In this research, a model with the maximum completion time, total machine workload and total carbon emissions is built. Moreover, a Q-learning-based adaptive weight-adjusted decomposition evolutionary algorithm (QMOEA/D-AWA) is proposed. In the QMOEA/D-AWA, an initialization strategy with four heuristic initial rules for obtaining high-quality population, a variable neighborhood search strategy with four problem-specific local search methods for enhancing exploration and a Q-learning-based parameter adaptive strategy for automatically determining the number of neighborhood solutions are designed. To validate the effectiveness of the proposed QMOEA/D-AWA, it is compared with five state-of-the-art algorithms on 15 instances. In the statistical analysis, the QMOEA/D-AWA obtains the overwhelming metric results in 10 instances. In the visual analysis, the completion time is reduced by 3.74%, the total workload is reduced by 3.94%, and the carbon emissions are reduced by 5.94%.</div></div>","PeriodicalId":10542,"journal":{"name":"Computers & Operations Research","volume":"185 ","pages":"Article 107266"},"PeriodicalIF":4.3,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145020002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A benders-branch-and-cut methodology for global cargo vessel traffic prediction given declining arctic sea ice and changing risks","authors":"Wenjie Li, Elise Miller-Hooks","doi":"10.1016/j.cor.2025.107265","DOIUrl":"10.1016/j.cor.2025.107265","url":null,"abstract":"<div><div>Global warming has led to declining sea-ice in the Arctic Ocean, making it easier for ice-class vessels to navigate Arctic waters for greater portions of the year. As sailing conditions in these waters improve over coming decades, these passageways are expected to open for larger portions of the year and to become increasingly viable options for unsupported transit and even open-water vessels. This paper proposes a Benders-branch-and-cut methodology for estimating changes in global maritime cargo flow patterns under future climate scenarios with declining Arctic sea ice. The model accounts for changing incident risk along Arctic passageways and corresponding ice-class vessel and icebreaker escort requirements, lower speeds, increased insurance premiums, higher accident probabilities, and constraints on path-based maximum risk exposure. The resulting mixed-integer program involves path-based, continuous decision variables. The solution technique is applied on a model of the global maritime container network including 80 ports, 76 routes, 426 links and 4,303 legs associated with the world’s largest carrier alliance. Embedded acceleration techniques and a label-correcting algorithm that employs specialized fathoming rules for a non-additive, constrained path subproblem enable solution at this global scale. The outcome is an estimate of seasonal future global maritime trade flows along key global routes and through ports predicted under six climate-related scenarios. Results illustrate that the developed model can provide support to companies, nations and regions as they prepare for a changing global landscape and climate.</div></div>","PeriodicalId":10542,"journal":{"name":"Computers & Operations Research","volume":"185 ","pages":"Article 107265"},"PeriodicalIF":4.3,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145020001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hangyu Ji , Chuntian Zhang , Jiateng Yin , Lixing Yang
{"title":"A data-driven optimization approach for the integrated train scheduling and maintenance planning in high-speed railways","authors":"Hangyu Ji , Chuntian Zhang , Jiateng Yin , Lixing Yang","doi":"10.1016/j.cor.2025.107261","DOIUrl":"10.1016/j.cor.2025.107261","url":null,"abstract":"<div><div>In railway systems, preventive maintenance plans are essential for ensuring the safety of train operations. However, these tasks are often subject to various disturbances (e.g., bad weather), leading to unpredictable deviations between planned and actual maintenance durations, which can further disrupt train schedules. Unlike most studies that assume constant maintenance durations, this paper introduces a data-driven, two-stage distributionally robust optimization (DRO) model for jointly optimizing train scheduling and maintenance planning. In the first stage, we determine the initial train schedule and maintenance plan. In the second stage, we allow for slight adjustments to train departure and arrival times at each station to accommodate disturbances affecting maintenance tasks. Our objective is to minimize both the expected travel time of trains and the deviation from the planned schedule under worst-case scenarios for maintenance disturbances. To capture the uncertainty of maintenance disturbances, we construct an ambiguity set using historical data and the Wasserstein metric. We show that the proposed two-stage DRO model, formulated over the Wasserstein ambiguity set, can be reformulated into an efficiently solvable equivalent form. Finally, we apply our model to a real-world case study of the Beijing–Guangzhou high-speed railway and compare it with traditional stochastic programming methods, including sample average approximation and robust optimization. The results highlight the efficiency of our approach and provide valuable insights for railway management.</div></div>","PeriodicalId":10542,"journal":{"name":"Computers & Operations Research","volume":"185 ","pages":"Article 107261"},"PeriodicalIF":4.3,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145046601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}