Going faster to see further: graphics processing unit-accelerated value iteration and simulation for perishable inventory control using JAX

IF 4.5 3区管理学 Q1 OPERATIONS RESEARCH & MANAGEMENT SCIENCE

Annals of Operations Research Pub Date : 2025-03-24 DOI:10.1007/s10479-025-06551-6

Joseph Farrington, Wai Keong Wong, Kezhi Li, Martin Utley

{"title":"Going faster to see further: graphics processing unit-accelerated value iteration and simulation for perishable inventory control using JAX","authors":"Joseph Farrington, Wai Keong Wong, Kezhi Li, Martin Utley","doi":"10.1007/s10479-025-06551-6","DOIUrl":null,"url":null,"abstract":"<div><p>Value iteration can find the optimal replenishment policy for a perishable inventory problem, but is computationally demanding due to the large state spaces that are required to represent the age profile of stock. The parallel processing capabilities of modern graphics processing units (GPUs) can reduce the wall time required to run value iteration by updating many states simultaneously. The adoption of GPU-accelerated approaches has been limited in operational research relative to other fields like machine learning, in which new software frameworks have made GPU programming widely accessible. We used the Python library JAX to implement value iteration and simulators of the underlying Markov decision processes in a high-level interface, and relied on this library’s function transformations and compiler to efficiently utilize GPU hardware. Our method can extend use of value iteration to settings that were previously considered infeasible or impractical. We demonstrate this on example scenarios from three recent studies which include problems with over 16 million states and additional problem features, such as substitution between products, that increase computational complexity. We compare the performance of the optimal replenishment policies to heuristic policies, fitted using simulation optimization in JAX which allowed the parallel evaluation of multiple candidate policy parameters on thousands of simulated years. The heuristic policies gave a maximum optimality gap of 2.49%. Our general approach may be applicable to a wide range of problems in operational research that would benefit from large-scale parallel computation on consumer-grade GPU hardware.</p></div>","PeriodicalId":8215,"journal":{"name":"Annals of Operations Research","volume":"349 3","pages":"1609 - 1638"},"PeriodicalIF":4.5000,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12350524/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Operations Research","FirstCategoryId":"91","ListUrlMain":"https://link.springer.com/article/10.1007/s10479-025-06551-6","RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPERATIONS RESEARCH & MANAGEMENT SCIENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Value iteration can find the optimal replenishment policy for a perishable inventory problem, but is computationally demanding due to the large state spaces that are required to represent the age profile of stock. The parallel processing capabilities of modern graphics processing units (GPUs) can reduce the wall time required to run value iteration by updating many states simultaneously. The adoption of GPU-accelerated approaches has been limited in operational research relative to other fields like machine learning, in which new software frameworks have made GPU programming widely accessible. We used the Python library JAX to implement value iteration and simulators of the underlying Markov decision processes in a high-level interface, and relied on this library’s function transformations and compiler to efficiently utilize GPU hardware. Our method can extend use of value iteration to settings that were previously considered infeasible or impractical. We demonstrate this on example scenarios from three recent studies which include problems with over 16 million states and additional problem features, such as substitution between products, that increase computational complexity. We compare the performance of the optimal replenishment policies to heuristic policies, fitted using simulation optimization in JAX which allowed the parallel evaluation of multiple candidate policy parameters on thousands of simulated years. The heuristic policies gave a maximum optimality gap of 2.49%. Our general approach may be applicable to a wide range of problems in operational research that would benefit from large-scale parallel computation on consumer-grade GPU hardware.

查看原文本刊更多论文

以更快的速度看得更远：图形处理单元加速的值迭代和使用JAX的易腐库存控制模拟。

值迭代可以找到易腐库存问题的最优补货策略，但由于需要大的状态空间来表示库存的年龄概况，因此计算量很大。现代图形处理单元（gpu）的并行处理能力可以通过同时更新多个状态来减少运行值迭代所需的时间。与机器学习等其他领域相比，在运筹学领域，GPU加速方法的采用受到了限制，在这些领域，新的软件框架使GPU编程变得广泛可用。我们使用Python库JAX在高级接口中实现了底层马尔可夫决策过程的值迭代和模拟器，并依靠该库的函数转换和编译器来有效地利用GPU硬件。我们的方法可以将值迭代的使用扩展到以前认为不可行或不切实际的设置。我们通过最近三项研究中的示例场景来证明这一点，这些研究包括超过1600万个状态的问题和额外的问题特征，例如产品之间的替代，这增加了计算复杂性。我们比较了最优补货策略和启发式策略的性能，这些策略在JAX中使用模拟优化进行拟合，允许在数千个模拟年对多个候选策略参数进行并行评估。启发式策略的最大最优性差距为2.49%。我们的一般方法可能适用于运筹学中的广泛问题，这些问题将受益于消费级GPU硬件上的大规模并行计算。补充信息：在线版本包含补充资料，可在10.1007/s10479-025-06551-6获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Annals of Operations Research 管理科学-运筹学与管理科学

CiteScore

7.90

自引率

16.70%

发文量

596

审稿时长

8.4 months

期刊介绍： The Annals of Operations Research publishes peer-reviewed original articles dealing with key aspects of operations research, including theory, practice, and computation. The journal publishes full-length research articles, short notes, expositions and surveys, reports on computational studies, and case studies that present new and innovative practical applications. In addition to regular issues, the journal publishes periodic special volumes that focus on defined fields of operations research, ranging from the highly theoretical to the algorithmic and the applied. These volumes have one or more Guest Editors who are responsible for collecting the papers and overseeing the refereeing process.