{"title":"Mean-variance optimization in finite horizon Markov decision processes and its application to revenue management","authors":"Rainer Schlosser, Jochen Gönsch","doi":"10.1016/j.ejor.2025.03.030","DOIUrl":null,"url":null,"abstract":"In many applications, risk-averse decision-making is crucial. In this context, the mean–variance (MV) criterion is widely accepted and often used to find the right balance between maximizing expected rewards and avoiding poor performances. In dynamic settings, however, it is challenging to efficiently compute policies under the MV objective and hence, surrogates like the exponential utility model are often used. In this paper, we consider MV optimization for discrete time Markov decision processes (MDP) with finite horizon. Our approach is based on a system of tractable subproblems with distorted variance that allows to identify mean–variance combinations that cannot be attained. The number of subproblems to solve can be chosen such that a predetermined ex-ante optimality gap is obtained. We illustrate the effectiveness and the applicability of our approach for different revenue management examples. We find that competitive ex-ante and ex-post optimality gaps lower than 0.0001% can be reliably obtained with acceptable computational effort.","PeriodicalId":55161,"journal":{"name":"European Journal of Operational Research","volume":"57 1","pages":""},"PeriodicalIF":6.0000,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Operational Research","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1016/j.ejor.2025.03.030","RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPERATIONS RESEARCH & MANAGEMENT SCIENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In many applications, risk-averse decision-making is crucial. In this context, the mean–variance (MV) criterion is widely accepted and often used to find the right balance between maximizing expected rewards and avoiding poor performances. In dynamic settings, however, it is challenging to efficiently compute policies under the MV objective and hence, surrogates like the exponential utility model are often used. In this paper, we consider MV optimization for discrete time Markov decision processes (MDP) with finite horizon. Our approach is based on a system of tractable subproblems with distorted variance that allows to identify mean–variance combinations that cannot be attained. The number of subproblems to solve can be chosen such that a predetermined ex-ante optimality gap is obtained. We illustrate the effectiveness and the applicability of our approach for different revenue management examples. We find that competitive ex-ante and ex-post optimality gaps lower than 0.0001% can be reliably obtained with acceptable computational effort.
期刊介绍:
The European Journal of Operational Research (EJOR) publishes high quality, original papers that contribute to the methodology of operational research (OR) and to the practice of decision making.