{"title":"基于状态相关风险规避的马尔可夫决策过程的启发式均值方差优化","authors":"Rainer Schlosser","doi":"10.1093/imaman/dpab009","DOIUrl":null,"url":null,"abstract":"In dynamic decision problems, it is challenging to find the right balance between maximizing expected rewards and minimizing risks. In this paper, we consider NP-hard mean-variance (MV) optimization problems in Markov decision processes with a finite time horizon. We present a heuristic approach to solve MV problems, which is based on state-dependent risk aversion and efficient dynamic programming techniques. Our approach can also be applied to mean-semivariance (MSV) problems, which particularly focus on the downside risk. We demonstrate the applicability and the effectiveness of our heuristic for dynamic pricing applications. Using reproducible examples, we show that our approach outperforms existing state-of-the-art benchmark models for MV and MSV problems while also providing competitive runtimes. Further, compared to models based on constant risk levels, we find that state-dependent risk aversion allows to more effectively intervene in case sales processes deviate from their planned paths. Our concepts are domain independent, easy to implement and of low computational complexity.","PeriodicalId":56296,"journal":{"name":"IMA Journal of Management Mathematics","volume":"33 2","pages":"181-199"},"PeriodicalIF":1.9000,"publicationDate":"2021-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Heuristic mean-variance optimization in Markov decision processes using state-dependent risk aversion\",\"authors\":\"Rainer Schlosser\",\"doi\":\"10.1093/imaman/dpab009\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In dynamic decision problems, it is challenging to find the right balance between maximizing expected rewards and minimizing risks. In this paper, we consider NP-hard mean-variance (MV) optimization problems in Markov decision processes with a finite time horizon. We present a heuristic approach to solve MV problems, which is based on state-dependent risk aversion and efficient dynamic programming techniques. Our approach can also be applied to mean-semivariance (MSV) problems, which particularly focus on the downside risk. We demonstrate the applicability and the effectiveness of our heuristic for dynamic pricing applications. Using reproducible examples, we show that our approach outperforms existing state-of-the-art benchmark models for MV and MSV problems while also providing competitive runtimes. Further, compared to models based on constant risk levels, we find that state-dependent risk aversion allows to more effectively intervene in case sales processes deviate from their planned paths. Our concepts are domain independent, easy to implement and of low computational complexity.\",\"PeriodicalId\":56296,\"journal\":{\"name\":\"IMA Journal of Management Mathematics\",\"volume\":\"33 2\",\"pages\":\"181-199\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2021-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IMA Journal of Management Mathematics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/9717044/\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MANAGEMENT\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IMA Journal of Management Mathematics","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/9717044/","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MANAGEMENT","Score":null,"Total":0}
Heuristic mean-variance optimization in Markov decision processes using state-dependent risk aversion
In dynamic decision problems, it is challenging to find the right balance between maximizing expected rewards and minimizing risks. In this paper, we consider NP-hard mean-variance (MV) optimization problems in Markov decision processes with a finite time horizon. We present a heuristic approach to solve MV problems, which is based on state-dependent risk aversion and efficient dynamic programming techniques. Our approach can also be applied to mean-semivariance (MSV) problems, which particularly focus on the downside risk. We demonstrate the applicability and the effectiveness of our heuristic for dynamic pricing applications. Using reproducible examples, we show that our approach outperforms existing state-of-the-art benchmark models for MV and MSV problems while also providing competitive runtimes. Further, compared to models based on constant risk levels, we find that state-dependent risk aversion allows to more effectively intervene in case sales processes deviate from their planned paths. Our concepts are domain independent, easy to implement and of low computational complexity.
期刊介绍:
The mission of this quarterly journal is to publish mathematical research of the highest quality, impact and relevance that can be directly utilised or have demonstrable potential to be employed by managers in profit, not-for-profit, third party and governmental/public organisations to improve their practices. Thus the research must be quantitative and of the highest quality if it is to be published in the journal. Furthermore, the outcome of the research must be ultimately useful for managers. The journal also publishes novel meta-analyses of the literature, reviews of the "state-of-the art" in a manner that provides new insight, and genuine applications of mathematics to real-world problems in the form of case studies. The journal welcomes papers dealing with topics in Operational Research and Management Science, Operations Management, Decision Sciences, Transportation Science, Marketing Science, Analytics, and Financial and Risk Modelling.