具有长期CVaR准则的风险敏感马尔可夫决策过程

IF 4.8 3区 管理学 Q1 ENGINEERING, MANUFACTURING
Li Xia, Luyao Zhang, Peter W. Glynn
{"title":"具有长期CVaR准则的风险敏感马尔可夫决策过程","authors":"Li Xia, Luyao Zhang, Peter W. Glynn","doi":"10.1111/poms.14077","DOIUrl":null,"url":null,"abstract":"Abstract CVaR (Conditional value at risk) is a risk metric widely used in finance. However, dynamically optimizing CVaR is difficult, because it is not a standard Markov decision process (MDP) and the principle of dynamic programming fails. In this paper, we study the infinite‐horizon discrete‐time MDP with a long‐run CVaR criterion, from the view of sensitivity‐based optimization. By introducing a pseudo‐CVaR metric, we reformulate the problem as a bilevel MDP model and derive a CVaR difference formula that quantifies the difference of long‐run CVaR under any two policies. The optimality of deterministic policies is derived. We obtain a so‐called Bellman local optimality equation for CVaR, which is a necessary and sufficient condition for locally optimal policies and only necessary for globally optimal policies. A CVaR derivative formula is also derived for providing more sensitivity information. Then we develop a policy iteration type algorithm to efficiently optimize CVaR, which is shown to converge to a local optimum in mixed policy space. Furthermore, based on the sensitivity analysis of our bilevel MDP formulation and critical points, we develop a globally optimal algorithm. The piecewise linearity and segment convexity of the optimal pseudo‐CVaR function are also established. Our main results and algorithms are further extended to optimize the mean and CVaR simultaneously. Finally, we conduct numerical experiments relating to portfolio management to demonstrate the main results. Our work sheds light on dynamically optimizing CVaR from a sensitivity viewpoint.","PeriodicalId":20623,"journal":{"name":"Production and Operations Management","volume":"45 1","pages":"0"},"PeriodicalIF":4.8000,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Risk‐sensitive markov decision processes with long‐run CVaR criterion\",\"authors\":\"Li Xia, Luyao Zhang, Peter W. Glynn\",\"doi\":\"10.1111/poms.14077\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract CVaR (Conditional value at risk) is a risk metric widely used in finance. However, dynamically optimizing CVaR is difficult, because it is not a standard Markov decision process (MDP) and the principle of dynamic programming fails. In this paper, we study the infinite‐horizon discrete‐time MDP with a long‐run CVaR criterion, from the view of sensitivity‐based optimization. By introducing a pseudo‐CVaR metric, we reformulate the problem as a bilevel MDP model and derive a CVaR difference formula that quantifies the difference of long‐run CVaR under any two policies. The optimality of deterministic policies is derived. We obtain a so‐called Bellman local optimality equation for CVaR, which is a necessary and sufficient condition for locally optimal policies and only necessary for globally optimal policies. A CVaR derivative formula is also derived for providing more sensitivity information. Then we develop a policy iteration type algorithm to efficiently optimize CVaR, which is shown to converge to a local optimum in mixed policy space. Furthermore, based on the sensitivity analysis of our bilevel MDP formulation and critical points, we develop a globally optimal algorithm. The piecewise linearity and segment convexity of the optimal pseudo‐CVaR function are also established. Our main results and algorithms are further extended to optimize the mean and CVaR simultaneously. Finally, we conduct numerical experiments relating to portfolio management to demonstrate the main results. Our work sheds light on dynamically optimizing CVaR from a sensitivity viewpoint.\",\"PeriodicalId\":20623,\"journal\":{\"name\":\"Production and Operations Management\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2023-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Production and Operations Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1111/poms.14077\",\"RegionNum\":3,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, MANUFACTURING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Production and Operations Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1111/poms.14077","RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MANUFACTURING","Score":null,"Total":0}
引用次数: 1

摘要

CVaR (Conditional value at risk)是金融中广泛使用的风险度量。然而,由于CVaR不是标准的马尔可夫决策过程(MDP),动态规划原理失效,动态优化CVaR是一个难点。本文从基于灵敏度优化的角度出发,研究了具有长期CVaR准则的无限视界离散时间MDP。通过引入伪CVaR度量,我们将该问题重新表述为双层MDP模型,并推导出CVaR差异公式,该公式量化了任意两种政策下的长期CVaR差异。导出了确定性策略的最优性。我们得到了CVaR的一个Bellman局部最优方程,它是全局最优策略的充要条件和局部最优策略的充要条件。为了提供更多的敏感性信息,还推导了CVaR的导数公式。然后,我们开发了一种策略迭代型算法来有效地优化CVaR,并证明该算法在混合策略空间中收敛到局部最优。此外,基于我们的双层MDP公式和临界点的敏感性分析,我们开发了一个全局最优算法。建立了最优伪CVaR函数的分段线性和段凸性。进一步扩展了我们的主要结果和算法,以同时优化均值和CVaR。最后,我们进行了与投资组合管理相关的数值实验来证明主要结果。我们的工作从敏感性的角度阐明了动态优化CVaR。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Risk‐sensitive markov decision processes with long‐run CVaR criterion
Abstract CVaR (Conditional value at risk) is a risk metric widely used in finance. However, dynamically optimizing CVaR is difficult, because it is not a standard Markov decision process (MDP) and the principle of dynamic programming fails. In this paper, we study the infinite‐horizon discrete‐time MDP with a long‐run CVaR criterion, from the view of sensitivity‐based optimization. By introducing a pseudo‐CVaR metric, we reformulate the problem as a bilevel MDP model and derive a CVaR difference formula that quantifies the difference of long‐run CVaR under any two policies. The optimality of deterministic policies is derived. We obtain a so‐called Bellman local optimality equation for CVaR, which is a necessary and sufficient condition for locally optimal policies and only necessary for globally optimal policies. A CVaR derivative formula is also derived for providing more sensitivity information. Then we develop a policy iteration type algorithm to efficiently optimize CVaR, which is shown to converge to a local optimum in mixed policy space. Furthermore, based on the sensitivity analysis of our bilevel MDP formulation and critical points, we develop a globally optimal algorithm. The piecewise linearity and segment convexity of the optimal pseudo‐CVaR function are also established. Our main results and algorithms are further extended to optimize the mean and CVaR simultaneously. Finally, we conduct numerical experiments relating to portfolio management to demonstrate the main results. Our work sheds light on dynamically optimizing CVaR from a sensitivity viewpoint.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Production and Operations Management
Production and Operations Management 管理科学-工程:制造
CiteScore
7.50
自引率
16.00%
发文量
278
审稿时长
24 months
期刊介绍: The mission of Production and Operations Management is to serve as the flagship research journal in operations management in manufacturing and services. The journal publishes scientific research into the problems, interest, and concerns of managers who manage product and process design, operations, and supply chains. It covers all topics in product and process design, operations, and supply chain management and welcomes papers using any research paradigm.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信