On maximizing probabilities for over-performing a target for Markov decision processes

IF 1.7 3区工程技术 Q2 ENGINEERING, MULTIDISCIPLINARY

Optimization and Engineering Pub Date : 2023-12-08 DOI:10.1007/s11081-023-09870-4

Tanhao Huang, Yanan Dai, Jinwen Chen

引用次数: 0

Abstract

This paper studies the dual relation between risk-sensitive control and large deviation control of maximizing the probability for out-performing a target for Markov Decision Processes. To derive the desired duality, we apply a non-linear extension of the Krein-Rutman Theorem to characterize the optimal risk-sensitive value and prove that an optimal policy exists which is stationary and deterministic. The right-hand side derivative of this value function is used to characterize the specific targets which make the duality to hold. It is proved that the optimal policy for the “out-performing” probability can be approximated by the optimal one for the risk-sensitive control. The range of the (right-hand, left-hand side) derivative of the optimal risk-sensitive value function plays an important role. Some essential differences between these two types of optimal control problems are presented.

Abstract Image

查看原文本刊更多论文

论最大化马尔可夫决策过程超额完成目标的概率

本文研究了马尔可夫决策过程的风险敏感控制与最大化超越目标概率的大偏差控制之间的对偶关系。为了推导出所需的对偶关系，我们应用了 Krein-Rutman 定理的非线性扩展来描述最优风险敏感值，并证明存在静态和确定性的最优策略。该价值函数的右侧导数用于描述使二元性成立的特定目标。证明了 "超额收益 "概率的最优政策可以用风险敏感控制的最优政策来近似。最优风险敏感价值函数（右侧、左侧）导数的范围起着重要作用。本文介绍了这两类最优控制问题之间的一些本质区别。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Optimization and Engineering 工程技术-工程：综合

CiteScore

4.80

自引率

14.30%

发文量

审稿时长

>12 weeks

期刊介绍： Optimization and Engineering is a multidisciplinary journal; its primary goal is to promote the application of optimization methods in the general area of engineering sciences. We expect submissions to OPTE not only to make a significant optimization contribution but also to impact a specific engineering application. Topics of Interest: -Optimization: All methods and algorithms of mathematical optimization, including blackbox and derivative-free optimization, continuous optimization, discrete optimization, global optimization, linear and conic optimization, multiobjective optimization, PDE-constrained optimization & control, and stochastic optimization. Numerical and implementation issues, optimization software, benchmarking, and case studies. -Engineering Sciences: Aerospace engineering, biomedical engineering, chemical & process engineering, civil, environmental, & architectural engineering, electrical engineering, financial engineering, geosciences, healthcare engineering, industrial & systems engineering, mechanical engineering & MDO, and robotics.