{"title":"Robustness of Stochastic Optimal Control to Approximate Diffusion Models Under Several Cost Evaluation Criteria","authors":"Somnath Pradhan, Serdar Yüksel","doi":"10.1287/moor.2022.0134","DOIUrl":null,"url":null,"abstract":"In control theory, typically a nominal model is assumed based on which an optimal control is designed and then applied to an actual (true) system. This gives rise to the problem of performance loss because of the mismatch between the true and assumed models. A robustness problem in this context is to show that the error because of the mismatch between a true and an assumed model decreases to zero as the assumed model approaches the true model. We study this problem when the state dynamics of the system are governed by controlled diffusion processes. In particular, we discuss continuity and robustness properties of finite and infinite horizon α-discounted/ergodic optimal control problems for a general class of nondegenerate controlled diffusion processes as well as for optimal control up to an exit time. Under a general set of assumptions and a convergence criterion on the models, we first establish that the optimal value of the approximate model converges to the optimal value of the true model. We then establish that the error because of the mismatch that occurs by application of a control policy, designed for an incorrectly estimated model, to a true model decreases to zero as the incorrect model approaches the true model. We see that, compared with related results in the discrete-time setup, the continuous-time theory lets us utilize the strong regularity properties of solutions to optimality (Hamilton–Jacobi–Bellman) equations, via the theory of uniformly elliptic partial differential equations, to arrive at strong continuity and robustness properties. Funding: The research of S. Yüksel was partially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).","PeriodicalId":49852,"journal":{"name":"Mathematics of Operations Research","volume":"48 1","pages":"0"},"PeriodicalIF":1.9000,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematics of Operations Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1287/moor.2022.0134","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0
Abstract
In control theory, typically a nominal model is assumed based on which an optimal control is designed and then applied to an actual (true) system. This gives rise to the problem of performance loss because of the mismatch between the true and assumed models. A robustness problem in this context is to show that the error because of the mismatch between a true and an assumed model decreases to zero as the assumed model approaches the true model. We study this problem when the state dynamics of the system are governed by controlled diffusion processes. In particular, we discuss continuity and robustness properties of finite and infinite horizon α-discounted/ergodic optimal control problems for a general class of nondegenerate controlled diffusion processes as well as for optimal control up to an exit time. Under a general set of assumptions and a convergence criterion on the models, we first establish that the optimal value of the approximate model converges to the optimal value of the true model. We then establish that the error because of the mismatch that occurs by application of a control policy, designed for an incorrectly estimated model, to a true model decreases to zero as the incorrect model approaches the true model. We see that, compared with related results in the discrete-time setup, the continuous-time theory lets us utilize the strong regularity properties of solutions to optimality (Hamilton–Jacobi–Bellman) equations, via the theory of uniformly elliptic partial differential equations, to arrive at strong continuity and robustness properties. Funding: The research of S. Yüksel was partially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).
在控制理论中,通常假设一个标称模型,在此基础上设计最优控制,然后将其应用于实际(真实)系统。由于真实模型和假设模型之间的不匹配,这会导致性能损失的问题。在这种情况下,鲁棒性问题是表明,当假设模型接近真实模型时,由于真实模型和假设模型之间的不匹配而导致的误差减少到零。当系统的状态动力学由可控扩散过程控制时,我们研究了这一问题。特别地,我们讨论了一类非退化控制扩散过程的有限和无限视界α-折现/遍历最优控制问题的连续性和鲁棒性,以及最优控制达到一个退出时间的问题。在模型的一般假设和收敛准则下,我们首先建立了近似模型的最优值收敛于真模型的最优值。然后,我们确定,当错误模型接近真实模型时,由于应用为错误估计模型设计的控制策略而发生的不匹配而导致的误差减少到零。我们看到,与离散时间设置中的相关结果相比,连续时间理论使我们能够利用最优性(Hamilton-Jacobi-Bellman)方程解的强正则性,通过一致椭圆偏微分方程理论,获得强连续性和鲁棒性。资助:S. y ksel的研究得到了加拿大自然科学与工程研究委员会(NSERC)的部分支持。
期刊介绍:
Mathematics of Operations Research is an international journal of the Institute for Operations Research and the Management Sciences (INFORMS). The journal invites articles concerned with the mathematical and computational foundations in the areas of continuous, discrete, and stochastic optimization; mathematical programming; dynamic programming; stochastic processes; stochastic models; simulation methodology; control and adaptation; networks; game theory; and decision theory. Also sought are contributions to learning theory and machine learning that have special relevance to decision making, operations research, and management science. The emphasis is on originality, quality, and importance; correctness alone is not sufficient. Significant developments in operations research and management science not having substantial mathematical interest should be directed to other journals such as Management Science or Operations Research.