{"title":"Another Look at Partially Observed Optimal Stochastic Control: Existence, Ergodicity, and Approximations Without Belief-Reduction","authors":"Serdar Yüksel","doi":"10.1007/s00245-024-10211-9","DOIUrl":"10.1007/s00245-024-10211-9","url":null,"abstract":"<div><p>We present an alternative view for the study of optimal control of partially observed Markov Decision Processes (POMDPs). We first revisit the traditional (and by now standard) separated-design method of reducing the problem to fully observed MDPs (belief-MDPs), and present conditions for the existence of optimal policies. Then, rather than working with this standard method, we define a Markov chain taking values in an infinite dimensional product space with the history process serving as the controlled state process and a further refinement in which the control actions and the state process are causally conditionally independent given the measurement/information process. We provide new sufficient conditions for the existence of optimal control policies under the discounted cost and average cost infinite horizon criteria. In particular, while in the belief-MDP reduction of POMDPs, weak Feller condition requirement imposes total variation continuity on either the system kernel or the measurement kernel, with the approach of this paper only weak continuity of both the transition kernel and the measurement kernel is needed (and total variation continuity is not) together with regularity conditions related to filter stability. For the discounted cost setup, we establish near optimality of finite window policies via a direct argument involving near optimality of quantized approximations for MDPs under weak Feller continuity, where finite truncations of memory can be viewed as quantizations of infinite memory with a uniform diameter in each finite window restriction under the product metric. For the average cost setup, we provide new existence conditions and also a general approach on how to initialize the randomness which we show to establish convergence to optimal cost. In the control-free case, our analysis leads to new and weak conditions for the existence and uniqueness of invariant probability measures for nonlinear filter processes, where we show that unique ergodicity of the measurement process and a measurability condition related to filter stability leads to unique ergodicity.</p></div>","PeriodicalId":55566,"journal":{"name":"Applied Mathematics and Optimization","volume":"91 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2025-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142925674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Necessary Optimality Conditions for Minimax Multiprocesses","authors":"Abdallah Abdel Wahab, Piernicola Bettiol","doi":"10.1007/s00245-024-10209-3","DOIUrl":"10.1007/s00245-024-10209-3","url":null,"abstract":"<div><p>In this paper we establish necessary optimality conditions for minimax multiprocess problems: these are optimal control problems in which we have a family of control systems coupled by endpoint constraints and a minimax cost functional to minimize.</p></div>","PeriodicalId":55566,"journal":{"name":"Applied Mathematics and Optimization","volume":"91 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142890062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"State-Constrained Optimal Control of a Coupled Quasilinear Parabolic System Modeling Economic Growth in the Presence of Technological Progress","authors":"Mohamed Mehdaoui, Deborah Lacitignola, Mouhcine Tilioua","doi":"10.1007/s00245-024-10214-6","DOIUrl":"10.1007/s00245-024-10214-6","url":null,"abstract":"<div><p>We develop an optimal control framework that enables to determine the most-beneficial ways of investing in technology and directing capital within an economy. Our developed framework features three main novelties: the optimization of a cross–diffusion term that incorporates the allocation of capital towards specific regions with higher level of technology; the coupling of technological progress with the capital in the state system; and the inclusion of an inequality constraint imposing that the squared norm of technological progress does not surpass a capacity <span>(M_A>0)</span>, which is more practical in economic applications. This leads to a new state-constrained optimal control problem which we analyze as follows. First, by examining the weak well-posedness of the dynamics, we identify a threshold parameter <span>(M^*>0)</span> such that when <span>(M_Age M^*)</span>, the state-constraint can be omitted. In this case, we deal with a reduced state-unconstrained optimal control problem. On the other hand, when <span>(M_A<M^*)</span>, the state-constraint is not implicitly incorporated. Consequently, we proceed by a penalization approach to formulate a sequence of state-unconstrained optimal control problems and provide necessary optimality conditions for its associated sequence of locally optimal solutions. Subsequently, we prove that the sequence of locally optimal solutions converges strongly to a locally optimal solution for the original state-constrained optimal control problem and retrieve its necessary optimality conditions. Finally, we perform various numerical simulations to illustrate the effects of optimal investment in technology and optimal capital direction on the economy. This study could offer interesting insights in the perspective of circular economy transition.</p></div>","PeriodicalId":55566,"journal":{"name":"Applied Mathematics and Optimization","volume":"91 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142889664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiacheng Chen, Kexin Feng, Lorenzo Freddi, Dan Goreac, Juan Li
{"title":"Optimality of Vaccination for Prevalence-Constrained SIRS Epidemics","authors":"Jiacheng Chen, Kexin Feng, Lorenzo Freddi, Dan Goreac, Juan Li","doi":"10.1007/s00245-024-10212-8","DOIUrl":"10.1007/s00245-024-10212-8","url":null,"abstract":"<div><p>The aim of the present paper is to investigate the optimal vaccination policies with prevalence restrictions in an SIRS demographic model. We provide a well-posedness result for the system and give a thorough description of safety zones (immunity and feasible) when intensive care units (ICU) restrictions are enforced on the prevalence. Using Pontryagin’s principle for state-constrained dynamics we show that the optimal vaccination policy is of bang–bang type and give further specifics on the precise structure. The paper is intended as a counter-part to Avram et al. (Appl Math Comput 418:126816, 2022) where non-pharmaceutical interventions have been considered.</p></div>","PeriodicalId":55566,"journal":{"name":"Applied Mathematics and Optimization","volume":"91 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142889931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Global Dynamics for a Class of Chemotaxis Systems with Density-Suppressed Motility and Nonlinear Indirect Signal Consumption","authors":"Quanyong Zhao, Jinrong Wang","doi":"10.1007/s00245-024-10215-5","DOIUrl":"10.1007/s00245-024-10215-5","url":null,"abstract":"<div><p>The paper is concerned with a chemotaxis model with nonlinear indirect signal consumption and density-suppressed motility </p><div><div><span>$$begin{aligned} left{ begin{aligned}&u_t=Delta (varphi (v)u)+f(u),&xin Omega ,t>0,&v_t=Delta v-vw^beta ,&xin Omega ,t>0,&w_t=-delta w+u,&xin Omega ,t>0, end{aligned} right. end{aligned}$$</span></div></div><p>under homogeneous Neumann boundary conditions in a smooth bounded domain <span>(Omega subset mathbb {R}^n)</span> <span>((nge 1))</span>, where the parameters <span>(delta )</span>, <span>(beta >0)</span>, and <span>(varphi (v))</span> is a motility function. The purpose of this paper is to determine the size of the absorption exponent to ensure the existence of global bounded classical solutions to the problem. Specifically, we first showed that when <span>(f(u)=0)</span>, the system has a global bounded classical solution if <span>(beta le 2)</span>, <span>(n=1)</span>, or <span>(beta <frac{4}{n})</span>, <span>(nge 2)</span>, or suitably small initial data. Subsequently, when <span>(f(u)=ru-mu u^alpha )</span> with <span>(rin mathbb {R})</span>, <span>(mu >0)</span>, <span>(alpha >1)</span>, it was shown that the system admits a global bounded classical solution if <span>(beta le alpha )</span>, <span>(n=1)</span> or <span>(beta <max bigl {alpha -1, frac{2alpha }{n}bigr })</span>, <span>(nge 2)</span>, and that in the critical case <span>(beta =alpha -1)</span>, <span>(nge 2)</span>, we proved the existence of global bounded classical solutions provided that <span>(mu )</span> is properly large. Moreover, we obtained the uniform convergence of bounded solutions to the system by constructing some suitable functionals.</p></div>","PeriodicalId":55566,"journal":{"name":"Applied Mathematics and Optimization","volume":"91 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142890529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Time Consistent Solution to Optimal Stopping Problems with Expectation Constraint","authors":"S. Christensen, M. Klein, B. Schultz","doi":"10.1007/s00245-024-10202-w","DOIUrl":"10.1007/s00245-024-10202-w","url":null,"abstract":"<div><p>We study the (weak) equilibrium problem arising from the problem of optimally stopping a one-dimensional diffusion subject to an expectation constraint on the time until stopping. The weak equilibrium problem is realized with a set of randomized but purely state dependent stopping times as admissible strategies. We derive a verification theorem and necessary conditions for equilibria, which together basically characterize all equilibria. Furthermore, additional structural properties of equilibria are obtained to feed a possible guess-and-verify approach, which is then illustrated by an example.</p></div>","PeriodicalId":55566,"journal":{"name":"Applied Mathematics and Optimization","volume":"91 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s00245-024-10202-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142858602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Continuous Time q-Learning for Mean-Field Control Problems","authors":"Xiaoli Wei, Xiang Yu","doi":"10.1007/s00245-024-10205-7","DOIUrl":"10.1007/s00245-024-10205-7","url":null,"abstract":"<div><p>This paper studies the q-learning, recently coined as the continuous time counterpart of Q-learning by Jia and Zhou (J Mach Learn Res 24:1–61, 2023), for continuous time mean-field control problems in the setting of entropy-regularized reinforcement learning. In contrast to the single agent’s control problem in Jia and Zhou (J Mach Learn Res 24:1–61, 2023), we reveal that two different q-functions naturally arise in mean-field control problems: (i) the integrated q-function (denoted by <i>q</i>) as the first-order approximation of the integrated Q-function introduced in Gu et al. (Oper Res 71(4):1040–1054, 2023), which can be learnt by a weak martingale condition using all test policies; and (ii) the essential q-function (denoted by <span>(q_e)</span>) that is employed in the policy improvement iterations. We show that two q-functions are related via an integral representation. Based on the weak martingale condition and our proposed searching method of test policies, some model-free learning algorithms are devised. In two examples, one in LQ control framework and one beyond LQ control framework, we can obtain the exact parameterization of the optimal value function and q-functions and illustrate our algorithms with simulation experiments.\u0000</p></div>","PeriodicalId":55566,"journal":{"name":"Applied Mathematics and Optimization","volume":"91 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142826395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Control Randomisation Approach for Policy Gradient and Application to Reinforcement Learning in Optimal Switching","authors":"Robert Denkert, Huyên Pham, Xavier Warin","doi":"10.1007/s00245-024-10207-5","DOIUrl":"10.1007/s00245-024-10207-5","url":null,"abstract":"<div><p>We propose a comprehensive framework for policy gradient methods tailored to continuous time reinforcement learning. This is based on the connection between stochastic control problems and randomised problems, enabling applications across various classes of Markovian continuous time control problems, beyond diffusion models, including e.g. regular, impulse and optimal stopping/switching problems. By utilizing change of measure in the control randomisation technique, we derive a new policy gradient representation for these randomised problems, featuring parametrised intensity policies. We further develop actor-critic algorithms specifically designed to address general Markovian stochastic control issues. Our framework is demonstrated through its application to optimal switching problems, with two numerical case studies in the energy sector focusing on real options.</p></div>","PeriodicalId":55566,"journal":{"name":"Applied Mathematics and Optimization","volume":"91 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142826391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Coarse Correlated Equilibria in Linear Quadratic Mean Field Games and Application to an Emission Abatement Game","authors":"Luciano Campi, Federico Cannerozzi, Fanny Cartellier","doi":"10.1007/s00245-024-10198-3","DOIUrl":"10.1007/s00245-024-10198-3","url":null,"abstract":"<div><p>Coarse correlated equilibria (CCE) are a good alternative to Nash equilibria (NE), as they arise more naturally as outcomes of learning algorithms and as they may exhibit higher payoffs than NE. CCEs include a device which allows players’ strategies to be correlated without any cooperation, only through information sent by a mediator. We develop a methodology to concretely compute mean field CCEs in a linear-quadratic mean field game (MFG) framework. We compare their performance to mean field control solutions and mean field NE (usually named MFG solutions). Our approach is implemented in the mean field version of an emission abatement game between greenhouse gas emitters. In particular, we exhibit a simple and tractable class of mean field CCEs which allows to outperform very significantly the mean field NE payoff and abatement levels, bridging the gap between the mean field NE and the social optimum obtained by mean field control.</p></div>","PeriodicalId":55566,"journal":{"name":"Applied Mathematics and Optimization","volume":"91 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s00245-024-10198-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142821384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Well-Posed Uniform Solvability of Convex Optimization Problems on a Uniform Differentiable Closed Convex Set","authors":"Shaoqiang Shang","doi":"10.1007/s00245-024-10206-6","DOIUrl":"10.1007/s00245-024-10206-6","url":null,"abstract":"<div><p>In this paper, we first give the definition of uniformly differentiable set and give the definitions of sets <span>(P(A,eta , r))</span> and <span>(P_{A,delta }(f))</span>. Secondly, we prove that if the set <i>A</i> is bounded closed convex set, then <i>A</i> is uniformly differentiable if and only if for any <span>(varepsilon , eta , r>0)</span>, there exists <span>(delta =delta (varepsilon ,eta ,r )>0)</span> such that <span>(Vert x-yVert <varepsilon )</span> whenever <span>(fin P(A,eta , r))</span>, <span>(yin P_{A,delta }(f))</span> and <span>(xin P_{A}(f))</span>. Moreover, we also prove that if <i>A</i> is a bounded closed convex set in a finite-dimensional space <i>X</i>, then <i>A</i> is differentiable if and only if <i>A</i> is uniformly differentiable. Finally, we give some examples of uniformly differentiable set. Therefore, we extend some conclusions (SIAM J. Optim. Vol. 30, No. 1, pp. 490–512).</p></div>","PeriodicalId":55566,"journal":{"name":"Applied Mathematics and Optimization","volume":"91 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142811195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}