{"title":"Low-complexity algorithm for restless bandits with imperfect observations","authors":"Keqin Liu, Richard Weber, Chengzhong Zhang","doi":"10.1007/s00186-024-00868-x","DOIUrl":"https://doi.org/10.1007/s00186-024-00868-x","url":null,"abstract":"<p>We consider a class of restless bandit problems that finds a broad application area in reinforcement learning and stochastic optimization. We consider <i>N</i> independent discrete-time Markov processes, each of which had two possible states: 1 and 0 (‘good’ and ‘bad’). Only if a process is both in state 1 and observed to be so does reward accrue. The aim is to maximize the expected discounted sum of returns over the infinite horizon subject to a constraint that only <i>M</i> <span>((<N))</span> processes may be observed at each step. Observation is error-prone: there are known probabilities that state 1 (0) will be observed as 0 (1). From this one knows, at any time <i>t</i>, a probability that process <i>i</i> is in state 1. The resulting system may be modeled as a restless multi-armed bandit problem with an information state space of uncountable cardinality. Restless bandit problems with even finite state spaces are PSPACE-HARD in general. We propose a novel approach for simplifying the dynamic programming equations of this class of restless bandits and develop a low-complexity algorithm that achieves a strong performance and is readily extensible to the general restless bandit model with observation errors. Under certain conditions, we establish the existence (indexability) of Whittle index and its equivalence to our algorithm. When those conditions do not hold, we show by numerical experiments the near-optimal performance of our algorithm in the general parametric space. Furthermore, we theoretically prove the optimality of our algorithm for homogeneous systems.</p>","PeriodicalId":49862,"journal":{"name":"Mathematical Methods of Operations Research","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142201491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-stage distributionally robust convex stochastic optimization with Bayesian-type ambiguity sets","authors":"Wentao Ma, Zhiping Chen","doi":"10.1007/s00186-024-00872-1","DOIUrl":"https://doi.org/10.1007/s00186-024-00872-1","url":null,"abstract":"<p>The existent methods for constructing ambiguity sets in distributionally robust optimization often suffer from over-conservativeness and inefficient utilization of available data. To address these limitations and to practically solve multi-stage distributionally robust optimization (MDRO), we propose a data-driven Bayesian-type approach that constructs the ambiguity set of possible distributions from a Bayesian perspective. We demonstrate that our Bayesian-type MDRO problem can be reformulated as a risk-averse multi-stage stochastic programming problem and subsequently investigate its theoretical properties such as consistency, finite sample guarantee, and statistical robustness. Moreover, the reformulation enables us to employ cutting planes algorithms in dynamic settings to solve the Bayesian-type MDRO problem. To illustrate the practicality and advantages of the proposed model and algorithm, we apply it to a distributionally robust inventory control problem and a distributionally robust hydrothermal scheduling problem, and compare it with usual formulations and solution methods to highlight the superior performance of our approach.</p>","PeriodicalId":49862,"journal":{"name":"Mathematical Methods of Operations Research","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142201492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new value for communication situations","authors":"Daniel Li Li, Erfang Shan","doi":"10.1007/s00186-024-00873-0","DOIUrl":"https://doi.org/10.1007/s00186-024-00873-0","url":null,"abstract":"<p>A communication situation (<i>N</i>, <i>v</i>, <i>H</i>) consists of a cooperative game (<i>N</i>, <i>v</i>) and a communication hypergraph (<i>N</i>, <i>H</i>), for which the Myerson value and the position value are well-known allocation rules. The value defined in this paper treats links in <i>H</i> as imaginal players, for which we define a bipartite graph between <i>N</i> and <i>H</i> according to the structure given by <i>H</i>, and propose an allocation rule called the bipartite value. This value assigns payoff to each player with two parts: as a player and as a member in links. A characterization of the bipartite value is given.</p>","PeriodicalId":49862,"journal":{"name":"Mathematical Methods of Operations Research","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141945415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the relationship between the value function and the efficient frontier of a mixed integer linear optimization problem","authors":"Samira Fallah, Ted K. Ralphs, Natashia L. Boland","doi":"10.1007/s00186-024-00871-2","DOIUrl":"https://doi.org/10.1007/s00186-024-00871-2","url":null,"abstract":"<p>In this study, we investigate the connection between the efficient frontier (EF) of a general multiobjective mixed integer linear optimization problem (MILP) and the so-called <i>restricted value function</i> (RVF) of a closely related single-objective MILP. In the first part of the paper, we detail the mathematical structure of the RVF, including characterizing the set of points at which it is differentiable, the gradients at such points, and the subdifferential at all nondifferentiable points. We then show that the EF of the multiobjective MILP is comprised of points on the boundary of the epigraph of the RVF and that any description of the EF suffices to describe the RVF and vice versa. Because of the close relationship of the RVF to the EF, we observe that methods for constructing the so-called value function (VF) of an MILP and methods for constructing the EF of a multiobjective optimization problem are effectively interchangeable. Exploiting this observation, we propose a generalized cutting-plane algorithm for constructing the EF of a multiobjective MILP that arises from an existing algorithm for constructing the classical MILP VF. The algorithm identifies the set of all integer parts of solutions on the EF. We prove that the algorithm converges finitely under a standard boundedness assumption and comes with a performance guarantee if terminated early.</p>","PeriodicalId":49862,"journal":{"name":"Mathematical Methods of Operations Research","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141880569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An approximation algorithm for multiobjective mixed-integer convex optimization","authors":"Ina Lammel, Karl-Heinz Küfer, Philipp Süss","doi":"10.1007/s00186-024-00870-3","DOIUrl":"https://doi.org/10.1007/s00186-024-00870-3","url":null,"abstract":"<p>In this article we introduce an algorithm that approximates the nondominated sets of multiobjective mixed-integer convex optimization problems. The algorithm constructs an inner and outer approximation of the front exploiting the convexity of the patches for problems with an arbitrary number of criteria. In the algorithm, the problem is decomposed into patches, which are multiobjective convex problems, by fixing the integer assignments. The patch problems are solved using (simplicial) Sandwiching. We identify parts of patches that are dominated by other patches and ensure that these patch parts are not refined further. We prove that the algorithm converges and show a bound on the reduction of the approximation error in the course of the algorithm. We illustrate the behaviour of our algorithm using some numerical examples and compare its performance to an algorithm from literature.</p>","PeriodicalId":49862,"journal":{"name":"Mathematical Methods of Operations Research","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141864727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tropical convexity in location problems","authors":"Andrei Comăneci","doi":"10.1007/s00186-024-00869-w","DOIUrl":"https://doi.org/10.1007/s00186-024-00869-w","url":null,"abstract":"<p>We investigate location problems where the optimal solution is found within the tropical convex hull of the given input points. Our initial focus is on geodesically star-convex sets, using the asymmetric tropical distance. We introduce the concept of tropically quasiconvex functions, which have sub-level sets with this shape, and are closely related to monotonic functions. Our findings demonstrate that location problems using tropically quasiconvex functions as distance measures will result in an optimal solution within the tropical convex hull of the input points. We also extend this result to cases where the input points are replaced with tropically convex sets. Finally, we explore the applications of our research in phylogenetics, highlighting the properties of consensus methods that arise from our class of location problems.</p>","PeriodicalId":49862,"journal":{"name":"Mathematical Methods of Operations Research","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141513094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Discrete-time stopping games with risk-sensitive discounted cost criterion","authors":"Wenzhao Zhang, Congying Liu","doi":"10.1007/s00186-024-00864-1","DOIUrl":"https://doi.org/10.1007/s00186-024-00864-1","url":null,"abstract":"<p>In this paper, we focus on the discrete-time stopping games under the risk-sensitive discounted cost criterion. The state space and the action spaces of all the players are assumed to be Borel spaces. The cost functions are allowed to be unbounded from above and from below. At each decision epoch, each player chooses an action to influence the transition laws, and player 1 incurs a running cost. If players 1 or 2 decides to stop the game, player 1 incurs a corresponding terminated cost. Under suitable hypothesis, we show that the game model has a value which is a unique solution of risk-sensitive stopping optimality equation by an approximation technique. Furthermore, we derive the existence of equilibria.</p>","PeriodicalId":49862,"journal":{"name":"Mathematical Methods of Operations Research","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141513129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Convex optimization via inertial algorithms with vanishing Tikhonov regularization: fast convergence to the minimum norm solution","authors":"Hedy Attouch, Szilárd Csaba László","doi":"10.1007/s00186-024-00867-y","DOIUrl":"https://doi.org/10.1007/s00186-024-00867-y","url":null,"abstract":"<p>In a Hilbertian framework, for the minimization of a general convex differentiable function <i>f</i>, we introduce new inertial dynamics and algorithms that generate trajectories and iterates that converge fastly towards the minimizer of <i>f</i> with minimum norm. Our study is based on the non-autonomous version of the Polyak heavy ball method, which, at time <i>t</i>, is associated with the strongly convex function obtained by adding to <i>f</i> a Tikhonov regularization term with vanishing coefficient <span>(varepsilon (t))</span>. In this dynamic, the damping coefficient is proportional to the square root of the Tikhonov regularization parameter <span>(varepsilon (t))</span>. By adjusting the speed of convergence of <span>(varepsilon (t))</span> towards zero, we will obtain both rapid convergence towards the infimal value of <i>f</i>, and the strong convergence of the trajectories towards the element of minimum norm of the set of minimizers of <i>f</i>. In particular, we obtain an improved version of the dynamic of Su-Boyd-Candès for the accelerated gradient method of Nesterov. This study naturally leads to corresponding first-order algorithms obtained by temporal discretization. In the case of a proper lower semicontinuous and convex function <i>f</i>, we study the proximal algorithms in detail, and show that they benefit from similar properties.\u0000</p>","PeriodicalId":49862,"journal":{"name":"Mathematical Methods of Operations Research","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Asymptotic upper bounds for an M/M/C/K retrial queue with a guard channel and guard buffer","authors":"Nesrine Zidani, Natalia Djellab","doi":"10.1007/s00186-024-00865-0","DOIUrl":"https://doi.org/10.1007/s00186-024-00865-0","url":null,"abstract":"<p>The paper deals with Markovian multiserver retrial queuing system with exponential abandonments, two types of arrivals: Fresh calls and Handover calls and waiting places in the service area. This model can be used for analysing a cellular mobile network, where the service area is divided into cells. In this paper, the number of customers in the system and in the orbit form a level-dependent quasi-birth-and-death process, whose stationary distribution is expressed in terms of a sequence of rate matrices. First, we derive the Taylor series expansion for nonzero elements of the rate matrices. Then, by the expansion results, we obtain an asymptotic upper bound for the stationary distribution of both the number of busy channels and the number of customers in the orbit. Furthermore, we present some numerical results to examine the performance of the system.</p>","PeriodicalId":49862,"journal":{"name":"Mathematical Methods of Operations Research","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Convergence rate of LQG mean field games with common noise","authors":"Jiamin Jian, Qingshuo Song, Jiaxuan Ye","doi":"10.1007/s00186-024-00863-2","DOIUrl":"https://doi.org/10.1007/s00186-024-00863-2","url":null,"abstract":"<p>This paper focuses on exploring the convergence properties of a generic player’s trajectory and empirical measures in an <i>N</i>-player Linear-Quadratic-Gaussian Nash game, where Brownian motion serves as the common noise. The study establishes three distinct convergence rates concerning the representative player and empirical measure. To investigate the convergence, the methodology relies on a specific decomposition of the equilibrium path in the <i>N</i>-player game and utilizes the associated mean field games framework.</p>","PeriodicalId":49862,"journal":{"name":"Mathematical Methods of Operations Research","volume":null,"pages":null},"PeriodicalIF":1.2,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}