Anna Winnicki, Joseph Lubars, Michael Livesay, R. Srikant
{"title":"The Role of Lookahead and Approximate Policy Evaluation in Reinforcement Learning with Linear Value Function Approximation","authors":"Anna Winnicki, Joseph Lubars, Michael Livesay, R. Srikant","doi":"10.1287/opre.2022.0357","DOIUrl":"https://doi.org/10.1287/opre.2022.0357","url":null,"abstract":"Operations Research, Ahead of Print. <br/>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141192335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jerry Anunrojwong, Santiago R. Balseiro, Omar Besbes
{"title":"On the Robustness of Second-Price Auctions in Prior-Independent Mechanism Design","authors":"Jerry Anunrojwong, Santiago R. Balseiro, Omar Besbes","doi":"10.1287/opre.2022.0428","DOIUrl":"https://doi.org/10.1287/opre.2022.0428","url":null,"abstract":"<p>Classical Bayesian mechanism design relies on the common prior assumption, but the common prior is often not available in practice. We study the design of prior-independent mechanisms that relax this assumption: The seller is selling an indivisible item to <i>n</i> buyers such that the buyers’ valuations are drawn from a joint distribution that is unknown to both the buyers and the seller, buyers do not need to form beliefs about competitors, and the seller assumes the distribution is adversarially chosen from a specified class. We measure performance through the worst-case <i>regret</i>, or the difference between the expected revenue achievable with perfect knowledge of buyers’ valuations and the actual mechanism revenue. We study a broad set of classes of valuation distributions that capture a wide spectrum of possible dependencies: independent and identically distributed (i.i.d.) distributions, mixtures of i.i.d. distributions, affiliated and exchangeable distributions, exchangeable distributions, and all joint distributions. We derive in quasi closed form the minimax values and the associated optimal mechanism. In particular, we show that the first three classes admit the same minimax regret value, which is decreasing with the number of competitors, whereas the last two have the same minimax regret equal to that of the case <i>n</i> = 1. Furthermore, we show that the minimax optimal mechanisms have a simple form across all settings: a <i>second-price auction with random reserve prices</i>, which shows its robustness in prior-independent mechanism design. En route to our results, we also develop a principled methodology to determine the form of the optimal mechanism and worst-case distribution via first-order conditions that should be of independent interest in other minimax problems.</p><p><b>Supplemental Material:</b> The online appendices are available at https://doi.org/10.1287/opre.2022.0428.</p>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141063531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Online Learning for Constrained Assortment Optimization Under Markov Chain Choice Model","authors":"Shukai Li, Qi Luo, Zhiyuan Huang, Cong Shi","doi":"10.1287/opre.2022.0693","DOIUrl":"https://doi.org/10.1287/opre.2022.0693","url":null,"abstract":"Assortment optimization finds many important applications in both brick-and-mortar and online retailing. Decision makers select a subset of products to offer to customers from a universe of substitutable products, based on the assumption that customers purchase according to a Markov chain choice model, which is a very general choice model encompassing many popular models. The existing literature predominantly assumes that the customer arrival process and the Markov chain choice model parameters are given as input to the stochastic optimization model. However, in practice, decision makers may not have this information and must learn them while maximizing the total expected revenue on the fly. In “Online Learning for Constrained Assortment Optimization under the Markov Chain Choice Model,” S. Li, Q. Luo, Z. Huang, and C. Shi developed a series of online learning algorithms for Markov chain choice-based assortment optimization problems with efficiency, as well as provable performance guarantees.","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140974007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Angelos Aveklouris, Levi DeValve, Maximiliano Stock, Amy Ward
{"title":"Matching Impatient and Heterogeneous Demand and Supply","authors":"Angelos Aveklouris, Levi DeValve, Maximiliano Stock, Amy Ward","doi":"10.1287/opre.2022.0005","DOIUrl":"https://doi.org/10.1287/opre.2022.0005","url":null,"abstract":"Balancing Speed and Value in On-Demand Matching Platforms In “Matching Impatient and Heterogeneous Demand and Supply,” Aveklouris, DeValve, Stock, and Ward consider a fundamental trade-off faced by many platforms (e.g., Uber/Lyft) that match supply (e.g., drivers) and demand (e.g., riders) dynamically over time: making matches quickly capitalizes on the value of current supply and demand in the system, whereas waiting may enable better matches at the risk of losing impatient customers. They show that this trade-off can be balanced by waiting a short amount of time before making matches: long enough to gather enough agents to make valuable matches but not so long that impatient agents are likely to leave. Intuitively, this balance depends on how long agents are willing to wait, on average, but the authors show that it also depends on the full distribution of the willingness to wait (i.e., not only mean, but also variance and higher moments play a role). Thus, approaches that only take into account the mean willingness to wait may perform quite poorly. Further, the authors develop an algorithm to rank matching priorities in order to achieve an optimized trade-off between speed and value of matches.","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140975091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Drone-Delivery Network for Opioid Overdose: Nonlinear Integer Queueing-Optimization Models and Methods","authors":"Miguel A. Lejeune, Wenbo Ma","doi":"10.1287/opre.2022.0489","DOIUrl":"https://doi.org/10.1287/opre.2022.0489","url":null,"abstract":"<p>We propose a new stochastic emergency network design model that uses a fleet of drones to quickly deliver naloxone in response to opioid overdoses. The network is represented as a collection of <span><math altimg=\"eq-00001.gif\" display=\"inline\" overflow=\"scroll\"><mrow><mi>M</mi><mo>/</mo><mi>G</mi><mo>/</mo><mi>K</mi></mrow></math></span><span></span> queueing systems in which the capacity <i>K</i> of each system is a decision variable, and the service time is modeled as a decision-dependent random variable. The model is a queuing-based optimization problem which locates fixed (drone bases) and mobile (drones) servers and determines the drone dispatching decisions and takes the form of a nonlinear integer problem intractable in its original form. We develop an efficient reformulation and algorithmic framework. Our approach reformulates the multiple nonlinearities (fractional, polynomial, exponential, factorial terms) to give a mixed-integer linear programming (MILP) formulation. We demonstrate its generalizability and show that the problem of minimizing the average response time of a collection of <span><math altimg=\"eq-00002.gif\" display=\"inline\" overflow=\"scroll\"><mrow><mi>M</mi><mo>/</mo><mi>G</mi><mo>/</mo><mi>K</mi></mrow></math></span><span></span> queueing systems with unknown capacity <i>K</i> is always MILP-representable. We design an outer approximation branch-and-cut algorithmic framework that is computationally efficient and scales well. The analysis based on real-life data reveals that drones can in Virginia Beach: (1) decrease the response time by 82%, (2) increase the survival chance by more than 273%, (3) save up to 33 additional lives per year, and (4) provide annually up to 279 additional quality-adjusted life years.</p><p><b>Funding:</b> M. A. Lejeune acknowledges the support of the National Science Foundation [Grant ECCS-2114100] and the Office of Naval Research [Grant N00014-22-1-2649].</p><p><b>Supplemental Material:</b> The online appendices are available at https://doi.org/10.1287/opre.2022.0489.</p>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140884105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pricing and Positioning of Horizontally Differentiated Products with Incomplete Demand Information","authors":"Arnoud V. den Boer, Boxiao Chen, Yining Wang","doi":"10.1287/opre.2021.0093","DOIUrl":"https://doi.org/10.1287/opre.2021.0093","url":null,"abstract":"<p>We consider the problem of determining the optimal prices and product configurations of horizontally differentiated products when customers purchase according to a locational (Hotelling) choice model and where the problem parameters are initially unknown to the decision maker. Both for the single-product and multiple-product setting, we propose a data-driven algorithm that learns the optimal prices and product configurations from accumulating sales data, and we show that their regret—the expected cumulative loss caused by not using optimal decisions—after <i>T</i> time periods is <span><math altimg=\"eq-00002.gif\" display=\"inline\" overflow=\"scroll\"><mrow><mi>O</mi><mo stretchy=\"false\">(</mo><msup><mrow><mi>T</mi></mrow><mrow><mn>1</mn><mo>/</mo><mn>2</mn><mo>+</mo><mi>o</mi><mo stretchy=\"false\">(</mo><mn>1</mn><mo stretchy=\"false\">)</mo></mrow></msup><mo stretchy=\"false\">)</mo></mrow></math></span><span></span>. We accompany this result by showing that, even in the single-product setting, the regret of any algorithm is bounded from below by a constant time <span><math altimg=\"eq-00003.gif\" display=\"inline\" overflow=\"scroll\"><mrow><msup><mrow><mi>T</mi></mrow><mrow><mn>1</mn><mo>/</mo><mn>2</mn></mrow></msup></mrow></math></span><span></span>, implying that our algorithms are asymptotically near optimal. In an extension, we show how our algorithm can be adapted for the case of fixed locations. A numerical study that compares our algorithms with three benchmarks shows that our algorithm is also competitive on a finite time horizon.</p><p><b>Supplemental Material:</b> The online appendix is available at https://doi.org/10.1287/opre.2021.0093.</p>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140833313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Market Entry and Competition Under Network Effects","authors":"Yinbo Feng, Ming Hu","doi":"10.1287/opre.2022.0275","DOIUrl":"https://doi.org/10.1287/opre.2022.0275","url":null,"abstract":"<p>We consider a three-stage game in which, first, a large number of potential firms make entry decisions, then those who choose to stay in the market decide on the investment (quality) level in each product, and last, customers with heterogeneous preferences arrive sequentially to make (random) purchase decisions based on product quality and historical sales under the network effect according to a discrete choice model. We characterize such a random purchase process and show that a growing network effect always contributes to more sales concentration ex post on a small number of products. Perhaps surprisingly, we further show several phase-changing phenomena regarding equilibrium outcomes with respect to the network effect’s strength. In particular, the equilibrium product variety (respectively, quality investment) first decreases (respectively, increases) and then increases (respectively, decreases) as the network effect grows. Specifically, when the strength of the network effect is below a threshold, an increasing network effect would shift more sales toward those products with higher quality, preventing more products from entering the market ex ante and inducing firms to adopt the high-budget equilibrium strategy by making a small number of high-quality products, which is consistent with the blockbuster phenomenon. When the strength of the network effect is above the threshold, the network effect would easily cause the market to be concentrated on a few products ex post; even some low-quality products may have a chance to become a “hit.” Interestingly, in this case, when the network effect is growing, the ex ante equilibrium product variety will be wider, and firms adopt the low-budget equilibrium strategy by making a (relatively) large number of low-quality products, a finding consistent with the long tail theory. We then establish the robustness of the previous main insights by accounting for endogenized pricing and multiproducts carried by each firm.</p><p><b>Funding:</b> Y. Feng was financially supported by the Major Program of National Natural Science Foundation of China [Grants 72192830 and 7219283X], Fundamental Research Funds for the Central Universities, and Program for Innovative Research of Shanghai University of Finance and Economics. M. Hu was supported by the Natural Sciences and Engineering Research Council of Canada [Grants RGPIN-2015-06757 and RGPIN-2021-04295].</p><p><b>Supplemental Material:</b> The online appendix is available at https://doi.org/10.1287/opre.2022.0275.</p>","PeriodicalId":54680,"journal":{"name":"Operations Research","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140833161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}