在线选择随时间变化的最佳保证

arXiv - ECON - Theoretical Economics Pub Date : 2024-08-20 DOI:arxiv-2408.11224

Sebastian Perez-Salazar, Victor Verdugo

{"title":"在线选择随时间变化的最佳保证","authors":"Sebastian Perez-Salazar, Victor Verdugo","doi":"arxiv-2408.11224","DOIUrl":null,"url":null,"abstract":"Prophet inequalities are a cornerstone in optimal stopping and online\ndecision-making. Traditionally, they involve the sequential observation of $n$\nnon-negative independent random variables and face irrevocable accept-or-reject\nchoices. The goal is to provide policies that provide a good approximation\nratio against the optimal offline solution that can access all the values\nupfront -- the so-called prophet value. In the prophet inequality over time\nproblem (POT), the decision-maker can commit to an accepted value for $\\tau$\nunits of time, during which no new values can be accepted. This creates a\ntrade-off between the duration of commitment and the opportunity to capture\npotentially higher future values. In this work, we provide best possible worst-case approximation ratios in the\nIID setting of POT for single-threshold algorithms and the optimal dynamic\nprogramming policy. We show a single-threshold algorithm that achieves an\napproximation ratio of $(1+e^{-2})/2\\approx 0.567$, and we prove that no\nsingle-threshold algorithm can surpass this guarantee. With our techniques, we\ncan analyze simple algorithms using $k$ thresholds and show that with $k=3$ it\nis possible to get an approximation ratio larger than $\\approx 0.602$. Then,\nfor each $n$, we prove it is possible to compute the tight worst-case\napproximation ratio of the optimal dynamic programming policy for instances\nwith $n$ values by solving a convex optimization program. A limit analysis of\nthe first-order optimality conditions yields a nonlinear differential equation\nshowing that the optimal dynamic programming policy's asymptotic worst-case\napproximation ratio is $\\approx 0.618$. Finally, we extend the discussion to\nadversarial settings and show an optimal worst-case approximation ratio of\n$\\approx 0.162$ when the values are streamed in random order.","PeriodicalId":501188,"journal":{"name":"arXiv - ECON - Theoretical Economics","volume":"26 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimal Guarantees for Online Selection Over Time\",\"authors\":\"Sebastian Perez-Salazar, Victor Verdugo\",\"doi\":\"arxiv-2408.11224\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Prophet inequalities are a cornerstone in optimal stopping and online\\ndecision-making. Traditionally, they involve the sequential observation of $n$\\nnon-negative independent random variables and face irrevocable accept-or-reject\\nchoices. The goal is to provide policies that provide a good approximation\\nratio against the optimal offline solution that can access all the values\\nupfront -- the so-called prophet value. In the prophet inequality over time\\nproblem (POT), the decision-maker can commit to an accepted value for $\\\\tau$\\nunits of time, during which no new values can be accepted. This creates a\\ntrade-off between the duration of commitment and the opportunity to capture\\npotentially higher future values. In this work, we provide best possible worst-case approximation ratios in the\\nIID setting of POT for single-threshold algorithms and the optimal dynamic\\nprogramming policy. We show a single-threshold algorithm that achieves an\\napproximation ratio of $(1+e^{-2})/2\\\\approx 0.567$, and we prove that no\\nsingle-threshold algorithm can surpass this guarantee. With our techniques, we\\ncan analyze simple algorithms using $k$ thresholds and show that with $k=3$ it\\nis possible to get an approximation ratio larger than $\\\\approx 0.602$. Then,\\nfor each $n$, we prove it is possible to compute the tight worst-case\\napproximation ratio of the optimal dynamic programming policy for instances\\nwith $n$ values by solving a convex optimization program. A limit analysis of\\nthe first-order optimality conditions yields a nonlinear differential equation\\nshowing that the optimal dynamic programming policy's asymptotic worst-case\\napproximation ratio is $\\\\approx 0.618$. Finally, we extend the discussion to\\nadversarial settings and show an optimal worst-case approximation ratio of\\n$\\\\approx 0.162$ when the values are streamed in random order.\",\"PeriodicalId\":501188,\"journal\":{\"name\":\"arXiv - ECON - Theoretical Economics\",\"volume\":\"26 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - ECON - Theoretical Economics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.11224\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - ECON - Theoretical Economics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.11224","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

先知不等式是最优停止和在线决策的基石。传统上，它们涉及对 $n$ 非负独立随机变量的连续观测，并面临不可撤销的接受或拒绝选择。我们的目标是提供与最优离线解决方案具有良好近似比的策略，而最优离线解决方案可以获取所有前值--即所谓的先知值。在先知不等式时间问题（POT）中，决策者可以承诺在 $\tau$ 单位时间内接受一个值，在此期间不能接受新的值。这就在承诺的持续时间和捕捉潜在更高未来值的机会之间产生了权衡。在这项工作中，我们为单阈值算法的 POTIID 设置和最优动态编程策略提供了最佳可能的最坏情况近似率。我们展示了一种单阈值算法，其近似率达到了 $(1+e^{-2})/2\approx 0.567$，我们还证明了单阈值算法可以超过这一保证。利用我们的技术，我们可以分析使用 $k$ 门限的简单算法，并证明在 $k=3$ 的情况下，有可能获得大于 $/approx 0.602$ 的逼近比。然后，对于每个 $n$，我们证明有可能通过求解一个凸优化程序，计算出具有 $n$ 值的实例的最优动态编程策略的最坏情况近似率。对一阶最优条件的极限分析得到了一个非线性微分方程，表明最优动态编程策略的渐近最坏情况逼近率为 $\approx 0.618$。最后，我们将讨论扩展到对抗性设置，并展示了当数值以随机顺序串流时，最优的最坏情况逼近率为 0.162$。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Optimal Guarantees for Online Selection Over Time

Prophet inequalities are a cornerstone in optimal stopping and online decision-making. Traditionally, they involve the sequential observation of $n$ non-negative independent random variables and face irrevocable accept-or-reject choices. The goal is to provide policies that provide a good approximation ratio against the optimal offline solution that can access all the values upfront -- the so-called prophet value. In the prophet inequality over time problem (POT), the decision-maker can commit to an accepted value for $\tau$ units of time, during which no new values can be accepted. This creates a trade-off between the duration of commitment and the opportunity to capture potentially higher future values. In this work, we provide best possible worst-case approximation ratios in the IID setting of POT for single-threshold algorithms and the optimal dynamic programming policy. We show a single-threshold algorithm that achieves an approximation ratio of $(1+e^{-2})/2\approx 0.567$, and we prove that no single-threshold algorithm can surpass this guarantee. With our techniques, we can analyze simple algorithms using $k$ thresholds and show that with $k=3$ it is possible to get an approximation ratio larger than $\approx 0.602$. Then, for each $n$, we prove it is possible to compute the tight worst-case approximation ratio of the optimal dynamic programming policy for instances with $n$ values by solving a convex optimization program. A limit analysis of the first-order optimality conditions yields a nonlinear differential equation showing that the optimal dynamic programming policy's asymptotic worst-case approximation ratio is $\approx 0.618$. Finally, we extend the discussion to adversarial settings and show an optimal worst-case approximation ratio of $\approx 0.162$ when the values are streamed in random order.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - ECON - Theoretical Economics

自引率

0.00%

发文量