{"title":"在线选择随时间变化的最佳保证","authors":"Sebastian Perez-Salazar, Victor Verdugo","doi":"arxiv-2408.11224","DOIUrl":null,"url":null,"abstract":"Prophet inequalities are a cornerstone in optimal stopping and online\ndecision-making. Traditionally, they involve the sequential observation of $n$\nnon-negative independent random variables and face irrevocable accept-or-reject\nchoices. The goal is to provide policies that provide a good approximation\nratio against the optimal offline solution that can access all the values\nupfront -- the so-called prophet value. In the prophet inequality over time\nproblem (POT), the decision-maker can commit to an accepted value for $\\tau$\nunits of time, during which no new values can be accepted. This creates a\ntrade-off between the duration of commitment and the opportunity to capture\npotentially higher future values. In this work, we provide best possible worst-case approximation ratios in the\nIID setting of POT for single-threshold algorithms and the optimal dynamic\nprogramming policy. We show a single-threshold algorithm that achieves an\napproximation ratio of $(1+e^{-2})/2\\approx 0.567$, and we prove that no\nsingle-threshold algorithm can surpass this guarantee. With our techniques, we\ncan analyze simple algorithms using $k$ thresholds and show that with $k=3$ it\nis possible to get an approximation ratio larger than $\\approx 0.602$. Then,\nfor each $n$, we prove it is possible to compute the tight worst-case\napproximation ratio of the optimal dynamic programming policy for instances\nwith $n$ values by solving a convex optimization program. A limit analysis of\nthe first-order optimality conditions yields a nonlinear differential equation\nshowing that the optimal dynamic programming policy's asymptotic worst-case\napproximation ratio is $\\approx 0.618$. Finally, we extend the discussion to\nadversarial settings and show an optimal worst-case approximation ratio of\n$\\approx 0.162$ when the values are streamed in random order.","PeriodicalId":501188,"journal":{"name":"arXiv - ECON - Theoretical Economics","volume":"26 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimal Guarantees for Online Selection Over Time\",\"authors\":\"Sebastian Perez-Salazar, Victor Verdugo\",\"doi\":\"arxiv-2408.11224\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Prophet inequalities are a cornerstone in optimal stopping and online\\ndecision-making. Traditionally, they involve the sequential observation of $n$\\nnon-negative independent random variables and face irrevocable accept-or-reject\\nchoices. The goal is to provide policies that provide a good approximation\\nratio against the optimal offline solution that can access all the values\\nupfront -- the so-called prophet value. In the prophet inequality over time\\nproblem (POT), the decision-maker can commit to an accepted value for $\\\\tau$\\nunits of time, during which no new values can be accepted. This creates a\\ntrade-off between the duration of commitment and the opportunity to capture\\npotentially higher future values. In this work, we provide best possible worst-case approximation ratios in the\\nIID setting of POT for single-threshold algorithms and the optimal dynamic\\nprogramming policy. We show a single-threshold algorithm that achieves an\\napproximation ratio of $(1+e^{-2})/2\\\\approx 0.567$, and we prove that no\\nsingle-threshold algorithm can surpass this guarantee. With our techniques, we\\ncan analyze simple algorithms using $k$ thresholds and show that with $k=3$ it\\nis possible to get an approximation ratio larger than $\\\\approx 0.602$. Then,\\nfor each $n$, we prove it is possible to compute the tight worst-case\\napproximation ratio of the optimal dynamic programming policy for instances\\nwith $n$ values by solving a convex optimization program. A limit analysis of\\nthe first-order optimality conditions yields a nonlinear differential equation\\nshowing that the optimal dynamic programming policy's asymptotic worst-case\\napproximation ratio is $\\\\approx 0.618$. Finally, we extend the discussion to\\nadversarial settings and show an optimal worst-case approximation ratio of\\n$\\\\approx 0.162$ when the values are streamed in random order.\",\"PeriodicalId\":501188,\"journal\":{\"name\":\"arXiv - ECON - Theoretical Economics\",\"volume\":\"26 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - ECON - Theoretical Economics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.11224\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - ECON - Theoretical Economics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.11224","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Prophet inequalities are a cornerstone in optimal stopping and online
decision-making. Traditionally, they involve the sequential observation of $n$
non-negative independent random variables and face irrevocable accept-or-reject
choices. The goal is to provide policies that provide a good approximation
ratio against the optimal offline solution that can access all the values
upfront -- the so-called prophet value. In the prophet inequality over time
problem (POT), the decision-maker can commit to an accepted value for $\tau$
units of time, during which no new values can be accepted. This creates a
trade-off between the duration of commitment and the opportunity to capture
potentially higher future values. In this work, we provide best possible worst-case approximation ratios in the
IID setting of POT for single-threshold algorithms and the optimal dynamic
programming policy. We show a single-threshold algorithm that achieves an
approximation ratio of $(1+e^{-2})/2\approx 0.567$, and we prove that no
single-threshold algorithm can surpass this guarantee. With our techniques, we
can analyze simple algorithms using $k$ thresholds and show that with $k=3$ it
is possible to get an approximation ratio larger than $\approx 0.602$. Then,
for each $n$, we prove it is possible to compute the tight worst-case
approximation ratio of the optimal dynamic programming policy for instances
with $n$ values by solving a convex optimization program. A limit analysis of
the first-order optimality conditions yields a nonlinear differential equation
showing that the optimal dynamic programming policy's asymptotic worst-case
approximation ratio is $\approx 0.618$. Finally, we extend the discussion to
adversarial settings and show an optimal worst-case approximation ratio of
$\approx 0.162$ when the values are streamed in random order.