International Conference on Algorithmic Learning Theory最新文献

International Conference on Algorithmic Learning Theory Pub Date : 2023-02-08 DOI: 10.48550/arXiv.2302.04357

Niki Hasrati, S. Ben-David

引用次数: 3

SQ Lower Bounds for Random Sparse Planted Vector Problem 随机稀疏种植向量问题的SQ下界

International Conference on Algorithmic Learning Theory Pub Date : 2023-01-26 DOI: 10.48550/arXiv.2301.11124

Jingqiu Ding, Yiding Hua

{"title":"SQ Lower Bounds for Random Sparse Planted Vector Problem","authors":"Jingqiu Ding, Yiding Hua","doi":"10.48550/arXiv.2301.11124","DOIUrl":"https://doi.org/10.48550/arXiv.2301.11124","url":null,"abstract":"Consider the setting where a $rho$-sparse Rademacher vector is planted in a random $d$-dimensional subspace of $R^n$. A classical question is how to recover this planted vector given a random basis in this subspace. A recent result by [ZSWB21] showed that the Lattice basis reduction algorithm can recover the planted vector when $ngeq d+1$. Although the algorithm is not expected to tolerate inverse polynomial amount of noise, it is surprising because it was previously shown that recovery cannot be achieved by low degree polynomials when $nll rho^2 d^{2}$ [MW21]. A natural question is whether we can derive an Statistical Query (SQ) lower bound matching the previous low degree lower bound in [MW21]. This will - imply that the SQ lower bound can be surpassed by lattice based algorithms; - predict the computational hardness when the planted vector is perturbed by inverse polynomial amount of noise. In this paper, we prove such an SQ lower bound. In particular, we show that super-polynomial number of VSTAT queries is needed to solve the easier statistical testing problem when $nll rho^2 d^{2}$ and $rhogg frac{1}{sqrt{d}}$. The most notable technique we used to derive the SQ lower bound is the almost equivalence relationship between SQ lower bound and low degree lower bound [BBH+20, MW21].","PeriodicalId":267197,"journal":{"name":"International Conference on Algorithmic Learning Theory","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116735093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Complexity Analysis of a Countable-armed Bandit Problem 可数武装强盗问题的复杂性分析

International Conference on Algorithmic Learning Theory Pub Date : 2023-01-18 DOI: 10.48550/arXiv.2301.07243

Anand Kalvit, A. Zeevi

引用次数: 0

Adversarial Online Multi-Task Reinforcement Learning 对抗在线多任务强化学习

International Conference on Algorithmic Learning Theory Pub Date : 2023-01-11 DOI: 10.48550/arXiv.2301.04268

Quan Nguyen, Nishant A. Mehta

引用次数: 0

Limitations of Information-Theoretic Generalization Bounds for Gradient Descent Methods in Stochastic Convex Optimization 随机凸优化中梯度下降方法的信息论泛化界的局限性

International Conference on Algorithmic Learning Theory Pub Date : 2022-12-27 DOI: 10.48550/arXiv.2212.13556

Mahdi Haghifam, Borja Rodr'iguez-G'alvez, R. Thobaben, M. Skoglund, Daniel M. Roy, G. Dziugaite

引用次数: 8

Variance-Reduced Conservative Policy Iteration 减方差保守策略迭代

International Conference on Algorithmic Learning Theory Pub Date : 2022-12-12 DOI: 10.48550/arXiv.2212.06283

Naman Agarwal, Brian Bullins, Karan Singh

引用次数: 2

Linear Reinforcement Learning with Ball Structure Action Space 球结构作用空间的线性强化学习

International Conference on Algorithmic Learning Theory Pub Date : 2022-11-14 DOI: 10.48550/arXiv.2211.07419

Zeyu Jia, Randy Jia, Dhruv Madeka, Dean Phillips Foster

引用次数: 1

Efficient Global Planning in Large MDPs via Stochastic Primal-Dual Optimization 基于随机原对偶优化的大型mdp高效全局规划

International Conference on Algorithmic Learning Theory Pub Date : 2022-10-21 DOI: 10.48550/arXiv.2210.12057

Gergely Neu, Nneka Okolo

引用次数: 3

Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path 达到目标是困难的:解决随机最短路径的样本复杂度

International Conference on Algorithmic Learning Theory Pub Date : 2022-10-10 DOI: 10.48550/arXiv.2210.04946

Liyu Chen, Andrea Tirinzoni, Matteo Pirotta, A. Lazaric

{"title":"Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic Shortest Path","authors":"Liyu Chen, Andrea Tirinzoni, Matteo Pirotta, A. Lazaric","doi":"10.48550/arXiv.2210.04946","DOIUrl":"https://doi.org/10.48550/arXiv.2210.04946","url":null,"abstract":"We study the sample complexity of learning an $epsilon$-optimal policy in the Stochastic Shortest Path (SSP) problem. We first derive sample complexity bounds when the learner has access to a generative model. We show that there exists a worst-case SSP instance with $S$ states, $A$ actions, minimum cost $c_{min}$, and maximum expected cost of the optimal policy over all states $B_{star}$, where any algorithm requires at least $Omega(SAB_{star}^3/(c_{min}epsilon^2))$ samples to return an $epsilon$-optimal policy with high probability. Surprisingly, this implies that whenever $c_{min}=0$ an SSP problem may not be learnable, thus revealing that learning in SSPs is strictly harder than in the finite-horizon and discounted settings. We complement this result with lower bounds when prior knowledge of the hitting time of the optimal policy is available and when we restrict optimality by competing against policies with bounded hitting time. Finally, we design an algorithm with matching upper bounds in these cases. This settles the sample complexity of learning $epsilon$-optimal polices in SSP with generative models. We also initiate the study of learning $epsilon$-optimal policies without access to a generative model (i.e., the so-called best-policy identification problem), and show that sample-efficient learning is impossible in general. On the other hand, efficient learning can be made possible if we assume the agent can directly reach the goal state from any state by paying a fixed cost. We then establish the first upper and lower bounds under this assumption. Finally, using similar analytic tools, we prove that horizon-free regret is impossible in SSPs under general costs, resolving an open problem in (Tarbouriech et al., 2021c).","PeriodicalId":267197,"journal":{"name":"International Conference on Algorithmic Learning Theory","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130666906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Fisher information lower bounds for sampling 抽样的费雪信息下界

International Conference on Algorithmic Learning Theory Pub Date : 2022-10-05 DOI: 10.48550/arXiv.2210.02482

Sinho Chewi, P. Gerber, Holden Lee, Chen Lu

引用次数: 7