{"title":"Bandit problems with arbitrary side observations","authors":"Chih-Chun Wang, Sanjeev R. Kulkami, H. Poor","doi":"10.1109/CDC.2003.1273074","DOIUrl":null,"url":null,"abstract":"A bandit problem with side observations is an extension of the traditional two-armed bandit problem, in which the decision maker has access to side information before deciding which arm to pull. In this paper, the essential properties of the side observations that allow achievability results with respect to the minimal inferior sampling time are extracted and formulated. The sufficient conditions for good side information obtained here contain various kinds of random processes as special cases, including i.i.d sequences, Markov chains, periodic sequences, etc. A necessary condition is also provided, giving more insight into the nature of bandit problems with side observations. A game-theoretic approach simplifies the analysis and justifies the viewpoint that the side observation serves as an index of different sub-bandit machines.","PeriodicalId":371853,"journal":{"name":"42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"42nd IEEE International Conference on Decision and Control (IEEE Cat. No.03CH37475)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CDC.2003.1273074","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
A bandit problem with side observations is an extension of the traditional two-armed bandit problem, in which the decision maker has access to side information before deciding which arm to pull. In this paper, the essential properties of the side observations that allow achievability results with respect to the minimal inferior sampling time are extracted and formulated. The sufficient conditions for good side information obtained here contain various kinds of random processes as special cases, including i.i.d sequences, Markov chains, periodic sequences, etc. A necessary condition is also provided, giving more insight into the nature of bandit problems with side observations. A game-theoretic approach simplifies the analysis and justifies the viewpoint that the side observation serves as an index of different sub-bandit machines.