{"title":"On Learning and Co-learning Effective Strategies in Iterated Travelers' Dilemma","authors":"Predrag T. Tosic","doi":"10.1109/WI.2016.0120","DOIUrl":null,"url":null,"abstract":"In this short paper, we summarize our previous results, and share some new insights, ideas and challenges about what types of adaptable strategies generally tend to get rewarded in Iterated Travelers' Dilemma (ITD). Our primary motivation for studying ITD is that this strategic 2-person game provides implicit incentives for cooperation – but only if both players cooperate. ITD is a non-zero-sum two-player game that generalizes the better known Iterated Prisoner's Dilemma (IPD). Both IPD and ITD can be viewed as repeated exchanges of proposals or bids, where the payoff to each agent at the end of a round is based on i) how close were the two agents' bids to each other and ii) who bid lower in that round. Our broader goal is to understand how a resource-bounded rational agent can learn about the behavior of other self-interested agents, in order to adjust his or her own bidding strategy in a manner that is most likely to be rewarding in the long run. In addition to exploring traditional reinforcement learning mechanisms in this setting, we also initiate studying the potential promise of co-learning between pairs of adaptive, self-interested but non-malicious agents with bounded computational resources.","PeriodicalId":6513,"journal":{"name":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","volume":"51 1","pages":"674-677"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI.2016.0120","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this short paper, we summarize our previous results, and share some new insights, ideas and challenges about what types of adaptable strategies generally tend to get rewarded in Iterated Travelers' Dilemma (ITD). Our primary motivation for studying ITD is that this strategic 2-person game provides implicit incentives for cooperation – but only if both players cooperate. ITD is a non-zero-sum two-player game that generalizes the better known Iterated Prisoner's Dilemma (IPD). Both IPD and ITD can be viewed as repeated exchanges of proposals or bids, where the payoff to each agent at the end of a round is based on i) how close were the two agents' bids to each other and ii) who bid lower in that round. Our broader goal is to understand how a resource-bounded rational agent can learn about the behavior of other self-interested agents, in order to adjust his or her own bidding strategy in a manner that is most likely to be rewarding in the long run. In addition to exploring traditional reinforcement learning mechanisms in this setting, we also initiate studying the potential promise of co-learning between pairs of adaptive, self-interested but non-malicious agents with bounded computational resources.