{"title":"基于隐式对手建模的多智能体学习","authors":"Ronald V. Bjarnason, T. Peterson","doi":"10.1109/CEC.2002.1004470","DOIUrl":null,"url":null,"abstract":"We present a learning algorithm for two player stochastic games. The algorithm generates optimal deterministic finite automata (DFA) strategies against opponents who can be modeled by probabilistic action automata. The algorithm generates dynamic history trees based on statistical tests to eliminate state aliasing. Experiments are conducted in an iterated prisoner's dilemma environment.","PeriodicalId":184547,"journal":{"name":"Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Multi-agent learning via implicit opponent modeling\",\"authors\":\"Ronald V. Bjarnason, T. Peterson\",\"doi\":\"10.1109/CEC.2002.1004470\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a learning algorithm for two player stochastic games. The algorithm generates optimal deterministic finite automata (DFA) strategies against opponents who can be modeled by probabilistic action automata. The algorithm generates dynamic history trees based on statistical tests to eliminate state aliasing. Experiments are conducted in an iterated prisoner's dilemma environment.\",\"PeriodicalId\":184547,\"journal\":{\"name\":\"Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CEC.2002.1004470\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CEC.2002.1004470","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-agent learning via implicit opponent modeling
We present a learning algorithm for two player stochastic games. The algorithm generates optimal deterministic finite automata (DFA) strategies against opponents who can be modeled by probabilistic action automata. The algorithm generates dynamic history trees based on statistical tests to eliminate state aliasing. Experiments are conducted in an iterated prisoner's dilemma environment.