{"title":"HORSE-CFR:安全开发反事实遗憾最小化的分层对手推理","authors":"Shijia Wang, Jiao Wang, Bangyan Song","doi":"10.1016/j.eswa.2024.125697","DOIUrl":null,"url":null,"abstract":"<div><div>Opponent modeling-based game decision-making algorithms relax the assumption of rationality, having the potential to achieve higher payoffs than Nash equilibrium strategies. For opponent modeling methods, existing work primarily suffers from incompatibility between computational complexity and robustness, leading to difficulties in achieving high payoff decisions from limited historical interactions in imperfect information games. This paper introduces the HORSE-CFR algorithm, which incorporates Hierarchical Opponent Reasoning (HOR) and Safe Exploitation Counterfactual Regret Minimization (SE-CFR) to enhance decision-making robustness in imperfect information games. HOR combines neural networks with Bayesian theory to accelerate reasoning, improve interpretability, and reduce modeling errors. SE-CFR optimizes the balance between profitability and conservatism, integrating opponent modeling-based strategy adaptation into a constrained linear binary optimization framework. In experiments, HORSE-CFR outperformed Nash equilibrium strategies when playing against various opponents, increasing payoffs by 16.4% in Leduc Hold’em and 36.8% in the Transit game, respectively. It also improved payoffs by more than 9.0% compared to the best-known opponent modeling-based safe adaptive algorithm in both games.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"263 ","pages":"Article 125697"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"HORSE-CFR: Hierarchical opponent reasoning for safe exploitation counterfactual regret minimization\",\"authors\":\"Shijia Wang, Jiao Wang, Bangyan Song\",\"doi\":\"10.1016/j.eswa.2024.125697\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Opponent modeling-based game decision-making algorithms relax the assumption of rationality, having the potential to achieve higher payoffs than Nash equilibrium strategies. For opponent modeling methods, existing work primarily suffers from incompatibility between computational complexity and robustness, leading to difficulties in achieving high payoff decisions from limited historical interactions in imperfect information games. This paper introduces the HORSE-CFR algorithm, which incorporates Hierarchical Opponent Reasoning (HOR) and Safe Exploitation Counterfactual Regret Minimization (SE-CFR) to enhance decision-making robustness in imperfect information games. HOR combines neural networks with Bayesian theory to accelerate reasoning, improve interpretability, and reduce modeling errors. SE-CFR optimizes the balance between profitability and conservatism, integrating opponent modeling-based strategy adaptation into a constrained linear binary optimization framework. In experiments, HORSE-CFR outperformed Nash equilibrium strategies when playing against various opponents, increasing payoffs by 16.4% in Leduc Hold’em and 36.8% in the Transit game, respectively. It also improved payoffs by more than 9.0% compared to the best-known opponent modeling-based safe adaptive algorithm in both games.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"263 \",\"pages\":\"Article 125697\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-11-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417424025648\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417424025648","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
HORSE-CFR: Hierarchical opponent reasoning for safe exploitation counterfactual regret minimization
Opponent modeling-based game decision-making algorithms relax the assumption of rationality, having the potential to achieve higher payoffs than Nash equilibrium strategies. For opponent modeling methods, existing work primarily suffers from incompatibility between computational complexity and robustness, leading to difficulties in achieving high payoff decisions from limited historical interactions in imperfect information games. This paper introduces the HORSE-CFR algorithm, which incorporates Hierarchical Opponent Reasoning (HOR) and Safe Exploitation Counterfactual Regret Minimization (SE-CFR) to enhance decision-making robustness in imperfect information games. HOR combines neural networks with Bayesian theory to accelerate reasoning, improve interpretability, and reduce modeling errors. SE-CFR optimizes the balance between profitability and conservatism, integrating opponent modeling-based strategy adaptation into a constrained linear binary optimization framework. In experiments, HORSE-CFR outperformed Nash equilibrium strategies when playing against various opponents, increasing payoffs by 16.4% in Leduc Hold’em and 36.8% in the Transit game, respectively. It also improved payoffs by more than 9.0% compared to the best-known opponent modeling-based safe adaptive algorithm in both games.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.