Bridge Bidding via Deep Reinforcement Learning and Belief Monte Carlo Search

IF 15.3 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Ieee-Caa Journal of Automatica Sinica Pub Date : 2024-09-04 DOI:10.1109/JAS.2024.124488

Zizhang Qiu;Shouguang Wang;Dan You;MengChu Zhou

引用次数: 0

Abstract

Contract Bridge, a four-player imperfect information game, comprises two phases: bidding and playing. While computer programs excel at playing, bidding presents a challenging aspect due to the need for information exchange with partners and interference with communication of opponents. In this work, we introduce a Bridge bidding agent that combines supervised learning, deep reinforcement learning via self-play, and a test-time search approach. Our experiments demonstrate that our agent outperforms WBridge5, a highly regarded computer Bridge software that has won multiple world championships, by a performance of 0.98 IMPs (international match points) per deal over 10 000 deals, with a much cost-effective approach. The performance significantly surpasses previous state-of-the-art (0.85 IMPs per deal). Note 0.1 IMPs per deal is a significant improvement in Bridge bidding.

查看原文本刊更多论文

通过深度强化学习和信念蒙特卡洛搜索进行桥牌竞价

契约桥牌是一种四人不完全信息游戏，包括两个阶段：竞标和下注。虽然计算机程序擅长下棋，但由于需要与合作伙伴交换信息并干扰对手的交流，竞标是一个具有挑战性的方面。在这项工作中，我们介绍了一种桥牌竞标代理，它结合了监督学习、通过自我比赛进行的深度强化学习以及测试时间搜索方法。我们的实验证明，我们的代理在 10,000 次交易中，以每交易 0.98 IMPs（国际比赛积分）的成绩超越了 WBridge5（一款备受推崇的计算机桥牌软件，曾多次获得世界冠军），而且采用的方法更具成本效益。这一成绩大大超过了以前的先进水平（每盘 0.85 IMPs）。注意：每局 0.1 IMPs 是桥牌竞标中的一项重大改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Ieee-Caa Journal of Automatica Sinica Engineering-Control and Systems Engineering

CiteScore

23.50

自引率

11.00%

发文量

880

期刊介绍： The IEEE/CAA Journal of Automatica Sinica is a reputable journal that publishes high-quality papers in English on original theoretical/experimental research and development in the field of automation. The journal covers a wide range of topics including automatic control, artificial intelligence and intelligent control, systems theory and engineering, pattern recognition and intelligent systems, automation engineering and applications, information processing and information systems, network-based automation, robotics, sensing and measurement, and navigation, guidance, and control. Additionally, the journal is abstracted/indexed in several prominent databases including SCIE (Science Citation Index Expanded), EI (Engineering Index), Inspec, Scopus, SCImago, DBLP, CNKI (China National Knowledge Infrastructure), CSCD (Chinese Science Citation Database), and IEEE Xplore.