实现多代理强化学习驱动的场外市场模拟

IF 2.4 3区经济学 Q3 BUSINESS, FINANCE

Mathematical Finance Pub Date : 2023-09-20 DOI:10.1111/mafi.12416

Nelson Vadori, Leo Ardon, Sumitra Ganesh, Thomas Spooner, Selim Amrouni, Jared Vann, Mengda Xu, Zeyu Zheng, Tucker Balch, Manuela Veloso

{"title":"实现多代理强化学习驱动的场外市场模拟","authors":"Nelson Vadori, Leo Ardon, Sumitra Ganesh, Thomas Spooner, Selim Amrouni, Jared Vann, Mengda Xu, Zeyu Zheng, Tucker Balch, Manuela Veloso","doi":"10.1111/mafi.12416","DOIUrl":null,"url":null,"abstract":"<p>We study a game between liquidity provider (LP) and liquidity taker agents interacting in an over-the-counter market, for which the typical example is foreign exchange. We show how a suitable design of parameterized families of reward functions coupled with shared policy learning constitutes an efficient solution to this problem. By playing against each other, our deep-reinforcement-learning-driven agents learn emergent behaviors relative to a wide spectrum of objectives encompassing profit-and-loss, optimal execution, and market share. In particular, we find that LPs naturally learn to balance hedging and skewing, where skewing refers to setting their buy and sell prices asymmetrically as a function of their inventory. We further introduce a novel RL-based calibration algorithm, which we found performed well at imposing constraints on the game equilibrium. On the theoretical side, we are able to show convergence rates for our multi-agent policy gradient algorithm under a transitivity assumption, closely related to generalized ordinal potential games.</p>","PeriodicalId":49867,"journal":{"name":"Mathematical Finance","volume":"34 2","pages":"262-347"},"PeriodicalIF":2.4000,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Towards multi-agent reinforcement learning-driven over-the-counter market simulations\",\"authors\":\"Nelson Vadori, Leo Ardon, Sumitra Ganesh, Thomas Spooner, Selim Amrouni, Jared Vann, Mengda Xu, Zeyu Zheng, Tucker Balch, Manuela Veloso\",\"doi\":\"10.1111/mafi.12416\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>We study a game between liquidity provider (LP) and liquidity taker agents interacting in an over-the-counter market, for which the typical example is foreign exchange. We show how a suitable design of parameterized families of reward functions coupled with shared policy learning constitutes an efficient solution to this problem. By playing against each other, our deep-reinforcement-learning-driven agents learn emergent behaviors relative to a wide spectrum of objectives encompassing profit-and-loss, optimal execution, and market share. In particular, we find that LPs naturally learn to balance hedging and skewing, where skewing refers to setting their buy and sell prices asymmetrically as a function of their inventory. We further introduce a novel RL-based calibration algorithm, which we found performed well at imposing constraints on the game equilibrium. On the theoretical side, we are able to show convergence rates for our multi-agent policy gradient algorithm under a transitivity assumption, closely related to generalized ordinal potential games.</p>\",\"PeriodicalId\":49867,\"journal\":{\"name\":\"Mathematical Finance\",\"volume\":\"34 2\",\"pages\":\"262-347\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2023-09-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mathematical Finance\",\"FirstCategoryId\":\"96\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/mafi.12416\",\"RegionNum\":3,\"RegionCategory\":\"经济学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"BUSINESS, FINANCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematical Finance","FirstCategoryId":"96","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/mafi.12416","RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}

引用次数: 0

摘要

我们研究的是场外交易市场中流动性提供者（LP）和流动性接受者之间的博弈，典型的例子是外汇市场。我们展示了如何通过适当设计参数化的奖励函数族并结合共享策略学习来高效解决这一问题。通过相互博弈，我们的深度强化学习驱动型代理学习到了与包括盈亏、最佳执行和市场份额在内的各种目标相关的新兴行为。我们特别发现，LPs 自然而然地学会了在对冲和倾斜之间取得平衡，其中倾斜指的是将买入价和卖出价作为库存的函数进行非对称设置。我们进一步引入了一种新颖的基于 RL 的校准算法，发现该算法在对博弈均衡施加约束方面表现出色。在理论方面，我们能够在与广义序数势博弈密切相关的传递性假设下证明我们的多代理策略梯度算法的收敛率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Towards multi-agent reinforcement learning-driven over-the-counter market simulations

We study a game between liquidity provider (LP) and liquidity taker agents interacting in an over-the-counter market, for which the typical example is foreign exchange. We show how a suitable design of parameterized families of reward functions coupled with shared policy learning constitutes an efficient solution to this problem. By playing against each other, our deep-reinforcement-learning-driven agents learn emergent behaviors relative to a wide spectrum of objectives encompassing profit-and-loss, optimal execution, and market share. In particular, we find that LPs naturally learn to balance hedging and skewing, where skewing refers to setting their buy and sell prices asymmetrically as a function of their inventory. We further introduce a novel RL-based calibration algorithm, which we found performed well at imposing constraints on the game equilibrium. On the theoretical side, we are able to show convergence rates for our multi-agent policy gradient algorithm under a transitivity assumption, closely related to generalized ordinal potential games.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Mathematical Finance 数学-数学跨学科应用

CiteScore

4.10

自引率

6.20%

发文量

审稿时长

>12 weeks

期刊介绍： Mathematical Finance seeks to publish original research articles focused on the development and application of novel mathematical and statistical methods for the analysis of financial problems. The journal welcomes contributions on new statistical methods for the analysis of financial problems. Empirical results will be appropriate to the extent that they illustrate a statistical technique, validate a model or provide insight into a financial problem. Papers whose main contribution rests on empirical results derived with standard approaches will not be considered.