Óscar Fernández Vicente, Javier García, Fernando Fernández
{"title":"Policy weighting via discounted Thomson sampling for non-stationary market-making","authors":"Óscar Fernández Vicente, Javier García, Fernando Fernández","doi":"10.1007/s10462-025-11312-9","DOIUrl":null,"url":null,"abstract":"<div><p>Market-making is an essential activity in every financial market. They provide liquidity to the system by placing buy and sell orders at multiple price levels. While performing this task, they aim to earn profit and manage inventory levels simultaneously. However, financial markets are not stationary environments; they constantly evolve, influenced by changes in participants, the occurrence of economic events, or the market trading hours, among others. This study introduces a novel approach to address the challenge of market-making in non-stationary financial markets with multi-objective Reinforcement Learning (RL). Traditional RL methods often struggle when applied to non-stationary environments, as the learned optimal policy may not be adapted to the new dynamics. We present Policy Weighting through Discounted Thompson Sampling (POW-dTS), a novel dynamic algorithm that adapts to changing market conditions by effectively weighting pre-trained policies across various contexts. Unlike some conventional methods, POW-dTS does not require additional artifacts such as change-point detection or models of transitions, making it robust against the unpredictability inherent in financial markets. Our approach focuses on optimizing trade profitability and managing inventory risk, the dual objectives of market makers. Through a detailed comparative analysis, we highlight the strengths and adaptability of POW-dTS against traditional techniques in non-stationary environments, demonstrating its potential to enhance market liquidity and efficiency.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 10","pages":""},"PeriodicalIF":13.9000,"publicationDate":"2025-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11312-9.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-025-11312-9","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Market-making is an essential activity in every financial market. They provide liquidity to the system by placing buy and sell orders at multiple price levels. While performing this task, they aim to earn profit and manage inventory levels simultaneously. However, financial markets are not stationary environments; they constantly evolve, influenced by changes in participants, the occurrence of economic events, or the market trading hours, among others. This study introduces a novel approach to address the challenge of market-making in non-stationary financial markets with multi-objective Reinforcement Learning (RL). Traditional RL methods often struggle when applied to non-stationary environments, as the learned optimal policy may not be adapted to the new dynamics. We present Policy Weighting through Discounted Thompson Sampling (POW-dTS), a novel dynamic algorithm that adapts to changing market conditions by effectively weighting pre-trained policies across various contexts. Unlike some conventional methods, POW-dTS does not require additional artifacts such as change-point detection or models of transitions, making it robust against the unpredictability inherent in financial markets. Our approach focuses on optimizing trade profitability and managing inventory risk, the dual objectives of market makers. Through a detailed comparative analysis, we highlight the strengths and adaptability of POW-dTS against traditional techniques in non-stationary environments, demonstrating its potential to enhance market liquidity and efficiency.
期刊介绍:
Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.