Reinforcement learning with temporal and variable dependency-aware transformer for stock trading optimization

IF 6.3 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks Pub Date : 2025-07-23 DOI:10.1016/j.neunet.2025.107905

Yifan Li , Xu Dong , Zhuang Wu , Jing Gao , Tianqi Zhang , Lina Yu

{"title":"Reinforcement learning with temporal and variable dependency-aware transformer for stock trading optimization","authors":"Yifan Li , Xu Dong , Zhuang Wu , Jing Gao , Tianqi Zhang , Lina Yu","doi":"10.1016/j.neunet.2025.107905","DOIUrl":null,"url":null,"abstract":"<div><div>Stock trading optimization aims to optimize portfolios in dynamic market environments, which plays a crucial role in practical financial decision-making. With the rise of Transformer in recent years, some researchers have combined Transformer with Reinforcement Learning (RL) to improve their ability to represent potential patterns in market data. However, existing methods mainly focus on capturing temporal dependencies, failing to effectively model the interactions among multiple variables, limiting sufficient decision-making information for policy learning in RL. To this end, this paper proposes a RL model that integrates a Temporal and Variable Dependency-aware Transformer to learn diverse dependency relationships in market data. Firstly, a short-term prediction module and a long-term prediction module are designed to explore potential dependencies in the market data with a short-term horizon and a long-term horizon, respectively. The core of both the short-term prediction module and the long-term prediction module is the Temporal and Variable Dependency-aware Transformer, which is implemented in two stages. Specifically, the first stage captures temporal relationships along the temporal dimension, and the second stage captures multivariate correlations across the variable dimension. Meanwhile, a relation representation module is proposed to further capture correlations of different stock assets within a market. Finally, a policy decision module is introduced to effectively fuse different representations from the preceding modules into a unified space, enabling RL to learn flexible policies with comprehensive decision-making information. The experimental results clearly demonstrate the superior performance of the proposed method, which achieves the highest Sharpe ratio of 1.48 and portfolio return of 2.65, outperforming state-of-the-art methods on three challenging datasets of CSI-300, S&P-100, and NASDAQ-100.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"192 ","pages":"Article 107905"},"PeriodicalIF":6.3000,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025007865","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Stock trading optimization aims to optimize portfolios in dynamic market environments, which plays a crucial role in practical financial decision-making. With the rise of Transformer in recent years, some researchers have combined Transformer with Reinforcement Learning (RL) to improve their ability to represent potential patterns in market data. However, existing methods mainly focus on capturing temporal dependencies, failing to effectively model the interactions among multiple variables, limiting sufficient decision-making information for policy learning in RL. To this end, this paper proposes a RL model that integrates a Temporal and Variable Dependency-aware Transformer to learn diverse dependency relationships in market data. Firstly, a short-term prediction module and a long-term prediction module are designed to explore potential dependencies in the market data with a short-term horizon and a long-term horizon, respectively. The core of both the short-term prediction module and the long-term prediction module is the Temporal and Variable Dependency-aware Transformer, which is implemented in two stages. Specifically, the first stage captures temporal relationships along the temporal dimension, and the second stage captures multivariate correlations across the variable dimension. Meanwhile, a relation representation module is proposed to further capture correlations of different stock assets within a market. Finally, a policy decision module is introduced to effectively fuse different representations from the preceding modules into a unified space, enabling RL to learn flexible policies with comprehensive decision-making information. The experimental results clearly demonstrate the superior performance of the proposed method, which achieves the highest Sharpe ratio of 1.48 and portfolio return of 2.65, outperforming state-of-the-art methods on three challenging datasets of CSI-300, S&P-100, and NASDAQ-100.

查看原文本刊更多论文

基于时间和变量依赖感知变压器的股票交易优化强化学习

股票交易优化旨在动态市场环境下的投资组合优化，在实际的财务决策中起着至关重要的作用。近年来，随着Transformer的兴起，一些研究人员将Transformer与强化学习（RL）结合起来，以提高他们在市场数据中表示潜在模式的能力。然而，现有的方法主要侧重于捕获时间依赖性，不能有效地对多变量之间的相互作用进行建模，限制了强化学习中策略学习的充分决策信息。为此，本文提出了一个RL模型，该模型集成了一个时间和变量依赖感知转换器，以学习市场数据中的各种依赖关系。首先，设计短期预测模块和长期预测模块，分别从短期和长期角度探索市场数据的潜在依赖关系。短期预测模块和长期预测模块的核心是时间和变量依赖感知转换器，它分两个阶段实现。具体来说，第一阶段捕获沿时间维度的时间关系，第二阶段捕获跨变量维度的多变量相关性。同时，提出了一个关系表示模块，以进一步捕获市场中不同股票资产之间的相关性。最后，引入策略决策模块，将上述模块的不同表示有效地融合到一个统一的空间中，使强化学习能够以全面的决策信息学习灵活的策略。实验结果清楚地证明了该方法的优越性能，在CSI-300、s&p -100和NASDAQ-100三个具有挑战性的数据集上，该方法实现了最高的夏普比率1.48和投资组合回报率2.65，优于目前最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.