Deep Q network with action retention for going long and short selling

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Soft Computing Pub Date : 2025-05-21 DOI:10.1016/j.asoc.2025.113252

Qizhou Sun , Yain-Whar Si

{"title":"Deep Q network with action retention for going long and short selling","authors":"Qizhou Sun , Yain-Whar Si","doi":"10.1016/j.asoc.2025.113252","DOIUrl":null,"url":null,"abstract":"<div><div>In computer-simulated games, the primary objective of adopting reinforcement learning is to achieve victory by attaining the highest hand-crafted reward, considering the optimal state-value functions across the promising trajectories. However, in the context of algorithmic trading, there is no clear goal for hand-crafting an extremely high reward for the state-value function. Besides, the exploration and exploitation of the reinforcement learning could generate a high number of unexpected <em>buy</em> and <em>sell</em> actions. These actions could lead to overlapped transactions which cannot provide a fair reward function. In order to alleviate these problems, we propose a novel trading algorithm named Deep Q Network with Action Retention (DQN-AR). Firstly, the action retention mechanism is proposed to avoid the overlapped transactions. Secondly, the divide-and-conquer approach is employed to break down the profit maximization goal into several sub-goals, with the aim of optimizing the annualized returns from all transactions throughout the entire trading period. Thirdly, we evaluate the effectiveness of the proposed approach by implementing the DQN-AR model for both long and short selling in algorithmic trading. In the experiments, we compare DQN-AR with DQN, Gated-DQN (GDQN), Simple Moving Average (SMA) and Dual Moving Average Crossover (DMAC). The experimental result shows that DQN-AR is superior to DQN, GDQN, SMA and DMAC and achieves the state-of-art trading performance both for long and short positions. In summary, our DQN-AR achieves 15.4% higher profit on average than the second top competitor approach for the long position and 101.03% higher on average for the short position.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"178 ","pages":"Article 113252"},"PeriodicalIF":7.2000,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625005630","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

In computer-simulated games, the primary objective of adopting reinforcement learning is to achieve victory by attaining the highest hand-crafted reward, considering the optimal state-value functions across the promising trajectories. However, in the context of algorithmic trading, there is no clear goal for hand-crafting an extremely high reward for the state-value function. Besides, the exploration and exploitation of the reinforcement learning could generate a high number of unexpected buy and sell actions. These actions could lead to overlapped transactions which cannot provide a fair reward function. In order to alleviate these problems, we propose a novel trading algorithm named Deep Q Network with Action Retention (DQN-AR). Firstly, the action retention mechanism is proposed to avoid the overlapped transactions. Secondly, the divide-and-conquer approach is employed to break down the profit maximization goal into several sub-goals, with the aim of optimizing the annualized returns from all transactions throughout the entire trading period. Thirdly, we evaluate the effectiveness of the proposed approach by implementing the DQN-AR model for both long and short selling in algorithmic trading. In the experiments, we compare DQN-AR with DQN, Gated-DQN (GDQN), Simple Moving Average (SMA) and Dual Moving Average Crossover (DMAC). The experimental result shows that DQN-AR is superior to DQN, GDQN, SMA and DMAC and achieves the state-of-art trading performance both for long and short positions. In summary, our DQN-AR achieves 15.4% higher profit on average than the second top competitor approach for the long position and 101.03% higher on average for the short position.

查看原文本刊更多论文

深度Q网络与行动保留做多和做空

在计算机模拟游戏中，采用强化学习的主要目标是通过获得最高的手工制作的奖励来获得胜利，考虑到有希望的轨迹上的最优状态-价值函数。然而，在算法交易的背景下，没有明确的目标为状态-价值函数手工制作极高的奖励。此外，对强化学习的探索和利用可能会产生大量意想不到的买卖行为。这些行为可能导致交易重叠，无法提供公平的奖励功能。为了缓解这些问题，我们提出了一种新的交易算法——带有动作保留的深度Q网络（Deep Q Network with Action Retention, DQN-AR）。首先，提出了动作保留机制以避免事务重叠。其次，采用分而治之的方法将利润最大化目标分解为几个子目标，以优化整个交易期间所有交易的年化回报。第三，我们通过在算法交易中实现多头和空头的DQN-AR模型来评估所提出方法的有效性。在实验中，我们将DQN- ar与DQN、门合DQN （GDQN）、简单移动平均（SMA）和双移动平均交叉（DMAC）进行了比较。实验结果表明，DQN- ar优于DQN、GDQN、SMA和DMAC，在多头和空头交易中都达到了最先进的交易性能。综上所述，我们的DQN-AR在多头仓位上的平均利润比排名第二的竞争对手高出15.4%，在空头仓位上的平均利润高出101.03%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Soft Computing 工程技术-计算机：跨学科应用

CiteScore

15.80

自引率

6.90%

发文量

874

审稿时长

10.9 months

期刊介绍： Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities. Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.