{"title":"PortfolioZero: A stock portfolio model based on deep reinforcement learning","authors":"Haifeng Li, Mo Hai","doi":"10.1016/j.asoc.2025.113578","DOIUrl":null,"url":null,"abstract":"<div><div>Current studies of portfolio mainly use reinforcement learning methods to build models aimed at achieving high investment returns while minimizing risks from market uncertainties. Two main issues will be considered: First, the complexity of financial markets makes it challenging to capture asset price change patterns. Second, current research assumes stock prices accurately show all asset information, and historical prices alone can predict future trends. However, numerous external factors can influence future judgments. We introduce PortfolioZero, a novel model to address these problems. PortfolioZero utilizes three connected deep neural networks combined with a Monte Carlo Tree to discover patterns of financial assets. In the representation network, a Transformer-based model is used to embed financial price data to capture temporal dynamics and potential correlations, providing richer feature representations; the prediction network and Monte Carlo Tree Search are redesigned to handle the continuous action space. Furthermore, we use the StructBERT model to process financial text data, extracting market information into sentiment scores, which are used to reconstruct two reward functions to capture dynamic changes of the financial market. In experiments conducted on the China A-share market, we compared our model with traditional portfolio methods and cutting-edge deep reinforcement learning algorithms. PortfolioZero achieved an average annualized return rate of 21.21% across three portfolio types, outperforming SARL by 20.64% and DDPG by 41.97%, while sentiment-enhanced reward functions improved average annualized return rate by 35% compared to basic reward.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"183 ","pages":"Article 113578"},"PeriodicalIF":7.2000,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625008890","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Current studies of portfolio mainly use reinforcement learning methods to build models aimed at achieving high investment returns while minimizing risks from market uncertainties. Two main issues will be considered: First, the complexity of financial markets makes it challenging to capture asset price change patterns. Second, current research assumes stock prices accurately show all asset information, and historical prices alone can predict future trends. However, numerous external factors can influence future judgments. We introduce PortfolioZero, a novel model to address these problems. PortfolioZero utilizes three connected deep neural networks combined with a Monte Carlo Tree to discover patterns of financial assets. In the representation network, a Transformer-based model is used to embed financial price data to capture temporal dynamics and potential correlations, providing richer feature representations; the prediction network and Monte Carlo Tree Search are redesigned to handle the continuous action space. Furthermore, we use the StructBERT model to process financial text data, extracting market information into sentiment scores, which are used to reconstruct two reward functions to capture dynamic changes of the financial market. In experiments conducted on the China A-share market, we compared our model with traditional portfolio methods and cutting-edge deep reinforcement learning algorithms. PortfolioZero achieved an average annualized return rate of 21.21% across three portfolio types, outperforming SARL by 20.64% and DDPG by 41.97%, while sentiment-enhanced reward functions improved average annualized return rate by 35% compared to basic reward.
期刊介绍:
Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities.
Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.