利用深度强化学习对加密货币交易进行在线概率知识提炼

IF 3.9 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Pattern Recognition Letters Pub Date : 2024-10-01 DOI:10.1016/j.patrec.2024.10.005

Vasileios Moustakidis , Nikolaos Passalis , Anastasios Tefas

{"title":"利用深度强化学习对加密货币交易进行在线概率知识提炼","authors":"Vasileios Moustakidis , Nikolaos Passalis , Anastasios Tefas","doi":"10.1016/j.patrec.2024.10.005","DOIUrl":null,"url":null,"abstract":"<div><div>Leveraging Deep Reinforcement Learning (DRL) for training agents for financial trading has gained significant attention in recent years. However, training these agents in noisy financial environments remains challenging and unstable, significantly impacting their performance as trading agents, as the recent literature has also showcased. This paper introduces a novel distillation method for DRL agents, aiming to improve the training stability of DRL agents. The proposed method transfers knowledge from a teacher ensemble to a student model, incorporating both the action probability distribution knowledge from the output layer, as well as the knowledge from the intermediate layers of the teacher’s network. Furthermore, the proposed method also works in an online fashion, allowing for eliminating the separate teacher training process typically involved in many DRL distillation pipelines, simplifying the distillation process. The proposed method is extensively evaluated on a large-scale cryptocurrency trading setup, demonstrating its ability to both lead to significant improvements in trading accuracy and obtained profit, as well as increase the stability of the training process.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"186 ","pages":"Pages 243-249"},"PeriodicalIF":3.9000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Online probabilistic knowledge distillation on cryptocurrency trading using Deep Reinforcement Learning\",\"authors\":\"Vasileios Moustakidis , Nikolaos Passalis , Anastasios Tefas\",\"doi\":\"10.1016/j.patrec.2024.10.005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Leveraging Deep Reinforcement Learning (DRL) for training agents for financial trading has gained significant attention in recent years. However, training these agents in noisy financial environments remains challenging and unstable, significantly impacting their performance as trading agents, as the recent literature has also showcased. This paper introduces a novel distillation method for DRL agents, aiming to improve the training stability of DRL agents. The proposed method transfers knowledge from a teacher ensemble to a student model, incorporating both the action probability distribution knowledge from the output layer, as well as the knowledge from the intermediate layers of the teacher’s network. Furthermore, the proposed method also works in an online fashion, allowing for eliminating the separate teacher training process typically involved in many DRL distillation pipelines, simplifying the distillation process. The proposed method is extensively evaluated on a large-scale cryptocurrency trading setup, demonstrating its ability to both lead to significant improvements in trading accuracy and obtained profit, as well as increase the stability of the training process.</div></div>\",\"PeriodicalId\":54638,\"journal\":{\"name\":\"Pattern Recognition Letters\",\"volume\":\"186 \",\"pages\":\"Pages 243-249\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167865524002939\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167865524002939","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

近年来，利用深度强化学习（DRL）来训练金融交易代理已受到广泛关注。然而，在嘈杂的金融环境中训练这些代理仍具有挑战性和不稳定性，极大地影响了它们作为交易代理的性能，最近的文献也证明了这一点。本文为 DRL 代理引入了一种新颖的蒸馏方法，旨在提高 DRL 代理的训练稳定性。所提出的方法将知识从教师集合转移到学生模型，既包含输出层的行动概率分布知识，也包含教师网络中间层的知识。此外，所提出的方法还能以在线方式工作，从而省去了许多 DRL 提炼管道中通常涉及的单独教师培训过程，简化了提炼过程。我们在大规模加密货币交易设置上对所提出的方法进行了广泛评估，证明该方法既能显著提高交易准确性和利润，又能提高训练过程的稳定性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Online probabilistic knowledge distillation on cryptocurrency trading using Deep Reinforcement Learning

Leveraging Deep Reinforcement Learning (DRL) for training agents for financial trading has gained significant attention in recent years. However, training these agents in noisy financial environments remains challenging and unstable, significantly impacting their performance as trading agents, as the recent literature has also showcased. This paper introduces a novel distillation method for DRL agents, aiming to improve the training stability of DRL agents. The proposed method transfers knowledge from a teacher ensemble to a student model, incorporating both the action probability distribution knowledge from the output layer, as well as the knowledge from the intermediate layers of the teacher’s network. Furthermore, the proposed method also works in an online fashion, allowing for eliminating the separate teacher training process typically involved in many DRL distillation pipelines, simplifying the distillation process. The proposed method is extensively evaluated on a large-scale cryptocurrency trading setup, demonstrating its ability to both lead to significant improvements in trading accuracy and obtained profit, as well as increase the stability of the training process.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Pattern Recognition Letters 工程技术-计算机：人工智能

CiteScore

12.40

自引率

5.90%

发文量

287

审稿时长

9.1 months

期刊介绍： Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition. Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.