教机器人走路,也教它们做交易--利用知情数据和 LLM 进行制度自适应执行

Raeid Saqur
{"title":"教机器人走路,也教它们做交易--利用知情数据和 LLM 进行制度自适应执行","authors":"Raeid Saqur","doi":"arxiv-2406.15508","DOIUrl":null,"url":null,"abstract":"Machine learning techniques applied to the problem of financial market\nforecasting struggle with dynamic regime switching, or underlying correlation\nand covariance shifts in true (hidden) market variables. Drawing inspiration\nfrom the success of reinforcement learning in robotics, particularly in agile\nlocomotion adaptation of quadruped robots to unseen terrains, we introduce an\ninnovative approach that leverages world knowledge of pretrained LLMs (aka.\n'privileged information' in robotics) and dynamically adapts them using\nintrinsic, natural market rewards using LLM alignment technique we dub as\n\"Reinforcement Learning from Market Feedback\" (**RLMF**). Strong empirical\nresults demonstrate the efficacy of our method in adapting to regime shifts in\nfinancial markets, a challenge that has long plagued predictive models in this\ndomain. The proposed algorithmic framework outperforms best-performing SOTA LLM\nmodels on the existing (FLARE) benchmark stock-movement (SM) tasks by more than\n15\\% improved accuracy. On the recently proposed NIFTY SM task, our adaptive\npolicy outperforms the SOTA best performing trillion parameter models like\nGPT-4. The paper details the dual-phase, teacher-student architecture and\nimplementation of our model, the empirical results obtained, and an analysis of\nthe role of language embeddings in terms of Information Gain.","PeriodicalId":501294,"journal":{"name":"arXiv - QuantFin - Computational Finance","volume":"2012 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"What Teaches Robots to Walk, Teaches Them to Trade too -- Regime Adaptive Execution using Informed Data and LLMs\",\"authors\":\"Raeid Saqur\",\"doi\":\"arxiv-2406.15508\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning techniques applied to the problem of financial market\\nforecasting struggle with dynamic regime switching, or underlying correlation\\nand covariance shifts in true (hidden) market variables. Drawing inspiration\\nfrom the success of reinforcement learning in robotics, particularly in agile\\nlocomotion adaptation of quadruped robots to unseen terrains, we introduce an\\ninnovative approach that leverages world knowledge of pretrained LLMs (aka.\\n'privileged information' in robotics) and dynamically adapts them using\\nintrinsic, natural market rewards using LLM alignment technique we dub as\\n\\\"Reinforcement Learning from Market Feedback\\\" (**RLMF**). Strong empirical\\nresults demonstrate the efficacy of our method in adapting to regime shifts in\\nfinancial markets, a challenge that has long plagued predictive models in this\\ndomain. The proposed algorithmic framework outperforms best-performing SOTA LLM\\nmodels on the existing (FLARE) benchmark stock-movement (SM) tasks by more than\\n15\\\\% improved accuracy. On the recently proposed NIFTY SM task, our adaptive\\npolicy outperforms the SOTA best performing trillion parameter models like\\nGPT-4. The paper details the dual-phase, teacher-student architecture and\\nimplementation of our model, the empirical results obtained, and an analysis of\\nthe role of language embeddings in terms of Information Gain.\",\"PeriodicalId\":501294,\"journal\":{\"name\":\"arXiv - QuantFin - Computational Finance\",\"volume\":\"2012 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuantFin - Computational Finance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2406.15508\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuantFin - Computational Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.15508","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

应用于金融市场预测问题的机器学习技术在动态体制转换或真实(隐藏)市场变量的潜在相关性和协方差变化方面困难重重。我们从机器人学中强化学习的成功,特别是四足机器人对未知地形的敏捷运动适应中汲取灵感,引入了一种创新方法,即利用预训练 LLM 的世界知识(又称机器人学中的 "特权信息"),并使用我们称之为 "市场反馈强化学习"(**RLMF**)的 LLM 对齐技术,利用内在的自然市场奖励对它们进行动态调整。强大的实证结果证明了我们的方法在适应金融市场制度转变方面的功效,而这正是长期困扰该领域预测模型的难题。在现有的(FLARE)基准股票移动(SM)任务上,所提出的算法框架优于表现最好的 SOTA LLM 模型,准确率提高了 15% 以上。在最近提出的 NIFTY SM 任务中,我们的自适应策略优于 SOTA 性能最好的万亿参数模型,如 GPT-4。论文详细介绍了我们的模型的师生双阶段架构和实施、获得的实证结果以及对语言嵌入在信息增益方面的作用的分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
What Teaches Robots to Walk, Teaches Them to Trade too -- Regime Adaptive Execution using Informed Data and LLMs
Machine learning techniques applied to the problem of financial market forecasting struggle with dynamic regime switching, or underlying correlation and covariance shifts in true (hidden) market variables. Drawing inspiration from the success of reinforcement learning in robotics, particularly in agile locomotion adaptation of quadruped robots to unseen terrains, we introduce an innovative approach that leverages world knowledge of pretrained LLMs (aka. 'privileged information' in robotics) and dynamically adapts them using intrinsic, natural market rewards using LLM alignment technique we dub as "Reinforcement Learning from Market Feedback" (**RLMF**). Strong empirical results demonstrate the efficacy of our method in adapting to regime shifts in financial markets, a challenge that has long plagued predictive models in this domain. The proposed algorithmic framework outperforms best-performing SOTA LLM models on the existing (FLARE) benchmark stock-movement (SM) tasks by more than 15\% improved accuracy. On the recently proposed NIFTY SM task, our adaptive policy outperforms the SOTA best performing trillion parameter models like GPT-4. The paper details the dual-phase, teacher-student architecture and implementation of our model, the empirical results obtained, and an analysis of the role of language embeddings in terms of Information Gain.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信