Denoised Labels for Financial Time Series Data via Self-Supervised Learning

Yanqing Ma, Carmine Ventre, M. Polukarov
{"title":"Denoised Labels for Financial Time Series Data via Self-Supervised Learning","authors":"Yanqing Ma, Carmine Ventre, M. Polukarov","doi":"10.1145/3533271.3561687","DOIUrl":null,"url":null,"abstract":"The introduction of electronic trading platforms effectively changed the organisation of traditional systemic trading from quote-driven markets into order-driven markets. Its convenience led to an exponentially increasing amount of financial data, which is however hard to use for the prediction of future prices, due to the low signal-to-noise ratio and the non-stationarity of financial time series. Simpler classification tasks — where the goal is to predict the directions of future price movement via supervised learning algorithms — need sufficiently reliable labels to generalise well. Labelling financial data is however less well defined than in other domains: did the price go up because of noise or a signal? The existing labelling methods have limited countermeasures against the noise, as well as limited effects in improving learning algorithms. This work takes inspiration from image classification in trading [6] and the success of self-supervised learning in computer vision (e.g., [16]). We investigate the idea of applying these techniques to financial time series to reduce the noise exposure and hence generate correct labels. We look at label generation as the pretext task of a self-supervised learning approach and compare the naive (and noisy) labels, commonly used in the literature, with the labels generated by a denoising autoencoder for the same downstream classification task. Our results demonstrate that these denoised labels improve the performances of the downstream learning algorithm, for both small and large datasets, while preserving the market trends. These findings suggest that with our proposed techniques, self-supervised learning constitutes a powerful framework for generating “better” financial labels that are useful for studying the underlying patterns of the market.","PeriodicalId":134888,"journal":{"name":"Proceedings of the Third ACM International Conference on AI in Finance","volume":"157 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Third ACM International Conference on AI in Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3533271.3561687","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

The introduction of electronic trading platforms effectively changed the organisation of traditional systemic trading from quote-driven markets into order-driven markets. Its convenience led to an exponentially increasing amount of financial data, which is however hard to use for the prediction of future prices, due to the low signal-to-noise ratio and the non-stationarity of financial time series. Simpler classification tasks — where the goal is to predict the directions of future price movement via supervised learning algorithms — need sufficiently reliable labels to generalise well. Labelling financial data is however less well defined than in other domains: did the price go up because of noise or a signal? The existing labelling methods have limited countermeasures against the noise, as well as limited effects in improving learning algorithms. This work takes inspiration from image classification in trading [6] and the success of self-supervised learning in computer vision (e.g., [16]). We investigate the idea of applying these techniques to financial time series to reduce the noise exposure and hence generate correct labels. We look at label generation as the pretext task of a self-supervised learning approach and compare the naive (and noisy) labels, commonly used in the literature, with the labels generated by a denoising autoencoder for the same downstream classification task. Our results demonstrate that these denoised labels improve the performances of the downstream learning algorithm, for both small and large datasets, while preserving the market trends. These findings suggest that with our proposed techniques, self-supervised learning constitutes a powerful framework for generating “better” financial labels that are useful for studying the underlying patterns of the market.
基于自监督学习的金融时间序列数据去噪标签
电子交易平台的引入有效地改变了传统的系统交易组织,从报价驱动的市场转变为订单驱动的市场。它的便利性导致金融数据呈指数级增长,但由于金融时间序列的低信噪比和非平稳性,这些数据难以用于预测未来的价格。更简单的分类任务——其目标是通过监督学习算法预测未来价格走势——需要足够可靠的标签才能很好地泛化。然而,与其他领域相比,给金融数据贴上标签的定义不那么明确:价格上涨是因为噪音还是信号?现有的标注方法对噪声的应对措施有限,在改进学习算法方面的效果也有限。这项工作的灵感来自于交易[6]中的图像分类和计算机视觉中自监督学习的成功(例如[16])。我们研究了将这些技术应用于金融时间序列的想法,以减少噪声暴露,从而产生正确的标签。我们将标签生成视为自监督学习方法的借口任务,并将文献中常用的朴素(和噪声)标签与由去噪自编码器为相同的下游分类任务生成的标签进行比较。我们的研究结果表明,这些去噪的标签提高了下游学习算法的性能,无论是小数据集还是大数据集,同时保持了市场趋势。这些发现表明,通过我们提出的技术,自我监督学习构成了一个强大的框架,可以生成“更好”的金融标签,这些标签对研究市场的潜在模式很有用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信