Token-Based Adaptive Time-Series Prediction by Ensembling Linear and Non-linear Estimators: A Machine Learning Approach for Predictive Analytics on big Stock Data

2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA) Pub Date : 2018-12-01 DOI:10.1109/ICMLA.2018.00242

K. J. Morris, S. Egan, Jorell L. Linsangan, C. Leung, A. Cuzzocrea, Calvin S. H. Hoi

{"title":"Token-Based Adaptive Time-Series Prediction by Ensembling Linear and Non-linear Estimators: A Machine Learning Approach for Predictive Analytics on big Stock Data","authors":"K. J. Morris, S. Egan, Jorell L. Linsangan, C. Leung, A. Cuzzocrea, Calvin S. H. Hoi","doi":"10.1109/ICMLA.2018.00242","DOIUrl":null,"url":null,"abstract":"With technological advancements, big data can be easily generated and collected in many applications. Embedded in these big data are useful information and knowledge that can be discovered by machine learning and data mining models, techniques or algorithms. A rich source of big data is stock exchange. The ability to effectively predict future stock prices improves the economic growth and development of a country. Traditional linear approaches for prediction (e.g., Kalman filters) may not be practical in handling big data like stock prices due to highly nonlinear and chaotic nature. This lead to the exploitation of various nonlinear estimators such as the extended Kalman filters, expert systems, and various neural network architectures. Moreover, to lessen the potential shortcomings of individual algorithms, ensemble approaches have been created by averaging values across different algorithms. Existing ensemble techniques mostly basket-together a collection of sample-based algorithms that are catered to nonlinear functions. To the best of our knowledge, traditional linear estimators have not yet been incorporated into such an ensemble. Hence, in this paper, we propose a machine learning (specifically, token-based ensemble) algorithm that utilizes both linear and nonlinear estimators to predict big financial time-series data. Our ensemble consists of a traditional Kalman filter, long short-term memory (LSTM) network, and the traditional linear regression model. We also explore the adaptive properties in short-term high-risk trading in the presence of noisy data like stock prices and demonstrate the performance of our ensemble.","PeriodicalId":6533,"journal":{"name":"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"2 1","pages":"1486-1491"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"50","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2018.00242","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 50

Abstract

With technological advancements, big data can be easily generated and collected in many applications. Embedded in these big data are useful information and knowledge that can be discovered by machine learning and data mining models, techniques or algorithms. A rich source of big data is stock exchange. The ability to effectively predict future stock prices improves the economic growth and development of a country. Traditional linear approaches for prediction (e.g., Kalman filters) may not be practical in handling big data like stock prices due to highly nonlinear and chaotic nature. This lead to the exploitation of various nonlinear estimators such as the extended Kalman filters, expert systems, and various neural network architectures. Moreover, to lessen the potential shortcomings of individual algorithms, ensemble approaches have been created by averaging values across different algorithms. Existing ensemble techniques mostly basket-together a collection of sample-based algorithms that are catered to nonlinear functions. To the best of our knowledge, traditional linear estimators have not yet been incorporated into such an ensemble. Hence, in this paper, we propose a machine learning (specifically, token-based ensemble) algorithm that utilizes both linear and nonlinear estimators to predict big financial time-series data. Our ensemble consists of a traditional Kalman filter, long short-term memory (LSTM) network, and the traditional linear regression model. We also explore the adaptive properties in short-term high-risk trading in the presence of noisy data like stock prices and demonstrate the performance of our ensemble.

查看原文本刊更多论文

集成线性和非线性估计量的基于标记的自适应时间序列预测:一种用于大股票数据预测分析的机器学习方法

随着技术的进步，大数据可以很容易地在许多应用中产生和收集。这些大数据中包含有用的信息和知识，可以通过机器学习和数据挖掘模型、技术或算法来发现。大数据的一个丰富来源是证券交易所。有效预测未来股票价格的能力提高了一个国家的经济增长和发展。传统的线性预测方法(如卡尔曼滤波)由于高度非线性和混沌的性质，在处理像股票价格这样的大数据时可能不实用。这导致了各种非线性估计器的开发，如扩展卡尔曼滤波器、专家系统和各种神经网络架构。此外，为了减少单个算法的潜在缺点，集成方法通过在不同算法之间取平均值来创建。现有的集成技术主要是将基于样本的算法集合在一起，以满足非线性函数的需求。据我们所知，传统的线性估计器还没有被整合到这样一个集合中。因此，在本文中，我们提出了一种机器学习(特别是基于令牌的集成)算法，该算法利用线性和非线性估计器来预测大型金融时间序列数据。我们的集成由传统的卡尔曼滤波、长短期记忆(LSTM)网络和传统的线性回归模型组成。我们还探讨了在股票价格等有噪声数据存在的短期高风险交易中的自适应特性，并展示了我们的集合的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)

自引率

0.00%

发文量