{"title":"Extracting predictive information from heterogeneous data streams using Gaussian Processes","authors":"Sid Ghoshal, Steve Roberts","doi":"10.3233/AF-160055","DOIUrl":null,"url":null,"abstract":"Financial markets are notoriously complex environments, presenting vast amounts of noisy, yet potentially informative data. We consider the problem of forecasting financial time series from a wide range of information sources using online Gaussian Processes with Automatic Relevance Determination (ARD) kernels. We measure the performance gain, quantified in terms of Normalised Root Mean Square Error (NRMSE), Median Absolute Deviation (MAD) and Pearson correlation, from fusing each of four separate data domains: time series technicals, sentiment analysis, options market data and broker recommendations. We show evidence that ARD kernels produce meaningful feature rankings that help retain salient inputs and reduce input dimensionality, providing a framework for sifting through financial complexity. We measure the performance gain from fusing each domain's heterogeneous data streams into a single probabilistic model. In particular our findings highlight the critical value of options data in mapping out the curvature of price space and inspire an intuitive, novel direction for research in financial prediction.","PeriodicalId":42207,"journal":{"name":"Algorithmic Finance","volume":"5 1","pages":"21-30"},"PeriodicalIF":0.3000,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.3233/AF-160055","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Algorithmic Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/AF-160055","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}
引用次数: 10
Abstract
Financial markets are notoriously complex environments, presenting vast amounts of noisy, yet potentially informative data. We consider the problem of forecasting financial time series from a wide range of information sources using online Gaussian Processes with Automatic Relevance Determination (ARD) kernels. We measure the performance gain, quantified in terms of Normalised Root Mean Square Error (NRMSE), Median Absolute Deviation (MAD) and Pearson correlation, from fusing each of four separate data domains: time series technicals, sentiment analysis, options market data and broker recommendations. We show evidence that ARD kernels produce meaningful feature rankings that help retain salient inputs and reduce input dimensionality, providing a framework for sifting through financial complexity. We measure the performance gain from fusing each domain's heterogeneous data streams into a single probabilistic model. In particular our findings highlight the critical value of options data in mapping out the curvature of price space and inspire an intuitive, novel direction for research in financial prediction.
期刊介绍:
Algorithmic Finance is both a nascent field of study and a new high-quality academic research journal that seeks to bridge computer science and finance. It covers such applications as: High frequency and algorithmic trading Statistical arbitrage strategies Momentum and other algorithmic portfolio management Machine learning and computational financial intelligence Agent-based finance Complexity and market efficiency Algorithmic analysis of derivatives valuation Behavioral finance and investor heuristics and algorithms Applications of quantum computation to finance News analytics and automated textual analysis.