Lei Li, Farzad Noorian, Duncan J. M. Moss, P. Leong
{"title":"使用MapReduce进行滚动窗时间序列预测","authors":"Lei Li, Farzad Noorian, Duncan J. M. Moss, P. Leong","doi":"10.1109/IRI.2014.7051965","DOIUrl":null,"url":null,"abstract":"Prediction of time series data is an important application in many domains. Despite their advantages, traditional databases and MapReduce methodology are not ideally suited for this type of processing due to dependencies introduced by the sequential nature of time series. We present a novel framework to facilitate retrieval and rolling-window prediction of irregularly sampled large-scale time series data. By introducing a new index pool data structure, processing of time series can be efficiently parallelised. The proposed framework is implemented in R programming environment and utilises Hadoop to support parallelisation and fault tolerance. Experimental results indicate our proposed framework scales linearly up to 32-nodes.","PeriodicalId":360013,"journal":{"name":"Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014)","volume":"138 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"Rolling window time series prediction using MapReduce\",\"authors\":\"Lei Li, Farzad Noorian, Duncan J. M. Moss, P. Leong\",\"doi\":\"10.1109/IRI.2014.7051965\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Prediction of time series data is an important application in many domains. Despite their advantages, traditional databases and MapReduce methodology are not ideally suited for this type of processing due to dependencies introduced by the sequential nature of time series. We present a novel framework to facilitate retrieval and rolling-window prediction of irregularly sampled large-scale time series data. By introducing a new index pool data structure, processing of time series can be efficiently parallelised. The proposed framework is implemented in R programming environment and utilises Hadoop to support parallelisation and fault tolerance. Experimental results indicate our proposed framework scales linearly up to 32-nodes.\",\"PeriodicalId\":360013,\"journal\":{\"name\":\"Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014)\",\"volume\":\"138 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IRI.2014.7051965\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2014 IEEE 15th International Conference on Information Reuse and Integration (IEEE IRI 2014)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRI.2014.7051965","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Rolling window time series prediction using MapReduce
Prediction of time series data is an important application in many domains. Despite their advantages, traditional databases and MapReduce methodology are not ideally suited for this type of processing due to dependencies introduced by the sequential nature of time series. We present a novel framework to facilitate retrieval and rolling-window prediction of irregularly sampled large-scale time series data. By introducing a new index pool data structure, processing of time series can be efficiently parallelised. The proposed framework is implemented in R programming environment and utilises Hadoop to support parallelisation and fault tolerance. Experimental results indicate our proposed framework scales linearly up to 32-nodes.