Sample-based forecasting exploiting hierarchical time series

Ulrike Fischer, Frank Rosenthal, Wolfgang Lehner
{"title":"Sample-based forecasting exploiting hierarchical time series","authors":"Ulrike Fischer, Frank Rosenthal, Wolfgang Lehner","doi":"10.1145/2351476.2351490","DOIUrl":null,"url":null,"abstract":"Time series forecasting is challenging as sophisticated forecast models are computationally expensive to build. Recent research has addressed the integration of forecasting inside a DBMS. One main benefit is that models can be created once and then repeatedly used to answer forecast queries. Often forecast queries are submitted on higher aggregation levels, e. g., forecasts of sales over all locations. To answer such a forecast query, we have two possibilities. First, we can aggregate all base time series (sales in Austria, sales in Belgium...) and create only one model for the aggregate time series. Second, we can create models for all base time series and aggregate the base forecast values. The second possibility might lead to a higher accuracy but it is usually too expensive due to a high number of base time series. However, we actually do not need all base models to achieve a high accuracy, a sample of base models is enough. With this approach, we still achieve a better accuracy than an aggregate model, very similar to using all models, but we need less models to create and maintain in the database. We further improve this approach if new actual values of the base time series arrive at different points in time. With each new actual value we can refine the aggregate forecast and eventually converge towards the real actual value. Our experimental evaluation using several real-world data sets, shows a high accuracy of our approaches and a fast convergence towards the optimal value with increasing sample sizes and increasing number of actual values respectively.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"34 1","pages":"120-129"},"PeriodicalIF":0.0000,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. International Database Engineering and Applications Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2351476.2351490","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Time series forecasting is challenging as sophisticated forecast models are computationally expensive to build. Recent research has addressed the integration of forecasting inside a DBMS. One main benefit is that models can be created once and then repeatedly used to answer forecast queries. Often forecast queries are submitted on higher aggregation levels, e. g., forecasts of sales over all locations. To answer such a forecast query, we have two possibilities. First, we can aggregate all base time series (sales in Austria, sales in Belgium...) and create only one model for the aggregate time series. Second, we can create models for all base time series and aggregate the base forecast values. The second possibility might lead to a higher accuracy but it is usually too expensive due to a high number of base time series. However, we actually do not need all base models to achieve a high accuracy, a sample of base models is enough. With this approach, we still achieve a better accuracy than an aggregate model, very similar to using all models, but we need less models to create and maintain in the database. We further improve this approach if new actual values of the base time series arrive at different points in time. With each new actual value we can refine the aggregate forecast and eventually converge towards the real actual value. Our experimental evaluation using several real-world data sets, shows a high accuracy of our approaches and a fast convergence towards the optimal value with increasing sample sizes and increasing number of actual values respectively.
利用分层时间序列的基于样本的预测
时间序列预测是具有挑战性的,因为复杂的预测模型在计算上是昂贵的。最近的研究解决了在DBMS中集成预测的问题。一个主要的好处是,模型可以创建一次,然后重复使用,以回答预测查询。通常,预测查询是在更高的聚合级别上提交的,例如,所有地点的销售预测。要回答这样一个预测问题,我们有两种可能。首先,我们可以聚合所有基本时间序列(奥地利的销售、比利时的销售……),并为聚合时间序列只创建一个模型。其次,我们可以为所有基本时间序列创建模型并汇总基本预测值。第二种可能性可能会导致更高的精度,但由于大量的基本时间序列,它通常过于昂贵。然而,我们实际上并不需要所有的基础模型来达到很高的精度,一个基础模型的样本就足够了。使用这种方法,我们仍然可以获得比聚合模型更好的准确性,这与使用所有模型非常相似,但是我们需要在数据库中创建和维护的模型更少。如果新的基本时间序列的实际值到达不同的时间点,我们将进一步改进这种方法。对于每一个新的实际值,我们都可以对总体预测进行细化,并最终收敛于真实的实际值。我们使用几个真实数据集进行的实验评估表明,我们的方法具有很高的准确性,并且随着样本量的增加和实际值数量的增加,我们的方法能够快速收敛到最优值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信