时间序列、决策树和聚类的结合:以气象事件预测为例

S.B. Lajevardi, B. Minaei-Bidgoli
{"title":"时间序列、决策树和聚类的结合:以气象事件预测为例","authors":"S.B. Lajevardi, B. Minaei-Bidgoli","doi":"10.1109/ICCEE.2008.110","DOIUrl":null,"url":null,"abstract":"Predictive systems use historical and other available data to predict an event. In this paper we propose a general framework to predict the Aerology events with time series streams and events stream using combination of K-means clustering algorithm and Decision Tree C5 algorithm. Firstly, we find the closest time series record for any events; therefore, we have gathered different parameters value when an event is occurring. Using K-means we add a field to data set which determines the cluster of each record after that by using C5 algorithm we predict events. C5 Decision Tree Algorithm is one of the well-known Decision Tree Algorithms. This framework and time series model can predict future events efficiently. We gathered 1961 until 2005 data of aerology organization for Tehran Mehrabad Station. This data contains some fields such as wet bulb, relative humidity, amount of cloud, wind speed and etc. This data set includes 17 types of events. Time series models can predict next time series parameters value and by using this Framework the closest event can be predicted. The C5 method is able to predict Events with Correct 74.11 percent and Wrong 25.89 percent. But with the aims of K-means clustering algorithm the prediction increase to 85 percent and wrong to 15 percent. 90 percent of data was used for training set and 10 percent for test set. We use 10-fold cross validation to evaluate our prediction rate. This framework is the first estimation in the area of event prediction for a huge data set of aerology and can be extended in many different data sets in any other environments.","PeriodicalId":365473,"journal":{"name":"2008 International Conference on Computer and Electrical Engineering","volume":"264 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Combination of Time Series, Decision Tree and Clustering: A Case Study in Aerology Event Prediction\",\"authors\":\"S.B. Lajevardi, B. Minaei-Bidgoli\",\"doi\":\"10.1109/ICCEE.2008.110\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Predictive systems use historical and other available data to predict an event. In this paper we propose a general framework to predict the Aerology events with time series streams and events stream using combination of K-means clustering algorithm and Decision Tree C5 algorithm. Firstly, we find the closest time series record for any events; therefore, we have gathered different parameters value when an event is occurring. Using K-means we add a field to data set which determines the cluster of each record after that by using C5 algorithm we predict events. C5 Decision Tree Algorithm is one of the well-known Decision Tree Algorithms. This framework and time series model can predict future events efficiently. We gathered 1961 until 2005 data of aerology organization for Tehran Mehrabad Station. This data contains some fields such as wet bulb, relative humidity, amount of cloud, wind speed and etc. This data set includes 17 types of events. Time series models can predict next time series parameters value and by using this Framework the closest event can be predicted. The C5 method is able to predict Events with Correct 74.11 percent and Wrong 25.89 percent. But with the aims of K-means clustering algorithm the prediction increase to 85 percent and wrong to 15 percent. 90 percent of data was used for training set and 10 percent for test set. We use 10-fold cross validation to evaluate our prediction rate. This framework is the first estimation in the area of event prediction for a huge data set of aerology and can be extended in many different data sets in any other environments.\",\"PeriodicalId\":365473,\"journal\":{\"name\":\"2008 International Conference on Computer and Electrical Engineering\",\"volume\":\"264 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-12-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 International Conference on Computer and Electrical Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCEE.2008.110\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Conference on Computer and Electrical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCEE.2008.110","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

预测系统使用历史和其他可用数据来预测事件。本文结合k均值聚类算法和决策树C5算法,提出了一种利用时间序列流和事件流预测气象事件的通用框架。首先,我们找到任何事件最接近的时间序列记录;因此,我们在事件发生时收集了不同的参数值。我们使用K-means向数据集添加一个字段,该字段确定每个记录的集群,然后使用C5算法预测事件。C5决策树算法是一种著名的决策树算法。该框架和时间序列模型可以有效地预测未来事件。我们收集了德黑兰Mehrabad气象站1961 - 2005年气象组织的数据。该数据包含湿球数、相对湿度、云量、风速等字段。该数据集包括17种类型的事件。时间序列模型可以预测下一个时间序列参数值,并通过使用该框架可以预测最接近的事件。C5方法预测事件的正确率为74.11%,错误率为25.89%。但随着k均值聚类算法的目标,预测增加到85%,错误率增加到15%。90%的数据用于训练集,10%用于测试集。我们使用10倍交叉验证来评估我们的预测率。该框架是对大型气象数据集事件预测领域的第一个估计,可以扩展到任何其他环境下的许多不同数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Combination of Time Series, Decision Tree and Clustering: A Case Study in Aerology Event Prediction
Predictive systems use historical and other available data to predict an event. In this paper we propose a general framework to predict the Aerology events with time series streams and events stream using combination of K-means clustering algorithm and Decision Tree C5 algorithm. Firstly, we find the closest time series record for any events; therefore, we have gathered different parameters value when an event is occurring. Using K-means we add a field to data set which determines the cluster of each record after that by using C5 algorithm we predict events. C5 Decision Tree Algorithm is one of the well-known Decision Tree Algorithms. This framework and time series model can predict future events efficiently. We gathered 1961 until 2005 data of aerology organization for Tehran Mehrabad Station. This data contains some fields such as wet bulb, relative humidity, amount of cloud, wind speed and etc. This data set includes 17 types of events. Time series models can predict next time series parameters value and by using this Framework the closest event can be predicted. The C5 method is able to predict Events with Correct 74.11 percent and Wrong 25.89 percent. But with the aims of K-means clustering algorithm the prediction increase to 85 percent and wrong to 15 percent. 90 percent of data was used for training set and 10 percent for test set. We use 10-fold cross validation to evaluate our prediction rate. This framework is the first estimation in the area of event prediction for a huge data set of aerology and can be extended in many different data sets in any other environments.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信