流数据挖掘技术的比较研究

S. Khan, Mushtaq Ahmed Peer, S. Quadri
{"title":"流数据挖掘技术的比较研究","authors":"S. Khan, Mushtaq Ahmed Peer, S. Quadri","doi":"10.1109/INDIACOM.2014.6828129","DOIUrl":null,"url":null,"abstract":"In order to extract fresh knowledge out of the data present in a data warehouse, a wide range of knowledge discovery techniques have been provided that process the data in multiple passes. But nowadays, we are facing a challenge of handling massive data in a proper and timely manner so as to extract useful information (knowledge) from streaming data. Such massive streaming data cannot be stored in our limited storage and due to its continuous flow we need to process it in single pass. Various algorithms have been provided in order to perform the single pass extraction of knowledge from streaming data; however, no single data mining algorithm can be used applicably for all the problems because of the different kinds of real data sets or synthetic data sets. This paper discusses various streaming data mining techniques and compares the algorithms taking into consideration some evaluation measures in an attempt to find the optimal solution for the generated synthetic data set.","PeriodicalId":404873,"journal":{"name":"2014 International Conference on Computing for Sustainable Global Development (INDIACom)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Comparative study of streaming data mining techniques\",\"authors\":\"S. Khan, Mushtaq Ahmed Peer, S. Quadri\",\"doi\":\"10.1109/INDIACOM.2014.6828129\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to extract fresh knowledge out of the data present in a data warehouse, a wide range of knowledge discovery techniques have been provided that process the data in multiple passes. But nowadays, we are facing a challenge of handling massive data in a proper and timely manner so as to extract useful information (knowledge) from streaming data. Such massive streaming data cannot be stored in our limited storage and due to its continuous flow we need to process it in single pass. Various algorithms have been provided in order to perform the single pass extraction of knowledge from streaming data; however, no single data mining algorithm can be used applicably for all the problems because of the different kinds of real data sets or synthetic data sets. This paper discusses various streaming data mining techniques and compares the algorithms taking into consideration some evaluation measures in an attempt to find the optimal solution for the generated synthetic data set.\",\"PeriodicalId\":404873,\"journal\":{\"name\":\"2014 International Conference on Computing for Sustainable Global Development (INDIACom)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-03-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference on Computing for Sustainable Global Development (INDIACom)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INDIACOM.2014.6828129\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Computing for Sustainable Global Development (INDIACom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INDIACOM.2014.6828129","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

为了从数据仓库中的数据中提取新的知识,已经提供了一系列分多次处理数据的知识发现技术。但是,如何正确及时地处理海量数据,从流数据中提取有用的信息(知识),是我们面临的挑战。如此庞大的流数据无法存储在我们有限的存储中,由于其连续流,我们需要一次处理。为了从流数据中单次提取知识,提供了各种算法;然而,由于真实数据集和合成数据集的种类不同,没有一种单一的数据挖掘算法可以适用于所有问题。本文讨论了各种流数据挖掘技术,并对这些算法进行了比较,并考虑了一些评估措施,试图找到生成的合成数据集的最优解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparative study of streaming data mining techniques
In order to extract fresh knowledge out of the data present in a data warehouse, a wide range of knowledge discovery techniques have been provided that process the data in multiple passes. But nowadays, we are facing a challenge of handling massive data in a proper and timely manner so as to extract useful information (knowledge) from streaming data. Such massive streaming data cannot be stored in our limited storage and due to its continuous flow we need to process it in single pass. Various algorithms have been provided in order to perform the single pass extraction of knowledge from streaming data; however, no single data mining algorithm can be used applicably for all the problems because of the different kinds of real data sets or synthetic data sets. This paper discusses various streaming data mining techniques and compares the algorithms taking into consideration some evaluation measures in an attempt to find the optimal solution for the generated synthetic data set.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信