基于Hadoop的基于云的数据挖掘系统评估模型

Anil Limbu, S. Heiyanthuduwage
{"title":"基于Hadoop的基于云的数据挖掘系统评估模型","authors":"Anil Limbu, S. Heiyanthuduwage","doi":"10.1109/CITISIA50690.2020.9371799","DOIUrl":null,"url":null,"abstract":"The traditional approach of mining is more expensive, slow, and is inefficient in case of big data. This calls the essence of cloud technology which has the capability of discovering knowledge from a huge database at a very high rate. The implementation of Hadoop technology makes the processing more efficient because of its underlying characteristic of parallelism and data locality. The aim is to have a system based on a review that resembles the most efficient cloud data mining technology. The system should have capabilities to mine big data and have greater application whilst address the problems of existing mining technologies. In doing so, the existing technologies described by some of the relevant works were taken to achieve the overall framework. Reviews of related works were performed for a better understanding of the existing technology on cloud data mining. Based on the references, some algorithms perform better in any given circumstance. The scalability, parallelism, and cost-effectiveness play a significant role in making the system more efficient. The data locality feature of Hadoop gives a maximum optimization in the mining process. Data mining is not a single task, and there is nothing like one algorithm fits all the tasks of mining procedures. The assumptions and given circumstance of data mining will define the accuracy of mining and overall performance. Data type and tasks are always changing which indicates the essence in dynamic algorithms and techniques of data mining.","PeriodicalId":145272,"journal":{"name":"2020 5th International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications (CITISIA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An evaluation model for Cloud-based Data mining Systems with Hadoop\",\"authors\":\"Anil Limbu, S. Heiyanthuduwage\",\"doi\":\"10.1109/CITISIA50690.2020.9371799\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The traditional approach of mining is more expensive, slow, and is inefficient in case of big data. This calls the essence of cloud technology which has the capability of discovering knowledge from a huge database at a very high rate. The implementation of Hadoop technology makes the processing more efficient because of its underlying characteristic of parallelism and data locality. The aim is to have a system based on a review that resembles the most efficient cloud data mining technology. The system should have capabilities to mine big data and have greater application whilst address the problems of existing mining technologies. In doing so, the existing technologies described by some of the relevant works were taken to achieve the overall framework. Reviews of related works were performed for a better understanding of the existing technology on cloud data mining. Based on the references, some algorithms perform better in any given circumstance. The scalability, parallelism, and cost-effectiveness play a significant role in making the system more efficient. The data locality feature of Hadoop gives a maximum optimization in the mining process. Data mining is not a single task, and there is nothing like one algorithm fits all the tasks of mining procedures. The assumptions and given circumstance of data mining will define the accuracy of mining and overall performance. Data type and tasks are always changing which indicates the essence in dynamic algorithms and techniques of data mining.\",\"PeriodicalId\":145272,\"journal\":{\"name\":\"2020 5th International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications (CITISIA)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 5th International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications (CITISIA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CITISIA50690.2020.9371799\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 5th International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications (CITISIA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CITISIA50690.2020.9371799","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

传统的挖掘方法更昂贵,速度慢,并且在大数据的情况下效率低下。这就是云技术的本质,它具有从庞大的数据库中以非常高的速度发现知识的能力。Hadoop技术的实现由于其潜在的并行性和数据局部性特性使得处理更加高效。其目的是建立一个基于审查的系统,类似于最有效的云数据挖掘技术。该系统应具有挖掘大数据的能力和更大的应用范围,同时解决现有挖掘技术的问题。在此过程中,采用了一些相关作品所描述的现有技术来实现总体框架。对相关工作进行了审查,以便更好地了解云数据挖掘的现有技术。在参考文献的基础上,一些算法在任何给定情况下都表现得更好。可伸缩性、并行性和成本效益在提高系统效率方面发挥了重要作用。Hadoop的数据局部性为挖掘过程提供了最大限度的优化。数据挖掘不是一个单一的任务,没有一种算法适合所有挖掘过程的任务。数据挖掘的假设和给定环境将定义挖掘的准确性和整体性能。数据类型和任务是不断变化的,这是动态数据挖掘算法和技术的本质。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An evaluation model for Cloud-based Data mining Systems with Hadoop
The traditional approach of mining is more expensive, slow, and is inefficient in case of big data. This calls the essence of cloud technology which has the capability of discovering knowledge from a huge database at a very high rate. The implementation of Hadoop technology makes the processing more efficient because of its underlying characteristic of parallelism and data locality. The aim is to have a system based on a review that resembles the most efficient cloud data mining technology. The system should have capabilities to mine big data and have greater application whilst address the problems of existing mining technologies. In doing so, the existing technologies described by some of the relevant works were taken to achieve the overall framework. Reviews of related works were performed for a better understanding of the existing technology on cloud data mining. Based on the references, some algorithms perform better in any given circumstance. The scalability, parallelism, and cost-effectiveness play a significant role in making the system more efficient. The data locality feature of Hadoop gives a maximum optimization in the mining process. Data mining is not a single task, and there is nothing like one algorithm fits all the tasks of mining procedures. The assumptions and given circumstance of data mining will define the accuracy of mining and overall performance. Data type and tasks are always changing which indicates the essence in dynamic algorithms and techniques of data mining.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信