基于Hadoop的基于云的数据挖掘系统评估模型

2020 5th International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications (CITISIA) Pub Date : 2020-11-25 DOI:10.1109/CITISIA50690.2020.9371799

Anil Limbu, S. Heiyanthuduwage

{"title":"基于Hadoop的基于云的数据挖掘系统评估模型","authors":"Anil Limbu, S. Heiyanthuduwage","doi":"10.1109/CITISIA50690.2020.9371799","DOIUrl":null,"url":null,"abstract":"The traditional approach of mining is more expensive, slow, and is inefficient in case of big data. This calls the essence of cloud technology which has the capability of discovering knowledge from a huge database at a very high rate. The implementation of Hadoop technology makes the processing more efficient because of its underlying characteristic of parallelism and data locality. The aim is to have a system based on a review that resembles the most efficient cloud data mining technology. The system should have capabilities to mine big data and have greater application whilst address the problems of existing mining technologies. In doing so, the existing technologies described by some of the relevant works were taken to achieve the overall framework. Reviews of related works were performed for a better understanding of the existing technology on cloud data mining. Based on the references, some algorithms perform better in any given circumstance. The scalability, parallelism, and cost-effectiveness play a significant role in making the system more efficient. The data locality feature of Hadoop gives a maximum optimization in the mining process. Data mining is not a single task, and there is nothing like one algorithm fits all the tasks of mining procedures. The assumptions and given circumstance of data mining will define the accuracy of mining and overall performance. Data type and tasks are always changing which indicates the essence in dynamic algorithms and techniques of data mining.","PeriodicalId":145272,"journal":{"name":"2020 5th International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications (CITISIA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An evaluation model for Cloud-based Data mining Systems with Hadoop\",\"authors\":\"Anil Limbu, S. Heiyanthuduwage\",\"doi\":\"10.1109/CITISIA50690.2020.9371799\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The traditional approach of mining is more expensive, slow, and is inefficient in case of big data. This calls the essence of cloud technology which has the capability of discovering knowledge from a huge database at a very high rate. The implementation of Hadoop technology makes the processing more efficient because of its underlying characteristic of parallelism and data locality. The aim is to have a system based on a review that resembles the most efficient cloud data mining technology. The system should have capabilities to mine big data and have greater application whilst address the problems of existing mining technologies. In doing so, the existing technologies described by some of the relevant works were taken to achieve the overall framework. Reviews of related works were performed for a better understanding of the existing technology on cloud data mining. Based on the references, some algorithms perform better in any given circumstance. The scalability, parallelism, and cost-effectiveness play a significant role in making the system more efficient. The data locality feature of Hadoop gives a maximum optimization in the mining process. Data mining is not a single task, and there is nothing like one algorithm fits all the tasks of mining procedures. The assumptions and given circumstance of data mining will define the accuracy of mining and overall performance. Data type and tasks are always changing which indicates the essence in dynamic algorithms and techniques of data mining.\",\"PeriodicalId\":145272,\"journal\":{\"name\":\"2020 5th International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications (CITISIA)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 5th International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications (CITISIA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CITISIA50690.2020.9371799\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 5th International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications (CITISIA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CITISIA50690.2020.9371799","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

传统的挖掘方法更昂贵，速度慢，并且在大数据的情况下效率低下。这就是云技术的本质，它具有从庞大的数据库中以非常高的速度发现知识的能力。Hadoop技术的实现由于其潜在的并行性和数据局部性特性使得处理更加高效。其目的是建立一个基于审查的系统，类似于最有效的云数据挖掘技术。该系统应具有挖掘大数据的能力和更大的应用范围，同时解决现有挖掘技术的问题。在此过程中，采用了一些相关作品所描述的现有技术来实现总体框架。对相关工作进行了审查，以便更好地了解云数据挖掘的现有技术。在参考文献的基础上，一些算法在任何给定情况下都表现得更好。可伸缩性、并行性和成本效益在提高系统效率方面发挥了重要作用。Hadoop的数据局部性为挖掘过程提供了最大限度的优化。数据挖掘不是一个单一的任务，没有一种算法适合所有挖掘过程的任务。数据挖掘的假设和给定环境将定义挖掘的准确性和整体性能。数据类型和任务是不断变化的，这是动态数据挖掘算法和技术的本质。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An evaluation model for Cloud-based Data mining Systems with Hadoop

The traditional approach of mining is more expensive, slow, and is inefficient in case of big data. This calls the essence of cloud technology which has the capability of discovering knowledge from a huge database at a very high rate. The implementation of Hadoop technology makes the processing more efficient because of its underlying characteristic of parallelism and data locality. The aim is to have a system based on a review that resembles the most efficient cloud data mining technology. The system should have capabilities to mine big data and have greater application whilst address the problems of existing mining technologies. In doing so, the existing technologies described by some of the relevant works were taken to achieve the overall framework. Reviews of related works were performed for a better understanding of the existing technology on cloud data mining. Based on the references, some algorithms perform better in any given circumstance. The scalability, parallelism, and cost-effectiveness play a significant role in making the system more efficient. The data locality feature of Hadoop gives a maximum optimization in the mining process. Data mining is not a single task, and there is nothing like one algorithm fits all the tasks of mining procedures. The assumptions and given circumstance of data mining will define the accuracy of mining and overall performance. Data type and tasks are always changing which indicates the essence in dynamic algorithms and techniques of data mining.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 5th International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications (CITISIA)

自引率

0.00%

发文量