{"title":"基于Hadoop的基于云的数据挖掘系统评估模型","authors":"Anil Limbu, S. Heiyanthuduwage","doi":"10.1109/CITISIA50690.2020.9371799","DOIUrl":null,"url":null,"abstract":"The traditional approach of mining is more expensive, slow, and is inefficient in case of big data. This calls the essence of cloud technology which has the capability of discovering knowledge from a huge database at a very high rate. The implementation of Hadoop technology makes the processing more efficient because of its underlying characteristic of parallelism and data locality. The aim is to have a system based on a review that resembles the most efficient cloud data mining technology. The system should have capabilities to mine big data and have greater application whilst address the problems of existing mining technologies. In doing so, the existing technologies described by some of the relevant works were taken to achieve the overall framework. Reviews of related works were performed for a better understanding of the existing technology on cloud data mining. Based on the references, some algorithms perform better in any given circumstance. The scalability, parallelism, and cost-effectiveness play a significant role in making the system more efficient. The data locality feature of Hadoop gives a maximum optimization in the mining process. Data mining is not a single task, and there is nothing like one algorithm fits all the tasks of mining procedures. The assumptions and given circumstance of data mining will define the accuracy of mining and overall performance. Data type and tasks are always changing which indicates the essence in dynamic algorithms and techniques of data mining.","PeriodicalId":145272,"journal":{"name":"2020 5th International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications (CITISIA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An evaluation model for Cloud-based Data mining Systems with Hadoop\",\"authors\":\"Anil Limbu, S. Heiyanthuduwage\",\"doi\":\"10.1109/CITISIA50690.2020.9371799\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The traditional approach of mining is more expensive, slow, and is inefficient in case of big data. This calls the essence of cloud technology which has the capability of discovering knowledge from a huge database at a very high rate. The implementation of Hadoop technology makes the processing more efficient because of its underlying characteristic of parallelism and data locality. The aim is to have a system based on a review that resembles the most efficient cloud data mining technology. The system should have capabilities to mine big data and have greater application whilst address the problems of existing mining technologies. In doing so, the existing technologies described by some of the relevant works were taken to achieve the overall framework. Reviews of related works were performed for a better understanding of the existing technology on cloud data mining. Based on the references, some algorithms perform better in any given circumstance. The scalability, parallelism, and cost-effectiveness play a significant role in making the system more efficient. The data locality feature of Hadoop gives a maximum optimization in the mining process. Data mining is not a single task, and there is nothing like one algorithm fits all the tasks of mining procedures. The assumptions and given circumstance of data mining will define the accuracy of mining and overall performance. Data type and tasks are always changing which indicates the essence in dynamic algorithms and techniques of data mining.\",\"PeriodicalId\":145272,\"journal\":{\"name\":\"2020 5th International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications (CITISIA)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 5th International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications (CITISIA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CITISIA50690.2020.9371799\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 5th International Conference on Innovative Technologies in Intelligent Systems and Industrial Applications (CITISIA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CITISIA50690.2020.9371799","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An evaluation model for Cloud-based Data mining Systems with Hadoop
The traditional approach of mining is more expensive, slow, and is inefficient in case of big data. This calls the essence of cloud technology which has the capability of discovering knowledge from a huge database at a very high rate. The implementation of Hadoop technology makes the processing more efficient because of its underlying characteristic of parallelism and data locality. The aim is to have a system based on a review that resembles the most efficient cloud data mining technology. The system should have capabilities to mine big data and have greater application whilst address the problems of existing mining technologies. In doing so, the existing technologies described by some of the relevant works were taken to achieve the overall framework. Reviews of related works were performed for a better understanding of the existing technology on cloud data mining. Based on the references, some algorithms perform better in any given circumstance. The scalability, parallelism, and cost-effectiveness play a significant role in making the system more efficient. The data locality feature of Hadoop gives a maximum optimization in the mining process. Data mining is not a single task, and there is nothing like one algorithm fits all the tasks of mining procedures. The assumptions and given circumstance of data mining will define the accuracy of mining and overall performance. Data type and tasks are always changing which indicates the essence in dynamic algorithms and techniques of data mining.