A mapreduce fuzzy techniques of big data classification

2016 SAI Computing Conference (SAI) Pub Date : 2016-07-13 DOI:10.1109/SAI.2016.7555971

O. Hegazy, Soha Safwat, M. El Bakry

{"title":"A mapreduce fuzzy techniques of big data classification","authors":"O. Hegazy, Soha Safwat, M. El Bakry","doi":"10.1109/SAI.2016.7555971","DOIUrl":null,"url":null,"abstract":"Due to the huge increase in the size of the data it becomes troublesome to perform efficient analysis using the current traditional techniques. Big data put forward a lot of challenges due to its several characteristics like volume, velocity, variety, variability, value and complexity. Today there is not only a necessity for efficient data mining techniques to process large volume of data but in addition a need for a means to meet the computational requirements to process such huge volume of data. The objective of this research is to implement a map reduce paradigm using fuzzy and crisp techniques, and to provide a comparative study between the results of the proposed systems and the methods reviewed in the literature. In this paper four proposed system is implemented using the map reduce paradigm to process on big data. First, in the mapper there are two techniques used; the fuzzy k-nearest neighbor method as a fuzzy technique and the support vector machine as non-fuzzy technique. Second, in the reducer there are three techniques used; the mode, the fuzzy soft labels and Gaussian fuzzy membership function. The first proposed system is using the fuzzy KNN in the mapper and the mode in the reducer, the second proposed system is using the SVM in the mapper and the mode in the reducer, the third proposed system is using the SVM in the mapper and the soft labels in the reducer, and the fourth proposed system is using the SVM in the mapper and fuzzy Gaussian membership function in the reducer. Results on different data sets show that the fuzzy proposed methods outperform a better performance than the crisp proposed method and the method reviewed in the literature.","PeriodicalId":219896,"journal":{"name":"2016 SAI Computing Conference (SAI)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 SAI Computing Conference (SAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SAI.2016.7555971","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

Abstract

Due to the huge increase in the size of the data it becomes troublesome to perform efficient analysis using the current traditional techniques. Big data put forward a lot of challenges due to its several characteristics like volume, velocity, variety, variability, value and complexity. Today there is not only a necessity for efficient data mining techniques to process large volume of data but in addition a need for a means to meet the computational requirements to process such huge volume of data. The objective of this research is to implement a map reduce paradigm using fuzzy and crisp techniques, and to provide a comparative study between the results of the proposed systems and the methods reviewed in the literature. In this paper four proposed system is implemented using the map reduce paradigm to process on big data. First, in the mapper there are two techniques used; the fuzzy k-nearest neighbor method as a fuzzy technique and the support vector machine as non-fuzzy technique. Second, in the reducer there are three techniques used; the mode, the fuzzy soft labels and Gaussian fuzzy membership function. The first proposed system is using the fuzzy KNN in the mapper and the mode in the reducer, the second proposed system is using the SVM in the mapper and the mode in the reducer, the third proposed system is using the SVM in the mapper and the soft labels in the reducer, and the fourth proposed system is using the SVM in the mapper and fuzzy Gaussian membership function in the reducer. Results on different data sets show that the fuzzy proposed methods outperform a better performance than the crisp proposed method and the method reviewed in the literature.

查看原文本刊更多论文

基于mapreduce的大数据模糊分类技术

由于数据量的巨大增加，使用现有的传统技术进行有效的分析变得很麻烦。大数据由于其体积、速度、多样性、可变性、价值和复杂性等特点，给我们带来了很多挑战。今天，不仅需要高效的数据挖掘技术来处理大量数据，而且还需要一种方法来满足处理如此大量数据的计算需求。本研究的目的是利用模糊和清晰的技术实现一个地图简化范式，并在所提出的系统的结果与文献中综述的方法之间提供一个比较研究。本文采用映射约简范式实现了对大数据的处理。首先，在映射器中使用了两种技术;采用模糊k近邻法作为模糊技术，支持向量机作为非模糊技术。第二，在减速机中有三种技术使用;模型、模糊软标签和高斯模糊隶属函数。第一个系统是在映射器中使用模糊KNN，在减速器中使用模式;第二个系统是在映射器中使用SVM，在减速器中使用模式;第三个系统是在映射器中使用SVM，在减速器中使用软标签;第四个系统是在映射器中使用SVM，在减速器中使用模糊高斯隶属函数。在不同数据集上的结果表明，模糊建议方法的性能优于清晰建议方法和文献综述的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 SAI Computing Conference (SAI)

自引率

0.00%

发文量