{"title":"一种基于MapReduce的性别分类方法","authors":"Tong Cui, Haifeng Zhao","doi":"10.1109/ICSESS.2015.7339163","DOIUrl":null,"url":null,"abstract":"A novel parallelize gender recognition method with MapReduce is presented, which successfully comprise several machine leaning algorithms which are employed for gender recognition. The mass of face sample images are gathered and separated as train dataset and test dataset, and Local Binary Pattern (LBP) features are extracted when those sample sets are pre-processed and made ready for following operations. And Principle Component Analysis (PCA) is applied to train dataset to extract the most distinguishing features. Three classification algorithms: Support Vector Machine(SVM), k-Nearest Neighborhood (k-NN) and Adaboost are implemented and compared to determine the most suitable and successful algorithm for gender parallelize machine learning (GPML). To achieve the shortest execution time, we propose to apply GPML with MapReduce to avoid parallelizing above three algorithms while also improving their scalability to big datasets. The results show that this method reduces the training computational complexity significantly when the number of computing nodes increases while gaining better speedup rates and extending performance than those on parallelize Adaboost.","PeriodicalId":335871,"journal":{"name":"2015 6th IEEE International Conference on Software Engineering and Service Science (ICSESS)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A novel gender classification method based on MapReduce\",\"authors\":\"Tong Cui, Haifeng Zhao\",\"doi\":\"10.1109/ICSESS.2015.7339163\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A novel parallelize gender recognition method with MapReduce is presented, which successfully comprise several machine leaning algorithms which are employed for gender recognition. The mass of face sample images are gathered and separated as train dataset and test dataset, and Local Binary Pattern (LBP) features are extracted when those sample sets are pre-processed and made ready for following operations. And Principle Component Analysis (PCA) is applied to train dataset to extract the most distinguishing features. Three classification algorithms: Support Vector Machine(SVM), k-Nearest Neighborhood (k-NN) and Adaboost are implemented and compared to determine the most suitable and successful algorithm for gender parallelize machine learning (GPML). To achieve the shortest execution time, we propose to apply GPML with MapReduce to avoid parallelizing above three algorithms while also improving their scalability to big datasets. The results show that this method reduces the training computational complexity significantly when the number of computing nodes increases while gaining better speedup rates and extending performance than those on parallelize Adaboost.\",\"PeriodicalId\":335871,\"journal\":{\"name\":\"2015 6th IEEE International Conference on Software Engineering and Service Science (ICSESS)\",\"volume\":\"56 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 6th IEEE International Conference on Software Engineering and Service Science (ICSESS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSESS.2015.7339163\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 6th IEEE International Conference on Software Engineering and Service Science (ICSESS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSESS.2015.7339163","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A novel gender classification method based on MapReduce
A novel parallelize gender recognition method with MapReduce is presented, which successfully comprise several machine leaning algorithms which are employed for gender recognition. The mass of face sample images are gathered and separated as train dataset and test dataset, and Local Binary Pattern (LBP) features are extracted when those sample sets are pre-processed and made ready for following operations. And Principle Component Analysis (PCA) is applied to train dataset to extract the most distinguishing features. Three classification algorithms: Support Vector Machine(SVM), k-Nearest Neighborhood (k-NN) and Adaboost are implemented and compared to determine the most suitable and successful algorithm for gender parallelize machine learning (GPML). To achieve the shortest execution time, we propose to apply GPML with MapReduce to avoid parallelizing above three algorithms while also improving their scalability to big datasets. The results show that this method reduces the training computational complexity significantly when the number of computing nodes increases while gaining better speedup rates and extending performance than those on parallelize Adaboost.