I. Borlea, R. Precup, Florin Dragan, Alexandra-Bianca Borlea
{"title":"Parallel Implementation of K-Means Algorithm Using MapReduce Approach","authors":"I. Borlea, R. Precup, Florin Dragan, Alexandra-Bianca Borlea","doi":"10.1109/SACI.2018.8441018","DOIUrl":null,"url":null,"abstract":"The information stored in a database can be processed for finding patterns, group the records in classes of records using a criterion or extracting valuable information that is hidden between database records. The artificial intelligence domain is used to analyze big volumes of data using special algorithms designed to handle a lot of information. The time needed by a dataset analysis algorithm to process a dataset usually increases with the size of the processed dataset. Giving the fact that the hardware components have evolved in the last years, the dataset analysis algorithms can be parallelized nowadays. This paper presents a parallel implementation of the K-means clustering algorithm on a Windows based operating systems using the MapReduce approach.","PeriodicalId":126087,"journal":{"name":"2018 IEEE 12th International Symposium on Applied Computational Intelligence and Informatics (SACI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 12th International Symposium on Applied Computational Intelligence and Informatics (SACI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SACI.2018.8441018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
The information stored in a database can be processed for finding patterns, group the records in classes of records using a criterion or extracting valuable information that is hidden between database records. The artificial intelligence domain is used to analyze big volumes of data using special algorithms designed to handle a lot of information. The time needed by a dataset analysis algorithm to process a dataset usually increases with the size of the processed dataset. Giving the fact that the hardware components have evolved in the last years, the dataset analysis algorithms can be parallelized nowadays. This paper presents a parallel implementation of the K-means clustering algorithm on a Windows based operating systems using the MapReduce approach.