{"title":"Feature Reduction for Anomaly Detection in Manufacturing with MapReduce GA/kNN","authors":"Sikana Tanupabrungsun, T. Achalakul","doi":"10.1109/ICPADS.2013.114","DOIUrl":null,"url":null,"abstract":"Manufacturing data is an important source of knowledge that can be used to enhance the production capability. The detection of the causes of defects may possibly lead to an improvement in production. However, the production records generally contain an enormous set of features. It is almost impossible in practice to monitor all features at once. This research proposes the feature reduction technique, which is designed to identify a subset of informative features that are representatives of the whole dataset. In our methodology, manufacturing data are pre-processed and adopted as inputs. Subsequently, the feature selection process is performed by wrapping Genetic Algorithm (GA) with the k-Nearest Neighborhood (kNN) classifier. To improve the performance, the proposed technique was parallelized with MapReduce. The results show that the number of features can be reduced by 50% with 83.12% accuracy. In addition, with MapReduce on the cloud, the performance can be increased by 17.5 times.","PeriodicalId":160979,"journal":{"name":"2013 International Conference on Parallel and Distributed Systems","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Parallel and Distributed Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPADS.2013.114","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Manufacturing data is an important source of knowledge that can be used to enhance the production capability. The detection of the causes of defects may possibly lead to an improvement in production. However, the production records generally contain an enormous set of features. It is almost impossible in practice to monitor all features at once. This research proposes the feature reduction technique, which is designed to identify a subset of informative features that are representatives of the whole dataset. In our methodology, manufacturing data are pre-processed and adopted as inputs. Subsequently, the feature selection process is performed by wrapping Genetic Algorithm (GA) with the k-Nearest Neighborhood (kNN) classifier. To improve the performance, the proposed technique was parallelized with MapReduce. The results show that the number of features can be reduced by 50% with 83.12% accuracy. In addition, with MapReduce on the cloud, the performance can be increased by 17.5 times.