{"title":"基于MapReduce GA/kNN的制造业异常检测特征约简","authors":"Sikana Tanupabrungsun, T. Achalakul","doi":"10.1109/ICPADS.2013.114","DOIUrl":null,"url":null,"abstract":"Manufacturing data is an important source of knowledge that can be used to enhance the production capability. The detection of the causes of defects may possibly lead to an improvement in production. However, the production records generally contain an enormous set of features. It is almost impossible in practice to monitor all features at once. This research proposes the feature reduction technique, which is designed to identify a subset of informative features that are representatives of the whole dataset. In our methodology, manufacturing data are pre-processed and adopted as inputs. Subsequently, the feature selection process is performed by wrapping Genetic Algorithm (GA) with the k-Nearest Neighborhood (kNN) classifier. To improve the performance, the proposed technique was parallelized with MapReduce. The results show that the number of features can be reduced by 50% with 83.12% accuracy. In addition, with MapReduce on the cloud, the performance can be increased by 17.5 times.","PeriodicalId":160979,"journal":{"name":"2013 International Conference on Parallel and Distributed Systems","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Feature Reduction for Anomaly Detection in Manufacturing with MapReduce GA/kNN\",\"authors\":\"Sikana Tanupabrungsun, T. Achalakul\",\"doi\":\"10.1109/ICPADS.2013.114\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Manufacturing data is an important source of knowledge that can be used to enhance the production capability. The detection of the causes of defects may possibly lead to an improvement in production. However, the production records generally contain an enormous set of features. It is almost impossible in practice to monitor all features at once. This research proposes the feature reduction technique, which is designed to identify a subset of informative features that are representatives of the whole dataset. In our methodology, manufacturing data are pre-processed and adopted as inputs. Subsequently, the feature selection process is performed by wrapping Genetic Algorithm (GA) with the k-Nearest Neighborhood (kNN) classifier. To improve the performance, the proposed technique was parallelized with MapReduce. The results show that the number of features can be reduced by 50% with 83.12% accuracy. In addition, with MapReduce on the cloud, the performance can be increased by 17.5 times.\",\"PeriodicalId\":160979,\"journal\":{\"name\":\"2013 International Conference on Parallel and Distributed Systems\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 International Conference on Parallel and Distributed Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPADS.2013.114\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Parallel and Distributed Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPADS.2013.114","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Feature Reduction for Anomaly Detection in Manufacturing with MapReduce GA/kNN
Manufacturing data is an important source of knowledge that can be used to enhance the production capability. The detection of the causes of defects may possibly lead to an improvement in production. However, the production records generally contain an enormous set of features. It is almost impossible in practice to monitor all features at once. This research proposes the feature reduction technique, which is designed to identify a subset of informative features that are representatives of the whole dataset. In our methodology, manufacturing data are pre-processed and adopted as inputs. Subsequently, the feature selection process is performed by wrapping Genetic Algorithm (GA) with the k-Nearest Neighborhood (kNN) classifier. To improve the performance, the proposed technique was parallelized with MapReduce. The results show that the number of features can be reduced by 50% with 83.12% accuracy. In addition, with MapReduce on the cloud, the performance can be increased by 17.5 times.