{"title":"面向预测性维护的大工业物联网数据特征和实例同时选择的混合模因算法","authors":"Yu-Lin Liang, Chih-Chi Kuo, Chun-Cheng Lin","doi":"10.1109/INDIN41052.2019.8972199","DOIUrl":null,"url":null,"abstract":"In Industry 4.0, various types of IoT sensors which are installed on machines to collect data for predictive maintenance. As the collected data increases, there are more missing values and noisy data. Related studies have already proposed various methods to solve the problems in big data. Among them, most studies focused on either feature selection or instance selection for data preprocessing before training forecast models. Metaheuristic algorithm is one of the mainstream methods in data preprocessing. However, most of these studies rarely considered feature and instance selection simultaneously. In addition, they seldom focused on noisy data. Therefore, this work combines the UCI datasets with noisy data to simulate the real situation. Memetic algorithm (MA) has excellent performance in machine learning of data selection, and variable neighborhood search (VNS) was also proved to be widely applied to the systematic change of local search algorithms. This work proposes a hybrid MA and VNS to find a new subset that maximizes the accuracy of the classifier while preserving the minimum amount of data by feature and instance selection simultaneously. Experimental results show that the proposed method can efficiently reduce the amount of data and the ratio of noisy data. By comparison with other metaheuristic algorithms, the proposed method has good performance by an excellent balance between exploration and exploitation.","PeriodicalId":260220,"journal":{"name":"2019 IEEE 17th International Conference on Industrial Informatics (INDIN)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A Hybrid Memetic Algorithm for Simultaneously Selecting Features and Instances in Big Industrial IoT Data for Predictive Maintenance\",\"authors\":\"Yu-Lin Liang, Chih-Chi Kuo, Chun-Cheng Lin\",\"doi\":\"10.1109/INDIN41052.2019.8972199\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In Industry 4.0, various types of IoT sensors which are installed on machines to collect data for predictive maintenance. As the collected data increases, there are more missing values and noisy data. Related studies have already proposed various methods to solve the problems in big data. Among them, most studies focused on either feature selection or instance selection for data preprocessing before training forecast models. Metaheuristic algorithm is one of the mainstream methods in data preprocessing. However, most of these studies rarely considered feature and instance selection simultaneously. In addition, they seldom focused on noisy data. Therefore, this work combines the UCI datasets with noisy data to simulate the real situation. Memetic algorithm (MA) has excellent performance in machine learning of data selection, and variable neighborhood search (VNS) was also proved to be widely applied to the systematic change of local search algorithms. This work proposes a hybrid MA and VNS to find a new subset that maximizes the accuracy of the classifier while preserving the minimum amount of data by feature and instance selection simultaneously. Experimental results show that the proposed method can efficiently reduce the amount of data and the ratio of noisy data. By comparison with other metaheuristic algorithms, the proposed method has good performance by an excellent balance between exploration and exploitation.\",\"PeriodicalId\":260220,\"journal\":{\"name\":\"2019 IEEE 17th International Conference on Industrial Informatics (INDIN)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 17th International Conference on Industrial Informatics (INDIN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INDIN41052.2019.8972199\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 17th International Conference on Industrial Informatics (INDIN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INDIN41052.2019.8972199","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Hybrid Memetic Algorithm for Simultaneously Selecting Features and Instances in Big Industrial IoT Data for Predictive Maintenance
In Industry 4.0, various types of IoT sensors which are installed on machines to collect data for predictive maintenance. As the collected data increases, there are more missing values and noisy data. Related studies have already proposed various methods to solve the problems in big data. Among them, most studies focused on either feature selection or instance selection for data preprocessing before training forecast models. Metaheuristic algorithm is one of the mainstream methods in data preprocessing. However, most of these studies rarely considered feature and instance selection simultaneously. In addition, they seldom focused on noisy data. Therefore, this work combines the UCI datasets with noisy data to simulate the real situation. Memetic algorithm (MA) has excellent performance in machine learning of data selection, and variable neighborhood search (VNS) was also proved to be widely applied to the systematic change of local search algorithms. This work proposes a hybrid MA and VNS to find a new subset that maximizes the accuracy of the classifier while preserving the minimum amount of data by feature and instance selection simultaneously. Experimental results show that the proposed method can efficiently reduce the amount of data and the ratio of noisy data. By comparison with other metaheuristic algorithms, the proposed method has good performance by an excellent balance between exploration and exploitation.