A Hybrid Memetic Algorithm for Simultaneously Selecting Features and Instances in Big Industrial IoT Data for Predictive Maintenance

Yu-Lin Liang, Chih-Chi Kuo, Chun-Cheng Lin
{"title":"A Hybrid Memetic Algorithm for Simultaneously Selecting Features and Instances in Big Industrial IoT Data for Predictive Maintenance","authors":"Yu-Lin Liang, Chih-Chi Kuo, Chun-Cheng Lin","doi":"10.1109/INDIN41052.2019.8972199","DOIUrl":null,"url":null,"abstract":"In Industry 4.0, various types of IoT sensors which are installed on machines to collect data for predictive maintenance. As the collected data increases, there are more missing values and noisy data. Related studies have already proposed various methods to solve the problems in big data. Among them, most studies focused on either feature selection or instance selection for data preprocessing before training forecast models. Metaheuristic algorithm is one of the mainstream methods in data preprocessing. However, most of these studies rarely considered feature and instance selection simultaneously. In addition, they seldom focused on noisy data. Therefore, this work combines the UCI datasets with noisy data to simulate the real situation. Memetic algorithm (MA) has excellent performance in machine learning of data selection, and variable neighborhood search (VNS) was also proved to be widely applied to the systematic change of local search algorithms. This work proposes a hybrid MA and VNS to find a new subset that maximizes the accuracy of the classifier while preserving the minimum amount of data by feature and instance selection simultaneously. Experimental results show that the proposed method can efficiently reduce the amount of data and the ratio of noisy data. By comparison with other metaheuristic algorithms, the proposed method has good performance by an excellent balance between exploration and exploitation.","PeriodicalId":260220,"journal":{"name":"2019 IEEE 17th International Conference on Industrial Informatics (INDIN)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 17th International Conference on Industrial Informatics (INDIN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INDIN41052.2019.8972199","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

In Industry 4.0, various types of IoT sensors which are installed on machines to collect data for predictive maintenance. As the collected data increases, there are more missing values and noisy data. Related studies have already proposed various methods to solve the problems in big data. Among them, most studies focused on either feature selection or instance selection for data preprocessing before training forecast models. Metaheuristic algorithm is one of the mainstream methods in data preprocessing. However, most of these studies rarely considered feature and instance selection simultaneously. In addition, they seldom focused on noisy data. Therefore, this work combines the UCI datasets with noisy data to simulate the real situation. Memetic algorithm (MA) has excellent performance in machine learning of data selection, and variable neighborhood search (VNS) was also proved to be widely applied to the systematic change of local search algorithms. This work proposes a hybrid MA and VNS to find a new subset that maximizes the accuracy of the classifier while preserving the minimum amount of data by feature and instance selection simultaneously. Experimental results show that the proposed method can efficiently reduce the amount of data and the ratio of noisy data. By comparison with other metaheuristic algorithms, the proposed method has good performance by an excellent balance between exploration and exploitation.
面向预测性维护的大工业物联网数据特征和实例同时选择的混合模因算法
在工业4.0中,安装在机器上的各种类型的物联网传感器用于收集数据以进行预测性维护。随着采集数据的增加,缺失值和噪声数据也越来越多。相关研究已经提出了解决大数据问题的各种方法。其中,大多数研究集中在训练预测模型之前的数据预处理中,要么是特征选择,要么是实例选择。元启发式算法是数据预处理的主流方法之一。然而,这些研究大多很少同时考虑特征选择和实例选择。此外,他们很少关注有噪声的数据。因此,本工作将UCI数据集与噪声数据相结合,模拟真实情况。模因算法(Memetic algorithm, MA)在数据选择的机器学习中具有优异的性能,而变量邻域搜索(variable neighborhood search, VNS)也被证明可以广泛应用于局部搜索算法的系统变化。这项工作提出了混合MA和VNS来寻找一个新的子集,该子集在通过特征和实例选择同时保持最小数据量的同时最大化分类器的准确性。实验结果表明,该方法可以有效地减少数据量和噪声数据的比例。通过与其他元启发式算法的比较,该方法在探索和利用之间取得了很好的平衡,具有良好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信