缺失物联网数据预测与机器学习技术

F. Azizoğlu, Emre Ünsal
{"title":"缺失物联网数据预测与机器学习技术","authors":"F. Azizoğlu, Emre Ünsal","doi":"10.31202/ecjse.1135485","DOIUrl":null,"url":null,"abstract":"Every day, the amount of data generated by industrial applications based on the Internet of Things (IoT) grows. However, data acquired as a result of failures and communication disconnections in IoT devices might be noisy, inaccurate, and incomplete. These issues have become crucial for data production, quality, processing, and analysis. The datasets used in the scope of this study were collected in real-time from the water neutralizer system of Sivas Numune Hospital, which converts medical waste into household waste. Medical liquid wastes in hospitals are exposed to chemical neutralization process by means of pH change with neutralization devices before being transferred to the sewer. In this regard, the monitoring of pH levels in the medical waste neutralization system is crucial for environmental protection. In this aspect, two datasets with varying quantities of missing data were evaluated for the prediction of the PH using the linear regression (LR), support vector machines (SVM), k-nearest neighbor (KNN), random forest (RF), and decision tree (DT) machine learning algorithms. Mean absolute error (MAE), mean squared error (MSE), and root mean square error (RMSE) performance metrics were used to evaluate machine learning algorithms. As a consequence of the analysis, it was determined that the SVM algorithm performed better performance on the two distinct datasets. The result of the evaluation indicates that machine learning algorithms are remarkably efficient at predicting missing pH data.","PeriodicalId":11622,"journal":{"name":"El-Cezeri Fen ve Mühendislik Dergisi","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Missing IoT Data Prediction with Machine Learning Techniques\",\"authors\":\"F. Azizoğlu, Emre Ünsal\",\"doi\":\"10.31202/ecjse.1135485\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Every day, the amount of data generated by industrial applications based on the Internet of Things (IoT) grows. However, data acquired as a result of failures and communication disconnections in IoT devices might be noisy, inaccurate, and incomplete. These issues have become crucial for data production, quality, processing, and analysis. The datasets used in the scope of this study were collected in real-time from the water neutralizer system of Sivas Numune Hospital, which converts medical waste into household waste. Medical liquid wastes in hospitals are exposed to chemical neutralization process by means of pH change with neutralization devices before being transferred to the sewer. In this regard, the monitoring of pH levels in the medical waste neutralization system is crucial for environmental protection. In this aspect, two datasets with varying quantities of missing data were evaluated for the prediction of the PH using the linear regression (LR), support vector machines (SVM), k-nearest neighbor (KNN), random forest (RF), and decision tree (DT) machine learning algorithms. Mean absolute error (MAE), mean squared error (MSE), and root mean square error (RMSE) performance metrics were used to evaluate machine learning algorithms. As a consequence of the analysis, it was determined that the SVM algorithm performed better performance on the two distinct datasets. The result of the evaluation indicates that machine learning algorithms are remarkably efficient at predicting missing pH data.\",\"PeriodicalId\":11622,\"journal\":{\"name\":\"El-Cezeri Fen ve Mühendislik Dergisi\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"El-Cezeri Fen ve Mühendislik Dergisi\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.31202/ecjse.1135485\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"El-Cezeri Fen ve Mühendislik Dergisi","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31202/ecjse.1135485","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

每天,基于物联网(IoT)的工业应用产生的数据量都在增长。然而,由于物联网设备中的故障和通信中断而获得的数据可能是嘈杂的、不准确的和不完整的。这些问题已经成为数据生产、质量、处理和分析的关键。本研究范围内使用的数据集是从Sivas Numune医院的水中和剂系统实时收集的,该系统将医疗废物转化为生活废物。医院的医疗废液是通过中和装置改变pH值进行化学中和处理后,再排入下水道的。因此,监测医疗废物中和系统中的pH值对环境保护至关重要。在这方面,使用线性回归(LR)、支持向量机(SVM)、k近邻(KNN)、随机森林(RF)和决策树(DT)机器学习算法对两个缺失数据量不同的数据集进行评估,以预测PH。使用平均绝对误差(MAE)、均方误差(MSE)和均方根误差(RMSE)性能指标来评估机器学习算法。作为分析的结果,确定SVM算法在两个不同的数据集上表现出更好的性能。评估结果表明,机器学习算法在预测缺失的pH数据方面非常有效。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Missing IoT Data Prediction with Machine Learning Techniques
Every day, the amount of data generated by industrial applications based on the Internet of Things (IoT) grows. However, data acquired as a result of failures and communication disconnections in IoT devices might be noisy, inaccurate, and incomplete. These issues have become crucial for data production, quality, processing, and analysis. The datasets used in the scope of this study were collected in real-time from the water neutralizer system of Sivas Numune Hospital, which converts medical waste into household waste. Medical liquid wastes in hospitals are exposed to chemical neutralization process by means of pH change with neutralization devices before being transferred to the sewer. In this regard, the monitoring of pH levels in the medical waste neutralization system is crucial for environmental protection. In this aspect, two datasets with varying quantities of missing data were evaluated for the prediction of the PH using the linear regression (LR), support vector machines (SVM), k-nearest neighbor (KNN), random forest (RF), and decision tree (DT) machine learning algorithms. Mean absolute error (MAE), mean squared error (MSE), and root mean square error (RMSE) performance metrics were used to evaluate machine learning algorithms. As a consequence of the analysis, it was determined that the SVM algorithm performed better performance on the two distinct datasets. The result of the evaluation indicates that machine learning algorithms are remarkably efficient at predicting missing pH data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信