Towards IoT device privacy & data integrity through decentralized storage with blockchain and predicting malicious entities by stacked machine learning

IF 6 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Internet of Things Pub Date : 2025-05-19 DOI:10.1016/j.iot.2025.101642

Zahoor Ali Khan , Nadeem Javaid , Arooba Saeed , Imran Ahmed , Farrukh Aslam Khan

{"title":"Towards IoT device privacy & data integrity through decentralized storage with blockchain and predicting malicious entities by stacked machine learning","authors":"Zahoor Ali Khan , Nadeem Javaid , Arooba Saeed , Imran Ahmed , Farrukh Aslam Khan","doi":"10.1016/j.iot.2025.101642","DOIUrl":null,"url":null,"abstract":"<div><div>Blockchain technology offers significant advantages in securing the internet of things (IoT) networks. However, IoT devices remain highly vulnerable to security and privacy threats, making them prime targets for malicious activities. This study addresses key challenges in IoT security, including ensuring device authenticity, preserving data integrity through decentralized storage, and enhancing the explainability of predictive models. To tackle these challenges, a novel approach integrating blockchain and machine learning (ML) is proposed. A stacking-based classification model is introduced to differentiate between legitimate and malicious IoT entities. At the base layer, the model leverages the extra trees, multinomial Naive Bayes, and Bernoulli Naive Bayes classifiers, while the logistic regression with cross-validation classifier functions as the meta-model. The preprocessing pipeline includes data normalization and handling of missing values to improve model robustness. To further strengthen security, a local blockchain is implemented on an IoT device manager to register IoT requestors with unique addresses. The Keccak256 hashing algorithm converts these addresses into hashes, which are securely stored on the local blockchain. The actual data is managed using the interplanetary file system, while block validation is performed using a proof-of-stake consensus mechanism. The proposed model classifies IoT devices with superior performance compared to baseline classifiers. Experimental results demonstrate the effectiveness of the stacking model, achieving notable improvements: a 6.90% increase in macro-recall, a 4.49% improvement in the Matthews correlation coefficient and Cohen’s kappa, a 3.33% enhancement in the macro-F1-score, and approximately a 1.02% gain in accuracy, micro-precision, micro-recall, and area under the receiver operating characteristics curve. Additionally, log loss and Hamming loss are reduced by 50%, indicating enhanced reliability and lower error rates. Results of the proposed stacking model are further assessed using the Friedman statistical test and 10-fold cross-validation techniques. To ensure interpretability, Shapley additive explanations and local interpretable model-agnostic explanations are employed, providing insights into model decisions. These findings underscore the effectiveness of the proposed approach in improving IoT security by combining blockchain for decentralized authentication and explainable ML for transparent decision-making.</div></div>","PeriodicalId":29968,"journal":{"name":"Internet of Things","volume":"32 ","pages":"Article 101642"},"PeriodicalIF":6.0000,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Internet of Things","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2542660525001568","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Blockchain technology offers significant advantages in securing the internet of things (IoT) networks. However, IoT devices remain highly vulnerable to security and privacy threats, making them prime targets for malicious activities. This study addresses key challenges in IoT security, including ensuring device authenticity, preserving data integrity through decentralized storage, and enhancing the explainability of predictive models. To tackle these challenges, a novel approach integrating blockchain and machine learning (ML) is proposed. A stacking-based classification model is introduced to differentiate between legitimate and malicious IoT entities. At the base layer, the model leverages the extra trees, multinomial Naive Bayes, and Bernoulli Naive Bayes classifiers, while the logistic regression with cross-validation classifier functions as the meta-model. The preprocessing pipeline includes data normalization and handling of missing values to improve model robustness. To further strengthen security, a local blockchain is implemented on an IoT device manager to register IoT requestors with unique addresses. The Keccak256 hashing algorithm converts these addresses into hashes, which are securely stored on the local blockchain. The actual data is managed using the interplanetary file system, while block validation is performed using a proof-of-stake consensus mechanism. The proposed model classifies IoT devices with superior performance compared to baseline classifiers. Experimental results demonstrate the effectiveness of the stacking model, achieving notable improvements: a 6.90% increase in macro-recall, a 4.49% improvement in the Matthews correlation coefficient and Cohen’s kappa, a 3.33% enhancement in the macro-F1-score, and approximately a 1.02% gain in accuracy, micro-precision, micro-recall, and area under the receiver operating characteristics curve. Additionally, log loss and Hamming loss are reduced by 50%, indicating enhanced reliability and lower error rates. Results of the proposed stacking model are further assessed using the Friedman statistical test and 10-fold cross-validation techniques. To ensure interpretability, Shapley additive explanations and local interpretable model-agnostic explanations are employed, providing insights into model decisions. These findings underscore the effectiveness of the proposed approach in improving IoT security by combining blockchain for decentralized authentication and explainable ML for transparent decision-making.

查看原文本刊更多论文

通过区块链分散存储实现物联网设备隐私和数据完整性，并通过堆叠机器学习预测恶意实体

区块链技术在保护物联网（IoT）网络方面具有显著优势。然而，物联网设备仍然极易受到安全和隐私威胁，使其成为恶意活动的主要目标。本研究解决了物联网安全中的关键挑战，包括确保设备真实性，通过分散存储保持数据完整性，以及增强预测模型的可解释性。为了解决这些挑战，提出了一种集成区块链和机器学习（ML）的新方法。引入了基于堆叠的分类模型来区分合法和恶意的物联网实体。在基础层，该模型利用额外的树、多项朴素贝叶斯和伯努利朴素贝叶斯分类器，而具有交叉验证分类器的逻辑回归作为元模型。预处理管道包括数据规范化和缺失值处理，以提高模型的鲁棒性。为了进一步加强安全性，在物联网设备管理器上实现本地区块链，为物联网请求者注册唯一地址。Keccak256散列算法将这些地址转换为散列，这些散列安全地存储在本地区块链上。实际数据使用星际文件系统进行管理，而区块验证使用权益证明共识机制执行。与基线分类器相比，所提出的模型对具有优越性能的物联网设备进行分类。实验结果证明了该模型的有效性，取得了显著的改进：宏观召回率提高了6.90%，马修斯相关系数和科恩kappa提高了4.49%，宏观一级得分提高了3.33%，准确度、微精度、微召回率和接收者工作特征曲线下面积提高了约1.02%。此外，日志损失和汉明损失减少了50%，表明可靠性增强，错误率降低。使用弗里德曼统计检验和10倍交叉验证技术进一步评估了所提出的叠加模型的结果。为了确保可解释性，采用了Shapley加性解释和局部可解释的模型不可知论解释，为模型决策提供了见解。这些发现强调了所提出的方法在提高物联网安全性方面的有效性，该方法将区块链用于分散认证和可解释的ML用于透明决策。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Internet of Things Multiple-

CiteScore

3.60

自引率

5.10%

发文量

115

审稿时长

37 days

期刊介绍： Internet of Things; Engineering Cyber Physical Human Systems is a comprehensive journal encouraging cross collaboration between researchers, engineers and practitioners in the field of IoT & Cyber Physical Human Systems. The journal offers a unique platform to exchange scientific information on the entire breadth of technology, science, and societal applications of the IoT. The journal will place a high priority on timely publication, and provide a home for high quality. Furthermore, IOT is interested in publishing topical Special Issues on any aspect of IOT.