{"title":"Convergence of Blockchain, k-medoids and homomorphic encryption for privacy preserving biomedical data classification","authors":"Shamima Akter , Farhana Reza , Manik Ahmed","doi":"10.1016/j.iotcps.2022.05.006","DOIUrl":null,"url":null,"abstract":"<div><p>Data privacy on the Internet of Medical Things (IoMT) remains a critical concern when handling biomedical data. While extant studies focus on cryptography and differential privacy, few of them capture the utility and authenticity of data. As a result, data privacy remains the primary concern when training a machine learning (ML) model with IoMT data from various data sources/owners such as <em>k − medoids</em>. To overcome the above-mentioned issues, this study proposes secure <em>k − medoids</em> that are implemented together with Blockchain and partial homomorphic cryptosystem (Paillier) to ensure authenticity and protect all entities (i.e., data owner and data analyst) data privacy. The homomorphic property of Paillier is utilized to develop secure building blocks (i.e., secure polynomial operations, secure comparison, and secure biasing operations) to ensure data privacy and eliminate dependency on any third parties. We utilized three different biomedical datasets, and these are (I) Heart Disease Data (HDD), (II) Diabetes Data (DD), and (III) Breast Cancer Wisconsin Data (BCWD). Rigorous security analysis demonstrates that secure <em>k − medoids</em> protect against sensitive data breaches. It also showed superior performance in both BCWD (Accuracy 97.80%, Precision 96.83%, and Recall 99.80%) and HDD (Accuracy 82.50%, Precision 81.28%, and Recall 80.50%) datasets, respectively. However, similar performance was not reflected in the case of the DD dataset. Furthermore, the study explains why such performance results are observed. In addition, the proposed system has been proven to take less execution time compared to the extant studies.</p></div>","PeriodicalId":100724,"journal":{"name":"Internet of Things and Cyber-Physical Systems","volume":"2 ","pages":"Pages 99-110"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667345222000165/pdfft?md5=ea8d6340078d4828cc7ed23824aadbce&pid=1-s2.0-S2667345222000165-main.pdf","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Internet of Things and Cyber-Physical Systems","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667345222000165","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Data privacy on the Internet of Medical Things (IoMT) remains a critical concern when handling biomedical data. While extant studies focus on cryptography and differential privacy, few of them capture the utility and authenticity of data. As a result, data privacy remains the primary concern when training a machine learning (ML) model with IoMT data from various data sources/owners such as k − medoids. To overcome the above-mentioned issues, this study proposes secure k − medoids that are implemented together with Blockchain and partial homomorphic cryptosystem (Paillier) to ensure authenticity and protect all entities (i.e., data owner and data analyst) data privacy. The homomorphic property of Paillier is utilized to develop secure building blocks (i.e., secure polynomial operations, secure comparison, and secure biasing operations) to ensure data privacy and eliminate dependency on any third parties. We utilized three different biomedical datasets, and these are (I) Heart Disease Data (HDD), (II) Diabetes Data (DD), and (III) Breast Cancer Wisconsin Data (BCWD). Rigorous security analysis demonstrates that secure k − medoids protect against sensitive data breaches. It also showed superior performance in both BCWD (Accuracy 97.80%, Precision 96.83%, and Recall 99.80%) and HDD (Accuracy 82.50%, Precision 81.28%, and Recall 80.50%) datasets, respectively. However, similar performance was not reflected in the case of the DD dataset. Furthermore, the study explains why such performance results are observed. In addition, the proposed system has been proven to take less execution time compared to the extant studies.