MapReduce based big data framework using associative Kruskal poly Kernel classifier for diabetic disease prediction

IF 1.6 Q2 MULTIDISCIPLINARY SCIENCES

MethodsX Pub Date : 2025-02-05 DOI:10.1016/j.mex.2025.103210

R. Ramani , S. Edwin Raja , D. Dhinakaran , S. Jagan , G. Prabaharan

{"title":"MapReduce based big data framework using associative Kruskal poly Kernel classifier for diabetic disease prediction","authors":"R. Ramani , S. Edwin Raja , D. Dhinakaran , S. Jagan , G. Prabaharan","doi":"10.1016/j.mex.2025.103210","DOIUrl":null,"url":null,"abstract":"<div><div>Recent trendy applications of Artificial Intelligence are Machine Learning (ML) algorithms, which have been extensively utilized for processes like pattern recognition, object classification, effective prediction of disease etc. However, ML techniques are reasonable solutions to computation methods and modeling, especially when the data size is enormous. These facts are established due to the reason that big data field has received considerable attention from both the industrial experts and academicians. The computation process must be accelerated to achieve early disease prediction in order to accomplish the prospects of ML for big data applications. In this paper, a method named “Associative Kruskal Wallis and MapReduce Poly Kernel (AKW-MRPK)\" is presented for early disease prediction. Initially, significant attributes are selected by applying Associative Kruskal Wallis Feature Selection model. This study parallelizes polynomial kernel vector using MapReduce based on the significant qualities gained, which will become a significant computing model to facilitate the early prognosis of disease. The proposed AKW-MRPK framework achieves up to 92 % accuracy, reduces computational time to as low as 0.875 ms for 25 patients, and demonstrates superior speedup efficiency with a value of 1.9 ms using two computational nodes, consistently outperforming supervised machine learning algorithms and Hadoop-based clusters across these critical metrics.<ul><li><span>•</span><span><div>The AKW-MRPK method selects attributes and accelerates computations for predictions.</div></span></li><li><span>•</span><span><div>Parallelizing polynomial kernels improves accuracy and speed in healthcare data analysis.</div></span></li></ul></div></div>","PeriodicalId":18446,"journal":{"name":"MethodsX","volume":"14 ","pages":"Article 103210"},"PeriodicalIF":1.6000,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"MethodsX","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2215016125000573","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Recent trendy applications of Artificial Intelligence are Machine Learning (ML) algorithms, which have been extensively utilized for processes like pattern recognition, object classification, effective prediction of disease etc. However, ML techniques are reasonable solutions to computation methods and modeling, especially when the data size is enormous. These facts are established due to the reason that big data field has received considerable attention from both the industrial experts and academicians. The computation process must be accelerated to achieve early disease prediction in order to accomplish the prospects of ML for big data applications. In this paper, a method named “Associative Kruskal Wallis and MapReduce Poly Kernel (AKW-MRPK)" is presented for early disease prediction. Initially, significant attributes are selected by applying Associative Kruskal Wallis Feature Selection model. This study parallelizes polynomial kernel vector using MapReduce based on the significant qualities gained, which will become a significant computing model to facilitate the early prognosis of disease. The proposed AKW-MRPK framework achieves up to 92 % accuracy, reduces computational time to as low as 0.875 ms for 25 patients, and demonstrates superior speedup efficiency with a value of 1.9 ms using two computational nodes, consistently outperforming supervised machine learning algorithms and Hadoop-based clusters across these critical metrics.

•
The AKW-MRPK method selects attributes and accelerates computations for predictions.
•
Parallelizing polynomial kernels improves accuracy and speed in healthcare data analysis.

Abstract Image

查看原文本刊更多论文

基于MapReduce的大数据框架，使用关联Kruskal多核分类器进行糖尿病疾病预测

人工智能最近的热门应用是机器学习（ML）算法，它已被广泛用于模式识别、对象分类、疾病有效预测等过程。然而，机器学习技术是计算方法和建模的合理解决方案，特别是当数据规模巨大时。这些事实之所以成立，是因为大数据领域受到了业界专家和学术界的高度关注。为了实现机器学习在大数据应用中的前景，必须加快计算过程，实现疾病的早期预测。本文提出了一种用于疾病早期预测的“Associative Kruskal Wallis and MapReduce Poly Kernel （AKW-MRPK）”方法。首先，应用关联Kruskal - Wallis特征选择模型选择重要属性。本研究基于所获得的显著品质，利用MapReduce对多项式核向量进行并行化处理，将成为促进疾病早期预后的重要计算模型。提出的AKW-MRPK框架实现了高达92%的准确率，将25名患者的计算时间降低到0.875 ms，并且使用两个计算节点显示了1.9 ms的卓越加速效率，在这些关键指标上始终优于监督机器学习算法和基于hadoop的集群。•AKW-MRPK方法选择属性并加速预测的计算。•并行化多项式核提高了医疗数据分析的准确性和速度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊