An intelligent recommender system using machine learning association rules and rough set for disease prediction from incomplete symptom set

Kamakhya Narain Singh , Jibendu Kumar Mantri
{"title":"An intelligent recommender system using machine learning association rules and rough set for disease prediction from incomplete symptom set","authors":"Kamakhya Narain Singh ,&nbsp;Jibendu Kumar Mantri","doi":"10.1016/j.dajour.2024.100468","DOIUrl":null,"url":null,"abstract":"<div><p>Digital devices are an integral component of the healthcare sector. With the advancement of modern technology with Artificial Intelligence (AI) and Machine Learning (ML), an automated diagnosis system with promising results is not a difficult task. This study aims to develop a recommender system (RS) for better diagnosis and improvement of patient care by hybridizing machine learning association rules (AR) and rough set theory (RST) to classify acute and life-threatening diseases. Initially data is preprocessed using binary, on-hot vector, and min–max scale to remove the noise. RST is used for feature selection to deal with incompleteness, inconsistency, and vagueness. We have designed an Associated Symptom Selection (ASS) algorithm to extract the mutually associated symptoms which need to be further matched in the existing database for prediction. ASS is especially helpful in detecting neurodevelopmental type diseases because the symptoms are usually not detectable by standard tests, and observations of behavioral expressions do general testing. The experiment is carried out using six popular ML classifiers such as AR, Decision Tree (DT), Random Forest (RF), K-Nearest Neighbors (KNN), Linear Support Vector Machine (LSVM), and Naive Bayes (NB) on a publicly available datasets. Performance was compared among different classifiers regarding the accuracy, precision, recall, F1-score, and J-Score value. The experimental result shows that AR performs better on clinical data with an accuracy of 94.40%, precision of 90.73%, recall of 94.45%, F1-score of 92.55%, and J-score of 95.14% and on autism with 98.7% accuracy, 98% precision, 97.8% recall, 97.9% F1-score, and 97.12% J-score respectively.</p></div>","PeriodicalId":100357,"journal":{"name":"Decision Analytics Journal","volume":"11 ","pages":"Article 100468"},"PeriodicalIF":0.0000,"publicationDate":"2024-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772662224000729/pdfft?md5=5d404290fff0fa44cd18e634ca426d4e&pid=1-s2.0-S2772662224000729-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Decision Analytics Journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772662224000729","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Digital devices are an integral component of the healthcare sector. With the advancement of modern technology with Artificial Intelligence (AI) and Machine Learning (ML), an automated diagnosis system with promising results is not a difficult task. This study aims to develop a recommender system (RS) for better diagnosis and improvement of patient care by hybridizing machine learning association rules (AR) and rough set theory (RST) to classify acute and life-threatening diseases. Initially data is preprocessed using binary, on-hot vector, and min–max scale to remove the noise. RST is used for feature selection to deal with incompleteness, inconsistency, and vagueness. We have designed an Associated Symptom Selection (ASS) algorithm to extract the mutually associated symptoms which need to be further matched in the existing database for prediction. ASS is especially helpful in detecting neurodevelopmental type diseases because the symptoms are usually not detectable by standard tests, and observations of behavioral expressions do general testing. The experiment is carried out using six popular ML classifiers such as AR, Decision Tree (DT), Random Forest (RF), K-Nearest Neighbors (KNN), Linear Support Vector Machine (LSVM), and Naive Bayes (NB) on a publicly available datasets. Performance was compared among different classifiers regarding the accuracy, precision, recall, F1-score, and J-Score value. The experimental result shows that AR performs better on clinical data with an accuracy of 94.40%, precision of 90.73%, recall of 94.45%, F1-score of 92.55%, and J-score of 95.14% and on autism with 98.7% accuracy, 98% precision, 97.8% recall, 97.9% F1-score, and 97.12% J-score respectively.

Abstract Image

利用机器学习关联规则和粗糙集的智能推荐系统,从不完整的症状集预测疾病
数字设备是医疗保健领域不可或缺的组成部分。随着人工智能(AI)和机器学习(ML)等现代技术的发展,建立一个具有良好效果的自动诊断系统并非难事。本研究旨在开发一种推荐系统(RS),通过混合机器学习关联规则(AR)和粗糙集理论(RST),对急性和危及生命的疾病进行分类,从而更好地诊断和改善患者护理。最初使用二进制、热向量和最小-最大比例对数据进行预处理,以去除噪声。RST 用于特征选择,以处理不完整性、不一致性和模糊性。我们设计了一种关联症状选择(ASS)算法来提取相互关联的症状,这些症状需要在现有数据库中进一步匹配以进行预测。关联症状选择算法尤其有助于检测神经发育型疾病,因为这些症状通常无法通过标准测试检测出来,而对行为表现的观察可以进行一般测试。实验在公开数据集上使用了 AR、决策树(DT)、随机森林(RF)、K-近邻(KNN)、线性支持向量机(LSVM)和奈夫贝叶斯(NB)等六种流行的 ML 分类器。比较了不同分类器在准确率、精确度、召回率、F1 分数和 J 分数等方面的性能。实验结果表明,AR 在临床数据上表现更好,准确率为 94.40%,精确率为 90.73%,召回率为 94.45%,F1 分数为 92.55%,J 分数为 95.14%;在自闭症数据上,准确率为 98.7%,精确率为 98%,召回率为 97.8%,F1 分数为 97.9%,J 分数为 97.12%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
3.90
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信