Mohammed Fadhil Mahdi, Arezoo Jahani, Dhafar Hamed Abd
{"title":"Fuzzy evaluation and explainable machine learning for diagnosis of rheumatic and autoimmune diseases.","authors":"Mohammed Fadhil Mahdi, Arezoo Jahani, Dhafar Hamed Abd","doi":"10.7717/peerj-cs.3096","DOIUrl":null,"url":null,"abstract":"<p><p>In this article, a new combination of an explainable machine learning approach with a fuzzy evaluation framework is proposed to improve the diagnostic performance and interpretation of rheumatic and autoimmune diseases. This work addresses three major challenges: (i) overlapping symptoms and complex clinical presentations, (ii) the lack of interpretability in traditional machine learning models, and (iii) the difficulty of selecting the best diagnosis model. To overcome these challenges, a new dataset was collected from Iraq's hospitals and health centers between 2019 and 2024. The size of dataset is 12,085 patients and includes 14 features in seven classes (rheumatoid arthritis, reactive arthritis, ankylosing spondylitis, Sjogren syndrome, systemic lupus erythematosus, psoriatic arthritis, and normal). The dataset is subjected to extensive preprocessing with attribute imputation (mean and mode), encoding categorical features, and balancing the data to pass it to 12 different machine learning models. Performance is evaluated based on precision, recall, F-score, kappa, Hamming loss, Matthews correlation coefficient, and accuracy to identify the best model. To select the optimal model, we apply fuzzy decision by opinion score method (FDOSM). The FDOSM process involves assessments from three domain experts to ensure a robust and well-rounded evaluation. Furthermore, the explainable artificial intelligence (XAI) technique provides global and local explanations for model predictions. Local interpretable model explanations (LIME) were used as explanations and significantly increased the transparency and reliability of the clinical decision-making process. The results show that the FDOSM yields gradient boosting with a 0.1333 score and a rank of 1, is the best model with an accuracy of 86.89%, precision of 87.35%, and kappa of 84.51%. The best model using XAI to increase confidence and trustworthiness in clinical decision-making and healthcare applications.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e3096"},"PeriodicalIF":2.5000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12453786/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.3096","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In this article, a new combination of an explainable machine learning approach with a fuzzy evaluation framework is proposed to improve the diagnostic performance and interpretation of rheumatic and autoimmune diseases. This work addresses three major challenges: (i) overlapping symptoms and complex clinical presentations, (ii) the lack of interpretability in traditional machine learning models, and (iii) the difficulty of selecting the best diagnosis model. To overcome these challenges, a new dataset was collected from Iraq's hospitals and health centers between 2019 and 2024. The size of dataset is 12,085 patients and includes 14 features in seven classes (rheumatoid arthritis, reactive arthritis, ankylosing spondylitis, Sjogren syndrome, systemic lupus erythematosus, psoriatic arthritis, and normal). The dataset is subjected to extensive preprocessing with attribute imputation (mean and mode), encoding categorical features, and balancing the data to pass it to 12 different machine learning models. Performance is evaluated based on precision, recall, F-score, kappa, Hamming loss, Matthews correlation coefficient, and accuracy to identify the best model. To select the optimal model, we apply fuzzy decision by opinion score method (FDOSM). The FDOSM process involves assessments from three domain experts to ensure a robust and well-rounded evaluation. Furthermore, the explainable artificial intelligence (XAI) technique provides global and local explanations for model predictions. Local interpretable model explanations (LIME) were used as explanations and significantly increased the transparency and reliability of the clinical decision-making process. The results show that the FDOSM yields gradient boosting with a 0.1333 score and a rank of 1, is the best model with an accuracy of 86.89%, precision of 87.35%, and kappa of 84.51%. The best model using XAI to increase confidence and trustworthiness in clinical decision-making and healthcare applications.
期刊介绍:
PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.