{"title":"Novel efficient feature selection: Classification of medical and immunotherapy treatments utilising Random Forest and Decision Trees","authors":"Ahsanullah Yunas Mahmoud","doi":"10.1016/j.ibmed.2024.100151","DOIUrl":null,"url":null,"abstract":"<div><p>Immunotherapy is an important topic in healthcare as it affects patients' treatments for breast cancer, diabetes, and immunotherapy. However, immunotherapy for warts is less representative because of the lack of data. Machine learning is frequently utilised for treatment diagnosis by converting raw immunotherapy data into useful insights. Efficient classification of immunotherapy treatments is crucial for a productive diagnosis. This study considers immunotherapy with a data-driven and ’less is more perspective’. Despite using a portion of the available imbalance and complex data, the process of diagnosis of immunotherapy treatment is made reasonably precise by considering the parameters of accuracy, sensitivity, and specificity. The contribution of this study is focused on ”more is less” feature selection, which states that approximately 80 % of the effects or results of a system are caused by 20 % of the inputs. The features that contribute most to the classification of immunotherapy treatments are prioritised. This study proposes the implementation of Random Forest and Decision Trees for the classification of immunotherapy treatments. The relevant experimental medical data are explored as a case study. The experiments are conducted using Weka and Python data analysis tools, performing data preprocessing, class balancing, and feature selection. Random Forest performed better than the Decision Trees. By Applying Random Forest and utilising only one feature (time) as an input variable, a classification accuracy of 88.88 %, sensitivity of 95.45 %, and specificity of 60 % are attained. By using 12.5 % of the dataset, when implementing Random Forest together with ordinary feature selection, the diagnosis of immunotherapy treatments is become more efficient, despite using a portion of data features reasonable results are obtained.</p></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"10 ","pages":"Article 100151"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666521224000188/pdfft?md5=e93dc97987b02f29f0f70f8ab813e2a6&pid=1-s2.0-S2666521224000188-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligence-based medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666521224000188","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Immunotherapy is an important topic in healthcare as it affects patients' treatments for breast cancer, diabetes, and immunotherapy. However, immunotherapy for warts is less representative because of the lack of data. Machine learning is frequently utilised for treatment diagnosis by converting raw immunotherapy data into useful insights. Efficient classification of immunotherapy treatments is crucial for a productive diagnosis. This study considers immunotherapy with a data-driven and ’less is more perspective’. Despite using a portion of the available imbalance and complex data, the process of diagnosis of immunotherapy treatment is made reasonably precise by considering the parameters of accuracy, sensitivity, and specificity. The contribution of this study is focused on ”more is less” feature selection, which states that approximately 80 % of the effects or results of a system are caused by 20 % of the inputs. The features that contribute most to the classification of immunotherapy treatments are prioritised. This study proposes the implementation of Random Forest and Decision Trees for the classification of immunotherapy treatments. The relevant experimental medical data are explored as a case study. The experiments are conducted using Weka and Python data analysis tools, performing data preprocessing, class balancing, and feature selection. Random Forest performed better than the Decision Trees. By Applying Random Forest and utilising only one feature (time) as an input variable, a classification accuracy of 88.88 %, sensitivity of 95.45 %, and specificity of 60 % are attained. By using 12.5 % of the dataset, when implementing Random Forest together with ordinary feature selection, the diagnosis of immunotherapy treatments is become more efficient, despite using a portion of data features reasonable results are obtained.