{"title":"Exploratory study on Evolutionary Random Forests for Classification in Medical Datasets","authors":"Susanne Blotwijk, Camille Raets, Kurt Barbé","doi":"10.1109/MeMeA57477.2023.10171908","DOIUrl":null,"url":null,"abstract":"This paper presents an exploratory study on the efficacy of different machine learning algorithms for classification in medical datasets, with a particular focus on a recently published evolutionary random forest algorithm. The study is motivated by the increasing availability of medical measurements obtained from various new sources such as wearables, and continued improvements in existing measurement techniques, which have resulted in an increase in the number of variables that can be measured per patient. Meanwhile, recruiting patients and collecting data often remain a costly and time-consuming endeavor, resulting in datasets with high dimensionality and low instance to feature ratios. The study aims to evaluate the performance of these machine learning algorithms and to investigate their sensitivity to varying sample sizes. Additionally, the study examines whether the use of an evolutionary random forest algorithm can improve performance and robustness in these datasets. The study was conducted on nine different datasets to assess the extent to which the findings can be generalized. The results indicate that the evolutionary random forest generally outperforms other classification algorithms. Furthermore, the performance gap often widens at lower instance to feature ratios. Future work may build on these findings to develop more sophisticated machine learning algorithms that are tailored to specific medical classification applications.","PeriodicalId":191927,"journal":{"name":"2023 IEEE International Symposium on Medical Measurements and Applications (MeMeA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Symposium on Medical Measurements and Applications (MeMeA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MeMeA57477.2023.10171908","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents an exploratory study on the efficacy of different machine learning algorithms for classification in medical datasets, with a particular focus on a recently published evolutionary random forest algorithm. The study is motivated by the increasing availability of medical measurements obtained from various new sources such as wearables, and continued improvements in existing measurement techniques, which have resulted in an increase in the number of variables that can be measured per patient. Meanwhile, recruiting patients and collecting data often remain a costly and time-consuming endeavor, resulting in datasets with high dimensionality and low instance to feature ratios. The study aims to evaluate the performance of these machine learning algorithms and to investigate their sensitivity to varying sample sizes. Additionally, the study examines whether the use of an evolutionary random forest algorithm can improve performance and robustness in these datasets. The study was conducted on nine different datasets to assess the extent to which the findings can be generalized. The results indicate that the evolutionary random forest generally outperforms other classification algorithms. Furthermore, the performance gap often widens at lower instance to feature ratios. Future work may build on these findings to develop more sophisticated machine learning algorithms that are tailored to specific medical classification applications.