F. K. H. D. Barros, André L. Jeller Selleti, Vinicius Queiroz, R. M. Pereira, C. Silla
{"title":"Analyzing the Impact of Resampling Approaches on Chest X-Ray Images for COVID-19 Identification in a Local Hierarchical Classification Scenario","authors":"F. K. H. D. Barros, André L. Jeller Selleti, Vinicius Queiroz, R. M. Pereira, C. Silla","doi":"10.1109/BIBE52308.2021.9635433","DOIUrl":null,"url":null,"abstract":"Researchers dealing with real-world data - such as in the healthcare domain - tend to face class imbalance issues. More specifically, publicly available datasets containing Chest X-Ray (CXR) of Pneumonia diseases (including COVID-19) usually have an imbalanced class distribution. This dataset imbalance causes automatic diagnosis systems to classify majority classes with much more accuracy than the minority ones. Several resampling algorithms were proposed in the past to deal with the class imbalance issue. Hierarchical classifiers have also been proposed to increase the predictive performance of classifiers, but there is little research in the literature verifying if using existing resampling algorithms with hierarchical classifiers are a good alternative to improve classification performance. This work proposes an experimental classification schema to investigate the effectiveness of using resampling algorithms in the identification of COVID-19 and other types of Pneumonia through CXR images. The proposed schema uses resampling algorithms to rebalance the class distribution, in a Local Hierarchical Classification scenario. The experimental evaluation, which is supported by inferential statistical analysis, showed that using specific resampling algorithms with Local Hierarchical Classifiers brings a statistically significant increase to the macro-averaged Fl-Score, and improves the predictive performance for the minority classes.","PeriodicalId":343724,"journal":{"name":"2021 IEEE 21st International Conference on Bioinformatics and Bioengineering (BIBE)","volume":"243 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 21st International Conference on Bioinformatics and Bioengineering (BIBE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE52308.2021.9635433","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Researchers dealing with real-world data - such as in the healthcare domain - tend to face class imbalance issues. More specifically, publicly available datasets containing Chest X-Ray (CXR) of Pneumonia diseases (including COVID-19) usually have an imbalanced class distribution. This dataset imbalance causes automatic diagnosis systems to classify majority classes with much more accuracy than the minority ones. Several resampling algorithms were proposed in the past to deal with the class imbalance issue. Hierarchical classifiers have also been proposed to increase the predictive performance of classifiers, but there is little research in the literature verifying if using existing resampling algorithms with hierarchical classifiers are a good alternative to improve classification performance. This work proposes an experimental classification schema to investigate the effectiveness of using resampling algorithms in the identification of COVID-19 and other types of Pneumonia through CXR images. The proposed schema uses resampling algorithms to rebalance the class distribution, in a Local Hierarchical Classification scenario. The experimental evaluation, which is supported by inferential statistical analysis, showed that using specific resampling algorithms with Local Hierarchical Classifiers brings a statistically significant increase to the macro-averaged Fl-Score, and improves the predictive performance for the minority classes.