Khaled SH. Raslan, Almohammady S. Alsharkawy, K. R. Raslan
{"title":"HHO-SMOTe:基于Harris Hawk优化的合成少数派过采样技术的有效采样率","authors":"Khaled SH. Raslan, Almohammady S. Alsharkawy, K. R. Raslan","doi":"10.14569/ijacsa.2023.0141047","DOIUrl":null,"url":null,"abstract":"Classifying imbalanced datasets presents a significant challenge in the field of machine learning, especially with big data, where instances are unevenly distributed among classes, leading to class imbalance issues that affect classifier performance. Synthetic Minority Over-sampling Technique (SMOTE) is an effective oversampling method that addresses this by generating new instances for the under-represented minority class. However, SMOTE's efficiency relies on the sampling rate for minority class instances, making optimal sampling rates crucial for solving class imbalance. In this paper, we introduce HHO-SMOTe, a novel hybrid approach that combines the Harris Hawk optimization (HHO) search algorithm with SMOTE to enhance classification accuracy by determining optimal sample rates for each dataset. We conducted extensive experiments across diverse datasets to comprehensively evaluate our binary classification model. The results demonstrated our model's exceptional performance, with an AUC score exceeding 0.96, a high G-means score of 0.95 highlighting its robustness, and an outstanding F1-score consistently exceeding 0.99. These findings collectively establish our proposed approach as a formidable contender in the domain of binary classification models.","PeriodicalId":13824,"journal":{"name":"International Journal of Advanced Computer Science and Applications","volume":"71 1","pages":"0"},"PeriodicalIF":0.9000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"HHO-SMOTe: Efficient Sampling Rate for Synthetic Minority Oversampling Technique Based on Harris Hawk Optimization\",\"authors\":\"Khaled SH. Raslan, Almohammady S. Alsharkawy, K. R. Raslan\",\"doi\":\"10.14569/ijacsa.2023.0141047\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Classifying imbalanced datasets presents a significant challenge in the field of machine learning, especially with big data, where instances are unevenly distributed among classes, leading to class imbalance issues that affect classifier performance. Synthetic Minority Over-sampling Technique (SMOTE) is an effective oversampling method that addresses this by generating new instances for the under-represented minority class. However, SMOTE's efficiency relies on the sampling rate for minority class instances, making optimal sampling rates crucial for solving class imbalance. In this paper, we introduce HHO-SMOTe, a novel hybrid approach that combines the Harris Hawk optimization (HHO) search algorithm with SMOTE to enhance classification accuracy by determining optimal sample rates for each dataset. We conducted extensive experiments across diverse datasets to comprehensively evaluate our binary classification model. The results demonstrated our model's exceptional performance, with an AUC score exceeding 0.96, a high G-means score of 0.95 highlighting its robustness, and an outstanding F1-score consistently exceeding 0.99. These findings collectively establish our proposed approach as a formidable contender in the domain of binary classification models.\",\"PeriodicalId\":13824,\"journal\":{\"name\":\"International Journal of Advanced Computer Science and Applications\",\"volume\":\"71 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Advanced Computer Science and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14569/ijacsa.2023.0141047\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Advanced Computer Science and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14569/ijacsa.2023.0141047","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
HHO-SMOTe: Efficient Sampling Rate for Synthetic Minority Oversampling Technique Based on Harris Hawk Optimization
Classifying imbalanced datasets presents a significant challenge in the field of machine learning, especially with big data, where instances are unevenly distributed among classes, leading to class imbalance issues that affect classifier performance. Synthetic Minority Over-sampling Technique (SMOTE) is an effective oversampling method that addresses this by generating new instances for the under-represented minority class. However, SMOTE's efficiency relies on the sampling rate for minority class instances, making optimal sampling rates crucial for solving class imbalance. In this paper, we introduce HHO-SMOTe, a novel hybrid approach that combines the Harris Hawk optimization (HHO) search algorithm with SMOTE to enhance classification accuracy by determining optimal sample rates for each dataset. We conducted extensive experiments across diverse datasets to comprehensively evaluate our binary classification model. The results demonstrated our model's exceptional performance, with an AUC score exceeding 0.96, a high G-means score of 0.95 highlighting its robustness, and an outstanding F1-score consistently exceeding 0.99. These findings collectively establish our proposed approach as a formidable contender in the domain of binary classification models.
期刊介绍:
IJACSA is a scholarly computer science journal representing the best in research. Its mission is to provide an outlet for quality research to be publicised and published to a global audience. The journal aims to publish papers selected through rigorous double-blind peer review to ensure originality, timeliness, relevance, and readability. In sync with the Journal''s vision "to be a respected publication that publishes peer reviewed research articles, as well as review and survey papers contributed by International community of Authors", we have drawn reviewers and editors from Institutions and Universities across the globe. A double blind peer review process is conducted to ensure that we retain high standards. At IJACSA, we stand strong because we know that global challenges make way for new innovations, new ways and new talent. International Journal of Advanced Computer Science and Applications publishes carefully refereed research, review and survey papers which offer a significant contribution to the computer science literature, and which are of interest to a wide audience. Coverage extends to all main-stream branches of computer science and related applications