{"title":"基于遗传算法的不平衡数据分类优化支持向量机","authors":"H. Shamsudin, U. K. Yusof, Yan Haijie, I. Isa","doi":"10.11113/jurnalteknologi.v85.19695","DOIUrl":null,"url":null,"abstract":"In supervised machine learning, class imbalance is commonly occurring when the number of examples that represent one class is much lower than other classes. Since an imbalance data may generate suboptimal classification models, it could lead to the minority examples are misclassified frequently and hardly achieving the best performance. This study proposes an improved support vector machine (SVM) method for imbalanced data namely as SVM-GA by optimizing SVM algorithm with Genetic Algorithm (GA) over a synthetic minority oversampling technique. Besides considering the best sampling method in optimized SVM, the experimental result shows that the proposed method improves by 97% compared to the baseline model and selected optimized models. The proposed model had significant performance by outperformed the baseline model and other models based SVM with Grid search and Randomized search in most of the cases, especially for the datasets which have extremely rare cases. ","PeriodicalId":47541,"journal":{"name":"Jurnal Teknologi-Sciences & Engineering","volume":" ","pages":""},"PeriodicalIF":0.6000,"publicationDate":"2023-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AN OPTIMIZED SUPPORT VECTOR MACHINE WITH GENETIC ALGORITHM FOR IMBALANCED DATA CLASSIFICATION\",\"authors\":\"H. Shamsudin, U. K. Yusof, Yan Haijie, I. Isa\",\"doi\":\"10.11113/jurnalteknologi.v85.19695\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In supervised machine learning, class imbalance is commonly occurring when the number of examples that represent one class is much lower than other classes. Since an imbalance data may generate suboptimal classification models, it could lead to the minority examples are misclassified frequently and hardly achieving the best performance. This study proposes an improved support vector machine (SVM) method for imbalanced data namely as SVM-GA by optimizing SVM algorithm with Genetic Algorithm (GA) over a synthetic minority oversampling technique. Besides considering the best sampling method in optimized SVM, the experimental result shows that the proposed method improves by 97% compared to the baseline model and selected optimized models. The proposed model had significant performance by outperformed the baseline model and other models based SVM with Grid search and Randomized search in most of the cases, especially for the datasets which have extremely rare cases. \",\"PeriodicalId\":47541,\"journal\":{\"name\":\"Jurnal Teknologi-Sciences & Engineering\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2023-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Jurnal Teknologi-Sciences & Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.11113/jurnalteknologi.v85.19695\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal Teknologi-Sciences & Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11113/jurnalteknologi.v85.19695","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
AN OPTIMIZED SUPPORT VECTOR MACHINE WITH GENETIC ALGORITHM FOR IMBALANCED DATA CLASSIFICATION
In supervised machine learning, class imbalance is commonly occurring when the number of examples that represent one class is much lower than other classes. Since an imbalance data may generate suboptimal classification models, it could lead to the minority examples are misclassified frequently and hardly achieving the best performance. This study proposes an improved support vector machine (SVM) method for imbalanced data namely as SVM-GA by optimizing SVM algorithm with Genetic Algorithm (GA) over a synthetic minority oversampling technique. Besides considering the best sampling method in optimized SVM, the experimental result shows that the proposed method improves by 97% compared to the baseline model and selected optimized models. The proposed model had significant performance by outperformed the baseline model and other models based SVM with Grid search and Randomized search in most of the cases, especially for the datasets which have extremely rare cases.