{"title":"非平衡数据集的lof增强SMOTE算法","authors":"Zhuangzhuang Zhang, Jing Hu, Tiecheng Song","doi":"10.1117/12.2685807","DOIUrl":null,"url":null,"abstract":"This paper proposes a new algorithm, LOF-Enhanced SMOTE, aimed at addressing the problem of imbalanced datasets in machine learning tasks. Due to the significantly fewer samples of certain classes in imbalanced datasets, the performance of classifiers may be negatively affected. To solve this problem, we introduce the Local Outlier Factor (LOF) algorithm to remove boundary noise on the basis of the SMOTE algorithm, and use a Gaussian kernel function to consider the similarity of generated samples. We conduct experiments on real intrusion detection data, UNSW-NB15. The results show that LOF-Enhanced SMOTE outperforms SMOTE and Borderline-SMOTE algorithms overall, and significantly outperforms them in detecting certain minority classes. This indicates that the LOF-Enhanced SMOTE algorithm can effectively solve the classification problem of imbalanced datasets.","PeriodicalId":305812,"journal":{"name":"International Conference on Electronic Information Technology","volume":"12719 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LOF-enhanced SMOTE algorithm for imbalanced dataset\",\"authors\":\"Zhuangzhuang Zhang, Jing Hu, Tiecheng Song\",\"doi\":\"10.1117/12.2685807\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes a new algorithm, LOF-Enhanced SMOTE, aimed at addressing the problem of imbalanced datasets in machine learning tasks. Due to the significantly fewer samples of certain classes in imbalanced datasets, the performance of classifiers may be negatively affected. To solve this problem, we introduce the Local Outlier Factor (LOF) algorithm to remove boundary noise on the basis of the SMOTE algorithm, and use a Gaussian kernel function to consider the similarity of generated samples. We conduct experiments on real intrusion detection data, UNSW-NB15. The results show that LOF-Enhanced SMOTE outperforms SMOTE and Borderline-SMOTE algorithms overall, and significantly outperforms them in detecting certain minority classes. This indicates that the LOF-Enhanced SMOTE algorithm can effectively solve the classification problem of imbalanced datasets.\",\"PeriodicalId\":305812,\"journal\":{\"name\":\"International Conference on Electronic Information Technology\",\"volume\":\"12719 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Electronic Information Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.2685807\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Electronic Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2685807","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
LOF-enhanced SMOTE algorithm for imbalanced dataset
This paper proposes a new algorithm, LOF-Enhanced SMOTE, aimed at addressing the problem of imbalanced datasets in machine learning tasks. Due to the significantly fewer samples of certain classes in imbalanced datasets, the performance of classifiers may be negatively affected. To solve this problem, we introduce the Local Outlier Factor (LOF) algorithm to remove boundary noise on the basis of the SMOTE algorithm, and use a Gaussian kernel function to consider the similarity of generated samples. We conduct experiments on real intrusion detection data, UNSW-NB15. The results show that LOF-Enhanced SMOTE outperforms SMOTE and Borderline-SMOTE algorithms overall, and significantly outperforms them in detecting certain minority classes. This indicates that the LOF-Enhanced SMOTE algorithm can effectively solve the classification problem of imbalanced datasets.