Ganglong Duan, Jianjun Liu, Weiwei Kong, B. Cui, Jiahao Li
{"title":"基于多算法融合的点击欺诈预测研究","authors":"Ganglong Duan, Jianjun Liu, Weiwei Kong, B. Cui, Jiahao Li","doi":"10.1117/12.2680157","DOIUrl":null,"url":null,"abstract":"The detection of click fraud in online advertisements on the Internet for the purpose of extracting advertising fees is one of the important aspects of machine learning applications. In this paper, using the data information of 400000 ad click cheating cases, we use recursive feature elimination method to determine the predictors and use five algorithms of gradient boosted decision tree (GBDT), random forest (RF), Adaboost, KNN and LGbmclassifier to train a single classifier, compare the prediction performance of each type of classifier, and the first three with better prediction performance The top three with better prediction performance were fused with multiple algorithms for prediction. The experimental results show that the random forest, Lgbmclassifier and Adaboost algorithms have the highest prediction accuracy, 87%, 83% and 79%, respectively, with AUC values of 0.90, 0.87 and 0.81. The prediction accuracy of the multi-algorithm fusion model taken in this paper can improve by 3% compared to the single algorithm with the best prediction performance, reaching 90%.","PeriodicalId":201466,"journal":{"name":"Symposium on Advances in Electrical, Electronics and Computer Engineering","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on click fraud prediction based on multi-algorithm fusion\",\"authors\":\"Ganglong Duan, Jianjun Liu, Weiwei Kong, B. Cui, Jiahao Li\",\"doi\":\"10.1117/12.2680157\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The detection of click fraud in online advertisements on the Internet for the purpose of extracting advertising fees is one of the important aspects of machine learning applications. In this paper, using the data information of 400000 ad click cheating cases, we use recursive feature elimination method to determine the predictors and use five algorithms of gradient boosted decision tree (GBDT), random forest (RF), Adaboost, KNN and LGbmclassifier to train a single classifier, compare the prediction performance of each type of classifier, and the first three with better prediction performance The top three with better prediction performance were fused with multiple algorithms for prediction. The experimental results show that the random forest, Lgbmclassifier and Adaboost algorithms have the highest prediction accuracy, 87%, 83% and 79%, respectively, with AUC values of 0.90, 0.87 and 0.81. The prediction accuracy of the multi-algorithm fusion model taken in this paper can improve by 3% compared to the single algorithm with the best prediction performance, reaching 90%.\",\"PeriodicalId\":201466,\"journal\":{\"name\":\"Symposium on Advances in Electrical, Electronics and Computer Engineering\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Symposium on Advances in Electrical, Electronics and Computer Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.2680157\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Symposium on Advances in Electrical, Electronics and Computer Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2680157","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Research on click fraud prediction based on multi-algorithm fusion
The detection of click fraud in online advertisements on the Internet for the purpose of extracting advertising fees is one of the important aspects of machine learning applications. In this paper, using the data information of 400000 ad click cheating cases, we use recursive feature elimination method to determine the predictors and use five algorithms of gradient boosted decision tree (GBDT), random forest (RF), Adaboost, KNN and LGbmclassifier to train a single classifier, compare the prediction performance of each type of classifier, and the first three with better prediction performance The top three with better prediction performance were fused with multiple algorithms for prediction. The experimental results show that the random forest, Lgbmclassifier and Adaboost algorithms have the highest prediction accuracy, 87%, 83% and 79%, respectively, with AUC values of 0.90, 0.87 and 0.81. The prediction accuracy of the multi-algorithm fusion model taken in this paper can improve by 3% compared to the single algorithm with the best prediction performance, reaching 90%.