{"title":"社交网络中不恰当信息分类的主动学习方法","authors":"D. Levshun, O. Tushkanova, A. Chechulin","doi":"10.1109/pdp55904.2022.00050","DOIUrl":null,"url":null,"abstract":"This paper describes an original approach of classification with active learning for inappropriate information detection and its application for the text posts from the VKontakte social network. The novelty of the approach lies in the constantly growing dataset, while the classifiers training process takes place during the operator's work. The approach works with texts of any size and content and applicable for Russian social networks. The research contribution lies in the original approach for inappropriate information detection, while practical significance lies in the automation of routine tasks to reduce the burden on specialists in the area of protection from information. Experimental evaluation of the approach is focused on its iterative retraining part. For the experiment, text posts of different topics from the VKontakte social network were collected and labeled. After that, we have evaluated F-measure and ROC-AUC metrics for classifiers trained on random subsamples of different sizes and different topics. Moreover, the advantages and disadvantages of the approach, as well as future work directions, were indicated.","PeriodicalId":210759,"journal":{"name":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Active learning approach for inappropriate information classification in social networks\",\"authors\":\"D. Levshun, O. Tushkanova, A. Chechulin\",\"doi\":\"10.1109/pdp55904.2022.00050\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes an original approach of classification with active learning for inappropriate information detection and its application for the text posts from the VKontakte social network. The novelty of the approach lies in the constantly growing dataset, while the classifiers training process takes place during the operator's work. The approach works with texts of any size and content and applicable for Russian social networks. The research contribution lies in the original approach for inappropriate information detection, while practical significance lies in the automation of routine tasks to reduce the burden on specialists in the area of protection from information. Experimental evaluation of the approach is focused on its iterative retraining part. For the experiment, text posts of different topics from the VKontakte social network were collected and labeled. After that, we have evaluated F-measure and ROC-AUC metrics for classifiers trained on random subsamples of different sizes and different topics. Moreover, the advantages and disadvantages of the approach, as well as future work directions, were indicated.\",\"PeriodicalId\":210759,\"journal\":{\"name\":\"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/pdp55904.2022.00050\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/pdp55904.2022.00050","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Active learning approach for inappropriate information classification in social networks
This paper describes an original approach of classification with active learning for inappropriate information detection and its application for the text posts from the VKontakte social network. The novelty of the approach lies in the constantly growing dataset, while the classifiers training process takes place during the operator's work. The approach works with texts of any size and content and applicable for Russian social networks. The research contribution lies in the original approach for inappropriate information detection, while practical significance lies in the automation of routine tasks to reduce the burden on specialists in the area of protection from information. Experimental evaluation of the approach is focused on its iterative retraining part. For the experiment, text posts of different topics from the VKontakte social network were collected and labeled. After that, we have evaluated F-measure and ROC-AUC metrics for classifiers trained on random subsamples of different sizes and different topics. Moreover, the advantages and disadvantages of the approach, as well as future work directions, were indicated.