Coarse-to-fine label propagation with hybrid representation for deep semi-supervised bot detection

IF 2.1 4区计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Wireless Networks Pub Date : 2024-08-14 DOI:10.1007/s11276-024-03821-2

Huailiang Peng, Yujun Zhang, Xu Bai, Qiong Dai

{"title":"Coarse-to-fine label propagation with hybrid representation for deep semi-supervised bot detection","authors":"Huailiang Peng, Yujun Zhang, Xu Bai, Qiong Dai","doi":"10.1007/s11276-024-03821-2","DOIUrl":null,"url":null,"abstract":"<p>Social bot detection is crucial for ensuring the active participation of digital twins and edge intelligence in future social media platforms. Nevertheless, the performance of existing detection methods is impeded by the limited availability of labeled accounts. Despite the notable progress made in some fields by deep semi-supervised learning with label propagation, which utilizes unlabeled data to enhance method performance, its effectiveness is significantly hindered in social bot detection due to the misdistribution of individuation users (MIU). To address these challenges, we propose a novel deep semi-supervised bot detection method, which adopts a coarse-to-fine label propagation (LP-CF) with the hybridized representation models over multi-relational graphs (HR-MRG) to enhance the accuracy of label propagation, thereby improving the effectiveness of unlabeled data in supporting the detection task. Specifically, considering the potential confusion among accounts in the MIU phenomenon, we utilize HR-MRG to obtain high-quality user representations. Subsequently, we introduce a sample selection strategy to partition unlabeled samples into two subsets and apply LP-CF to generate pseudo labels for each subset. Finally, the predicted pseudo labels of unlabeled samples, combined with labeled samples, are used to fine-tune the detection models. Comprehensive experiments on two widely used real datasets demonstrate that our method outperforms other semi-supervised approaches and achieves comparable performance to the fully supervised social bot detection method.</p>","PeriodicalId":23750,"journal":{"name":"Wireless Networks","volume":"420 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Wireless Networks","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11276-024-03821-2","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Social bot detection is crucial for ensuring the active participation of digital twins and edge intelligence in future social media platforms. Nevertheless, the performance of existing detection methods is impeded by the limited availability of labeled accounts. Despite the notable progress made in some fields by deep semi-supervised learning with label propagation, which utilizes unlabeled data to enhance method performance, its effectiveness is significantly hindered in social bot detection due to the misdistribution of individuation users (MIU). To address these challenges, we propose a novel deep semi-supervised bot detection method, which adopts a coarse-to-fine label propagation (LP-CF) with the hybridized representation models over multi-relational graphs (HR-MRG) to enhance the accuracy of label propagation, thereby improving the effectiveness of unlabeled data in supporting the detection task. Specifically, considering the potential confusion among accounts in the MIU phenomenon, we utilize HR-MRG to obtain high-quality user representations. Subsequently, we introduce a sample selection strategy to partition unlabeled samples into two subsets and apply LP-CF to generate pseudo labels for each subset. Finally, the predicted pseudo labels of unlabeled samples, combined with labeled samples, are used to fine-tune the detection models. Comprehensive experiments on two widely used real datasets demonstrate that our method outperforms other semi-supervised approaches and achieves comparable performance to the fully supervised social bot detection method.

Abstract Image

查看原文本刊更多论文

利用混合表示进行粗到细标签传播，实现深度半监督机器人检测

社交机器人检测对于确保数字双胞胎和边缘智能积极参与未来的社交媒体平台至关重要。然而，现有检测方法的性能却因标签账户的有限性而受到阻碍。尽管利用标签传播的深度半监督学习在某些领域取得了显著进展，利用非标签数据提高了方法的性能，但由于个体化用户（MIU）的错误分布，其有效性在社交僵尸检测中受到了很大阻碍。为了应对这些挑战，我们提出了一种新的深度半监督僵尸检测方法，该方法采用粗到细标签传播（LP-CF）和多关系图混合表示模型（HR-MRG）来提高标签传播的准确性，从而提高了无标签数据在支持检测任务中的有效性。具体来说，考虑到 MIU 现象中账户之间可能存在的混淆，我们利用 HR-MRG 来获得高质量的用户表示。随后，我们引入样本选择策略，将未标记样本划分为两个子集，并应用 LP-CF 为每个子集生成伪标签。最后，未标记样本的预测伪标签与已标记样本相结合，用于微调检测模型。在两个广泛使用的真实数据集上进行的综合实验表明，我们的方法优于其他半监督方法，其性能可与完全监督的社交僵尸检测方法相媲美。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Wireless Networks 工程技术-电信学

CiteScore

7.70

自引率

3.30%

发文量

314

审稿时长

5.5 months

期刊介绍： The wireless communication revolution is bringing fundamental changes to data networking, telecommunication, and is making integrated networks a reality. By freeing the user from the cord, personal communications networks, wireless LAN''s, mobile radio networks and cellular systems, harbor the promise of fully distributed mobile computing and communications, any time, anywhere. Focusing on the networking and user aspects of the field, Wireless Networks provides a global forum for archival value contributions documenting these fast growing areas of interest. The journal publishes refereed articles dealing with research, experience and management issues of wireless networks. Its aim is to allow the reader to benefit from experience, problems and solutions described.