Wacml: based on graph neural network for imbalanced node classification algorithm

IF 3.1 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Multimedia Systems Pub Date : 2024-08-30 DOI:10.1007/s00530-024-01454-1

Junfeng Wang, Jiayue Yang, Lidun

{"title":"Wacml: based on graph neural network for imbalanced node classification algorithm","authors":"Junfeng Wang, Jiayue Yang, Lidun","doi":"10.1007/s00530-024-01454-1","DOIUrl":null,"url":null,"abstract":"<p>The presence of a large number of robot accounts on social media has led to negative social impacts. In most cases, the distribution of robot accounts and real human accounts is imbalanced, resulting in insufficient representativeness and poor performance of a few types of samples. Graph neural networks can effectively utilize user interaction and are widely used to process graph structure data, achieving good performance in robot detection. However, previous robot detection methods based on GNN mostly considered the impact of class imbalance. However, in graph-structured data, the imbalance caused by differences in the position and structure of labeled nodes makes the processing results of GNN prone to bias toward larger categories. Due to the lack of consideration for the unique connectivity issues of the graph structure, the classification performance of nodes is not ideal. Therefore, in response to the shortcomings of existing schemes, this paper proposes a class imbalanced node classification algorithm based on minority weighting and abnormal connectivity margin loss, which extends the traditional imbalanced classification idea in the field of machine learning to graph-structured data and jointly handles the problem of quantity imbalance and graph-structured abnormal connectivity to improve GNN’s perception of connection anomalies. In the node feature aggregation stage, weighted aggregation is applied to minority classes. In the oversampling stage, the SMOTE algorithm is used to process imbalanced data, while considering node representation and topology structure. Simultaneously training an edge generator to model relationship information, combined with abnormal connectivity margin loss, to enhance the model’s learning of connectivity information, greatly improving the quality of the edge generator. Finally, we evaluated a publicly available dataset, and the experimental results showed that it achieved good results in classifying imbalanced nodes.</p>","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"128 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimedia Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01454-1","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

The presence of a large number of robot accounts on social media has led to negative social impacts. In most cases, the distribution of robot accounts and real human accounts is imbalanced, resulting in insufficient representativeness and poor performance of a few types of samples. Graph neural networks can effectively utilize user interaction and are widely used to process graph structure data, achieving good performance in robot detection. However, previous robot detection methods based on GNN mostly considered the impact of class imbalance. However, in graph-structured data, the imbalance caused by differences in the position and structure of labeled nodes makes the processing results of GNN prone to bias toward larger categories. Due to the lack of consideration for the unique connectivity issues of the graph structure, the classification performance of nodes is not ideal. Therefore, in response to the shortcomings of existing schemes, this paper proposes a class imbalanced node classification algorithm based on minority weighting and abnormal connectivity margin loss, which extends the traditional imbalanced classification idea in the field of machine learning to graph-structured data and jointly handles the problem of quantity imbalance and graph-structured abnormal connectivity to improve GNN’s perception of connection anomalies. In the node feature aggregation stage, weighted aggregation is applied to minority classes. In the oversampling stage, the SMOTE algorithm is used to process imbalanced data, while considering node representation and topology structure. Simultaneously training an edge generator to model relationship information, combined with abnormal connectivity margin loss, to enhance the model’s learning of connectivity information, greatly improving the quality of the edge generator. Finally, we evaluated a publicly available dataset, and the experimental results showed that it achieved good results in classifying imbalanced nodes.

Abstract Image

查看原文本刊更多论文

Wacml：基于图神经网络的不平衡节点分类算法

社交媒体上大量机器人账户的出现造成了负面的社会影响。在大多数情况下，机器人账号和真人账号的分布不平衡，导致代表性不足，少数类型样本的性能较差。图神经网络能有效利用用户互动，被广泛用于处理图结构数据，在机器人检测中取得了良好的性能。然而，以往基于图神经网络的机器人检测方法大多考虑了类不平衡的影响。然而，在图结构数据中，由于标记节点的位置和结构不同而导致的不平衡，使得 GNN 的处理结果容易偏向较大的类别。由于没有考虑图结构特有的连接性问题，节点的分类性能并不理想。因此，针对现有方案的不足，本文提出了一种基于少数加权和异常连通性边际损失的类不平衡节点分类算法，将机器学习领域传统的不平衡分类思想扩展到图结构数据，共同处理数量不平衡和图结构异常连通性问题，提高 GNN 对连接异常的感知能力。在节点特征聚合阶段，对少数类进行加权聚合。在超采样阶段，使用 SMOTE 算法处理不平衡数据，同时考虑节点表示和拓扑结构。同时训练边缘生成器对关系信息进行建模，结合异常连通性边际损失，加强模型对连通性信息的学习，大大提高了边缘生成器的质量。最后，我们对一个公开的数据集进行了评估，实验结果表明它在分类不平衡节点方面取得了良好的效果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Multimedia Systems 工程技术-计算机：理论方法

CiteScore

5.40

自引率

7.70%

发文量

148

审稿时长

4.5 months

期刊介绍： This journal details innovative research ideas, emerging technologies, state-of-the-art methods and tools in all aspects of multimedia computing, communication, storage, and applications. It features theoretical, experimental, and survey articles.