Wacml: based on graph neural network for imbalanced node classification algorithm

IF 4.3 3区 材料科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Junfeng Wang, Jiayue Yang, Lidun
{"title":"Wacml: based on graph neural network for imbalanced node classification algorithm","authors":"Junfeng Wang, Jiayue Yang, Lidun","doi":"10.1007/s00530-024-01454-1","DOIUrl":null,"url":null,"abstract":"<p>The presence of a large number of robot accounts on social media has led to negative social impacts. In most cases, the distribution of robot accounts and real human accounts is imbalanced, resulting in insufficient representativeness and poor performance of a few types of samples. Graph neural networks can effectively utilize user interaction and are widely used to process graph structure data, achieving good performance in robot detection. However, previous robot detection methods based on GNN mostly considered the impact of class imbalance. However, in graph-structured data, the imbalance caused by differences in the position and structure of labeled nodes makes the processing results of GNN prone to bias toward larger categories. Due to the lack of consideration for the unique connectivity issues of the graph structure, the classification performance of nodes is not ideal. Therefore, in response to the shortcomings of existing schemes, this paper proposes a class imbalanced node classification algorithm based on minority weighting and abnormal connectivity margin loss, which extends the traditional imbalanced classification idea in the field of machine learning to graph-structured data and jointly handles the problem of quantity imbalance and graph-structured abnormal connectivity to improve GNN’s perception of connection anomalies. In the node feature aggregation stage, weighted aggregation is applied to minority classes. In the oversampling stage, the SMOTE algorithm is used to process imbalanced data, while considering node representation and topology structure. Simultaneously training an edge generator to model relationship information, combined with abnormal connectivity margin loss, to enhance the model’s learning of connectivity information, greatly improving the quality of the edge generator. Finally, we evaluated a publicly available dataset, and the experimental results showed that it achieved good results in classifying imbalanced nodes.</p>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01454-1","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

The presence of a large number of robot accounts on social media has led to negative social impacts. In most cases, the distribution of robot accounts and real human accounts is imbalanced, resulting in insufficient representativeness and poor performance of a few types of samples. Graph neural networks can effectively utilize user interaction and are widely used to process graph structure data, achieving good performance in robot detection. However, previous robot detection methods based on GNN mostly considered the impact of class imbalance. However, in graph-structured data, the imbalance caused by differences in the position and structure of labeled nodes makes the processing results of GNN prone to bias toward larger categories. Due to the lack of consideration for the unique connectivity issues of the graph structure, the classification performance of nodes is not ideal. Therefore, in response to the shortcomings of existing schemes, this paper proposes a class imbalanced node classification algorithm based on minority weighting and abnormal connectivity margin loss, which extends the traditional imbalanced classification idea in the field of machine learning to graph-structured data and jointly handles the problem of quantity imbalance and graph-structured abnormal connectivity to improve GNN’s perception of connection anomalies. In the node feature aggregation stage, weighted aggregation is applied to minority classes. In the oversampling stage, the SMOTE algorithm is used to process imbalanced data, while considering node representation and topology structure. Simultaneously training an edge generator to model relationship information, combined with abnormal connectivity margin loss, to enhance the model’s learning of connectivity information, greatly improving the quality of the edge generator. Finally, we evaluated a publicly available dataset, and the experimental results showed that it achieved good results in classifying imbalanced nodes.

Abstract Image

Wacml:基于图神经网络的不平衡节点分类算法
社交媒体上大量机器人账户的出现造成了负面的社会影响。在大多数情况下,机器人账号和真人账号的分布不平衡,导致代表性不足,少数类型样本的性能较差。图神经网络能有效利用用户互动,被广泛用于处理图结构数据,在机器人检测中取得了良好的性能。然而,以往基于图神经网络的机器人检测方法大多考虑了类不平衡的影响。然而,在图结构数据中,由于标记节点的位置和结构不同而导致的不平衡,使得 GNN 的处理结果容易偏向较大的类别。由于没有考虑图结构特有的连接性问题,节点的分类性能并不理想。因此,针对现有方案的不足,本文提出了一种基于少数加权和异常连通性边际损失的类不平衡节点分类算法,将机器学习领域传统的不平衡分类思想扩展到图结构数据,共同处理数量不平衡和图结构异常连通性问题,提高 GNN 对连接异常的感知能力。在节点特征聚合阶段,对少数类进行加权聚合。在超采样阶段,使用 SMOTE 算法处理不平衡数据,同时考虑节点表示和拓扑结构。同时训练边缘生成器对关系信息进行建模,结合异常连通性边际损失,加强模型对连通性信息的学习,大大提高了边缘生成器的质量。最后,我们对一个公开的数据集进行了评估,实验结果表明它在分类不平衡节点方面取得了良好的效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.20
自引率
4.30%
发文量
567
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信