{"title":"Wacml: based on graph neural network for imbalanced node classification algorithm","authors":"Junfeng Wang, Jiayue Yang, Lidun","doi":"10.1007/s00530-024-01454-1","DOIUrl":null,"url":null,"abstract":"<p>The presence of a large number of robot accounts on social media has led to negative social impacts. In most cases, the distribution of robot accounts and real human accounts is imbalanced, resulting in insufficient representativeness and poor performance of a few types of samples. Graph neural networks can effectively utilize user interaction and are widely used to process graph structure data, achieving good performance in robot detection. However, previous robot detection methods based on GNN mostly considered the impact of class imbalance. However, in graph-structured data, the imbalance caused by differences in the position and structure of labeled nodes makes the processing results of GNN prone to bias toward larger categories. Due to the lack of consideration for the unique connectivity issues of the graph structure, the classification performance of nodes is not ideal. Therefore, in response to the shortcomings of existing schemes, this paper proposes a class imbalanced node classification algorithm based on minority weighting and abnormal connectivity margin loss, which extends the traditional imbalanced classification idea in the field of machine learning to graph-structured data and jointly handles the problem of quantity imbalance and graph-structured abnormal connectivity to improve GNN’s perception of connection anomalies. In the node feature aggregation stage, weighted aggregation is applied to minority classes. In the oversampling stage, the SMOTE algorithm is used to process imbalanced data, while considering node representation and topology structure. Simultaneously training an edge generator to model relationship information, combined with abnormal connectivity margin loss, to enhance the model’s learning of connectivity information, greatly improving the quality of the edge generator. Finally, we evaluated a publicly available dataset, and the experimental results showed that it achieved good results in classifying imbalanced nodes.</p>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01454-1","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
The presence of a large number of robot accounts on social media has led to negative social impacts. In most cases, the distribution of robot accounts and real human accounts is imbalanced, resulting in insufficient representativeness and poor performance of a few types of samples. Graph neural networks can effectively utilize user interaction and are widely used to process graph structure data, achieving good performance in robot detection. However, previous robot detection methods based on GNN mostly considered the impact of class imbalance. However, in graph-structured data, the imbalance caused by differences in the position and structure of labeled nodes makes the processing results of GNN prone to bias toward larger categories. Due to the lack of consideration for the unique connectivity issues of the graph structure, the classification performance of nodes is not ideal. Therefore, in response to the shortcomings of existing schemes, this paper proposes a class imbalanced node classification algorithm based on minority weighting and abnormal connectivity margin loss, which extends the traditional imbalanced classification idea in the field of machine learning to graph-structured data and jointly handles the problem of quantity imbalance and graph-structured abnormal connectivity to improve GNN’s perception of connection anomalies. In the node feature aggregation stage, weighted aggregation is applied to minority classes. In the oversampling stage, the SMOTE algorithm is used to process imbalanced data, while considering node representation and topology structure. Simultaneously training an edge generator to model relationship information, combined with abnormal connectivity margin loss, to enhance the model’s learning of connectivity information, greatly improving the quality of the edge generator. Finally, we evaluated a publicly available dataset, and the experimental results showed that it achieved good results in classifying imbalanced nodes.