A Survey of Imbalanced Learning on Graphs: Problems, Techniques, and Future Directions

IF 10.4 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-03-12 DOI:10.1109/TKDE.2025.3549299

Zemin Liu;Yuan Li;Nan Chen;Qian Wang;Bryan Hooi;Bingsheng He

{"title":"A Survey of Imbalanced Learning on Graphs: Problems, Techniques, and Future Directions","authors":"Zemin Liu;Yuan Li;Nan Chen;Qian Wang;Bryan Hooi;Bingsheng He","doi":"10.1109/TKDE.2025.3549299","DOIUrl":null,"url":null,"abstract":"Graphs represent interconnected structures prevalent in a myriad of real-world scenarios. Effective graph analytics, such as graph learning methods, enables users to gain profound insights from graph data, underpinning various tasks including node classification and link prediction. However, these methods often suffer from data imbalance, a common issue in graph data where certain segments possess abundant data while others are scarce, thereby leading to biased learning outcomes. This necessitates the emerging field of imbalanced learning on graphs, which aims to correct these data distribution skews for more accurate and representative learning outcomes. In this survey, we embark on a comprehensive review of the literature on imbalanced learning on graphs. We begin by providing a definitive understanding of the concept and related terminologies, establishing a strong foundational understanding for readers. Following this, we propose two comprehensive taxonomies: (1) the <italic>problem taxonomy</i>, which describes the forms of imbalance we consider, the associated tasks, and potential solutions and (2) the <italic>technique taxonomy</i>, which details key strategies for addressing these imbalances, and aids readers in their method selection process. Finally, we suggest prospective future directions for both problems and techniques within the sphere of imbalanced learning on graphs, fostering further innovation in this critical area.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3132-3152"},"PeriodicalIF":10.4000,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10924418/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Graphs represent interconnected structures prevalent in a myriad of real-world scenarios. Effective graph analytics, such as graph learning methods, enables users to gain profound insights from graph data, underpinning various tasks including node classification and link prediction. However, these methods often suffer from data imbalance, a common issue in graph data where certain segments possess abundant data while others are scarce, thereby leading to biased learning outcomes. This necessitates the emerging field of imbalanced learning on graphs, which aims to correct these data distribution skews for more accurate and representative learning outcomes. In this survey, we embark on a comprehensive review of the literature on imbalanced learning on graphs. We begin by providing a definitive understanding of the concept and related terminologies, establishing a strong foundational understanding for readers. Following this, we propose two comprehensive taxonomies: (1) the problem taxonomy, which describes the forms of imbalance we consider, the associated tasks, and potential solutions and (2) the technique taxonomy, which details key strategies for addressing these imbalances, and aids readers in their method selection process. Finally, we suggest prospective future directions for both problems and techniques within the sphere of imbalanced learning on graphs, fostering further innovation in this critical area.

查看原文本刊更多论文

图上的不平衡学习综述：问题、技术和未来方向

图形表示在无数现实场景中普遍存在的相互关联的结构。有效的图分析，如图学习方法，使用户能够从图数据中获得深刻的见解，支持各种任务，包括节点分类和链接预测。然而，这些方法往往存在数据不平衡的问题，这是图数据中一个常见的问题，即某些部分拥有丰富的数据，而另一些部分则缺乏数据，从而导致有偏差的学习结果。这就需要新兴的图上不平衡学习领域，其目的是纠正这些数据分布偏差，以获得更准确和更具代表性的学习结果。在这项调查中，我们着手对图上的不平衡学习的文献进行了全面的回顾。我们首先提供对概念和相关术语的明确理解，为读者建立牢固的基础理解。在此之后，我们提出了两种全面的分类法：(1)问题分类法，它描述了我们考虑的不平衡形式、相关任务和潜在解决方案；(2)技术分类法，它详细说明了解决这些不平衡的关键策略，并帮助读者选择方法。最后，我们提出了图上不平衡学习领域的问题和技术的未来发展方向，促进了这一关键领域的进一步创新。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Knowledge and Data Engineering 工程技术-工程：电子与电气

CiteScore

11.70

自引率

3.40%

发文量

515

审稿时长

6 months

期刊介绍： The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.