{"title":"A novel heterogeneous data classification approach combining gradient boosting decision trees and hybrid structure model","authors":"Feng Xu , Yuting Huang , Hui Wang , Zizhu Fan","doi":"10.1016/j.patcog.2025.111614","DOIUrl":null,"url":null,"abstract":"<div><div>Graph neural network (GNN) is crucial in graph representation learning tasks. However, when the feature of graph network nodes is complex, such as those originating from heterogeneous data or multi-view data, graph neural network methods encounter difficulties. It is well known that gradient boosting decision trees (GBDT) excel at handling heterogeneous tabular data, while GNN and HGNN perform well with low-order and high-order sparse matrices. Therefore, we propose a method that combines their strengths by using GBDT to handle heterogeneous features, while a hybrid structured model (HSM) based on GNN and hypergraph neural network (HGNN), which can effectively capture both low-order and high-order information, backpropagates gradients to the GBDT. The proposed GBDT-HSM algorithm performs well on four structured tabular datasets and two multi-view datasets. It achieves state-of-the-art performance, showcasing its potential in addressing the challenges of heterogeneous data classification. The code is available at <span><span>https://github.com/zzfan3/GBDT-HSM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"165 ","pages":"Article 111614"},"PeriodicalIF":7.5000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325002742","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Graph neural network (GNN) is crucial in graph representation learning tasks. However, when the feature of graph network nodes is complex, such as those originating from heterogeneous data or multi-view data, graph neural network methods encounter difficulties. It is well known that gradient boosting decision trees (GBDT) excel at handling heterogeneous tabular data, while GNN and HGNN perform well with low-order and high-order sparse matrices. Therefore, we propose a method that combines their strengths by using GBDT to handle heterogeneous features, while a hybrid structured model (HSM) based on GNN and hypergraph neural network (HGNN), which can effectively capture both low-order and high-order information, backpropagates gradients to the GBDT. The proposed GBDT-HSM algorithm performs well on four structured tabular datasets and two multi-view datasets. It achieves state-of-the-art performance, showcasing its potential in addressing the challenges of heterogeneous data classification. The code is available at https://github.com/zzfan3/GBDT-HSM.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.