Ponnuthurai Nagaratnam Suganthan , Lingping Kong , Václav Snášel , Varun Ojha , Hussein Ahmed Hussein Zaky Aly
{"title":"Euclidean and Poincaré space ensemble Xgboost","authors":"Ponnuthurai Nagaratnam Suganthan , Lingping Kong , Václav Snášel , Varun Ojha , Hussein Ahmed Hussein Zaky Aly","doi":"10.1016/j.inffus.2024.102746","DOIUrl":null,"url":null,"abstract":"<div><div>The Hyperbolic space has garnered attention for its unique properties and efficient representation of hierarchical structures. Recent studies have explored hyperbolic alternatives to hyperplane-based classifiers, such as logistic regression and support vector machines. Hyperbolic methods have even been fused into random forests by constructing data splits with horosphere, which proved effective for hyperbolic datasets. However, the existing incorporation of the horosphere leads to substantial computation time, diverting attention from its application on most datasets. Against this backdrop, we introduce an extension of Xgboost, a renowned machine learning (ML) algorithm to hyperbolic space, denoted as PXgboost. This extension involves a redefinition of the node split concept using the Riemannian gradient and Riemannian Hessian. Our findings unveil the promising performance of PXgboost compared to the algorithms in the literature through comprehensive experiments conducted on 64 datasets from the UCI ML repository and 8 datasets from WordNet by fusing both their Euclidean and hyperbolic-transformed (hyperbolic UCI) representations. Furthermore, our findings suggest that the Euclidean metric-based classifier performs well even on hyperbolic data. Building upon the above finding, we propose a space fusion classifier called, EPboost. It harmonizes data processing across various spaces and integrates probability outcomes for predictive analysis. In our comparative analysis involving 19 algorithms on the UCI dataset, our EPboost outperforms others in most cases, underscoring its efficacy and potential significance in diverse ML applications. This research marks a step forward in harnessing hyperbolic geometry for ML tasks and showcases its potential to enhance algorithmic efficacy.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102746"},"PeriodicalIF":14.7000,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253524005244","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The Hyperbolic space has garnered attention for its unique properties and efficient representation of hierarchical structures. Recent studies have explored hyperbolic alternatives to hyperplane-based classifiers, such as logistic regression and support vector machines. Hyperbolic methods have even been fused into random forests by constructing data splits with horosphere, which proved effective for hyperbolic datasets. However, the existing incorporation of the horosphere leads to substantial computation time, diverting attention from its application on most datasets. Against this backdrop, we introduce an extension of Xgboost, a renowned machine learning (ML) algorithm to hyperbolic space, denoted as PXgboost. This extension involves a redefinition of the node split concept using the Riemannian gradient and Riemannian Hessian. Our findings unveil the promising performance of PXgboost compared to the algorithms in the literature through comprehensive experiments conducted on 64 datasets from the UCI ML repository and 8 datasets from WordNet by fusing both their Euclidean and hyperbolic-transformed (hyperbolic UCI) representations. Furthermore, our findings suggest that the Euclidean metric-based classifier performs well even on hyperbolic data. Building upon the above finding, we propose a space fusion classifier called, EPboost. It harmonizes data processing across various spaces and integrates probability outcomes for predictive analysis. In our comparative analysis involving 19 algorithms on the UCI dataset, our EPboost outperforms others in most cases, underscoring its efficacy and potential significance in diverse ML applications. This research marks a step forward in harnessing hyperbolic geometry for ML tasks and showcases its potential to enhance algorithmic efficacy.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.