Gravitational least squares twin support vector machine based on optimal angle for class imbalance learning

IF 3.4 2区 数学 Q1 MATHEMATICS, APPLIED
Abdullah Mohammadi , Jalal A. Nasiri , Sohrab Effati
{"title":"Gravitational least squares twin support vector machine based on optimal angle for class imbalance learning","authors":"Abdullah Mohammadi ,&nbsp;Jalal A. Nasiri ,&nbsp;Sohrab Effati","doi":"10.1016/j.amc.2025.129705","DOIUrl":null,"url":null,"abstract":"<div><div>This paper introduces the Gravitational Least Squares Twin Support Vector Machine for Class Imbalance Learning (GLSTSVM-CIL), a novel binary classification method designed to address critical limitations in existing approaches for imbalanced large-scale datasets. Traditional methods like Fuzzy TSVM and KNN-based weighting fail to simultaneously capture both global positional relationships and local density characteristics of data points. Our proposed gravitational weighting function innovatively models data samples as masses influenced by their distance from class centroids and neighborhood density, effectively prioritizing representative points while suppressing outliers. The optimization framework uniquely incorporates angular constraints between hyperplanes to enhance structural risk control and generalization capability. For scalability, we reformulate the solution into a linear system solvable via conjugate gradient methods, avoiding computationally expensive matrix inversions. Comprehensive evaluations on 92 datasets (including synthetic, noisy, medical, text, and large-scale NDC benchmarks) demonstrate GLSTSVM-CIL’s superior performance, particularly in minority-class recognition where it achieves average F1-Score improvements over baseline methods. The model maintains robust Accuracy under high noise (20 %) and extreme class imbalance (ratio 20:1) while ables to process datasets up to 50,000 samples.</div></div>","PeriodicalId":55496,"journal":{"name":"Applied Mathematics and Computation","volume":"510 ","pages":"Article 129705"},"PeriodicalIF":3.4000,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Mathematics and Computation","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S009630032500431X","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0

Abstract

This paper introduces the Gravitational Least Squares Twin Support Vector Machine for Class Imbalance Learning (GLSTSVM-CIL), a novel binary classification method designed to address critical limitations in existing approaches for imbalanced large-scale datasets. Traditional methods like Fuzzy TSVM and KNN-based weighting fail to simultaneously capture both global positional relationships and local density characteristics of data points. Our proposed gravitational weighting function innovatively models data samples as masses influenced by their distance from class centroids and neighborhood density, effectively prioritizing representative points while suppressing outliers. The optimization framework uniquely incorporates angular constraints between hyperplanes to enhance structural risk control and generalization capability. For scalability, we reformulate the solution into a linear system solvable via conjugate gradient methods, avoiding computationally expensive matrix inversions. Comprehensive evaluations on 92 datasets (including synthetic, noisy, medical, text, and large-scale NDC benchmarks) demonstrate GLSTSVM-CIL’s superior performance, particularly in minority-class recognition where it achieves average F1-Score improvements over baseline methods. The model maintains robust Accuracy under high noise (20 %) and extreme class imbalance (ratio 20:1) while ables to process datasets up to 50,000 samples.
基于最优角度的重力最小二乘双支持向量机类不平衡学习
本文介绍了用于类不平衡学习的引力最小二乘双支持向量机(GLSTSVM-CIL),这是一种新的二分类方法,旨在解决现有方法在不平衡大规模数据集上的关键限制。传统方法如模糊TSVM和基于knn的加权不能同时捕捉数据点的全局位置关系和局部密度特征。我们提出的引力加权函数创新地将数据样本建模为受其与类质心的距离和邻域密度影响的质量,有效地优先考虑代表性点,同时抑制异常值。优化框架独特地结合了超平面间的角度约束,增强了结构风险控制和泛化能力。为了可扩展性,我们将解重新表述为可通过共轭梯度方法求解的线性系统,避免了计算上昂贵的矩阵反转。对92个数据集(包括合成、噪声、医疗、文本和大规模NDC基准)的综合评估表明,GLSTSVM-CIL的性能优越,特别是在少数族裔识别方面,它比基线方法实现了平均f1分的提高。该模型在高噪声(20%)和极端类不平衡(比例20:1)下保持稳健的准确性,同时能够处理多达50,000个样本的数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.90
自引率
10.00%
发文量
755
审稿时长
36 days
期刊介绍: Applied Mathematics and Computation addresses work at the interface between applied mathematics, numerical computation, and applications of systems – oriented ideas to the physical, biological, social, and behavioral sciences, and emphasizes papers of a computational nature focusing on new algorithms, their analysis and numerical results. In addition to presenting research papers, Applied Mathematics and Computation publishes review articles and single–topics issues.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信