Equiangular Basis Vectors: A Novel Paradigm for Classification Tasks

IF 11.6 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Yang Shen, Xuhao Sun, Xiu-Shen Wei, Anqi Xu, Lingyan Gao
{"title":"Equiangular Basis Vectors: A Novel Paradigm for Classification Tasks","authors":"Yang Shen, Xuhao Sun, Xiu-Shen Wei, Anqi Xu, Lingyan Gao","doi":"10.1007/s11263-024-02189-2","DOIUrl":null,"url":null,"abstract":"<p>In this paper, we propose Equiangular Basis Vectors (EBVs) as a novel training paradigm of deep learning for image classification tasks. Differing from prominent training paradigms, e.g., <i>k</i>-way classification layers (mapping the learned representations to the label space) and deep metric learning (quantifying sample similarity), our method generates normalized vector embeddings as \"predefined classifiers\", which act as the fixed learning targets corresponding to different categories. By minimizing the spherical distance of the embedding of an input between its categorical EBV in training, the predictions can be obtained by identifying the categorical EBV with the smallest distance during inference. More importantly, by directly adding EBVs corresponding to newly added categories of equal status on the basis of existing EBVs, our method exhibits strong scalability to deal with the large increase of training categories in open-environment machine learning. In experiments, we evaluate EBVs on diverse computer vision tasks with large-scale real-world datasets, including classification on ImageNet-1K, object detection on COCO, semantic segmentation on ADE20K, etc. We further collected a dataset consisting of 100,000 categories to validate the superior performance of EBVs when handling a large number of categories. Comprehensive experiments validate both the effectiveness and scalability of our EBVs. Our method won the first place in the 2022 DIGIX Global AI Challenge, code along with all associated logs are open-source and available at https://github.com/aassxun/Equiangular-Basis-Vectors.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"45 1","pages":""},"PeriodicalIF":11.6000,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Vision","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11263-024-02189-2","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

In this paper, we propose Equiangular Basis Vectors (EBVs) as a novel training paradigm of deep learning for image classification tasks. Differing from prominent training paradigms, e.g., k-way classification layers (mapping the learned representations to the label space) and deep metric learning (quantifying sample similarity), our method generates normalized vector embeddings as "predefined classifiers", which act as the fixed learning targets corresponding to different categories. By minimizing the spherical distance of the embedding of an input between its categorical EBV in training, the predictions can be obtained by identifying the categorical EBV with the smallest distance during inference. More importantly, by directly adding EBVs corresponding to newly added categories of equal status on the basis of existing EBVs, our method exhibits strong scalability to deal with the large increase of training categories in open-environment machine learning. In experiments, we evaluate EBVs on diverse computer vision tasks with large-scale real-world datasets, including classification on ImageNet-1K, object detection on COCO, semantic segmentation on ADE20K, etc. We further collected a dataset consisting of 100,000 categories to validate the superior performance of EBVs when handling a large number of categories. Comprehensive experiments validate both the effectiveness and scalability of our EBVs. Our method won the first place in the 2022 DIGIX Global AI Challenge, code along with all associated logs are open-source and available at https://github.com/aassxun/Equiangular-Basis-Vectors.

Abstract Image

等边基向量:分类任务的新范式
在本文中,我们提出了等边基向量(EBVs)作为一种新的深度学习训练范式,用于图像分类任务。与k-way分类层(将学习到的表征映射到标签空间)和深度度量学习(量化样本相似性)等著名训练范式不同,我们的方法生成归一化向量嵌入作为 "预定义分类器",作为对应于不同类别的固定学习目标。在训练中,通过最小化输入的嵌入与其分类 EBV 之间的球形距离,在推理过程中就可以通过识别距离最小的分类 EBV 来获得预测结果。更重要的是,我们的方法在现有 EBV 的基础上直接添加了与新增加的同等地位类别相对应的 EBV,因此具有很强的可扩展性,可以应对开放环境机器学习中训练类别的大量增加。在实验中,我们利用大规模真实世界数据集评估了 EBV 在各种计算机视觉任务中的应用,包括 ImageNet-1K 的分类、COCO 的物体检测、ADE20K 的语义分割等。我们还收集了一个包含 100,000 个类别的数据集,以验证 EBV 在处理大量类别时的卓越性能。综合实验验证了 EBV 的有效性和可扩展性。我们的方法赢得了 2022 年 DIGIX 全球人工智能挑战赛的第一名,代码和所有相关日志都是开源的,可在 https://github.com/aassxun/Equiangular-Basis-Vectors 上获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Computer Vision
International Journal of Computer Vision 工程技术-计算机:人工智能
CiteScore
29.80
自引率
2.10%
发文量
163
审稿时长
6 months
期刊介绍: The International Journal of Computer Vision (IJCV) serves as a platform for sharing new research findings in the rapidly growing field of computer vision. It publishes 12 issues annually and presents high-quality, original contributions to the science and engineering of computer vision. The journal encompasses various types of articles to cater to different research outputs. Regular articles, which span up to 25 journal pages, focus on significant technical advancements that are of broad interest to the field. These articles showcase substantial progress in computer vision. Short articles, limited to 10 pages, offer a swift publication path for novel research outcomes. They provide a quicker means for sharing new findings with the computer vision community. Survey articles, comprising up to 30 pages, offer critical evaluations of the current state of the art in computer vision or offer tutorial presentations of relevant topics. These articles provide comprehensive and insightful overviews of specific subject areas. In addition to technical articles, the journal also includes book reviews, position papers, and editorials by prominent scientific figures. These contributions serve to complement the technical content and provide valuable perspectives. The journal encourages authors to include supplementary material online, such as images, video sequences, data sets, and software. This additional material enhances the understanding and reproducibility of the published research. Overall, the International Journal of Computer Vision is a comprehensive publication that caters to researchers in this rapidly growing field. It covers a range of article types, offers additional online resources, and facilitates the dissemination of impactful research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信