{"title":"基于类中心损失的细粒度视觉分类特征增强模块。","authors":"Daohui Wang,He Xinyu,Shujing Lyu,Wei Tian,Yue Lu","doi":"10.1109/tnnls.2025.3613791","DOIUrl":null,"url":null,"abstract":"We propose a novel feature enhancement module designed for fine-grained visual classification tasks, which can be seamlessly integrated into various backbone architectures, including both convolutional neural network (CNN)-based and Transformer-based networks. The plug-and-play module outputs pixel-level feature maps and performs a weighted fusion of filtered features to enhance fine-grained feature representation. We introduce a class-centric loss function that optimizes the alignment of samples with their target class centers by pulling them toward the center of the target class while simultaneously pushing them away from the center of the most visually similar nontarget classes. Soft labels are employed to mitigate overfitting, ensuring the model generalizes well to unseen examples. Our approach consistently delivers significant improvements in accuracy across various mainstream backbone architectures, underscoring its versatility and robustness. Furthermore, we achieved the highest accuracy on the NABirds (NAB) and our proprietary lock cylinder datasets. We have released our source code and pretrained model on GitHub: https://github.com/Richard5413/FEM-CC.git.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"57 1","pages":""},"PeriodicalIF":8.9000,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Feature Enhancement Module Based on Class-Centric Loss for Fine-Grained Visual Classification.\",\"authors\":\"Daohui Wang,He Xinyu,Shujing Lyu,Wei Tian,Yue Lu\",\"doi\":\"10.1109/tnnls.2025.3613791\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a novel feature enhancement module designed for fine-grained visual classification tasks, which can be seamlessly integrated into various backbone architectures, including both convolutional neural network (CNN)-based and Transformer-based networks. The plug-and-play module outputs pixel-level feature maps and performs a weighted fusion of filtered features to enhance fine-grained feature representation. We introduce a class-centric loss function that optimizes the alignment of samples with their target class centers by pulling them toward the center of the target class while simultaneously pushing them away from the center of the most visually similar nontarget classes. Soft labels are employed to mitigate overfitting, ensuring the model generalizes well to unseen examples. Our approach consistently delivers significant improvements in accuracy across various mainstream backbone architectures, underscoring its versatility and robustness. Furthermore, we achieved the highest accuracy on the NABirds (NAB) and our proprietary lock cylinder datasets. We have released our source code and pretrained model on GitHub: https://github.com/Richard5413/FEM-CC.git.\",\"PeriodicalId\":13303,\"journal\":{\"name\":\"IEEE transactions on neural networks and learning systems\",\"volume\":\"57 1\",\"pages\":\"\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on neural networks and learning systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1109/tnnls.2025.3613791\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/tnnls.2025.3613791","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Feature Enhancement Module Based on Class-Centric Loss for Fine-Grained Visual Classification.
We propose a novel feature enhancement module designed for fine-grained visual classification tasks, which can be seamlessly integrated into various backbone architectures, including both convolutional neural network (CNN)-based and Transformer-based networks. The plug-and-play module outputs pixel-level feature maps and performs a weighted fusion of filtered features to enhance fine-grained feature representation. We introduce a class-centric loss function that optimizes the alignment of samples with their target class centers by pulling them toward the center of the target class while simultaneously pushing them away from the center of the most visually similar nontarget classes. Soft labels are employed to mitigate overfitting, ensuring the model generalizes well to unseen examples. Our approach consistently delivers significant improvements in accuracy across various mainstream backbone architectures, underscoring its versatility and robustness. Furthermore, we achieved the highest accuracy on the NABirds (NAB) and our proprietary lock cylinder datasets. We have released our source code and pretrained model on GitHub: https://github.com/Richard5413/FEM-CC.git.
期刊介绍:
The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.