IEEE transactions on artificial intelligence最新文献

筛选
英文 中文
Embodied Navigation in Unknown Environments With Implicit Scene Memory and Target-Aware Memory Retrieval 基于隐式场景记忆和目标感知记忆检索的未知环境嵌入导航
IEEE transactions on artificial intelligence Pub Date : 2026-04-01 Epub Date: 2025-10-10 DOI: 10.1109/TAI.2025.3618793
Qiming Liu;Yunhe Li;Yiduo Xu;Lijun Han;Zhe Liu;Hesheng Wang
{"title":"Embodied Navigation in Unknown Environments With Implicit Scene Memory and Target-Aware Memory Retrieval","authors":"Qiming Liu;Yunhe Li;Yiduo Xu;Lijun Han;Zhe Liu;Hesheng Wang","doi":"10.1109/TAI.2025.3618793","DOIUrl":"https://doi.org/10.1109/TAI.2025.3618793","url":null,"abstract":"Neural radiance fields (NeRF) have demonstrated significant potential in providing fine and dense scene representations. This article intend to leverage NeRF as a memory structure for storing scene cues and to explore its potential in robotic navigation tasks, thereby enabling robots to achieve reliable and efficient navigation in unknown environments. During navigation, scene features from historical observations are stored online in the NeRF memory. Concurrently, exploiting the implicit characteristics of NeRF, a differentiable memory retrieval mechanism called dual space attention is designed to extract target-relevant scene cues from the NeRF structure, supporting long-term optimized behavior patterns. Additionally, to address the significant noise challenges due to the lack of environmental priors when using implicit NeRF memory in unknown scenes, an uncertainty masking operation is introduced during memory retrieval to eliminate low signal-to-noise ratio information and potentially enhance exploration behavior. Results from photorealistic simulations and real-world demonstrations show that the proposed navigation system exhibits obvious target search and orientation behavior patterns, outperforming typical baselines in terms of performance and efficiency. Overall, the proposed end-to-end navigation pipeline extends the application of NeRF technology in control domain tasks, offering new possibilities for the advancement of intelligent robots and embodied AI.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 4","pages":"2387-2400"},"PeriodicalIF":0.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147578994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DDD-GenDT: Dynamic Data-Driven Generative Digital Twin Framework DDD-GenDT:动态数据驱动生成数字孪生框架
IEEE transactions on artificial intelligence Pub Date : 2026-04-01 Epub Date: 2025-10-09 DOI: 10.1109/TAI.2025.3612920
Yu-Zheng Lin;Qinxuan Shi;Zhanglong Yang;Banafsheh Saber Latibari;Shalaka Satam;Sicong Shao;Soheil Salehi;Pratik Satam
{"title":"DDD-GenDT: Dynamic Data-Driven Generative Digital Twin Framework","authors":"Yu-Zheng Lin;Qinxuan Shi;Zhanglong Yang;Banafsheh Saber Latibari;Shalaka Satam;Sicong Shao;Soheil Salehi;Pratik Satam","doi":"10.1109/TAI.2025.3612920","DOIUrl":"https://doi.org/10.1109/TAI.2025.3612920","url":null,"abstract":"Digital twin (DT) technology enables real-time simulation, prediction, and optimization of physical systems (PSs), but its practical deployment often faces challenges related to high data requirements, proprietary data constraints, and limited adaptability to evolving system conditions. This work introduces dynamic data-driven generative DT (DDD-GenDT), a DDD-GenDT framework grounded in the dynamic data-driven application systems (DDDAS) paradigm. The proposed architecture comprises of the physical twin (PT) observation graph (PTOG) for representing operational states of the PT, an observation window extraction process for capturing relevant temporal state sequences, a data preprocessing pipeline within an large language model (LLM)-based behavior prediction engine for sensor data structuring and filtering, and an LLM ensemble that performs zero-shot predictive inference. By leveraging generative artificial intelligence (AI), DDD-GenDT reduces the need for extensive historical datasets, enabling DT construction in data-scarce environments while maintaining privacy for proprietary industrial processes. The DDDAS-driven feedback mechanism enables the DT to autonomically adapt its predictive behavior to align with PT-specific wear and degradation patterns, thereby supporting DT-aging, which is the progressive synchronization of the DT with the evolving PS. The proposed framework is validated using the NASA CNC milling dataset, with spindle motor current as the monitored variable. In a zero-shot prediction setting, the GPT-4-based DT achieves an average RMSE of 0.479 A (4.79% of the maximum 10 A spindle current), accurately modeling both nonlinear process dynamics and changes arising from PT aging without retraining. These results demonstrate that DDD-GenDT provides a generalizable, data-efficient, and adaptive DT modeling approach, bridging the generative AI (GenAI) capabilities with the performance and reliability requirements of industrial DT applications.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 4","pages":"2171-2185"},"PeriodicalIF":0.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147579020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interclass and Intraclass Relationships Incorporated Knowledge Distillation for Continual Learning 班际和班内关系结合知识精馏的持续学习
IEEE transactions on artificial intelligence Pub Date : 2026-04-01 Epub Date: 2025-09-22 DOI: 10.1109/TAI.2025.3611366
Qingya Sui;Lin Zhong;Lianbo Ma;Ziqian Wang;Zhenyu Lei;Shangce Gao
{"title":"Interclass and Intraclass Relationships Incorporated Knowledge Distillation for Continual Learning","authors":"Qingya Sui;Lin Zhong;Lianbo Ma;Ziqian Wang;Zhenyu Lei;Shangce Gao","doi":"10.1109/TAI.2025.3611366","DOIUrl":"https://doi.org/10.1109/TAI.2025.3611366","url":null,"abstract":"Continual learning (CL) enables models to learn sequentially from a stream of tasks while retaining previously acquired knowledge. However, current methods lack the attention of tasks with different categories, as new task is introduced, models often adjust their internal representations to accommodate the new knowledge, which leads to decision boundary shifts and catastrophic forgetting, limiting the performance of CL methods. To address these limitations, this article proposes CL with interclass and intraclass relationships incorporated knowledge distillation (2ICL). The interclass ensures stable decision boundaries by capturing the relative positioning between task categories. Intraclass relationships preserve internal coherence within each class to enhance generalization. Furthermore, 2ICL incorporates a dynamically expandable representation, enabling it to expand its feature space as new tasks are added while retaining old and new knowledge. Experiments conducted on CIFAR-10, CIFAR-100, and Pathmnist datasets demonstrate that 2ICL not only significantly alleviates catastrophic forgetting but also maintains high accuracy across tasks.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 4","pages":"2102-2111"},"PeriodicalIF":0.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147579033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
WeatherRemover: All-in-One Adverse Weather Removal With Multiscale Feature Map Compression WeatherRemover:多尺度特征地图压缩的一体化不利天气去除
IEEE transactions on artificial intelligence Pub Date : 2026-03-01 Epub Date: 2025-11-14 DOI: 10.1109/TAI.2025.3633206
Weikai Qu;Sijun Liang;Cheng Pan;Zikuan Yang;Guanchi Zhou;Xianjun Fu;Bo Liu;Changmiao Wang;Ahmed Elazab
{"title":"WeatherRemover: All-in-One Adverse Weather Removal With Multiscale Feature Map Compression","authors":"Weikai Qu;Sijun Liang;Cheng Pan;Zikuan Yang;Guanchi Zhou;Xianjun Fu;Bo Liu;Changmiao Wang;Ahmed Elazab","doi":"10.1109/TAI.2025.3633206","DOIUrl":"https://doi.org/10.1109/TAI.2025.3633206","url":null,"abstract":"Photographs taken in adverse weather conditions often suffer from blurriness, occlusion, and low brightness due to interference from rain, snow, and fog. These weather effects substantially impair downstream vision tasks, making their removal a critical step in image enhancement. Existing methods primarily target specific weather conditions, with only a few capable of handling multiple weather scenarios. However, mainstream approaches often overlook performance considerations, resulting in large parameter sizes, long inference times, and high memory costs. In this study, we introduce the WeatherRemover model, designed to enhance the restoration of images affected by various weather conditions while balancing performance. Our model adopts a UNet-like structure with a gating mechanism and a multiscale pyramid vision transformer. It employs channel-wise attention derived from convolutional neural networks to optimize feature extraction, while linear spatial reduction helps curtail the computational demands of attention. The gating mechanisms, strategically placed within the feed-forward and downsampling phases, refine the processing of information by selectively addressing redundancy and mitigating its influence on learning. This approach facilitates the adaptive selection of essential data, ensuring superior restoration and maximizing efficiency. Additionally, our lightweight model achieves an optimal balance between restoration quality, parameter efficiency, computational overhead, and memory usage, distinguishing it from other multiweather models, thereby meeting practical application demands effectively.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 5","pages":"2980-2994"},"PeriodicalIF":0.0,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147757122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Ensemble Link Prediction Model for Sparse Knowledge Graphs With Drifting Entity 带有漂移实体的稀疏知识图的集成链接预测模型
IEEE transactions on artificial intelligence Pub Date : 2026-03-01 Epub Date: 2025-11-11 DOI: 10.1109/TAI.2025.3628292
Xiulin Zheng;Pei-pei Li;Peng Zhou;Xindong Wu
{"title":"An Ensemble Link Prediction Model for Sparse Knowledge Graphs With Drifting Entity","authors":"Xiulin Zheng;Pei-pei Li;Peng Zhou;Xindong Wu","doi":"10.1109/TAI.2025.3628292","DOIUrl":"https://doi.org/10.1109/TAI.2025.3628292","url":null,"abstract":"Incompleteness is a common issue in knowledge graphs (KGs), limiting their performance on aiding downstream applications. Knowledge graph completion (KGC) has emerged to predict the missing links between entities to enhance the quality of KGs. However, there are still some issues for existing methods remaining unsolved. First, most of them only utilize either the triple structure information or text semantic information, which usually leads to a sub-optimal result in sparse KGC. Second, redundant text replacement and embedding calculation for same entities in KG brings about a huge computational cost during the period of embedding learning. Third, entity drift may occur at any time in dynamic KGs while existing methods rarely care them. To address these challenges, we propose a Time friendly Ensemble model for link prediction in dynamic KGs with drifting entity (TEDD). In terms of TEDD model, a double scoring mechanism is employed to fully exploit the text semantic information as well as structure information of triple in KGs. Creative dual-bidirectional encoder representations from transformers (BERT) structure design along with auxiliary candidate entity set is adopted to avoid redundant embedding calculation for the same entities and further alleviate the huge computing cost to some extent. Language representation model works as knowledge base to learn the representation of drifted entity and avoid the huge computational cost brought by frequent retraining model. Extensive experiments conducted on benchmark datasets demonstrate the superiority of proposed TEDD model compared to its state-of-the-arts (SOTAs).","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 5","pages":"2870-2881"},"PeriodicalIF":0.0,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147757160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FA-MADT: Enhancing Offline Multiagent Reinforcement Learning With Factorized Attention and Decision Transformers FA-MADT:利用因式注意和决策转换器增强离线多智能体强化学习
IEEE transactions on artificial intelligence Pub Date : 2026-03-01 Epub Date: 2025-10-20 DOI: 10.1109/TAI.2025.3623619
Youness Boutyour;Abdellah Idrissi
{"title":"FA-MADT: Enhancing Offline Multiagent Reinforcement Learning With Factorized Attention and Decision Transformers","authors":"Youness Boutyour;Abdellah Idrissi","doi":"10.1109/TAI.2025.3623619","DOIUrl":"https://doi.org/10.1109/TAI.2025.3623619","url":null,"abstract":"Multiagent reinforcement learning (MARL) is a challenging issue in respect of scalability, coordination, and stability, particularly in the offline setting where exploration is restricted. Decision transformers (DTs) are an emerging technology in offline reinforcement learning (RL) for single agents by transforming RL into a sequence modeling problem, but their use in multiagent environments is not fully explored. In this work, we introduce factorized attention for multiagent decision transformers (FA-MADTs), new architecture that enhances coordination and sample efficiency, with design considerations aimed at improving scalability. FA-MADT uses factorized attention (FA) to model interagent dependencies and thus avoids the quadratic complexity of standard self-attention while preserving relevant coordination information. With the integration of return-to-go (RTG) conditioning, FA-MADT is capable of making trajectory-based decisions and thus performs well in long-term planning without the need for online exploration. Furthermore, behavior cloning (BC) regularization improves policy learning by preventing out-of-distribution (OOD) actions and enhancing the generality of the policy over different offline datasets. We evaluate FA-MADT on three benchmark suites—multiagent MuJoCo, the StarCraft Multiagent Challenge (SMAC), and multiagent traffic signal control—demonstrating consistent improvements over state-of-the-art baselines including MADT, TransMix, CQL-MA, and OMIGA. Our method improves coordination efficiency by up to 15%, reduces OOD action rates by 20%, and lowers memory usage by 12%. FA-MADT also reduces attention complexity from <inline-formula><tex-math>$mathcal{O}(N^{2})$</tex-math></inline-formula> to <inline-formula><tex-math>$mathcal{O}(Ncdot dcdot k)$</tex-math></inline-formula> with <inline-formula><tex-math>$kll N$</tex-math></inline-formula>, supporting scalable policy learning. Additionally, BC regularization improves OOD action selection accuracy by up to 9.4% on the most challenging SMAC scenarios, contributing to more stable offline policy optimization. These results highlight FA-MADT as a promising step toward scalable and generalizable offline multiagent decision-making, with future work needed to validate its robustness in real-world systems involving noisy sensors and physical dynamics.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 5","pages":"2751-2760"},"PeriodicalIF":0.0,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147757162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inter and Intra-snippet Multi-head Attention With Position Offset for Action Localization and Recognition 基于位置偏移的片段间和片段内多头注意动作定位和识别
IEEE transactions on artificial intelligence Pub Date : 2026-03-01 Epub Date: 2025-11-10 DOI: 10.1109/TAI.2025.3630621
Himanshu Singh;Khanjan Choudhury;Badri Narayan Subudhi;Vinit Jakhetiya;T. Veerakumar
{"title":"Inter and Intra-snippet Multi-head Attention With Position Offset for Action Localization and Recognition","authors":"Himanshu Singh;Khanjan Choudhury;Badri Narayan Subudhi;Vinit Jakhetiya;T. Veerakumar","doi":"10.1109/TAI.2025.3630621","DOIUrl":"https://doi.org/10.1109/TAI.2025.3630621","url":null,"abstract":"Numerous studies have focused on action localization and recognition; however, their performance suffers when applied to weakly supervised scenarios, leading to poor or rapidly declining results. This article introduces an efficient deep learning architecture based on multi-head attention to enhance action localization in untrimmed videos. Our proposed algorithm comprises three stages. Initially, a short-snippet enhancement (SSE) sampling module captures intrinsic details in video frames, adeptly balancing short-term and long-term action contributions for improved localization. The second stage employs inter-snippet and intra-snippet multi-head attention, incorporating positional offset, to capture spatio-temporal dependencies among videos and within individual video snippets, precisely identifying action boundaries. The third stage integrates an action localization network with uncertainty-guided pseudoinstance-level and video-level losses to enhance performance, mitigating the impact of noisy labels. A multistep updating process progressively refines action proposals, augmenting localization precision. To demonstrate the effectiveness of our proposed scheme, we evaluate the performance of the proposed scheme using mean average precision (mAP) over the different thresholds of intersection over union (IoU) as the evaluation measure on the “THUMOS14” and “ActivityNet-v1.3” datasets. Our algorithm achieves an mAP value of 45.20% on “THUMOS14” and an mAP value of 25.24% on “ActivityNet-v1.3.” Furthermore, we compare our technique with 24 state-of-the-art (SOTA) techniques on “THUMOS14” and eleven SOTA techniques on “ActivityNet-v1.3,” confirming the superiority of the proposed scheme.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 5","pages":"3018-3030"},"PeriodicalIF":0.0,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147757167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GraphTeacher: Transductive Fine-Tuning of Encoders Through Graph Neural Networks 基于图神经网络的编码器转导微调
IEEE transactions on artificial intelligence Pub Date : 2026-03-01 Epub Date: 2025-10-31 DOI: 10.1109/TAI.2025.3627514
Emirhan Koç;Arda Can Aras;Tuna Alikaşifoğlu;Aykut Koç
{"title":"GraphTeacher: Transductive Fine-Tuning of Encoders Through Graph Neural Networks","authors":"Emirhan Koç;Arda Can Aras;Tuna Alikaşifoğlu;Aykut Koç","doi":"10.1109/TAI.2025.3627514","DOIUrl":"https://doi.org/10.1109/TAI.2025.3627514","url":null,"abstract":"We present GraphTeacher for fine-tuning transformer encoders by leveraging graph neural networks (GNNs) to effectively train models when fully labeled training data is unavailable. When different percentages of labeled training data exist, we study popular transformer models, DistilBERT, RoBERTa, and BERT. The proposed approach uses the underlying graph structure of a corpus by allowing the transformer encoders to incorporate GNNs into the fine-tuning process. Using latent patterns and correlations identified in unlabeled data, our method aims to enhance the model’s adaptability to scarcely labeled data scenarios. Moreover, our approach excels in conducting single-instance inference, a capability not inherently possessed by models with a transductive (semisupervised) training stage. GraphTeacher not only processes the unlabeled data effectively, as in transductive methods, but also offers an inductive inference setup for test samples. Experiments on diverse datasets and various GNN architectures show that integrating GNNs significantly enhances transformer encoders’ robustness and generalization capabilities, in particular under sparsely labeled training conditions. GraphTeacher demonstrates a noteworthy improvement, achieving up to a 10% increase in performance on the GLUE benchmark dataset compared with the baselines<xref><sup>1</sup></xref><fn><label><sup>1</sup></label><p>The code base and data pipeline of the GraphTeacher are provided in <uri>https://github.com/koc-lab/graph-teacher</uri></p></fn>.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 5","pages":"2811-2825"},"PeriodicalIF":0.0,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147757148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ELSA: Equivariant Self-Supervised Adaptation for Small Data Learning ELSA:小数据学习的等变自监督适应
IEEE transactions on artificial intelligence Pub Date : 2026-03-01 Epub Date: 2025-11-05 DOI: 10.1109/TAI.2025.3628282
Lili Kong;Dongdong Chen;Chenwei Tang;Deng Xiong;Jiancheng Lv
{"title":"ELSA: Equivariant Self-Supervised Adaptation for Small Data Learning","authors":"Lili Kong;Dongdong Chen;Chenwei Tang;Deng Xiong;Jiancheng Lv","doi":"10.1109/TAI.2025.3628282","DOIUrl":"https://doi.org/10.1109/TAI.2025.3628282","url":null,"abstract":"Learning with small datasets presents a significant challenge due to the low volume of data, and currently, there is no effective and unified framework for small data learning. Although recent progress in transfer learning and generative models has made great strides in machine learning tasks for small data scenarios, these approaches rely on prior information from large-scale datasets gathered from similar tasks or distributions, making small data learning a false proposition. In contrast, our study aims to answer the challenging question: <italic>is it possible to learn with small data alone, where no data-driven priors are available?</i> Inspired by the emerging field of equivariant imaging, we showcase that the invariance ubiquitous in the data can derive the equivariance of self-supervised reconstruction functions, which can enable neural networks to learn with small data alone. Based on this observation, we propose a conceptual simple equivariant self-supervised adaptation (ELSA) for small data learning. ELSA can be used as a plug-and-play prior for various small data learning tasks. In this work, we conduct experiments on three representative tasks, including small data classification, image inpainting, and style transfer. Experimental results on multiple benchmark datasets demonstrate the effectiveness and efficiency of our method for small-data learning.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 5","pages":"2858-2869"},"PeriodicalIF":0.0,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147757157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving Transferability in Image Classification through Refinement of Discriminative Features 通过改进判别特征提高图像分类的可移植性
IEEE transactions on artificial intelligence Pub Date : 2026-03-01 Epub Date: 2025-10-23 DOI: 10.1109/TAI.2025.3624711
HyunGi Kim;Seungryong Yoo;Bong Gyun Kang;Saehyung Lee;Jaekoo Lee;Sungroh Yoon
{"title":"Improving Transferability in Image Classification through Refinement of Discriminative Features","authors":"HyunGi Kim;Seungryong Yoo;Bong Gyun Kang;Saehyung Lee;Jaekoo Lee;Sungroh Yoon","doi":"10.1109/TAI.2025.3624711","DOIUrl":"https://doi.org/10.1109/TAI.2025.3624711","url":null,"abstract":"Transfer learning utilizes pretrained models with rich feature representations to enhance performance on downstream tasks. However, this raises a critical question: Are all learned features equally advantageous for transfer? Our observations reveal that numerous features acquired during pretraining can be nondiscriminative for downstream classification tasks and can even obstruct their transferability by dominating truly discriminative cues during prediction. To address this issue, we propose a novel approach that prioritizes discriminative features using discriminative anchors—class-representative embeddings refined to enhance discriminative power in the downstream domain. We initialize these anchors using classifier weight vectors obtained after linear probing, which naturally encode class boundaries within the pretrained feature space. Subsequently, we iteratively refine the discriminative anchors by explicitly aggregating features toward their corresponding anchors, progressively amplifying task-relevant, discriminative patterns within the data. We demonstrate the effectiveness of our approach, class-aligned refinement of discriminative anchors (CARA), across challenging scenarios prone to overfitting to nondiscriminative features: fine-grained image classification, limited data regimes, and out-of-distribution (OOD) generalization. In fine-grained tasks, CARA achieves an average improvement of 8.87%, while in data-scarce settings, it yields over a 40% performance gain. Furthermore, CARA exhibits strong robustness in OOD settings, with less than half the accuracy drop compared to existing methods.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"7 5","pages":"2774-2787"},"PeriodicalIF":0.0,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147757168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书