基于超像素的零镜头合成学习视觉特征增强

IF 6.9 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Wenlong Du , Xianglin Bao , Xiaofeng Xu , Xingyu Lu , Ruiheng Zhang
{"title":"基于超像素的零镜头合成学习视觉特征增强","authors":"Wenlong Du ,&nbsp;Xianglin Bao ,&nbsp;Xiaofeng Xu ,&nbsp;Xingyu Lu ,&nbsp;Ruiheng Zhang","doi":"10.1016/j.ipm.2025.104414","DOIUrl":null,"url":null,"abstract":"<div><div>Compositional Zero-Shot Learning (CZSL) is a challenging machine learning task that recognizes new compositional concepts by leveraging learned concepts such as attribute-object combinations. Previous research depended on visual attributes derived from networks pre-trained in object categorization. These approaches are limited in capturing the subtleties of attribute distinctions and fail to account for the critical contextual interactions between attributes and visual objects. To address this problem, in this work, we draw inspiration from superpixels and introduce the Superpixel-based Visual Feature Enhancement (SVFE) model for the compositional zero-shot learning task. In the proposed approach, an innovative superpixel integration strategy is designed to meticulously disentangle and represent the visual concepts of states and objects with finer granularity. Then, we introduce a novel Fourier spectral layer that harnesses the frequency domain to capture global image features and dynamically adjusts component contributions to enhance the local detail representation. Furthermore, we propose a long-range fusion module to optimize the synergy between the local and global features, thereby fortifying the model’s acuity in discerning intricate compositional relationships. Through rigorous experiments on standard CZSL benchmark datasets, the proposed SVFE model demonstrates significant improvement over other state-of-the-art methods in both open-world and closed-world CZSL scenarios.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104414"},"PeriodicalIF":6.9000,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Superpixel-based Visual Feature Enhancement for Compositional Zero-Shot Learning\",\"authors\":\"Wenlong Du ,&nbsp;Xianglin Bao ,&nbsp;Xiaofeng Xu ,&nbsp;Xingyu Lu ,&nbsp;Ruiheng Zhang\",\"doi\":\"10.1016/j.ipm.2025.104414\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Compositional Zero-Shot Learning (CZSL) is a challenging machine learning task that recognizes new compositional concepts by leveraging learned concepts such as attribute-object combinations. Previous research depended on visual attributes derived from networks pre-trained in object categorization. These approaches are limited in capturing the subtleties of attribute distinctions and fail to account for the critical contextual interactions between attributes and visual objects. To address this problem, in this work, we draw inspiration from superpixels and introduce the Superpixel-based Visual Feature Enhancement (SVFE) model for the compositional zero-shot learning task. In the proposed approach, an innovative superpixel integration strategy is designed to meticulously disentangle and represent the visual concepts of states and objects with finer granularity. Then, we introduce a novel Fourier spectral layer that harnesses the frequency domain to capture global image features and dynamically adjusts component contributions to enhance the local detail representation. Furthermore, we propose a long-range fusion module to optimize the synergy between the local and global features, thereby fortifying the model’s acuity in discerning intricate compositional relationships. Through rigorous experiments on standard CZSL benchmark datasets, the proposed SVFE model demonstrates significant improvement over other state-of-the-art methods in both open-world and closed-world CZSL scenarios.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":\"63 2\",\"pages\":\"Article 104414\"},\"PeriodicalIF\":6.9000,\"publicationDate\":\"2025-10-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457325003553\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325003553","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

组合零射击学习(CZSL)是一项具有挑战性的机器学习任务,它通过利用已学习的概念(如属性-对象组合)来识别新的组合概念。以前的研究依赖于在对象分类中预训练的网络的视觉属性。这些方法在捕捉属性区别的微妙之处方面受到限制,并且无法解释属性和可视对象之间关键的上下文交互。为了解决这一问题,在本工作中,我们从超像素中汲取灵感,引入了基于超像素的视觉特征增强(SVFE)模型用于构图零镜头学习任务。在该方法中,设计了一种创新的超像素集成策略,以更细的粒度细致地解纠缠并表示状态和对象的视觉概念。然后,我们引入了一种新的傅立叶谱层,该层利用频域捕获全局图像特征并动态调整分量贡献以增强局部细节表示。此外,我们提出了一个远程融合模块来优化局部和全局特征之间的协同作用,从而增强了模型识别复杂组合关系的敏锐性。通过在标准CZSL基准数据集上的严格实验,所提出的SVFE模型在开放世界和封闭世界的CZSL场景中都比其他最先进的方法有了显著的改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Superpixel-based Visual Feature Enhancement for Compositional Zero-Shot Learning
Compositional Zero-Shot Learning (CZSL) is a challenging machine learning task that recognizes new compositional concepts by leveraging learned concepts such as attribute-object combinations. Previous research depended on visual attributes derived from networks pre-trained in object categorization. These approaches are limited in capturing the subtleties of attribute distinctions and fail to account for the critical contextual interactions between attributes and visual objects. To address this problem, in this work, we draw inspiration from superpixels and introduce the Superpixel-based Visual Feature Enhancement (SVFE) model for the compositional zero-shot learning task. In the proposed approach, an innovative superpixel integration strategy is designed to meticulously disentangle and represent the visual concepts of states and objects with finer granularity. Then, we introduce a novel Fourier spectral layer that harnesses the frequency domain to capture global image features and dynamically adjusts component contributions to enhance the local detail representation. Furthermore, we propose a long-range fusion module to optimize the synergy between the local and global features, thereby fortifying the model’s acuity in discerning intricate compositional relationships. Through rigorous experiments on standard CZSL benchmark datasets, the proposed SVFE model demonstrates significant improvement over other state-of-the-art methods in both open-world and closed-world CZSL scenarios.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Information Processing & Management
Information Processing & Management 工程技术-计算机:信息系统
CiteScore
17.00
自引率
11.60%
发文量
276
审稿时长
39 days
期刊介绍: Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信