Wenlong Du , Xianglin Bao , Xiaofeng Xu , Xingyu Lu , Ruiheng Zhang
{"title":"基于超像素的零镜头合成学习视觉特征增强","authors":"Wenlong Du , Xianglin Bao , Xiaofeng Xu , Xingyu Lu , Ruiheng Zhang","doi":"10.1016/j.ipm.2025.104414","DOIUrl":null,"url":null,"abstract":"<div><div>Compositional Zero-Shot Learning (CZSL) is a challenging machine learning task that recognizes new compositional concepts by leveraging learned concepts such as attribute-object combinations. Previous research depended on visual attributes derived from networks pre-trained in object categorization. These approaches are limited in capturing the subtleties of attribute distinctions and fail to account for the critical contextual interactions between attributes and visual objects. To address this problem, in this work, we draw inspiration from superpixels and introduce the Superpixel-based Visual Feature Enhancement (SVFE) model for the compositional zero-shot learning task. In the proposed approach, an innovative superpixel integration strategy is designed to meticulously disentangle and represent the visual concepts of states and objects with finer granularity. Then, we introduce a novel Fourier spectral layer that harnesses the frequency domain to capture global image features and dynamically adjusts component contributions to enhance the local detail representation. Furthermore, we propose a long-range fusion module to optimize the synergy between the local and global features, thereby fortifying the model’s acuity in discerning intricate compositional relationships. Through rigorous experiments on standard CZSL benchmark datasets, the proposed SVFE model demonstrates significant improvement over other state-of-the-art methods in both open-world and closed-world CZSL scenarios.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104414"},"PeriodicalIF":6.9000,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Superpixel-based Visual Feature Enhancement for Compositional Zero-Shot Learning\",\"authors\":\"Wenlong Du , Xianglin Bao , Xiaofeng Xu , Xingyu Lu , Ruiheng Zhang\",\"doi\":\"10.1016/j.ipm.2025.104414\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Compositional Zero-Shot Learning (CZSL) is a challenging machine learning task that recognizes new compositional concepts by leveraging learned concepts such as attribute-object combinations. Previous research depended on visual attributes derived from networks pre-trained in object categorization. These approaches are limited in capturing the subtleties of attribute distinctions and fail to account for the critical contextual interactions between attributes and visual objects. To address this problem, in this work, we draw inspiration from superpixels and introduce the Superpixel-based Visual Feature Enhancement (SVFE) model for the compositional zero-shot learning task. In the proposed approach, an innovative superpixel integration strategy is designed to meticulously disentangle and represent the visual concepts of states and objects with finer granularity. Then, we introduce a novel Fourier spectral layer that harnesses the frequency domain to capture global image features and dynamically adjusts component contributions to enhance the local detail representation. Furthermore, we propose a long-range fusion module to optimize the synergy between the local and global features, thereby fortifying the model’s acuity in discerning intricate compositional relationships. Through rigorous experiments on standard CZSL benchmark datasets, the proposed SVFE model demonstrates significant improvement over other state-of-the-art methods in both open-world and closed-world CZSL scenarios.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":\"63 2\",\"pages\":\"Article 104414\"},\"PeriodicalIF\":6.9000,\"publicationDate\":\"2025-10-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457325003553\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325003553","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Superpixel-based Visual Feature Enhancement for Compositional Zero-Shot Learning
Compositional Zero-Shot Learning (CZSL) is a challenging machine learning task that recognizes new compositional concepts by leveraging learned concepts such as attribute-object combinations. Previous research depended on visual attributes derived from networks pre-trained in object categorization. These approaches are limited in capturing the subtleties of attribute distinctions and fail to account for the critical contextual interactions between attributes and visual objects. To address this problem, in this work, we draw inspiration from superpixels and introduce the Superpixel-based Visual Feature Enhancement (SVFE) model for the compositional zero-shot learning task. In the proposed approach, an innovative superpixel integration strategy is designed to meticulously disentangle and represent the visual concepts of states and objects with finer granularity. Then, we introduce a novel Fourier spectral layer that harnesses the frequency domain to capture global image features and dynamically adjusts component contributions to enhance the local detail representation. Furthermore, we propose a long-range fusion module to optimize the synergy between the local and global features, thereby fortifying the model’s acuity in discerning intricate compositional relationships. Through rigorous experiments on standard CZSL benchmark datasets, the proposed SVFE model demonstrates significant improvement over other state-of-the-art methods in both open-world and closed-world CZSL scenarios.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.