GeoDCL: Weak Geometrical Distortion-Based Contrastive Learning for Fine-Grained Fashion Image Retrieval

Ling Xiao;Toshihiko Yamasaki
{"title":"GeoDCL: Weak Geometrical Distortion-Based Contrastive Learning for Fine-Grained Fashion Image Retrieval","authors":"Ling Xiao;Toshihiko Yamasaki","doi":"10.1109/TAI.2025.3545791","DOIUrl":null,"url":null,"abstract":"This article addresses fine-grained fashion image retrieval (FIR), which aims at the detailed and precise retrieval of fashion items from extensive databases. Conventional fine-grained FIR methods design complex attention modules to enhance attribute-aware feature discrimination. However, they often ignore the multiview characteristics of real-world fashion data, leading to diminished model accuracy. Furthermore, our empirical analysis revealed that the straightforward application of standard contrastive learning methods to fine-grained FIR often yields suboptimal results. To alleviate this issue, we propose a novel weak geometrical distortion-based contrastive learning (GeoDCL) strategy. Specifically, GeoDCL incorporates both a novel positive pair design and a novel contrastive loss. GeoDCL can be seamlessly integrated into state-of-the-art (SOTA) fine-grained FIR methods during the training stage to enhance performance during inference. When GeoDCL is applied, the model structures of SOTA methods require no modifications. Additionally, GeoDCL is not utilized during inference, ensuring no increase in inference time. Experiments on the FashionAI, DeepFashion, and Zappos50K datasets verified GeoDCL's effectiveness in consistently improving SOTA models. In particular, GeoDCL drastically improved ASENet_V2 from 60.76% to 66.48% in mAP on the FashionAI dataset.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 9","pages":"2409-2421"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10908573/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This article addresses fine-grained fashion image retrieval (FIR), which aims at the detailed and precise retrieval of fashion items from extensive databases. Conventional fine-grained FIR methods design complex attention modules to enhance attribute-aware feature discrimination. However, they often ignore the multiview characteristics of real-world fashion data, leading to diminished model accuracy. Furthermore, our empirical analysis revealed that the straightforward application of standard contrastive learning methods to fine-grained FIR often yields suboptimal results. To alleviate this issue, we propose a novel weak geometrical distortion-based contrastive learning (GeoDCL) strategy. Specifically, GeoDCL incorporates both a novel positive pair design and a novel contrastive loss. GeoDCL can be seamlessly integrated into state-of-the-art (SOTA) fine-grained FIR methods during the training stage to enhance performance during inference. When GeoDCL is applied, the model structures of SOTA methods require no modifications. Additionally, GeoDCL is not utilized during inference, ensuring no increase in inference time. Experiments on the FashionAI, DeepFashion, and Zappos50K datasets verified GeoDCL's effectiveness in consistently improving SOTA models. In particular, GeoDCL drastically improved ASENet_V2 from 60.76% to 66.48% in mAP on the FashionAI dataset.
基于弱几何扭曲的细粒度时尚图像检索对比学习
本文讨论细粒度时尚图像检索(FIR),其目的是从广泛的数据库中详细而精确地检索时尚项目。传统的细粒度FIR方法通过设计复杂的注意模块来增强属性感知特征识别。然而,他们经常忽略现实世界时尚数据的多视图特征,导致模型准确性降低。此外,我们的实证分析表明,直接将标准对比学习方法应用于细粒度FIR通常会产生次优结果。为了解决这个问题,我们提出了一种新的基于弱几何扭曲的对比学习策略。具体来说,GeoDCL结合了一种新的正对设计和一种新的对比损耗。GeoDCL可以在训练阶段无缝集成到最先进的(SOTA)细粒度FIR方法中,以提高推理期间的性能。当应用GeoDCL时,SOTA方法的模型结构不需要修改。此外,在推理过程中不使用GeoDCL,确保不会增加推理时间。在FashionAI、DeepFashion和Zappos50K数据集上的实验验证了GeoDCL在不断改进SOTA模型方面的有效性。特别是,GeoDCL极大地提高了ASENet_V2在FashionAI数据集上的mAP从60.76%提高到66.48%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.70
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信