基于文本的无监督跨域人物搜索的缓存辅助跨模态相关校正

IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Kai Niu , Qinzi Zhao , Jiahui Chen , Yanning Zhang
{"title":"基于文本的无监督跨域人物搜索的缓存辅助跨模态相关校正","authors":"Kai Niu ,&nbsp;Qinzi Zhao ,&nbsp;Jiahui Chen ,&nbsp;Yanning Zhang","doi":"10.1016/j.patcog.2025.112521","DOIUrl":null,"url":null,"abstract":"<div><div>Unsupervised Cross-domain Text-Based Person Search (UC-TBPS) has to face not only the modality heterogeneity, but also the cross-domain difficulty in more practical surveillance circumstances. However, few research has focused on the cross-domain difficulty, which may severely hinder the real-world applications of TBPS. In this paper, we propose the Test-time Cache-aided Cross-modal Correlation Correction (TC<span><math><msup><mrow></mrow><mn>4</mn></msup></math></span>) method, which acts as a pioneer for especially addressing the UC-TBPS task by novel test-time re-ranking. Firstly, we conduct clustering inside the pedestrian image gallery, and construct the reward and penalty caches based on these clustering centers, to store more sentences relays for alleviating the cross-domain problem. Secondly, we calculate the reward and penalty values to refine the appropriately located image-sentence correlation positions under the guidance of these two caches, respectively. Finally, the refined image-sentence correlations are used to re-rank the original retrieval results. As a test-time re-ranking approach, our TC<span><math><msup><mrow></mrow><mn>4</mn></msup></math></span> method does not require fine-tuning in the target domain, and can obtain retrieval performance improvements with negligible additional overheads. Extensive experiments and analyses on the tasks of UC-TBPS as well as unsupervised cross-domain image-text matching can validate the effectiveness and generalization capacities of our proposed TC<span><math><msup><mrow></mrow><mn>4</mn></msup></math></span> solution.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112521"},"PeriodicalIF":7.6000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cache-aided cross-modal correlation correction for unsupervised cross-domain text-based person search\",\"authors\":\"Kai Niu ,&nbsp;Qinzi Zhao ,&nbsp;Jiahui Chen ,&nbsp;Yanning Zhang\",\"doi\":\"10.1016/j.patcog.2025.112521\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Unsupervised Cross-domain Text-Based Person Search (UC-TBPS) has to face not only the modality heterogeneity, but also the cross-domain difficulty in more practical surveillance circumstances. However, few research has focused on the cross-domain difficulty, which may severely hinder the real-world applications of TBPS. In this paper, we propose the Test-time Cache-aided Cross-modal Correlation Correction (TC<span><math><msup><mrow></mrow><mn>4</mn></msup></math></span>) method, which acts as a pioneer for especially addressing the UC-TBPS task by novel test-time re-ranking. Firstly, we conduct clustering inside the pedestrian image gallery, and construct the reward and penalty caches based on these clustering centers, to store more sentences relays for alleviating the cross-domain problem. Secondly, we calculate the reward and penalty values to refine the appropriately located image-sentence correlation positions under the guidance of these two caches, respectively. Finally, the refined image-sentence correlations are used to re-rank the original retrieval results. As a test-time re-ranking approach, our TC<span><math><msup><mrow></mrow><mn>4</mn></msup></math></span> method does not require fine-tuning in the target domain, and can obtain retrieval performance improvements with negligible additional overheads. Extensive experiments and analyses on the tasks of UC-TBPS as well as unsupervised cross-domain image-text matching can validate the effectiveness and generalization capacities of our proposed TC<span><math><msup><mrow></mrow><mn>4</mn></msup></math></span> solution.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"172 \",\"pages\":\"Article 112521\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320325011847\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325011847","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

无监督跨领域基于文本的人物搜索(UC-TBPS)不仅面临着模态的异质性,而且在更实际的监控环境中也面临着跨领域的困难。然而,对跨域困难的研究很少,这可能严重阻碍了TBPS的实际应用。在本文中,我们提出了测试时间缓存辅助跨模态相关校正(TC4)方法,该方法是通过新颖的测试时间重新排序来解决UC-TBPS任务的先驱。首先,我们在行人图像库内部进行聚类,并基于这些聚类中心构建奖罚缓存,存储更多的句子中继,以缓解跨域问题。其次,在这两个缓存的指导下,我们分别计算奖惩值来细化合适的图像-句子相关位置。最后,使用改进的图像-句子相关性对原始检索结果进行重新排序。作为一种测试时间重新排序方法,我们的TC4方法不需要在目标域中进行微调,并且可以在可以忽略不计的额外开销下获得检索性能的改进。对UC-TBPS任务以及无监督跨域图像-文本匹配的大量实验和分析可以验证我们提出的TC4解决方案的有效性和泛化能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Cache-aided cross-modal correlation correction for unsupervised cross-domain text-based person search
Unsupervised Cross-domain Text-Based Person Search (UC-TBPS) has to face not only the modality heterogeneity, but also the cross-domain difficulty in more practical surveillance circumstances. However, few research has focused on the cross-domain difficulty, which may severely hinder the real-world applications of TBPS. In this paper, we propose the Test-time Cache-aided Cross-modal Correlation Correction (TC4) method, which acts as a pioneer for especially addressing the UC-TBPS task by novel test-time re-ranking. Firstly, we conduct clustering inside the pedestrian image gallery, and construct the reward and penalty caches based on these clustering centers, to store more sentences relays for alleviating the cross-domain problem. Secondly, we calculate the reward and penalty values to refine the appropriately located image-sentence correlation positions under the guidance of these two caches, respectively. Finally, the refined image-sentence correlations are used to re-rank the original retrieval results. As a test-time re-ranking approach, our TC4 method does not require fine-tuning in the target domain, and can obtain retrieval performance improvements with negligible additional overheads. Extensive experiments and analyses on the tasks of UC-TBPS as well as unsupervised cross-domain image-text matching can validate the effectiveness and generalization capacities of our proposed TC4 solution.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Pattern Recognition
Pattern Recognition 工程技术-工程:电子与电气
CiteScore
14.40
自引率
16.20%
发文量
683
审稿时长
5.6 months
期刊介绍: The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信