{"title":"基于文本的无监督跨域人物搜索的缓存辅助跨模态相关校正","authors":"Kai Niu , Qinzi Zhao , Jiahui Chen , Yanning Zhang","doi":"10.1016/j.patcog.2025.112521","DOIUrl":null,"url":null,"abstract":"<div><div>Unsupervised Cross-domain Text-Based Person Search (UC-TBPS) has to face not only the modality heterogeneity, but also the cross-domain difficulty in more practical surveillance circumstances. However, few research has focused on the cross-domain difficulty, which may severely hinder the real-world applications of TBPS. In this paper, we propose the Test-time Cache-aided Cross-modal Correlation Correction (TC<span><math><msup><mrow></mrow><mn>4</mn></msup></math></span>) method, which acts as a pioneer for especially addressing the UC-TBPS task by novel test-time re-ranking. Firstly, we conduct clustering inside the pedestrian image gallery, and construct the reward and penalty caches based on these clustering centers, to store more sentences relays for alleviating the cross-domain problem. Secondly, we calculate the reward and penalty values to refine the appropriately located image-sentence correlation positions under the guidance of these two caches, respectively. Finally, the refined image-sentence correlations are used to re-rank the original retrieval results. As a test-time re-ranking approach, our TC<span><math><msup><mrow></mrow><mn>4</mn></msup></math></span> method does not require fine-tuning in the target domain, and can obtain retrieval performance improvements with negligible additional overheads. Extensive experiments and analyses on the tasks of UC-TBPS as well as unsupervised cross-domain image-text matching can validate the effectiveness and generalization capacities of our proposed TC<span><math><msup><mrow></mrow><mn>4</mn></msup></math></span> solution.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112521"},"PeriodicalIF":7.6000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cache-aided cross-modal correlation correction for unsupervised cross-domain text-based person search\",\"authors\":\"Kai Niu , Qinzi Zhao , Jiahui Chen , Yanning Zhang\",\"doi\":\"10.1016/j.patcog.2025.112521\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Unsupervised Cross-domain Text-Based Person Search (UC-TBPS) has to face not only the modality heterogeneity, but also the cross-domain difficulty in more practical surveillance circumstances. However, few research has focused on the cross-domain difficulty, which may severely hinder the real-world applications of TBPS. In this paper, we propose the Test-time Cache-aided Cross-modal Correlation Correction (TC<span><math><msup><mrow></mrow><mn>4</mn></msup></math></span>) method, which acts as a pioneer for especially addressing the UC-TBPS task by novel test-time re-ranking. Firstly, we conduct clustering inside the pedestrian image gallery, and construct the reward and penalty caches based on these clustering centers, to store more sentences relays for alleviating the cross-domain problem. Secondly, we calculate the reward and penalty values to refine the appropriately located image-sentence correlation positions under the guidance of these two caches, respectively. Finally, the refined image-sentence correlations are used to re-rank the original retrieval results. As a test-time re-ranking approach, our TC<span><math><msup><mrow></mrow><mn>4</mn></msup></math></span> method does not require fine-tuning in the target domain, and can obtain retrieval performance improvements with negligible additional overheads. Extensive experiments and analyses on the tasks of UC-TBPS as well as unsupervised cross-domain image-text matching can validate the effectiveness and generalization capacities of our proposed TC<span><math><msup><mrow></mrow><mn>4</mn></msup></math></span> solution.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"172 \",\"pages\":\"Article 112521\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320325011847\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320325011847","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Cache-aided cross-modal correlation correction for unsupervised cross-domain text-based person search
Unsupervised Cross-domain Text-Based Person Search (UC-TBPS) has to face not only the modality heterogeneity, but also the cross-domain difficulty in more practical surveillance circumstances. However, few research has focused on the cross-domain difficulty, which may severely hinder the real-world applications of TBPS. In this paper, we propose the Test-time Cache-aided Cross-modal Correlation Correction (TC) method, which acts as a pioneer for especially addressing the UC-TBPS task by novel test-time re-ranking. Firstly, we conduct clustering inside the pedestrian image gallery, and construct the reward and penalty caches based on these clustering centers, to store more sentences relays for alleviating the cross-domain problem. Secondly, we calculate the reward and penalty values to refine the appropriately located image-sentence correlation positions under the guidance of these two caches, respectively. Finally, the refined image-sentence correlations are used to re-rank the original retrieval results. As a test-time re-ranking approach, our TC method does not require fine-tuning in the target domain, and can obtain retrieval performance improvements with negligible additional overheads. Extensive experiments and analyses on the tasks of UC-TBPS as well as unsupervised cross-domain image-text matching can validate the effectiveness and generalization capacities of our proposed TC solution.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.