Tianle Hu , Yu Chen , Chuwei Cheng , Junhong Xiao , Weijun Sun , Xiaozhao Fang
{"title":"面向领域自适应检索的从粗到细标签细化","authors":"Tianle Hu , Yu Chen , Chuwei Cheng , Junhong Xiao , Weijun Sun , Xiaozhao Fang","doi":"10.1016/j.ins.2025.122532","DOIUrl":null,"url":null,"abstract":"<div><div>Domain adaptive retrieval (DAR) is a promising research field. However, existing methods still suffer from the following limitations: 1) they rely heavily on pseudo-labeling strategies that oversimplify complex relationships between samples; 2) they treat labels as algorithmic outputs rather than optimizable variables, potentially breaking some natural connections between features and categories. To address these issues, we propose an effective approach called Coarse-to-Fine Label Refinement (CFLR). First, joint orthogonal matrix factorization is employed: one is to learn an optimizable latent feature representation, the other is to decompose predefined coarse pseudo-labels into improvable continuous values. Second, a classifier is introduced to connect these components, establishing a mutually reinforcing relationship between features and labels. This mutual enhancement captures implicit cross-category semantics by mining the iteratively updated feature information. Based on the refined labels, we develop an improved graph embedding that achieves more natural cross-domain relationships. Finally, high-quality hash codes are generated by directly quantifying the refined semantics. Experiments on multiple popular cross-domain benchmark datasets demonstrate that the proposed CFLR achieves state-of-the-art performance.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"720 ","pages":"Article 122532"},"PeriodicalIF":8.1000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Coarse-to-fine label refinement for domain adaptive retrieval\",\"authors\":\"Tianle Hu , Yu Chen , Chuwei Cheng , Junhong Xiao , Weijun Sun , Xiaozhao Fang\",\"doi\":\"10.1016/j.ins.2025.122532\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Domain adaptive retrieval (DAR) is a promising research field. However, existing methods still suffer from the following limitations: 1) they rely heavily on pseudo-labeling strategies that oversimplify complex relationships between samples; 2) they treat labels as algorithmic outputs rather than optimizable variables, potentially breaking some natural connections between features and categories. To address these issues, we propose an effective approach called Coarse-to-Fine Label Refinement (CFLR). First, joint orthogonal matrix factorization is employed: one is to learn an optimizable latent feature representation, the other is to decompose predefined coarse pseudo-labels into improvable continuous values. Second, a classifier is introduced to connect these components, establishing a mutually reinforcing relationship between features and labels. This mutual enhancement captures implicit cross-category semantics by mining the iteratively updated feature information. Based on the refined labels, we develop an improved graph embedding that achieves more natural cross-domain relationships. Finally, high-quality hash codes are generated by directly quantifying the refined semantics. Experiments on multiple popular cross-domain benchmark datasets demonstrate that the proposed CFLR achieves state-of-the-art performance.</div></div>\",\"PeriodicalId\":51063,\"journal\":{\"name\":\"Information Sciences\",\"volume\":\"720 \",\"pages\":\"Article 122532\"},\"PeriodicalIF\":8.1000,\"publicationDate\":\"2025-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Sciences\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0020025525006644\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020025525006644","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Coarse-to-fine label refinement for domain adaptive retrieval
Domain adaptive retrieval (DAR) is a promising research field. However, existing methods still suffer from the following limitations: 1) they rely heavily on pseudo-labeling strategies that oversimplify complex relationships between samples; 2) they treat labels as algorithmic outputs rather than optimizable variables, potentially breaking some natural connections between features and categories. To address these issues, we propose an effective approach called Coarse-to-Fine Label Refinement (CFLR). First, joint orthogonal matrix factorization is employed: one is to learn an optimizable latent feature representation, the other is to decompose predefined coarse pseudo-labels into improvable continuous values. Second, a classifier is introduced to connect these components, establishing a mutually reinforcing relationship between features and labels. This mutual enhancement captures implicit cross-category semantics by mining the iteratively updated feature information. Based on the refined labels, we develop an improved graph embedding that achieves more natural cross-domain relationships. Finally, high-quality hash codes are generated by directly quantifying the refined semantics. Experiments on multiple popular cross-domain benchmark datasets demonstrate that the proposed CFLR achieves state-of-the-art performance.
期刊介绍:
Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions.
Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.