Shaohua Teng, Wenbiao Huang, Naiqi Wu, Guanglong Du, Tongbao Chen, Wei Zhang, Luyao Teng
{"title":"Discrete cross-modal hashing with relaxation and label semantic guidance","authors":"Shaohua Teng, Wenbiao Huang, Naiqi Wu, Guanglong Du, Tongbao Chen, Wei Zhang, Luyao Teng","doi":"10.1007/s11280-024-01239-6","DOIUrl":null,"url":null,"abstract":"<p>Supervised cross-modal hashing has attracted many researchers. In these studies, they seek a common semantic space or directly regress the zero-one label information into the Hamming space. Although they achieve many achievements, they neglect some issues: 1) some methods of the classification task are not suitable for retrieval tasks, since they are lack of learning personalized features of sample; 2) the outcomes of hash retrieval are related to both the length and encoding method of hash codes. Because a sample possess more personalized features than label semantics, in this paper, we propose a novel supervised cross-modal hashing collaboration learning method called discrete Cross-modal Hashing with Relaxation and Label Semantic Guidance (CHRLSG). First, we introduce two relaxation variables as latent spaces. One is used to extract text features and label semantic information collaboratively, and the other is used to extract image features and label semantics collaboratively. Second, the more accurate hash codes are generated from latent spaces, since CHRLSG learns collaboratively feature semantics and label semantics by using labels as the domination and features as the auxiliary. Third, we utilize labels to strengthen the similar relationship of inter-modal samples via keeping the pairwise closeness. Label semantics are made full use of to avoid classification error. Fourth, we introduce class weight to further increase the discrimination of samples that belong to different classes in intra-modal and keep the similarity of samples unchanged. Therefore, CHRLSG model preserves not only the relationship between samples, but also maintains the consistency of label semantic during collaboration optimization. Experimental results of three common benchmark datasets demonstrate that the proposed model is superior to the existing advanced methods.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"World Wide Web","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s11280-024-01239-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Supervised cross-modal hashing has attracted many researchers. In these studies, they seek a common semantic space or directly regress the zero-one label information into the Hamming space. Although they achieve many achievements, they neglect some issues: 1) some methods of the classification task are not suitable for retrieval tasks, since they are lack of learning personalized features of sample; 2) the outcomes of hash retrieval are related to both the length and encoding method of hash codes. Because a sample possess more personalized features than label semantics, in this paper, we propose a novel supervised cross-modal hashing collaboration learning method called discrete Cross-modal Hashing with Relaxation and Label Semantic Guidance (CHRLSG). First, we introduce two relaxation variables as latent spaces. One is used to extract text features and label semantic information collaboratively, and the other is used to extract image features and label semantics collaboratively. Second, the more accurate hash codes are generated from latent spaces, since CHRLSG learns collaboratively feature semantics and label semantics by using labels as the domination and features as the auxiliary. Third, we utilize labels to strengthen the similar relationship of inter-modal samples via keeping the pairwise closeness. Label semantics are made full use of to avoid classification error. Fourth, we introduce class weight to further increase the discrimination of samples that belong to different classes in intra-modal and keep the similarity of samples unchanged. Therefore, CHRLSG model preserves not only the relationship between samples, but also maintains the consistency of label semantic during collaboration optimization. Experimental results of three common benchmark datasets demonstrate that the proposed model is superior to the existing advanced methods.