{"title":"利用相似知识和数据隐藏结构的无监督哈希对比学习","authors":"Zhenpeng Song, Qinliang Su, Jiayang Chen","doi":"10.1145/3581783.3612596","DOIUrl":null,"url":null,"abstract":"By noticing the superior ability of contrastive learning in representation learning, several recent works have proposed to use it to learn semantic-rich hash codes. However, due to the absence of label information, existing contrastive-based hashing methods simply follow contrastive learning by only using the augmentation of the anchor as positive, while treating all other samples in the batch as negatives, resulting in the ignorance of a large number of potential positives. Consequently, the learned hash codes tend to be distributed dispersedly in the space, making their distances unable to accurately reflect their semantic similarities. To address this issue, we propose to exploit the similarity knowledge and hidden structure of the dataset. Specifically, we first develop an intuitive approach based on self-training that comprises two main components, a pseudo-label predictor and a hash code improving module, which mutually benefit from each other by utilizing the output from one another, in conjunction with the similarity knowledge obtained from pre-trained models. Furthermore, we subjected the intuitive approach to a more rigorous probabilistic framework and propose CGHash, a probabilistic hashing model based on conditional generative models, which is theoretically more reasonable and could model the similarity knowledge and the hidden group structure more accurately. Our extensive experimental results on three image datasets demonstrate that CGHash exhibits significant superiority when compared to both the proposed intuitive approach and existing baselines. Our code is available at https://github.com/KARLSZP/CGHash.","PeriodicalId":0,"journal":{"name":"","volume":"30 1-2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unsupervised Hashing with Contrastive Learning by Exploiting Similarity Knowledge and Hidden Structure of Data\",\"authors\":\"Zhenpeng Song, Qinliang Su, Jiayang Chen\",\"doi\":\"10.1145/3581783.3612596\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"By noticing the superior ability of contrastive learning in representation learning, several recent works have proposed to use it to learn semantic-rich hash codes. However, due to the absence of label information, existing contrastive-based hashing methods simply follow contrastive learning by only using the augmentation of the anchor as positive, while treating all other samples in the batch as negatives, resulting in the ignorance of a large number of potential positives. Consequently, the learned hash codes tend to be distributed dispersedly in the space, making their distances unable to accurately reflect their semantic similarities. To address this issue, we propose to exploit the similarity knowledge and hidden structure of the dataset. Specifically, we first develop an intuitive approach based on self-training that comprises two main components, a pseudo-label predictor and a hash code improving module, which mutually benefit from each other by utilizing the output from one another, in conjunction with the similarity knowledge obtained from pre-trained models. Furthermore, we subjected the intuitive approach to a more rigorous probabilistic framework and propose CGHash, a probabilistic hashing model based on conditional generative models, which is theoretically more reasonable and could model the similarity knowledge and the hidden group structure more accurately. Our extensive experimental results on three image datasets demonstrate that CGHash exhibits significant superiority when compared to both the proposed intuitive approach and existing baselines. Our code is available at https://github.com/KARLSZP/CGHash.\",\"PeriodicalId\":0,\"journal\":{\"name\":\"\",\"volume\":\"30 1-2\",\"pages\":\"0\"},\"PeriodicalIF\":0.0,\"publicationDate\":\"2023-10-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3581783.3612596\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3581783.3612596","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Unsupervised Hashing with Contrastive Learning by Exploiting Similarity Knowledge and Hidden Structure of Data
By noticing the superior ability of contrastive learning in representation learning, several recent works have proposed to use it to learn semantic-rich hash codes. However, due to the absence of label information, existing contrastive-based hashing methods simply follow contrastive learning by only using the augmentation of the anchor as positive, while treating all other samples in the batch as negatives, resulting in the ignorance of a large number of potential positives. Consequently, the learned hash codes tend to be distributed dispersedly in the space, making their distances unable to accurately reflect their semantic similarities. To address this issue, we propose to exploit the similarity knowledge and hidden structure of the dataset. Specifically, we first develop an intuitive approach based on self-training that comprises two main components, a pseudo-label predictor and a hash code improving module, which mutually benefit from each other by utilizing the output from one another, in conjunction with the similarity knowledge obtained from pre-trained models. Furthermore, we subjected the intuitive approach to a more rigorous probabilistic framework and propose CGHash, a probabilistic hashing model based on conditional generative models, which is theoretically more reasonable and could model the similarity knowledge and the hidden group structure more accurately. Our extensive experimental results on three image datasets demonstrate that CGHash exhibits significant superiority when compared to both the proposed intuitive approach and existing baselines. Our code is available at https://github.com/KARLSZP/CGHash.