{"title":"大规模视频检索的监督循环哈希","authors":"Yun Gu, Chao Ma, Jie Yang","doi":"10.1145/2964284.2967225","DOIUrl":null,"url":null,"abstract":"Hashing for large-scale multimedia is a popular research topic, attracting much attention in computer vision and visual information retrieval. Previous works mostly focus on hashing the images and texts while the approaches designed for videos are limited. In this paper, we propose a \\textit{Supervised Recurrent Hashing} (SRH) that explores the discriminative representation obtained by deep neural networks to design hashing approaches. The long-short term memory (LSTM) network is deployed to model the structure of video samples. The max-pooling mechanism is introduced to embedding the frames into fixed-length representations that are fed into supervised hashing loss. Experiments on UCF-101 dataset demonstrate that the proposed method can significantly outperforms several state-of-the-art methods.","PeriodicalId":140670,"journal":{"name":"Proceedings of the 24th ACM international conference on Multimedia","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"42","resultStr":"{\"title\":\"Supervised Recurrent Hashing for Large Scale Video Retrieval\",\"authors\":\"Yun Gu, Chao Ma, Jie Yang\",\"doi\":\"10.1145/2964284.2967225\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hashing for large-scale multimedia is a popular research topic, attracting much attention in computer vision and visual information retrieval. Previous works mostly focus on hashing the images and texts while the approaches designed for videos are limited. In this paper, we propose a \\\\textit{Supervised Recurrent Hashing} (SRH) that explores the discriminative representation obtained by deep neural networks to design hashing approaches. The long-short term memory (LSTM) network is deployed to model the structure of video samples. The max-pooling mechanism is introduced to embedding the frames into fixed-length representations that are fed into supervised hashing loss. Experiments on UCF-101 dataset demonstrate that the proposed method can significantly outperforms several state-of-the-art methods.\",\"PeriodicalId\":140670,\"journal\":{\"name\":\"Proceedings of the 24th ACM international conference on Multimedia\",\"volume\":\"64 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"42\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 24th ACM international conference on Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2964284.2967225\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 24th ACM international conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2964284.2967225","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Supervised Recurrent Hashing for Large Scale Video Retrieval
Hashing for large-scale multimedia is a popular research topic, attracting much attention in computer vision and visual information retrieval. Previous works mostly focus on hashing the images and texts while the approaches designed for videos are limited. In this paper, we propose a \textit{Supervised Recurrent Hashing} (SRH) that explores the discriminative representation obtained by deep neural networks to design hashing approaches. The long-short term memory (LSTM) network is deployed to model the structure of video samples. The max-pooling mechanism is introduced to embedding the frames into fixed-length representations that are fed into supervised hashing loss. Experiments on UCF-101 dataset demonstrate that the proposed method can significantly outperforms several state-of-the-art methods.