{"title":"基于视觉内容语义稀疏重编码的图像标注","authors":"Zhiwu Lu, Yuxin Peng","doi":"10.1145/2393347.2393418","DOIUrl":null,"url":null,"abstract":"This paper presents a new semantic sparse recoding method to generate more descriptive and robust representation of visual content for image annotation. Although the visual bag-of-words (BOW) representation has been reported to achieve promising results in image annotation, its visual codebook is completely learnt from low-level visual features using quantization techniques and thus the so-called semantic gap remains unbridgeable. To handle such challenging issue, we utilize both the annotations of training images and the predicted annotations of test images to improve the original visual BOW representation. This is further formulated as a sparse coding problem so that the noise issue induced by the inaccurate quantization of visual features can also be handled to some extent. By developing an efficient sparse coding algorithm, we successfully generate a new visual BOW representation for image annotation. Since such sparse coding has actually incorporated the high-level semantic information into the original visual codebook, we thus consider it as semantic sparse recoding of the visual content. Although the predicted annotations of test images are also used as inputs by the traditional image annotation refinement, we focus on the visual BOW representation refinement for image annotation in this paper. The experimental results on two benchmark datasets show the superior performance of our semantic sparse recoding method in image annotation.","PeriodicalId":212654,"journal":{"name":"Proceedings of the 20th ACM international conference on Multimedia","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"Image annotation by semantic sparse recoding of visual content\",\"authors\":\"Zhiwu Lu, Yuxin Peng\",\"doi\":\"10.1145/2393347.2393418\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a new semantic sparse recoding method to generate more descriptive and robust representation of visual content for image annotation. Although the visual bag-of-words (BOW) representation has been reported to achieve promising results in image annotation, its visual codebook is completely learnt from low-level visual features using quantization techniques and thus the so-called semantic gap remains unbridgeable. To handle such challenging issue, we utilize both the annotations of training images and the predicted annotations of test images to improve the original visual BOW representation. This is further formulated as a sparse coding problem so that the noise issue induced by the inaccurate quantization of visual features can also be handled to some extent. By developing an efficient sparse coding algorithm, we successfully generate a new visual BOW representation for image annotation. Since such sparse coding has actually incorporated the high-level semantic information into the original visual codebook, we thus consider it as semantic sparse recoding of the visual content. Although the predicted annotations of test images are also used as inputs by the traditional image annotation refinement, we focus on the visual BOW representation refinement for image annotation in this paper. The experimental results on two benchmark datasets show the superior performance of our semantic sparse recoding method in image annotation.\",\"PeriodicalId\":212654,\"journal\":{\"name\":\"Proceedings of the 20th ACM international conference on Multimedia\",\"volume\":\"67 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 20th ACM international conference on Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2393347.2393418\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 20th ACM international conference on Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2393347.2393418","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Image annotation by semantic sparse recoding of visual content
This paper presents a new semantic sparse recoding method to generate more descriptive and robust representation of visual content for image annotation. Although the visual bag-of-words (BOW) representation has been reported to achieve promising results in image annotation, its visual codebook is completely learnt from low-level visual features using quantization techniques and thus the so-called semantic gap remains unbridgeable. To handle such challenging issue, we utilize both the annotations of training images and the predicted annotations of test images to improve the original visual BOW representation. This is further formulated as a sparse coding problem so that the noise issue induced by the inaccurate quantization of visual features can also be handled to some extent. By developing an efficient sparse coding algorithm, we successfully generate a new visual BOW representation for image annotation. Since such sparse coding has actually incorporated the high-level semantic information into the original visual codebook, we thus consider it as semantic sparse recoding of the visual content. Although the predicted annotations of test images are also used as inputs by the traditional image annotation refinement, we focus on the visual BOW representation refinement for image annotation in this paper. The experimental results on two benchmark datasets show the superior performance of our semantic sparse recoding method in image annotation.