{"title":"超越概念检测:图像检索中用户意图的潜力","authors":"Bo Wang, M. Larson","doi":"10.1145/3132515.3132521","DOIUrl":null,"url":null,"abstract":"Behind each photographic act is a rationale that impacts the visual appearance of the resulting photo. Better understanding of this rationale has great potential to support image retrieval systems in serving user needs. However, at present, surprisingly little is known about the connection between what a picture shows (the literally depicted conceptual content) and why that picture was taken (the photographer intent). In this paper, we investigate photographer intent in a large Flickr data set. First, an expert annotator carries out a large number of iterative intent judgments to create a taxonomy of intent classes. Next, analysis of the distribution of concepts and intent classes reveals patterns of independence both at a global and user level. Finally, we report the results of experiments showing that a deep neural network classifier is capable of learning to differentiate between these intent classes, and that these classes support the diversification of image search results.","PeriodicalId":395519,"journal":{"name":"Proceedings of the Workshop on Multimodal Understanding of Social, Affective and Subjective Attributes","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Beyond Concept Detection: The Potential of User Intent for Image Retrieval\",\"authors\":\"Bo Wang, M. Larson\",\"doi\":\"10.1145/3132515.3132521\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Behind each photographic act is a rationale that impacts the visual appearance of the resulting photo. Better understanding of this rationale has great potential to support image retrieval systems in serving user needs. However, at present, surprisingly little is known about the connection between what a picture shows (the literally depicted conceptual content) and why that picture was taken (the photographer intent). In this paper, we investigate photographer intent in a large Flickr data set. First, an expert annotator carries out a large number of iterative intent judgments to create a taxonomy of intent classes. Next, analysis of the distribution of concepts and intent classes reveals patterns of independence both at a global and user level. Finally, we report the results of experiments showing that a deep neural network classifier is capable of learning to differentiate between these intent classes, and that these classes support the diversification of image search results.\",\"PeriodicalId\":395519,\"journal\":{\"name\":\"Proceedings of the Workshop on Multimodal Understanding of Social, Affective and Subjective Attributes\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Workshop on Multimodal Understanding of Social, Affective and Subjective Attributes\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3132515.3132521\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Workshop on Multimodal Understanding of Social, Affective and Subjective Attributes","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3132515.3132521","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Beyond Concept Detection: The Potential of User Intent for Image Retrieval
Behind each photographic act is a rationale that impacts the visual appearance of the resulting photo. Better understanding of this rationale has great potential to support image retrieval systems in serving user needs. However, at present, surprisingly little is known about the connection between what a picture shows (the literally depicted conceptual content) and why that picture was taken (the photographer intent). In this paper, we investigate photographer intent in a large Flickr data set. First, an expert annotator carries out a large number of iterative intent judgments to create a taxonomy of intent classes. Next, analysis of the distribution of concepts and intent classes reveals patterns of independence both at a global and user level. Finally, we report the results of experiments showing that a deep neural network classifier is capable of learning to differentiate between these intent classes, and that these classes support the diversification of image search results.