{"title":"基于多模态特征嵌入的社交字体搜索","authors":"Saemi Choi, Shun Matsumura, K. Aizawa","doi":"10.1145/3338533.3366595","DOIUrl":null,"url":null,"abstract":"A typical tag/keyword-based search system retrieves documents where, given a query term q, the query term q occurs in the dataset. However, when applying these systems to a real-world font web community setting, practical challenges arise --- font tags are more subjective than other benchmark datasets, which magnify the tag mismatch problem. To address these challenges, we propose a tag dictionary space leveraged by word embedding, which relates undefined words that have a similar meaning. Even if a query is not defined in the tag dictionary, we can represent it as a vector on the tag dictionary space. The proposed system facilitates multi-modal inputs that can use both textual and image queries. By integrating a visual sentiment concept model that classifies affective concepts as adjective--noun pairs for a given image and uses it as a query, users can interact with the search system in a multi-modal way. We used crowd sourcing to collect user ratings for the retrieved fonts and observed that the retrieved font with the proposed methods obtained a higher score compared to other methods.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Social Font Search by Multimodal Feature Embedding\",\"authors\":\"Saemi Choi, Shun Matsumura, K. Aizawa\",\"doi\":\"10.1145/3338533.3366595\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A typical tag/keyword-based search system retrieves documents where, given a query term q, the query term q occurs in the dataset. However, when applying these systems to a real-world font web community setting, practical challenges arise --- font tags are more subjective than other benchmark datasets, which magnify the tag mismatch problem. To address these challenges, we propose a tag dictionary space leveraged by word embedding, which relates undefined words that have a similar meaning. Even if a query is not defined in the tag dictionary, we can represent it as a vector on the tag dictionary space. The proposed system facilitates multi-modal inputs that can use both textual and image queries. By integrating a visual sentiment concept model that classifies affective concepts as adjective--noun pairs for a given image and uses it as a query, users can interact with the search system in a multi-modal way. We used crowd sourcing to collect user ratings for the retrieved fonts and observed that the retrieved font with the proposed methods obtained a higher score compared to other methods.\",\"PeriodicalId\":273086,\"journal\":{\"name\":\"Proceedings of the ACM Multimedia Asia\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ACM Multimedia Asia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3338533.3366595\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Multimedia Asia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3338533.3366595","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Social Font Search by Multimodal Feature Embedding
A typical tag/keyword-based search system retrieves documents where, given a query term q, the query term q occurs in the dataset. However, when applying these systems to a real-world font web community setting, practical challenges arise --- font tags are more subjective than other benchmark datasets, which magnify the tag mismatch problem. To address these challenges, we propose a tag dictionary space leveraged by word embedding, which relates undefined words that have a similar meaning. Even if a query is not defined in the tag dictionary, we can represent it as a vector on the tag dictionary space. The proposed system facilitates multi-modal inputs that can use both textual and image queries. By integrating a visual sentiment concept model that classifies affective concepts as adjective--noun pairs for a given image and uses it as a query, users can interact with the search system in a multi-modal way. We used crowd sourcing to collect user ratings for the retrieved fonts and observed that the retrieved font with the proposed methods obtained a higher score compared to other methods.