{"title":"Social Font Search by Multimodal Feature Embedding","authors":"Saemi Choi, Shun Matsumura, K. Aizawa","doi":"10.1145/3338533.3366595","DOIUrl":null,"url":null,"abstract":"A typical tag/keyword-based search system retrieves documents where, given a query term q, the query term q occurs in the dataset. However, when applying these systems to a real-world font web community setting, practical challenges arise --- font tags are more subjective than other benchmark datasets, which magnify the tag mismatch problem. To address these challenges, we propose a tag dictionary space leveraged by word embedding, which relates undefined words that have a similar meaning. Even if a query is not defined in the tag dictionary, we can represent it as a vector on the tag dictionary space. The proposed system facilitates multi-modal inputs that can use both textual and image queries. By integrating a visual sentiment concept model that classifies affective concepts as adjective--noun pairs for a given image and uses it as a query, users can interact with the search system in a multi-modal way. We used crowd sourcing to collect user ratings for the retrieved fonts and observed that the retrieved font with the proposed methods obtained a higher score compared to other methods.","PeriodicalId":273086,"journal":{"name":"Proceedings of the ACM Multimedia Asia","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Multimedia Asia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3338533.3366595","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A typical tag/keyword-based search system retrieves documents where, given a query term q, the query term q occurs in the dataset. However, when applying these systems to a real-world font web community setting, practical challenges arise --- font tags are more subjective than other benchmark datasets, which magnify the tag mismatch problem. To address these challenges, we propose a tag dictionary space leveraged by word embedding, which relates undefined words that have a similar meaning. Even if a query is not defined in the tag dictionary, we can represent it as a vector on the tag dictionary space. The proposed system facilitates multi-modal inputs that can use both textual and image queries. By integrating a visual sentiment concept model that classifies affective concepts as adjective--noun pairs for a given image and uses it as a query, users can interact with the search system in a multi-modal way. We used crowd sourcing to collect user ratings for the retrieved fonts and observed that the retrieved font with the proposed methods obtained a higher score compared to other methods.