Social Font Search by Multimodal Feature Embedding

Proceedings of the ACM Multimedia Asia Pub Date : 2019-12-15 DOI:10.1145/3338533.3366595

Saemi Choi, Shun Matsumura, K. Aizawa

引用次数: 0

Abstract

A typical tag/keyword-based search system retrieves documents where, given a query term q, the query term q occurs in the dataset. However, when applying these systems to a real-world font web community setting, practical challenges arise --- font tags are more subjective than other benchmark datasets, which magnify the tag mismatch problem. To address these challenges, we propose a tag dictionary space leveraged by word embedding, which relates undefined words that have a similar meaning. Even if a query is not defined in the tag dictionary, we can represent it as a vector on the tag dictionary space. The proposed system facilitates multi-modal inputs that can use both textual and image queries. By integrating a visual sentiment concept model that classifies affective concepts as adjective--noun pairs for a given image and uses it as a query, users can interact with the search system in a multi-modal way. We used crowd sourcing to collect user ratings for the retrieved fonts and observed that the retrieved font with the proposed methods obtained a higher score compared to other methods.

查看原文本刊更多论文

基于多模态特征嵌入的社交字体搜索

典型的基于标记/关键字的搜索系统检索文档，在给定查询词q的情况下，查询词q出现在数据集中。然而，当将这些系统应用于真实的字体web社区设置时，实际的挑战就出现了——字体标签比其他基准数据集更主观，这放大了标签不匹配的问题。为了解决这些挑战，我们提出了一个利用词嵌入的标签字典空间，它将具有相似含义的未定义词联系起来。即使在标记字典中没有定义查询，我们也可以将其表示为标记字典空间中的向量。提出的系统促进多模态输入，可以使用文本和图像查询。通过集成视觉情感概念模型，将给定图像的情感概念分类为形容词-名词对，并将其用作查询，用户可以以多模态方式与搜索系统交互。我们使用众包来收集用户对检索到的字体的评分，并观察到与其他方法相比，使用所提出的方法检索到的字体获得了更高的分数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the ACM Multimedia Asia

自引率

0.00%

发文量