Learning Semantic Correlation of Web Images and Text with Mixture of Local Linear Mappings

Proceedings of the 23rd ACM international conference on Multimedia Pub Date : 2015-10-13 DOI:10.1145/2733373.2806331

Youtian Du, Kai Yang

引用次数: 4

Abstract

This paper proposes a new approach, called mixture of local linear mappings (MLLM), to the modeling of semantic correlation between web images and text. We consider that close examples generally represent a uniform concept and can be supposed to be locally transformed based on a linear mapping into the feature space of another modality. Thus, we use a mixture of local linear transformations, each local component being constrained by a neighborhood model into a finite local space, instead of a more complex nonlinear one. To handle the sparseness of data representation, we introduce the constraints of sparseness and non-negativeness into the approach. MLLM is with good interpretability due to its explicit closed form and concept-related local components, and it avoids the determination of capacity that is often considered for nonlinear transformations. Experimental results demonstrate the effectiveness of the proposed approach.

查看原文本刊更多论文

混合局部线性映射学习Web图像和文本的语义关联

本文提出了一种局部线性混合映射(MLLM)的方法来对网络图像和文本之间的语义关联进行建模。我们认为，紧密的例子通常表示一个统一的概念，可以假定是局部变换基于一个线性映射到另一个模态的特征空间。因此，我们使用局部线性变换的混合，每个局部分量被一个邻域模型约束到有限的局部空间，而不是一个更复杂的非线性空间。为了处理数据表示的稀疏性，我们在该方法中引入了稀疏性和非负性约束。由于其明确的封闭形式和与概念相关的局部分量，MLLM具有良好的可解释性，并且避免了非线性变换经常考虑的容量确定问题。实验结果证明了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 23rd ACM international conference on Multimedia

自引率

0.00%

发文量