Fast Multi-Modal Unified Sparse Representation Learning

Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval Pub Date : 2017-06-06 DOI:10.1145/3078971.3079040

Mridula Verma, K. K. Shukla

引用次数: 0

Abstract

Exploiting feature sets belonging to different modalities helps in improving a significant amount of accuracy for the task of recognition. Given representations of an object in different modalities (e.g. image, text, audio etc.), to learn a unified representation of the object, has been a popular problem in the literature of multimedia retrieval. In this paper, we introduce a new iterative algorithm that learns the sparse unified representation with better accuracy in a lesser number of iterations than the previously reported results. Our algorithm employs a new fixed-point iterative scheme along with an inertial step. In order to obtain more discriminative representation, we also imposed a regularization term that utilizes the label information from the datasets. Experimental results on two real benchmark datasets demonstrate the efficacy of our method in terms of the number of iterations and accuracy.

查看原文本刊更多论文

快速多模态统一稀疏表示学习

利用属于不同模态的特征集有助于显著提高识别任务的准确性。给定对象以不同的形式(如图像、文本、音频等)表示，如何学习对象的统一表示一直是多媒体检索文献中的一个热门问题。在本文中，我们引入了一种新的迭代算法，该算法在较少的迭代次数下以更好的精度学习稀疏统一表示。该算法采用了一种新的不动点迭代格式和惯性步长。为了获得更具判别性的表示，我们还施加了一个正则化项，利用来自数据集的标签信息。在两个真实基准数据集上的实验结果证明了该方法在迭代次数和准确率方面的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval

自引率

0.00%

发文量