Structural fusion of heterogeneous visual-auditory features for multimedia analysis

2013 10th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD) Pub Date : 2013-07-23 DOI:10.1109/FSKD.2013.6816307

Hong Zhang, Ji-kang Nie, Li Chen

引用次数: 0

Abstract

It is interesting and challenging to learn underlying semantics from multimodal data of different modalities, which carry their own contribution to high-level semantics. However, multimodal data are usually represented with heterogeneous features. It is difficult to learn a semantic subspace where multimodal correlation is learned and preserved. In this paper, we analyze sparse canonical correlation for multimodal data in heterogeneous feature dimension reduction; moreover, we propose subspace optimization strategy with structural multi-feature fusion, which fuse structural content correlation learning result and graph-based semantic correlation learning result into an objective function. Our algorithm has been applied to content based multimedia applications, including image classification and multimedia retrieval. Comprehensive experiments have demonstrated the superiority of our method over several existing algorithms.

查看原文本刊更多论文

多媒体分析中异构视觉听觉特征的结构融合

从不同模态的多模态数据中学习底层语义是一项有趣且具有挑战性的工作，这些数据对高级语义有自己的贡献。然而，多模态数据通常用异构特征表示。在多模态关联被学习和保存的语义子空间中，很难学习到语义子空间。本文分析了异构特征降维中多模态数据的稀疏典型相关;提出了结构多特征融合的子空间优化策略，将结构内容相关学习结果和基于图的语义相关学习结果融合为目标函数。该算法已应用于基于内容的多媒体应用，包括图像分类和多媒体检索。综合实验证明了我们的方法优于几种现有算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2013 10th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

自引率

0.00%

发文量