Structural fusion of heterogeneous visual-auditory features for multimedia analysis

Hong Zhang, Ji-kang Nie, Li Chen
{"title":"Structural fusion of heterogeneous visual-auditory features for multimedia analysis","authors":"Hong Zhang, Ji-kang Nie, Li Chen","doi":"10.1109/FSKD.2013.6816307","DOIUrl":null,"url":null,"abstract":"It is interesting and challenging to learn underlying semantics from multimodal data of different modalities, which carry their own contribution to high-level semantics. However, multimodal data are usually represented with heterogeneous features. It is difficult to learn a semantic subspace where multimodal correlation is learned and preserved. In this paper, we analyze sparse canonical correlation for multimodal data in heterogeneous feature dimension reduction; moreover, we propose subspace optimization strategy with structural multi-feature fusion, which fuse structural content correlation learning result and graph-based semantic correlation learning result into an objective function. Our algorithm has been applied to content based multimedia applications, including image classification and multimedia retrieval. Comprehensive experiments have demonstrated the superiority of our method over several existing algorithms.","PeriodicalId":368964,"journal":{"name":"2013 10th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)","volume":"23 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 10th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FSKD.2013.6816307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

It is interesting and challenging to learn underlying semantics from multimodal data of different modalities, which carry their own contribution to high-level semantics. However, multimodal data are usually represented with heterogeneous features. It is difficult to learn a semantic subspace where multimodal correlation is learned and preserved. In this paper, we analyze sparse canonical correlation for multimodal data in heterogeneous feature dimension reduction; moreover, we propose subspace optimization strategy with structural multi-feature fusion, which fuse structural content correlation learning result and graph-based semantic correlation learning result into an objective function. Our algorithm has been applied to content based multimedia applications, including image classification and multimedia retrieval. Comprehensive experiments have demonstrated the superiority of our method over several existing algorithms.
多媒体分析中异构视觉听觉特征的结构融合
从不同模态的多模态数据中学习底层语义是一项有趣且具有挑战性的工作,这些数据对高级语义有自己的贡献。然而,多模态数据通常用异构特征表示。在多模态关联被学习和保存的语义子空间中,很难学习到语义子空间。本文分析了异构特征降维中多模态数据的稀疏典型相关;提出了结构多特征融合的子空间优化策略,将结构内容相关学习结果和基于图的语义相关学习结果融合为目标函数。该算法已应用于基于内容的多媒体应用,包括图像分类和多媒体检索。综合实验证明了我们的方法优于几种现有算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信