Multimodal Representation Learning: Advances, Trends and Challenges

Sufang Zhang, Jun-Hai Zhai, Bo-Jun Xie, Yan Zhan, Xin Wang
{"title":"Multimodal Representation Learning: Advances, Trends and Challenges","authors":"Sufang Zhang, Jun-Hai Zhai, Bo-Jun Xie, Yan Zhan, Xin Wang","doi":"10.1109/ICMLC48188.2019.8949228","DOIUrl":null,"url":null,"abstract":"Representation learning is the base and crucial for consequential tasks, such as classification, regression, and recognition. The goal of representation learning is to automatically learning good features with deep models. Multimodal representation learning is a special representation learning, which automatically learns good features from multiple modalities, and these modalities are not independent, there are correlations and associations among modalities. Furthermore, multimodal data are usually heterogeneous. Due to the characteristics, multimodal representation learning poses many difficulties: how to combine multimodal data from heterogeneous sources; how to jointly learning features from multimodal data; how to effectively describe the correlations and associations, etc. These difficulties triggered great interest of researchers along with the upsurge of deep learning, many deep multimodal learning methods have been proposed by different researchers. In this paper, we present an overview of deep multimodal learning, especially the approaches proposed within the last decades. We provide potential readers with advances, trends and challenges, which can be very helpful to researchers in the field of machine, especially for the ones engaging in the study of multimodal deep machine learning.","PeriodicalId":221349,"journal":{"name":"2019 International Conference on Machine Learning and Cybernetics (ICMLC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Machine Learning and Cybernetics (ICMLC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLC48188.2019.8949228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

Representation learning is the base and crucial for consequential tasks, such as classification, regression, and recognition. The goal of representation learning is to automatically learning good features with deep models. Multimodal representation learning is a special representation learning, which automatically learns good features from multiple modalities, and these modalities are not independent, there are correlations and associations among modalities. Furthermore, multimodal data are usually heterogeneous. Due to the characteristics, multimodal representation learning poses many difficulties: how to combine multimodal data from heterogeneous sources; how to jointly learning features from multimodal data; how to effectively describe the correlations and associations, etc. These difficulties triggered great interest of researchers along with the upsurge of deep learning, many deep multimodal learning methods have been proposed by different researchers. In this paper, we present an overview of deep multimodal learning, especially the approaches proposed within the last decades. We provide potential readers with advances, trends and challenges, which can be very helpful to researchers in the field of machine, especially for the ones engaging in the study of multimodal deep machine learning.
多模态表示学习:进展、趋势和挑战
表征学习是分类、回归和识别等后续任务的基础和关键。表示学习的目标是利用深度模型自动学习好的特征。多模态表征学习是一种特殊的表征学习,它自动地从多个模态中学习到好的特征,并且这些模态之间不是独立的,而是相互关联的。此外,多模态数据通常是异构的。由于这些特点,多模态表示学习带来了许多困难:如何组合来自异构源的多模态数据;如何从多模态数据中联合学习特征;如何有效地描述相关性和关联性等。随着深度学习的兴起,这些困难引起了研究者的极大兴趣,不同的研究者提出了许多深度多模态学习方法。在本文中,我们介绍了深度多模态学习的概述,特别是在过去的几十年里提出的方法。我们为潜在的读者提供了一些进展、趋势和挑战,这对机器领域的研究人员,特别是从事多模态深度机器学习研究的研究人员非常有帮助。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信