SMURF: Statistical Modality Uniqueness and Redundancy Factorization.

Torsten Wörtwein, Nicholas B Allen, Jeffrey F Cohn, Louis-Philippe Morency
{"title":"SMURF: Statistical Modality Uniqueness and Redundancy Factorization.","authors":"Torsten Wörtwein, Nicholas B Allen, Jeffrey F Cohn, Louis-Philippe Morency","doi":"10.1145/3678957.3685716","DOIUrl":null,"url":null,"abstract":"<p><p>Multimodal late fusion is a well-performing fusion method that sums the outputs of separately processed modalities, so-called modality contributions, to create a prediction; for example, summing contributions from vision, acoustic, and language to predict affective states. In this paper, our primary goal is to improve the interpretability of what modalities contribute to the prediction in late fusion models. More specifically, we want to factorize modality contributions into what is consistently shared by at least two modalities (pairwise redundant contributions) and what the remaining modality-specific contributions are (unique contributions). Our secondary goal is to improve robustness to missing modalities by encouraging the model to learn redundant contributions. To achieve our two goals, we propose SMURF (Statistical Modality Uniqueness and Redundancy Factorization), a late fusion method that factorizes its outputs into a) unique contributions that are uncorrelated with all other modalities and b) pairwise redundant contributions that are maximally correlated between two modalities. For our primary goal, we 1) verify SMURF's factorization on a synthetic dataset, 2) ensure that its factorization does not degrade predictive performance on eight affective datasets, and 3) observe significant relationships between its factorization and human judgments on three datasets. For our secondary goal, we demonstrate that SMURF leads to more robustness to missing modalities at test time compared to three late fusion baselines.</p>","PeriodicalId":74508,"journal":{"name":"Proceedings of the ... ACM International Conference on Multimodal Interaction. ICMI (Conference)","volume":"2024 ","pages":"339-349"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11637459/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ACM International Conference on Multimodal Interaction. ICMI (Conference)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3678957.3685716","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/4 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Multimodal late fusion is a well-performing fusion method that sums the outputs of separately processed modalities, so-called modality contributions, to create a prediction; for example, summing contributions from vision, acoustic, and language to predict affective states. In this paper, our primary goal is to improve the interpretability of what modalities contribute to the prediction in late fusion models. More specifically, we want to factorize modality contributions into what is consistently shared by at least two modalities (pairwise redundant contributions) and what the remaining modality-specific contributions are (unique contributions). Our secondary goal is to improve robustness to missing modalities by encouraging the model to learn redundant contributions. To achieve our two goals, we propose SMURF (Statistical Modality Uniqueness and Redundancy Factorization), a late fusion method that factorizes its outputs into a) unique contributions that are uncorrelated with all other modalities and b) pairwise redundant contributions that are maximally correlated between two modalities. For our primary goal, we 1) verify SMURF's factorization on a synthetic dataset, 2) ensure that its factorization does not degrade predictive performance on eight affective datasets, and 3) observe significant relationships between its factorization and human judgments on three datasets. For our secondary goal, we demonstrate that SMURF leads to more robustness to missing modalities at test time compared to three late fusion baselines.

统计模态唯一性和冗余分解。
多模态后期融合是一种性能良好的融合方法,它将单独处理的模态的输出相加,即所谓的模态贡献,以创建预测;例如,汇总视觉、听觉和语言的贡献来预测情感状态。在本文中,我们的主要目标是提高在后期融合模型中哪些模式有助于预测的可解释性。更具体地说,我们希望将模态贡献分解为至少两个模态一致共享的部分(两两冗余贡献)和剩余的特定于模态的贡献(唯一贡献)。我们的第二个目标是通过鼓励模型学习冗余贡献来提高对缺失模态的鲁棒性。为了实现我们的两个目标,我们提出了SMURF(统计模态唯一性和冗余分解),这是一种后期融合方法,它将其输出分解为a)与所有其他模态不相关的唯一贡献和b)两个模态之间最大相关的两两冗余贡献。为了实现我们的主要目标,我们1)在一个合成数据集上验证SMURF的分解,2)确保其分解不会降低8个情感数据集上的预测性能,以及3)在3个数据集上观察其分解与人类判断之间的显著关系。对于我们的第二个目标,我们证明了与三个后期融合基线相比,SMURF在测试时对缺失模态的鲁棒性更强。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信