Multimodal recommender system based on multi-channel counterfactual learning networks

IF 4.3 3区 材料科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Hong Fang, Leiyuxin Sha, Jindong Liang
{"title":"Multimodal recommender system based on multi-channel counterfactual learning networks","authors":"Hong Fang, Leiyuxin Sha, Jindong Liang","doi":"10.1007/s00530-024-01448-z","DOIUrl":null,"url":null,"abstract":"<p>Most multimodal recommender systems utilize multimodal content of user-interacted items as supplemental information to capture user preferences based on historical interactions without considering user-uninteracted items. In contrast, multimodal recommender systems based on causal inference counterfactual learning utilize the causal difference between the multimodal content of user-interacted and user-uninteracted items to purify the content related to user preferences. However, existing methods adopt a unified multimodal channel, which treats each modality equally, resulting in the inability to distinguish users’ tastes for different modalities. Therefore, the differences in users’ attention and perception of different modalities' content cannot be reflected. To cope with the above issue, this paper proposes a novel recommender system based on multi-channel counterfactual learning (MCCL) networks to capture user fine-grained preferences on different modalities. First, two independent channels are established based on the corresponding features for the content of image and text modalities for modality-specific feature extraction. Then, leveraging the counterfactual theory of causal inference, features in each channel unrelated to user preferences are eliminated using the features of the user-uninteracted items. Features related to user preferences are enhanced and multimodal user preferences are modeled at the content level, which portrays the users' taste for the different modalities of items. Finally, semantic entities are extracted to model semantic-level multimodal user preferences, which are fused with historical user interaction information and content-level user preferences for recommendation. Extensive experiments on three different datasets show that our results improve up to 4.17% on NDCG compared to the optimal model.</p>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01448-z","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Most multimodal recommender systems utilize multimodal content of user-interacted items as supplemental information to capture user preferences based on historical interactions without considering user-uninteracted items. In contrast, multimodal recommender systems based on causal inference counterfactual learning utilize the causal difference between the multimodal content of user-interacted and user-uninteracted items to purify the content related to user preferences. However, existing methods adopt a unified multimodal channel, which treats each modality equally, resulting in the inability to distinguish users’ tastes for different modalities. Therefore, the differences in users’ attention and perception of different modalities' content cannot be reflected. To cope with the above issue, this paper proposes a novel recommender system based on multi-channel counterfactual learning (MCCL) networks to capture user fine-grained preferences on different modalities. First, two independent channels are established based on the corresponding features for the content of image and text modalities for modality-specific feature extraction. Then, leveraging the counterfactual theory of causal inference, features in each channel unrelated to user preferences are eliminated using the features of the user-uninteracted items. Features related to user preferences are enhanced and multimodal user preferences are modeled at the content level, which portrays the users' taste for the different modalities of items. Finally, semantic entities are extracted to model semantic-level multimodal user preferences, which are fused with historical user interaction information and content-level user preferences for recommendation. Extensive experiments on three different datasets show that our results improve up to 4.17% on NDCG compared to the optimal model.

Abstract Image

基于多通道反事实学习网络的多模式推荐系统
大多数多模态推荐系统利用用户互动项目的多模态内容作为补充信息,以历史互动为基础捕捉用户偏好,而不考虑用户未互动的项目。相比之下,基于因果推理反事实学习的多模态推荐系统则利用用户互动项目和用户未互动项目的多模态内容之间的因果差异来提纯与用户偏好相关的内容。然而,现有方法采用统一的多模态通道,对每种模态一视同仁,导致无法区分用户对不同模态的喜好。因此,无法反映用户对不同模式内容的关注和感知差异。针对上述问题,本文提出了一种基于多通道反事实学习(MCCL)网络的新型推荐系统,以捕捉用户对不同模式的细粒度偏好。首先,根据图像和文本模态内容的相应特征建立两个独立通道,以提取特定模态的特征。然后,利用因果推理的反事实理论,利用用户未互动项目的特征剔除每个通道中与用户偏好无关的特征。增强与用户偏好相关的特征,并在内容层面建立多模态用户偏好模型,从而描绘出用户对不同模态项目的喜好。最后,提取语义实体,建立语义级多模态用户偏好模型,并将其与历史用户交互信息和内容级用户偏好融合,以进行推荐。在三个不同数据集上进行的广泛实验表明,与最优模型相比,我们的结果在 NDCG 上提高了 4.17%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.20
自引率
4.30%
发文量
567
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信