Multimodal metaphor recognition based on chain-of-cognition prompting

IF 2.1 3区心理学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Cognitive Systems Research Pub Date : 2025-04-04 DOI:10.1016/j.cogsys.2025.101356

Dongyu Zhang , Xingyuan Lu , Mulin Zhuang , Senqi Yang , Hongjun Chen

{"title":"Multimodal metaphor recognition based on chain-of-cognition prompting","authors":"Dongyu Zhang , Xingyuan Lu , Mulin Zhuang , Senqi Yang , Hongjun Chen","doi":"10.1016/j.cogsys.2025.101356","DOIUrl":null,"url":null,"abstract":"<div><div>Metaphor is a way of thinking and cognition prevalent in human language. With the development of social media and multimodal data, metaphor recognition research has expanded from the traditional unimodal scope (such as text or images) to the multimodality. However, current multimodal metaphor processing methods mainly focus on fusion techniques for multiple modalities such as text and image, but neglect the cognitive mechanism of metaphor as a way of thinking, and are deficient in utilizing pre-trained information from large language models. Therefore, this paper proposes a chain-of-cognition prompting (CoC) method to address multimodal metaphor recognition task, which makes full use of the pre-training information of the large model in order to better recognize metaphors. The method utilizes prompting words to construct inputs that guide the large language model to reason about potential metaphorical source and target domain related entities and associations between entities in the sample. At the same time, visual information is obtained through image caption extraction and a visual encoder to enable the model to reason and produce metaphor recognition results. The experimental results show that the method performs well on the metaphor recognition task, which is better than the existing baseline model, verifying the effectiveness of the method on the metaphor recognition task.</div></div>","PeriodicalId":55242,"journal":{"name":"Cognitive Systems Research","volume":"91 ","pages":"Article 101356"},"PeriodicalIF":2.1000,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Systems Research","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389041725000361","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Metaphor is a way of thinking and cognition prevalent in human language. With the development of social media and multimodal data, metaphor recognition research has expanded from the traditional unimodal scope (such as text or images) to the multimodality. However, current multimodal metaphor processing methods mainly focus on fusion techniques for multiple modalities such as text and image, but neglect the cognitive mechanism of metaphor as a way of thinking, and are deficient in utilizing pre-trained information from large language models. Therefore, this paper proposes a chain-of-cognition prompting (CoC) method to address multimodal metaphor recognition task, which makes full use of the pre-training information of the large model in order to better recognize metaphors. The method utilizes prompting words to construct inputs that guide the large language model to reason about potential metaphorical source and target domain related entities and associations between entities in the sample. At the same time, visual information is obtained through image caption extraction and a visual encoder to enable the model to reason and produce metaphor recognition results. The experimental results show that the method performs well on the metaphor recognition task, which is better than the existing baseline model, verifying the effectiveness of the method on the metaphor recognition task.

查看原文本刊更多论文

基于认知链提示的多模态隐喻识别

隐喻是人类语言中普遍存在的一种思维和认知方式。随着社交媒体和多模态数据的发展，隐喻识别研究已经从传统的单模态范围（如文本或图像）扩展到多模态。然而，目前的多模态隐喻处理方法主要侧重于文本和图像等多模态的融合技术，而忽视了隐喻作为一种思维方式的认知机制，缺乏对大语言模型预训练信息的利用。因此，本文提出了一种认知链提示（CoC）方法来解决多模态隐喻识别任务，充分利用大模型的预训练信息来更好地识别隐喻。该方法利用提示词构建输入，引导大型语言模型推断样本中潜在的隐喻源和目标领域相关实体以及实体之间的关联。同时，通过图像标题提取和视觉编码器获得视觉信息，使模型能够推理并产生隐喻识别结果。实验结果表明，该方法在隐喻识别任务上表现良好，优于现有的基线模型，验证了该方法在隐喻识别任务上的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Cognitive Systems Research 工程技术-计算机：人工智能

CiteScore

9.40

自引率

5.10%

发文量

审稿时长

>12 weeks

期刊介绍： Cognitive Systems Research is dedicated to the study of human-level cognition. As such, it welcomes papers which advance the understanding, design and applications of cognitive and intelligent systems, both natural and artificial. The journal brings together a broad community studying cognition in its many facets in vivo and in silico, across the developmental spectrum, focusing on individual capacities or on entire architectures. It aims to foster debate and integrate ideas, concepts, constructs, theories, models and techniques from across different disciplines and different perspectives on human-level cognition. The scope of interest includes the study of cognitive capacities and architectures - both brain-inspired and non-brain-inspired - and the application of cognitive systems to real-world problems as far as it offers insights relevant for the understanding of cognition. Cognitive Systems Research therefore welcomes mature and cutting-edge research approaching cognition from a systems-oriented perspective, both theoretical and empirically-informed, in the form of original manuscripts, short communications, opinion articles, systematic reviews, and topical survey articles from the fields of Cognitive Science (including Philosophy of Cognitive Science), Artificial Intelligence/Computer Science, Cognitive Robotics, Developmental Science, Psychology, and Neuroscience and Neuromorphic Engineering. Empirical studies will be considered if they are supplemented by theoretical analyses and contributions to theory development and/or computational modelling studies.