ChatGPT和CLT：研究多模态加工的差异

Journal of Economy and Technology Pub Date : 2024-12-07 DOI:10.1016/j.ject.2024.11.008

Michael Cahalane, Samuel N. Kirshner

{"title":"ChatGPT和CLT：研究多模态加工的差异","authors":"Michael Cahalane, Samuel N. Kirshner","doi":"10.1016/j.ject.2024.11.008","DOIUrl":null,"url":null,"abstract":"<div><div>Drawing on construal level theory, recent studies have demonstrated that ChatGPT interprets text inputs from an abstract perspective. However, as ChatGPT has evolved into a multimodal tool, this research examines whether ChatGPT's abstraction bias extends to image-based prompts. In a pre-registered study utilising hierarchical letters, ChatGPT predominantly associated these images with local rather than global letters, suggesting a concrete bias when analysing images. This starkly contrasts human participants who predominantly identified the same images with the global letters, indicating that humans and ChatGPT significantly diverge in image interpretations. Furthermore, while humans generally perceive ChatGPT to be more concrete in image processing, there is a notable discrepancy between this perception and the actual level of concreteness exhibited by ChatGPT in handling image-based tasks. These findings provide insights into the distinct cognitive behaviours of LLMs compared to humans, contributing to an emerging understanding of LLM cognition in the context of multimodal inputs.</div></div>","PeriodicalId":100776,"journal":{"name":"Journal of Economy and Technology","volume":"3 ","pages":"Pages 10-21"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ChatGPT and CLT: Investigating differences in multimodal processing\",\"authors\":\"Michael Cahalane, Samuel N. Kirshner\",\"doi\":\"10.1016/j.ject.2024.11.008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Drawing on construal level theory, recent studies have demonstrated that ChatGPT interprets text inputs from an abstract perspective. However, as ChatGPT has evolved into a multimodal tool, this research examines whether ChatGPT's abstraction bias extends to image-based prompts. In a pre-registered study utilising hierarchical letters, ChatGPT predominantly associated these images with local rather than global letters, suggesting a concrete bias when analysing images. This starkly contrasts human participants who predominantly identified the same images with the global letters, indicating that humans and ChatGPT significantly diverge in image interpretations. Furthermore, while humans generally perceive ChatGPT to be more concrete in image processing, there is a notable discrepancy between this perception and the actual level of concreteness exhibited by ChatGPT in handling image-based tasks. These findings provide insights into the distinct cognitive behaviours of LLMs compared to humans, contributing to an emerging understanding of LLM cognition in the context of multimodal inputs.</div></div>\",\"PeriodicalId\":100776,\"journal\":{\"name\":\"Journal of Economy and Technology\",\"volume\":\"3 \",\"pages\":\"Pages 10-21\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-12-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Economy and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2949948824000611\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Economy and Technology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949948824000611","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

根据解释水平理论，最近的研究表明，ChatGPT从抽象的角度解释文本输入。然而，随着ChatGPT已经发展成为一个多模态工具，本研究考察了ChatGPT的抽象偏见是否扩展到基于图像的提示。在一项使用分层字母的预注册研究中，ChatGPT主要将这些图像与局部字母而不是全局字母联系起来，这表明在分析图像时存在具体偏差。这与主要识别相同图像与全球字母的人类参与者形成鲜明对比，表明人类和ChatGPT在图像解释上存在显着差异。此外，虽然人类通常认为ChatGPT在图像处理中更具体，但这种感知与ChatGPT在处理基于图像的任务时所表现出的实际具体水平之间存在显著差异。这些发现提供了法学硕士与人类相比的独特认知行为的见解，有助于在多模态输入背景下对法学硕士认知的新兴理解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

ChatGPT and CLT: Investigating differences in multimodal processing

Drawing on construal level theory, recent studies have demonstrated that ChatGPT interprets text inputs from an abstract perspective. However, as ChatGPT has evolved into a multimodal tool, this research examines whether ChatGPT's abstraction bias extends to image-based prompts. In a pre-registered study utilising hierarchical letters, ChatGPT predominantly associated these images with local rather than global letters, suggesting a concrete bias when analysing images. This starkly contrasts human participants who predominantly identified the same images with the global letters, indicating that humans and ChatGPT significantly diverge in image interpretations. Furthermore, while humans generally perceive ChatGPT to be more concrete in image processing, there is a notable discrepancy between this perception and the actual level of concreteness exhibited by ChatGPT in handling image-based tasks. These findings provide insights into the distinct cognitive behaviours of LLMs compared to humans, contributing to an emerging understanding of LLM cognition in the context of multimodal inputs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Economy and Technology

自引率

0.00%

发文量