Multimodal computation or interpretation? Automatic vs. critical understanding of text-image relations in racist memes in English

IF 2.3 2区文学 Q1 COMMUNICATION

Discourse Context & Media Pub Date : 2024-02-01 DOI:10.1016/j.dcm.2024.100755

Chiara Polli , Maria Grazia Sindoni

{"title":"Multimodal computation or interpretation? Automatic vs. critical understanding of text-image relations in racist memes in English","authors":"Chiara Polli , Maria Grazia Sindoni","doi":"10.1016/j.dcm.2024.100755","DOIUrl":null,"url":null,"abstract":"<div><p>This paper discusses the epistemological differences between the label ‘multimodal’ in computational and sociosemiotic terms by addressing the challenges of automatic detection of hate speech in racist memes, considered as germane families of multimodal artifacts. Assuming that text-image interplays, such is the case of memes, may be extremely complex to disentangle by AI-driven models, the paper adopts a sociosemiotic multimodal critical approach to discuss the challenges of automatic detection of hateful memes on the Internet. As a case study, we select two different English-language datasets, 1) the Hateful Memes Challenge (HMC) Dataset, which was built by the Facebook AI Research group in 2020, and 2) the Text-Image Cluster (TIC) Dataset, including manually collected user-generated (UG) hateful memes. By discussing different combinations of non-hateful/hateful texts and non-hateful/hateful images, we will show how humour, intertextuality, and anomalous juxtapositions of texts and images, as well as contextual cultural knowledge, may make AI-based automatic interpretation incorrect, biased or misleading. In our conclusions, we will argue the case for the development of computational models that incorporate insights from sociosemiotics and multimodal critical discourse analysis.</p></div>","PeriodicalId":46649,"journal":{"name":"Discourse Context & Media","volume":"57 ","pages":"Article 100755"},"PeriodicalIF":2.3000,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2211695824000011/pdfft?md5=34f53fbb557f1af17cea42644e44a543&pid=1-s2.0-S2211695824000011-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Discourse Context & Media","FirstCategoryId":"98","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2211695824000011","RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMMUNICATION","Score":null,"Total":0}

引用次数: 0

Abstract

This paper discusses the epistemological differences between the label ‘multimodal’ in computational and sociosemiotic terms by addressing the challenges of automatic detection of hate speech in racist memes, considered as germane families of multimodal artifacts. Assuming that text-image interplays, such is the case of memes, may be extremely complex to disentangle by AI-driven models, the paper adopts a sociosemiotic multimodal critical approach to discuss the challenges of automatic detection of hateful memes on the Internet. As a case study, we select two different English-language datasets, 1) the Hateful Memes Challenge (HMC) Dataset, which was built by the Facebook AI Research group in 2020, and 2) the Text-Image Cluster (TIC) Dataset, including manually collected user-generated (UG) hateful memes. By discussing different combinations of non-hateful/hateful texts and non-hateful/hateful images, we will show how humour, intertextuality, and anomalous juxtapositions of texts and images, as well as contextual cultural knowledge, may make AI-based automatic interpretation incorrect, biased or misleading. In our conclusions, we will argue the case for the development of computational models that incorporate insights from sociosemiotics and multimodal critical discourse analysis.

查看原文本刊更多论文

多模态计算还是解释？自动理解与批判性理解英语种族主义备忘录中的文本-图像关系

本文讨论了 "多模态 "这一标签在计算和社会交际方面的认识论差异，探讨了自动检测种族主义备忘录中的仇恨言论所面临的挑战。假设文本与图像之间的相互作用（如memes的情况）可能极其复杂，人工智能驱动的模型难以厘清，因此本文采用了社会交际学的多模态批判方法来讨论自动检测互联网上的仇恨memes所面临的挑战。作为案例研究，我们选择了两个不同的英语数据集：1）2020 年由 Facebook 人工智能研究小组建立的仇恨备忘录挑战（HMC）数据集；2）文本-图像集群（TIC）数据集，包括人工收集的用户生成的仇恨备忘录。通过讨论非仇恨/仇恨文本和非仇恨/仇恨图像的不同组合，我们将展示幽默、互文性、文本和图像的异常并置以及上下文文化知识是如何使基于人工智能的自动解读变得不正确、有偏见或误导的。在结论中，我们将论证结合社会符号学和多模态批判性话语分析的见解开发计算模型的必要性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Discourse Context & Media COMMUNICATION-

CiteScore

5.00

自引率

10.00%

发文量

审稿时长

55 days