低资源语言中攻击性模因的识别:基于效价和唤醒的多模态多任务方法

IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Gitanjali Kumari , Dibyanayan Bandyopadhyay , Asif Ekbal , Arindam Chatterjee , Vinutha B.N.
{"title":"低资源语言中攻击性模因的识别:基于效价和唤醒的多模态多任务方法","authors":"Gitanjali Kumari ,&nbsp;Dibyanayan Bandyopadhyay ,&nbsp;Asif Ekbal ,&nbsp;Arindam Chatterjee ,&nbsp;Vinutha B.N.","doi":"10.1016/j.csl.2025.101781","DOIUrl":null,"url":null,"abstract":"<div><div>Social media platforms, including Facebook, Twitter, and Instagram, have provided a revolutionary communication platform with unrestricted expression. However, this has also led to the propagation of offensive and abusive content, cyberbullying, and harassment. The use of memes, a popular form of multimodal media, has grown exponentially and is often used to spread objectionable content through the use of dark humor. In this paper, we propose a multi-task multi-modal framework for identifying offensive Hindi memes by leveraging the auxiliary tasks of valence and arousal to improve model performance. This approach leads to a more nuanced understanding of offensive memes and outperforms unimodal models that consider only one modality. To facilitate future research, we present a new Hindi corpus, named OffVA, containing 7,646 Hindi memes annotated with offensive, valence, and arousal labels. This is the first dataset of its kind for Hindi and can serve as a benchmark for future research on detecting offensive content in Hindi memes. Additionally, we emphasize the importance of incorporating high-resource language datasets, such as English, in identifying offensive memes in low-resource languages to improve model performance. Our experimental results on this dataset demonstrate that the proposed framework outperforms unimodal models in identifying offensive memes, and the incorporation of valence and arousal as auxiliary tasks leads to better results, highlighting the importance of considering multiple modalities and tasks for effective offensiveness detection in memes.</div></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"92 ","pages":"Article 101781"},"PeriodicalIF":3.1000,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Identifying offensive memes in low-resource languages: A multi-modal multi-task approach using valence and arousal\",\"authors\":\"Gitanjali Kumari ,&nbsp;Dibyanayan Bandyopadhyay ,&nbsp;Asif Ekbal ,&nbsp;Arindam Chatterjee ,&nbsp;Vinutha B.N.\",\"doi\":\"10.1016/j.csl.2025.101781\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Social media platforms, including Facebook, Twitter, and Instagram, have provided a revolutionary communication platform with unrestricted expression. However, this has also led to the propagation of offensive and abusive content, cyberbullying, and harassment. The use of memes, a popular form of multimodal media, has grown exponentially and is often used to spread objectionable content through the use of dark humor. In this paper, we propose a multi-task multi-modal framework for identifying offensive Hindi memes by leveraging the auxiliary tasks of valence and arousal to improve model performance. This approach leads to a more nuanced understanding of offensive memes and outperforms unimodal models that consider only one modality. To facilitate future research, we present a new Hindi corpus, named OffVA, containing 7,646 Hindi memes annotated with offensive, valence, and arousal labels. This is the first dataset of its kind for Hindi and can serve as a benchmark for future research on detecting offensive content in Hindi memes. Additionally, we emphasize the importance of incorporating high-resource language datasets, such as English, in identifying offensive memes in low-resource languages to improve model performance. Our experimental results on this dataset demonstrate that the proposed framework outperforms unimodal models in identifying offensive memes, and the incorporation of valence and arousal as auxiliary tasks leads to better results, highlighting the importance of considering multiple modalities and tasks for effective offensiveness detection in memes.</div></div>\",\"PeriodicalId\":50638,\"journal\":{\"name\":\"Computer Speech and Language\",\"volume\":\"92 \",\"pages\":\"Article 101781\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-02-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Speech and Language\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0885230825000063\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Speech and Language","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0885230825000063","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

包括Facebook、Twitter和Instagram在内的社交媒体平台提供了一个不受限制的革命性交流平台。然而,这也导致了攻击性和辱骂性内容的传播、网络欺凌和骚扰。模因是一种流行的多模式媒体形式,其使用呈指数级增长,经常被用来通过使用黑色幽默来传播令人反感的内容。在本文中,我们提出了一个多任务多模态框架,利用价态和唤醒的辅助任务来识别攻击性印地语模因,以提高模型的性能。这种方法可以更细致地理解攻击性模因,并且优于只考虑一种模态的单模态模型。为了促进未来的研究,我们提出了一个新的印地语语料库,名为OffVA,包含7,646个带有攻击性、价态和唤醒标签的印地语模因。这是第一个针对印地语的数据集,可以作为未来研究检测印地语表情包中冒犯性内容的基准。此外,我们强调了整合高资源语言数据集(如英语)在识别低资源语言中的攻击性模因以提高模型性能方面的重要性。我们在该数据集上的实验结果表明,所提出的框架在识别攻击性模因方面优于单模态模型,并且将效价和唤醒作为辅助任务的结合导致更好的结果,突出了考虑多种模态和任务对模因中有效的攻击性检测的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Identifying offensive memes in low-resource languages: A multi-modal multi-task approach using valence and arousal
Social media platforms, including Facebook, Twitter, and Instagram, have provided a revolutionary communication platform with unrestricted expression. However, this has also led to the propagation of offensive and abusive content, cyberbullying, and harassment. The use of memes, a popular form of multimodal media, has grown exponentially and is often used to spread objectionable content through the use of dark humor. In this paper, we propose a multi-task multi-modal framework for identifying offensive Hindi memes by leveraging the auxiliary tasks of valence and arousal to improve model performance. This approach leads to a more nuanced understanding of offensive memes and outperforms unimodal models that consider only one modality. To facilitate future research, we present a new Hindi corpus, named OffVA, containing 7,646 Hindi memes annotated with offensive, valence, and arousal labels. This is the first dataset of its kind for Hindi and can serve as a benchmark for future research on detecting offensive content in Hindi memes. Additionally, we emphasize the importance of incorporating high-resource language datasets, such as English, in identifying offensive memes in low-resource languages to improve model performance. Our experimental results on this dataset demonstrate that the proposed framework outperforms unimodal models in identifying offensive memes, and the incorporation of valence and arousal as auxiliary tasks leads to better results, highlighting the importance of considering multiple modalities and tasks for effective offensiveness detection in memes.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computer Speech and Language
Computer Speech and Language 工程技术-计算机:人工智能
CiteScore
11.30
自引率
4.70%
发文量
80
审稿时长
22.9 weeks
期刊介绍: Computer Speech & Language publishes reports of original research related to the recognition, understanding, production, coding and mining of speech and language. The speech and language sciences have a long history, but it is only relatively recently that large-scale implementation of and experimentation with complex models of speech and language processing has become feasible. Such research is often carried out somewhat separately by practitioners of artificial intelligence, computer science, electronic engineering, information retrieval, linguistics, phonetics, or psychology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信