{"title":"Identifying offensive memes in low-resource languages: A multi-modal multi-task approach using valence and arousal","authors":"Gitanjali Kumari , Dibyanayan Bandyopadhyay , Asif Ekbal , Arindam Chatterjee , Vinutha B.N.","doi":"10.1016/j.csl.2025.101781","DOIUrl":null,"url":null,"abstract":"<div><div>Social media platforms, including Facebook, Twitter, and Instagram, have provided a revolutionary communication platform with unrestricted expression. However, this has also led to the propagation of offensive and abusive content, cyberbullying, and harassment. The use of memes, a popular form of multimodal media, has grown exponentially and is often used to spread objectionable content through the use of dark humor. In this paper, we propose a multi-task multi-modal framework for identifying offensive Hindi memes by leveraging the auxiliary tasks of valence and arousal to improve model performance. This approach leads to a more nuanced understanding of offensive memes and outperforms unimodal models that consider only one modality. To facilitate future research, we present a new Hindi corpus, named OffVA, containing 7,646 Hindi memes annotated with offensive, valence, and arousal labels. This is the first dataset of its kind for Hindi and can serve as a benchmark for future research on detecting offensive content in Hindi memes. Additionally, we emphasize the importance of incorporating high-resource language datasets, such as English, in identifying offensive memes in low-resource languages to improve model performance. Our experimental results on this dataset demonstrate that the proposed framework outperforms unimodal models in identifying offensive memes, and the incorporation of valence and arousal as auxiliary tasks leads to better results, highlighting the importance of considering multiple modalities and tasks for effective offensiveness detection in memes.</div></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"92 ","pages":"Article 101781"},"PeriodicalIF":3.1000,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Speech and Language","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0885230825000063","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Social media platforms, including Facebook, Twitter, and Instagram, have provided a revolutionary communication platform with unrestricted expression. However, this has also led to the propagation of offensive and abusive content, cyberbullying, and harassment. The use of memes, a popular form of multimodal media, has grown exponentially and is often used to spread objectionable content through the use of dark humor. In this paper, we propose a multi-task multi-modal framework for identifying offensive Hindi memes by leveraging the auxiliary tasks of valence and arousal to improve model performance. This approach leads to a more nuanced understanding of offensive memes and outperforms unimodal models that consider only one modality. To facilitate future research, we present a new Hindi corpus, named OffVA, containing 7,646 Hindi memes annotated with offensive, valence, and arousal labels. This is the first dataset of its kind for Hindi and can serve as a benchmark for future research on detecting offensive content in Hindi memes. Additionally, we emphasize the importance of incorporating high-resource language datasets, such as English, in identifying offensive memes in low-resource languages to improve model performance. Our experimental results on this dataset demonstrate that the proposed framework outperforms unimodal models in identifying offensive memes, and the incorporation of valence and arousal as auxiliary tasks leads to better results, highlighting the importance of considering multiple modalities and tasks for effective offensiveness detection in memes.
期刊介绍:
Computer Speech & Language publishes reports of original research related to the recognition, understanding, production, coding and mining of speech and language.
The speech and language sciences have a long history, but it is only relatively recently that large-scale implementation of and experimentation with complex models of speech and language processing has become feasible. Such research is often carried out somewhat separately by practitioners of artificial intelligence, computer science, electronic engineering, information retrieval, linguistics, phonetics, or psychology.