基于情境感知注意力和图神经网络的多模态厌女症检测框架

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management Pub Date : 2024-09-24 DOI:10.1016/j.ipm.2024.103895

Mohammad Zia Ur Rehman , Sufyaan Zahoor , Areeb Manzoor , Musharaf Maqbool , Nagendra Kumar

{"title":"基于情境感知注意力和图神经网络的多模态厌女症检测框架","authors":"Mohammad Zia Ur Rehman , Sufyaan Zahoor , Areeb Manzoor , Musharaf Maqbool , Nagendra Kumar","doi":"10.1016/j.ipm.2024.103895","DOIUrl":null,"url":null,"abstract":"<div><div>A substantial portion of offensive content on social media is directed towards women. Since the approaches for general offensive content detection face a challenge in detecting misogynistic content, it requires solutions tailored to address offensive content against women. To this end, we propose a novel multimodal framework for the detection of misogynistic and sexist content. The framework comprises three modules: the Multimodal Attention module (MANM), the Graph-based Feature Reconstruction Module (GFRM), and the Content-specific Features Learning Module (CFLM). The MANM employs adaptive gating-based multimodal context-aware attention, enabling the model to focus on relevant visual and textual information and generating contextually relevant features. The GFRM module utilizes graphs to refine features within individual modalities, while the CFLM focuses on learning text and image-specific features such as toxicity features and caption features. Additionally, we curate a set of misogynous lexicons to compute the misogyny-specific lexicon score from the text. We apply test-time augmentation in feature space to better generalize the predictions on diverse inputs. The performance of the proposed approach has been evaluated on two multimodal datasets, MAMI, and MMHS150K, with 11,000 and 13,494 samples, respectively. The proposed method demonstrates an average improvement of 11.87% and 10.82% in macro-F1 over existing multimodal methods on the MAMI and MMHS150K datasets, respectively.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4000,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306457324002541/pdfft?md5=d17cb5e20a69f9c766570983bc722abc&pid=1-s2.0-S0306457324002541-main.pdf","citationCount":"0","resultStr":"{\"title\":\"A context-aware attention and graph neural network-based multimodal framework for misogyny detection\",\"authors\":\"Mohammad Zia Ur Rehman , Sufyaan Zahoor , Areeb Manzoor , Musharaf Maqbool , Nagendra Kumar\",\"doi\":\"10.1016/j.ipm.2024.103895\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>A substantial portion of offensive content on social media is directed towards women. Since the approaches for general offensive content detection face a challenge in detecting misogynistic content, it requires solutions tailored to address offensive content against women. To this end, we propose a novel multimodal framework for the detection of misogynistic and sexist content. The framework comprises three modules: the Multimodal Attention module (MANM), the Graph-based Feature Reconstruction Module (GFRM), and the Content-specific Features Learning Module (CFLM). The MANM employs adaptive gating-based multimodal context-aware attention, enabling the model to focus on relevant visual and textual information and generating contextually relevant features. The GFRM module utilizes graphs to refine features within individual modalities, while the CFLM focuses on learning text and image-specific features such as toxicity features and caption features. Additionally, we curate a set of misogynous lexicons to compute the misogyny-specific lexicon score from the text. We apply test-time augmentation in feature space to better generalize the predictions on diverse inputs. The performance of the proposed approach has been evaluated on two multimodal datasets, MAMI, and MMHS150K, with 11,000 and 13,494 samples, respectively. The proposed method demonstrates an average improvement of 11.87% and 10.82% in macro-F1 over existing multimodal methods on the MAMI and MMHS150K datasets, respectively.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2024-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0306457324002541/pdfft?md5=d17cb5e20a69f9c766570983bc722abc&pid=1-s2.0-S0306457324002541-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457324002541\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324002541","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

社交媒体上的攻击性内容有很大一部分是针对女性的。由于一般的攻击性内容检测方法在检测厌女症内容方面面临挑战，因此需要专门针对针对女性的攻击性内容的解决方案。为此，我们提出了一个新颖的多模态框架，用于检测厌女症和性别歧视内容。该框架由三个模块组成：多模态注意模块（MANM）、基于图形的特征重构模块（GFRM）和特定内容特征学习模块（CFLM）。MANM 采用基于自适应门控的多模态上下文感知注意力，使模型能够关注相关的视觉和文本信息，并生成与上下文相关的特征。GFRM 模块利用图形来完善单个模态中的特征，而 CFLM 则侧重于学习文本和图像的特定特征，如毒性特征和标题特征。此外，我们还策划了一组厌女词库，以计算文本中的厌女词库得分。我们在特征空间中应用了测试时间增强技术，以更好地泛化对不同输入的预测。我们在两个多模态数据集 MAMI 和 MMHS150K（分别包含 11,000 和 13,494 个样本）上对所提出方法的性能进行了评估。在 MAMI 和 MMHS150K 数据集上，与现有的多模态方法相比，所提出的方法在 macro-F1 方面分别平均提高了 11.87% 和 10.82%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A context-aware attention and graph neural network-based multimodal framework for misogyny detection

A substantial portion of offensive content on social media is directed towards women. Since the approaches for general offensive content detection face a challenge in detecting misogynistic content, it requires solutions tailored to address offensive content against women. To this end, we propose a novel multimodal framework for the detection of misogynistic and sexist content. The framework comprises three modules: the Multimodal Attention module (MANM), the Graph-based Feature Reconstruction Module (GFRM), and the Content-specific Features Learning Module (CFLM). The MANM employs adaptive gating-based multimodal context-aware attention, enabling the model to focus on relevant visual and textual information and generating contextually relevant features. The GFRM module utilizes graphs to refine features within individual modalities, while the CFLM focuses on learning text and image-specific features such as toxicity features and caption features. Additionally, we curate a set of misogynous lexicons to compute the misogyny-specific lexicon score from the text. We apply test-time augmentation in feature space to better generalize the predictions on diverse inputs. The performance of the proposed approach has been evaluated on two multimodal datasets, MAMI, and MMHS150K, with 11,000 and 13,494 samples, respectively. The proposed method demonstrates an average improvement of 11.87% and 10.82% in macro-F1 over existing multimodal methods on the MAMI and MMHS150K datasets, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information Processing & Management 工程技术-计算机：信息系统

CiteScore

17.00

自引率

11.60%

发文量

276

审稿时长

39 days

期刊介绍： Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.