Enhancing misogyny detection in bilingual texts using explainable AI and multilingual fine-tuned transformers

IF 5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Complex & Intelligent Systems Pub Date : 2024-11-15 DOI:10.1007/s40747-024-01655-1

Ehtesham Hashmi, Sule Yildirim Yayilgan, Muhammad Mudassar Yamin, Mohib Ullah

{"title":"Enhancing misogyny detection in bilingual texts using explainable AI and multilingual fine-tuned transformers","authors":"Ehtesham Hashmi, Sule Yildirim Yayilgan, Muhammad Mudassar Yamin, Mohib Ullah","doi":"10.1007/s40747-024-01655-1","DOIUrl":null,"url":null,"abstract":"<p>Gendered disinformation undermines women’s rights, democratic principles, and national security by worsening societal divisions through authoritarian regimes’ intentional weaponization of social media. Online misogyny represents a harmful societal issue, threatening to transform digital platforms into environments that are hostile and inhospitable to women. Despite the severity of this issue, efforts to persuade digital platforms to strengthen their protections against gendered disinformation are frequently ignored, highlighting the difficult task of countering online misogyny in the face of commercial interests. This growing concern underscores the need for effective measures to create safer online spaces, where respect and equality prevail, ensuring that women can participate fully and freely without the fear of harassment or discrimination. This study addresses the challenge of detecting misogynous content in bilingual (English and Italian) online communications. Utilizing FastText word embeddings and explainable artificial intelligence techniques, we introduce a model that enhances both the interpretability and accuracy in detecting misogynistic language. To conduct an in-depth analysis, we implemented a range of experiments encompassing classic machine learning methodologies and conventional deep learning approaches to the recent transformer-based models incorporating both language-specific and multilingual capabilities. This paper enhances the methodologies for detecting misogyny by incorporating incremental learning for cutting-edge datasets containing tweets and posts from different sources like Facebook, Twitter, and Reddit, with our proposed approach outperforming these datasets in metrics such as accuracy, F1-score, precision, and recall. This process involved refining hyperparameters, employing optimization techniques, and utilizing generative configurations. By implementing Local Interpretable Model-agnostic Explanations (LIME), we further elucidate the rationale behind the model’s predictions, enhancing understanding of its decision-making process.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"216 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-024-01655-1","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Gendered disinformation undermines women’s rights, democratic principles, and national security by worsening societal divisions through authoritarian regimes’ intentional weaponization of social media. Online misogyny represents a harmful societal issue, threatening to transform digital platforms into environments that are hostile and inhospitable to women. Despite the severity of this issue, efforts to persuade digital platforms to strengthen their protections against gendered disinformation are frequently ignored, highlighting the difficult task of countering online misogyny in the face of commercial interests. This growing concern underscores the need for effective measures to create safer online spaces, where respect and equality prevail, ensuring that women can participate fully and freely without the fear of harassment or discrimination. This study addresses the challenge of detecting misogynous content in bilingual (English and Italian) online communications. Utilizing FastText word embeddings and explainable artificial intelligence techniques, we introduce a model that enhances both the interpretability and accuracy in detecting misogynistic language. To conduct an in-depth analysis, we implemented a range of experiments encompassing classic machine learning methodologies and conventional deep learning approaches to the recent transformer-based models incorporating both language-specific and multilingual capabilities. This paper enhances the methodologies for detecting misogyny by incorporating incremental learning for cutting-edge datasets containing tweets and posts from different sources like Facebook, Twitter, and Reddit, with our proposed approach outperforming these datasets in metrics such as accuracy, F1-score, precision, and recall. This process involved refining hyperparameters, employing optimization techniques, and utilizing generative configurations. By implementing Local Interpretable Model-agnostic Explanations (LIME), we further elucidate the rationale behind the model’s predictions, enhancing understanding of its decision-making process.

查看原文本刊更多论文

利用可解释人工智能和多语种微调转换器加强双语文本中的厌女症检测

性别化虚假信息通过专制政权有意将社交媒体武器化，加剧社会分裂，从而损害妇女权利、民主原则和国家安全。网络上的厌女症是一个有害的社会问题，有可能将数字平台变成对妇女充满敌意和不友好的环境。尽管这一问题十分严重，但说服数字平台加强对性别虚假信息的保护措施的努力却常常被忽视，这凸显了在商业利益面前反击网络厌女症的艰巨任务。这一日益严重的问题突出表明，有必要采取有效措施，创建更加安全的网络空间，让尊重和平等成为主流，确保女性能够充分、自由地参与其中，而不必担心受到骚扰或歧视。本研究探讨了在双语（英语和意大利语）在线交流中检测厌恶女性内容的挑战。利用 FastText 词嵌入和可解释人工智能技术，我们引入了一个模型，该模型可提高检测厌恶女性语言的可解释性和准确性。为了进行深入分析，我们进行了一系列实验，包括经典的机器学习方法和传统的深度学习方法，以及最新的基于转换器的模型，这些模型结合了特定语言和多语言功能。本文通过对包含来自 Facebook、Twitter 和 Reddit 等不同来源的推文和帖子的前沿数据集进行增量学习，增强了检测厌女症的方法，我们提出的方法在准确率、F1 分数、精确度和召回率等指标上都优于这些数据集。这一过程包括完善超参数、采用优化技术和利用生成配置。通过实施本地可解释模型解释（LIME），我们进一步阐明了模型预测背后的原理，从而加深了对其决策过程的理解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

9.60

自引率

10.30%

发文量

297

期刊介绍： Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.