Is a single model enough? The systematic comparison of computational approaches for detecting populist radical right content.

Q1 Mathematics

Quality & Quantity Pub Date : 2025-01-01 Epub Date: 2025-01-29 DOI:10.1007/s11135-024-02034-1

Mykola Makhortykh, Ernesto de León, Clara Christner, Maryna Sydorova, Aleksandra Urman, Silke Adam, Michaela Maier, Teresa Gil-Lopez

{"title":"Is a single model enough? The systematic comparison of computational approaches for detecting populist radical right content.","authors":"Mykola Makhortykh, Ernesto de León, Clara Christner, Maryna Sydorova, Aleksandra Urman, Silke Adam, Michaela Maier, Teresa Gil-Lopez","doi":"10.1007/s11135-024-02034-1","DOIUrl":null,"url":null,"abstract":"<p><p>The rise of populist radical right (PRR) ideas stresses the importance of understanding how individuals engage with PRR content online. However, this task is complicated by the variety of channels through which such engagement can take place. In this article, we systematically compare computational approaches for detecting PRR content in textual data. Using 66 dictionary, classic supervised machine learning, and deep learning (DL) models, we compare how these distinct approaches perform on the PRR detection task for three Germanophone test datasets and how their performance is affected by different modes of text preprocessing. In addition to individual models, we examine the performance of 330 ensemble models combining the above-mentioned approaches for the dataset with a particularly high volume of noise. Our findings demonstrate that the DL models, in combination with more computationally intense forms of preprocessing, show the best performance among the individual models, but it remains suboptimal in the case of more noisy datasets. While the use of ensemble models shows some improvement for specific modes of preprocessing, overall, it mostly remains on par with individual DL models, thus stressing the challenging nature of computational detection of PRR content.</p>","PeriodicalId":49649,"journal":{"name":"Quality & Quantity","volume":"59 Suppl 2","pages":"1163-1207"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12055619/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quality & Quantity","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s11135-024-02034-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/29 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"Mathematics","Score":null,"Total":0}

引用次数: 0

Abstract

The rise of populist radical right (PRR) ideas stresses the importance of understanding how individuals engage with PRR content online. However, this task is complicated by the variety of channels through which such engagement can take place. In this article, we systematically compare computational approaches for detecting PRR content in textual data. Using 66 dictionary, classic supervised machine learning, and deep learning (DL) models, we compare how these distinct approaches perform on the PRR detection task for three Germanophone test datasets and how their performance is affected by different modes of text preprocessing. In addition to individual models, we examine the performance of 330 ensemble models combining the above-mentioned approaches for the dataset with a particularly high volume of noise. Our findings demonstrate that the DL models, in combination with more computationally intense forms of preprocessing, show the best performance among the individual models, but it remains suboptimal in the case of more noisy datasets. While the use of ensemble models shows some improvement for specific modes of preprocessing, overall, it mostly remains on par with individual DL models, thus stressing the challenging nature of computational detection of PRR content.

Abstract Image

查看原文本刊更多论文

单一模型就足够了吗？检测民粹主义极右内容的计算方法的系统比较。

民粹激进右翼（PRR）思想的兴起强调了理解个人如何在网上参与PRR内容的重要性。然而，这一任务因各种参与渠道而变得复杂。在本文中，我们系统地比较了检测文本数据中PRR内容的计算方法。使用66个字典、经典监督机器学习和深度学习（DL）模型，我们比较了这些不同的方法在三个德语测试数据集的PRR检测任务上的表现，以及不同文本预处理模式对其性能的影响。除了单个模型之外，我们还研究了330个集成模型的性能，这些模型结合了上述方法，用于具有特别高噪声的数据集。我们的研究结果表明，深度学习模型与更多计算强度形式的预处理相结合，在单个模型中表现出最佳性能，但在更多噪声数据集的情况下，它仍然是次优的。虽然集成模型的使用在特定的预处理模式上有所改进，但总体而言，它基本上与单个深度学习模型保持一致，从而强调了PRR内容的计算检测的挑战性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Quality & Quantity 管理科学-统计学与概率论

CiteScore

4.60

自引率

0.00%

发文量

276

审稿时长

4-8 weeks

期刊介绍： Quality and Quantity constitutes a point of reference for European and non-European scholars to discuss instruments of methodology for more rigorous scientific results in the social sciences. In the era of biggish data, the journal also provides a publication venue for data scientists who are interested in proposing a new indicator to measure the latent aspects of social, cultural, and political events. Rather than leaning towards one specific methodological school, the journal publishes papers on a mixed method of quantitative and qualitative data. Furthermore, the journal’s key aim is to tackle some methodological pluralism across research cultures. In this context, the journal is open to papers addressing some general logic of empirical research and analysis of the validity and verification of social laws. Thus The journal accepts papers on science metrics and publication ethics and, their related issues affecting methodological practices among researchers. Quality and Quantity is an interdisciplinary journal which systematically correlates disciplines such as data and information sciences with the other humanities and social sciences. The journal extends discussion of interesting contributions in methodology to scholars worldwide, to promote the scientific development of social research.