Chiara Respi, Marco Gui, Gaetano Scaduto, Miriam Serini, Dario Pizzul, Tiziano Gerosa, Christoph Lutz
{"title":"Lower Cynicism, Not Higher Literacy, Promotes Protective Behavior: Exploring the “privacy exception” in the Digital Inequality Framework","authors":"Chiara Respi, Marco Gui, Gaetano Scaduto, Miriam Serini, Dario Pizzul, Tiziano Gerosa, Christoph Lutz","doi":"10.1177/08944393251341201","DOIUrl":"https://doi.org/10.1177/08944393251341201","url":null,"abstract":"Prior research on digital inequality has highlighted the role of sociocultural resources in shaping Internet beneficial use patterns by positively impacting on online literacy. Research on privacy protection online has—at the same time—shown the emergence of a “privacy cynicism,” where concerns about privacy fail to translate into protective actions. This study investigates how education level impacts privacy protection behavior through these two different mediation paths. Using unique data from a sample of 3,156 Italian Internet users, structural equation modeling (SEM) is employed to analyze the linkages between education level, privacy literacy, privacy cynicism, and protective behaviors. Contrary to expectations, the results reveal a moderate negative impact of education level on privacy protection behaviors. This total effect is the results of two different paths exerting opposite effects on protection behaviors. While a higher education correlates with increased privacy literacy, this competence does not translate into proactive protective actions. Surprisingly, individuals with higher privacy literacy exhibit even lower levels of protection behavior, contributing to a negative indirect effect of education on privacy protection. On the other side, the indirect effect of education on behaviors through privacy cynicism operates consistently with the digital inequality framework, partially compensating the negative effect through literacy. An interpretation of privacy protection as an exception within the digital inequality framework is proposed.","PeriodicalId":49509,"journal":{"name":"Social Science Computer Review","volume":"44 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144290121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Practical Guide and Case Study on How to Instruct LLMs for Automated Coding During Content Analysis","authors":"Mike Farjam, Hendrik Meyer, Meike Lohkamp","doi":"10.1177/08944393251349541","DOIUrl":"https://doi.org/10.1177/08944393251349541","url":null,"abstract":"This paper provides a practical example and guide on how to augment or replace human coders with Large Language Models (LLMs) during content analysis. We demonstrate this by replicating and extending an influential study on environmental communication. Our setup, running locally on consumer-grade hardware, makes it feasible for university researchers operating within typical computational and legal constraints. We validate the LLM’s performance by replicating the original study’s codings, scaling the analysis to cover a tenfold increase in articles, and extending the LLM’s application to a comparable German-language corpus, comparing these results to human expert coders. We offer guidelines for instructing LLMs, validating output, and handling multilingual coding, presenting a replicable framework for future research. This paper is intended to systematically guide other researchers when integrating LLMs into their workflows, ensuring reliable and scalable coding practices. We demonstrate several advantages of LLMs as coders, including cost-effective multilingual coding, overcoming the limitations of small-sample content analysis, and improving both the replicability and transparency of the coding process.","PeriodicalId":49509,"journal":{"name":"Social Science Computer Review","volume":"218 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144260638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rejoice Chitengu, Silas Formunyuy Verkijika, Kelibone Eva Mamabolo
{"title":"Forecasting Civil Unrest in South Africa Using Social Media Data: A Hybrid Machine Learning Approach","authors":"Rejoice Chitengu, Silas Formunyuy Verkijika, Kelibone Eva Mamabolo","doi":"10.1177/08944393251349542","DOIUrl":"https://doi.org/10.1177/08944393251349542","url":null,"abstract":"Civil unrest, encompassing protests and riots, is an increasing global concern, with incidents rising at an alarming rate, a trend that has been observed in South Africa over the years. This issue is particularly pronounced in today’s social media era, where platforms like ‘X’ (formerly Twitter) serve as powerful tools for mobilization. This raises the question: What factors drive civil unrest, and how can machine learning, using social media data, be employed to forecast such events? In response, this study had as objective to develop a hybrid machine learning model to forecast protest and riot events in South Africa using Twitter data. Employing the CRISP-DM methodology, data was collected from Twitter for the period between 2019 and 2024, resulting in 18,487 curated tweets, with associated ground truth data extracted from the ACLED database. Using this data, a hybrid model combining Bidirectional LSTM (Bi-LSTM) networks with eXtreme Gradient Boosting (XGBoost) for classification and regression tasks was developed to forecast civil unrest in South Africa. Additionally, SHapley Additive exPlanations (SHAP) were used for model explainability. The proposed model outperformed the base model, achieving an R-squared value of 33% for protests and 23% for riots in regression, along with classification accuracies of 92% for protests and 86.2% for riots. SHAP results indicated that the key predictors of unrest included sentiment-related features, tweet engagement features, regional factors, the day of the week, public holidays, and the topics being discussed. This study demonstrates the value of a hybrid model in forecasting civil unrest events and identifies key features that stakeholders can use to target their efforts more precisely in addressing civil unrest, ensuring resources are allocated where they are needed most. The study concludes with a discussion of valuable insights for stakeholders on how to leverage social media data to predict and mitigate civil unrest.","PeriodicalId":49509,"journal":{"name":"Social Science Computer Review","volume":"60 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144252137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Laurence-Olivier M. Foisy, Étienne Proulx, Hubert Cadieux, Jérémy Gilbert, Jozef Rivest, Alexandre Bouillon, Yannick Dufresne
{"title":"Prompting the Machine: Introducing an LLM Data Extraction Method for Social Scientists","authors":"Laurence-Olivier M. Foisy, Étienne Proulx, Hubert Cadieux, Jérémy Gilbert, Jozef Rivest, Alexandre Bouillon, Yannick Dufresne","doi":"10.1177/08944393251344865","DOIUrl":"https://doi.org/10.1177/08944393251344865","url":null,"abstract":"This research note addresses a methodological gap in the study of large language models (LLMs) in social sciences: the absence of standardized data extraction procedures. While existing research has examined biases and the reliability of LLM-generated content, the establishment of transparent extraction protocols necessarily precedes substantive analysis. The paper introduces a replicable procedural framework for extracting structured political data from LLMs via API, designed to enhance transparency, accessibility, and reproducibility. Canadian federal and Quebec provincial politicians serve as an illustrative case to demonstrate the extraction methodology, encompassing prompt engineering, output processing, and error handling mechanisms. The procedure facilitates systematic data collection across multiple LLM versions, enabling inter-model comparisons while addressing extraction challenges such as response variability and malformed outputs. The contribution is primarily methodological—providing researchers with a foundational extraction protocol adaptable to diverse research contexts. This standardized approach constitutes an essential preliminary step for subsequent evaluation of LLM-generated content, establishing procedural clarity in this methodologically developing research domain.","PeriodicalId":49509,"journal":{"name":"Social Science Computer Review","volume":"151 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144153928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vihang Jumle, Mykola Makhortykh, Maryna Sydorova, Victoria Vziatysheva
{"title":"Finding Frames With BERT: A Transformer-Based Approach to Generic News Frame Detection","authors":"Vihang Jumle, Mykola Makhortykh, Maryna Sydorova, Victoria Vziatysheva","doi":"10.1177/08944393251338396","DOIUrl":"https://doi.org/10.1177/08944393251338396","url":null,"abstract":"Framing is among the most extensively used concepts in the field of communication science. The availability of digital data offers new possibilities for studying how specific aspects of social reality are made more salient in online communication, but also raises challenges related to the scaling of framing analysis and its adoption to new research areas (e.g. studying the impact of artificial intelligence-powered systems on the representation of societally relevant issues). To address these challenges, we introduce a transformer-based approach for generic news frame detection in Anglophone online content. While doing so, we discuss the composition of the training and test datasets, the model architecture, and the validation of the approach and reflect on the possibilities and limitations of the automated detection of generic news frames.","PeriodicalId":49509,"journal":{"name":"Social Science Computer Review","volume":"79 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144104618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jennifer Stromer-Galley, Brian McKernan, Saklain Zaman, Chinmay Maganur, Sampada Regmi
{"title":"The Efficacy of Large Language Models and Crowd Annotation for Accurate Content Analysis of Political Social Media Messages","authors":"Jennifer Stromer-Galley, Brian McKernan, Saklain Zaman, Chinmay Maganur, Sampada Regmi","doi":"10.1177/08944393251334977","DOIUrl":"https://doi.org/10.1177/08944393251334977","url":null,"abstract":"Systematic content analysis of messaging has been a staple method in the study of communication. While computer-assisted content analysis has been used in the field for three decades, advances in machine learning and crowd-based annotation combined with the ease of collecting volumes of text-based communication via social media have made the opportunities for classification of messages easier and faster. The greatest advancement yet might be in the form of general intelligence large language models (LLMs), which are ostensibly able to accurately and reliably classify messages by leveraging context to disambiguate meaning. It is unclear, however, how effective LLMs are in deploying the method of content analysis. In this study, we compare the classification of political candidate social media messages between trained annotators, crowd annotators, and large language models from Open AI accessed through the free Web (ChatGPT) and the paid API (GPT API) on five different categories of political communication commonly used in the literature. We find that crowd annotation generally had higher F1 scores than ChatGPT and an earlier version of the GPT API, although the newest version, GPT-4 API, demonstrated good performance as compared with the crowd and with ground truth data derived from trained student annotators. This study suggests the application of any LLM to an annotation task requires validation, and that freely available and older LLM models may not be effective for studying human communication.","PeriodicalId":49509,"journal":{"name":"Social Science Computer Review","volume":"43 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143901247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Theory-Driven Approach to Fake News/Information Disorder Analysis and Explanation via Target-Based Emotion–Stance Analysis (TESA) and Interpretive Graph Generation (IGG)","authors":"Xingyu Ken Chen, Jin-Cheon Na","doi":"10.1177/08944393251338403","DOIUrl":"https://doi.org/10.1177/08944393251338403","url":null,"abstract":"Information disorder (IDO) presents a persistent challenge to society, necessitating innovative approaches to understanding its dynamics beyond just merely detecting it. This study introduces a theory-driven framework that integrates advanced natural language processing (NLP) with deep learning, utilizing the target-based emotion–stance analysis (TESA) approach to analyze emotion and stance dynamics within IDO content. Complementing TESA, interactive graph generation (IGG) is applied for scalable and interpretable qualitative analyses. Employing a mixed-methods approach, the study leverages TESA for target-centric emotion and stance analysis, evaluating target-based classifiers on both human-annotated and synthetic datasets. Additionally, the study explores synthetic data generation using generative AI to enrich the analysis, applying IGG to map complex data interactions. The study also found that integrating synthetic data developed from human annotations enhanced model performance, particularly for emotion classification tasks. Results demonstrate that IDO narratives significantly differ from non-IDO narratives, frequently leveraging negative emotions such as anger and disgust to manipulate public perception. TESA proved effective in capturing these nuanced variations, while IGG facilitated the triangulation of such findings via the scalable interpretation of emotional narratives, revealing that IDO content often amplifies polarizing and antagonistic perspectives. By combining TESA and IGG, this research emphasizes the importance of using NLP to extract and examine the emotional and stance nuances toward targets of interest within IDO context. This approach not only deepens theoretical insights into IDO’s persuasive mechanisms but also supports the development of practical tools for analyzing and managing the influence of IDO on public discourse.","PeriodicalId":49509,"journal":{"name":"Social Science Computer Review","volume":"5 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143901304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring Gender Disparities in Experiences of Being Hacked Using Twitter Data: A Focus on the Third-Level Digital Divide","authors":"Ern Chern Khor, Moon Choi","doi":"10.1177/08944393251334974","DOIUrl":"https://doi.org/10.1177/08944393251334974","url":null,"abstract":"Despite millions of hacked accounts fueling cybercrime, research on the hacking experience, particularly sociodemographic aspects, remains sparse. This study examines the experience of being hacked with a focus on gender disparities from the perspective of the third-level digital divide—socially constructed gaps of digital use outcomes even among users with similar digital access and skills. Analyzing 13,731 Twitter mentions of accounts being hacked, using topic modeling and classifying the gender of 12,586 users, we showed that women reported more experiences of being hacked across all types of online services except gaming. Women were more likely to experience negative consequences of being hacked, including reputational harm, money loss, and having personalized content modified. Gender differences were also found in coping strategies. Men were more likely to use active strategies like warning others, rebuilding accounts, and deducing hackers’ origins, while women were more likely to seek help from others to recover or report experiencing hacked accounts. The findings of this study imply the need for further research into the gendered experiences of being hacked from the third-level digital divide perspective, alongside the development of interventions to mitigate harm and empower users with diverse needs to cope with being hacked.","PeriodicalId":49509,"journal":{"name":"Social Science Computer Review","volume":"95 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143889529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leah von der Heyde, Anna-Carolina Haensch, Alexander Wenz
{"title":"Vox Populi, Vox AI? Using Large Language Models to Estimate German Vote Choice","authors":"Leah von der Heyde, Anna-Carolina Haensch, Alexander Wenz","doi":"10.1177/08944393251337014","DOIUrl":"https://doi.org/10.1177/08944393251337014","url":null,"abstract":"“Synthetic samples” generated by large language models (LLMs) have been argued to complement or replace traditional surveys, assuming their training data is grounded in human-generated data that potentially reflects attitudes and behaviors prevalent in the population. Initial US-based studies that have prompted LLMs to mimic survey respondents found that the responses match survey data. However, the relationship between the respective target population and LLM training data might affect the generalizability of such findings. In this paper, we critically evaluate the use of LLMs for public opinion research in a different context, by investigating whether LLMs can estimate vote choice in Germany. We generate a synthetic sample matching the 2017 German Longitudinal Election Study respondents and ask the LLM GPT-3.5 to predict each respondent’s vote choice. Comparing these predictions to the survey-based estimates on the aggregate and subgroup levels, we find that GPT-3.5 exhibits a bias towards the Green and Left parties. While the LLM predictions capture the tendencies of “typical” voters, they miss more complex factors of vote choice. By examining the LLM-based prediction of voting behavior in a non-English speaking context, our study contributes to research on the extent to which LLMs can be leveraged for studying public opinion. The findings point to disparities in opinion representation in LLMs and underscore the limitations in applying them for public opinion estimation.","PeriodicalId":49509,"journal":{"name":"Social Science Computer Review","volume":"26 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143878112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lingnuo Wang, Guicheng Shi, Jon D. Elhai, Song Zhou, Yiqing Zeng, Lei Zheng
{"title":"Problematic use of short-video apps among elderly adults: An extension of the TAM","authors":"Lingnuo Wang, Guicheng Shi, Jon D. Elhai, Song Zhou, Yiqing Zeng, Lei Zheng","doi":"10.1177/08944393251338400","DOIUrl":"https://doi.org/10.1177/08944393251338400","url":null,"abstract":"Short-form videos have become a dominant form of social media globally. While short-video apps are popular among adolescents, their ease-of-use has also attracted a growing number of elderly users. However, this accessibility can lead to problematic use, resulting in physical and mental health issues for this demographic. Therefore, our research employed the technology acceptance model (TAM) to understand the problematic use of short-video apps (PUSVA) among elderly adults. 281 elderly adults completed a three-wave survey with a 1-month interval between waves. Results showed that both perceived utilitarian-usefulness and perceived hedonic-usefulness mediated the relationship between perceived ease-of-use and PUSVA, suggesting a double-edged sword effect of ease-to-use short-video apps. Moreover, perceived susceptibility moderated the relationship between perceived ease-of-use and perceived utilitarian-usefulness, but not between perceived ease-of-use and perceived hedonic-usefulness, suggesting a moderated mediation effect of perceived susceptibility on PUSVA. Specifically, elderly adults with low perceived susceptibility tended to report higher perceived utilitarian-usefulness for easy-to-use applications, while no relationship between perceived ease-of-use and perceived utilitarian-usefulness was observed among those with high perceived susceptibility. Our findings highlight the double-edged sword effect of user-friendly short-video apps and offer valuable insights for developing interventions to mitigate problematic use among elderly adults.","PeriodicalId":49509,"journal":{"name":"Social Science Computer Review","volume":"9 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143875878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}