{"title":"Social Mobility as Causal Intervention","authors":"Lai Wei, Yu Xie","doi":"10.1177/00491241251320963","DOIUrl":"https://doi.org/10.1177/00491241251320963","url":null,"abstract":"The study of mobility effects is an important subject of study in sociology. Empirical investigations of individual mobility effects, however, have been hindered by one fundamental limitation, the unidentifiability of mobility effects when origin and destination are held constant. Given this fundamental limitation, we propose to reconceptualize mobility effects from the micro- to macro-level. Instead of micro-level mobility effects, the primary focus of the past literature, we ask alternative research questions about macro-level mobility effects: What happens to the population distribution of an outcome if we manipulate the mobility regime, that is, if we alter the observed association between social origin and social destination? We relate individual-level mobility experience to macro-level mobility effects under special interventions. The proposed method bridges the macro and micro agendas in social stratification research, and has wider applications in social stratification beyond the study of mobility effects. We illustrate the method with two analyses that evaluate the impact of social mobility on average fertility and income inequality in the United States. We provide an open-source software, the R package <jats:italic>socmob</jats:italic> , that implements the method.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"1 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143857717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alessandra Rister Portinari Maranca, Jihoon Chung, Musashi Hinck, Adam D. Wolsky, Naoki Egami, Brandon M. Stewart
{"title":"Correcting the Measurement Errors of AI-Assisted Labeling in Image Analysis Using Design-Based Supervised Learning","authors":"Alessandra Rister Portinari Maranca, Jihoon Chung, Musashi Hinck, Adam D. Wolsky, Naoki Egami, Brandon M. Stewart","doi":"10.1177/00491241251333372","DOIUrl":"https://doi.org/10.1177/00491241251333372","url":null,"abstract":"Generative artificial intelligence (AI) has shown incredible leaps in performance across data of a variety of modalities including texts, images, audio, and videos. This affords social scientists the ability to annotate variables of interest from unstructured media. While rapidly improving, these methods are far from perfect and, as we show, even ignoring the small amounts of error in high accuracy systems can lead to substantial bias and invalid confidence intervals in downstream analysis. We review how using design-based supervised learning (DSL) guarantees asymptotic unbiasedness and proper confidence interval coverage by making use of a small number of expert annotations. While originally developed for use with large language models in text, we present a series of applications in the context of image analysis, including an investigation of visual predictors of the perceived level of violence in protest images, an analysis of the images shared in the Black Lives Matter movement on Twitter, and a study of U.S. outlets reporting of immigrant caravans. These applications are representative of the type of analysis performed in the visual social science landscape today, and our analyses will exemplify how DSL helps us attain statistical guarantees while using automated methods to reduce human labor.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"3 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143857722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Julien Boelaert, Samuel Coavoux, Étienne Ollion, Ivaylo Petev, Patrick Präg
{"title":"Machine Bias. How Do Generative Language Models Answer Opinion Polls?","authors":"Julien Boelaert, Samuel Coavoux, Étienne Ollion, Ivaylo Petev, Patrick Präg","doi":"10.1177/00491241251330582","DOIUrl":"https://doi.org/10.1177/00491241251330582","url":null,"abstract":"Generative artificial intelligence (AI) is increasingly presented as a potential substitute for humans, including as research subjects. However, there is no scientific consensus on how closely these in silico clones can emulate survey respondents. While some defend the use of these “synthetic users,” others point toward social biases in the responses provided by large language models (LLMs). In this article, we demonstrate that these critics are right to be wary of using generative AI to emulate respondents, but probably not for the right reasons. Our results show (i) that to date, models cannot replace research subjects for opinion or attitudinal research; (ii) that they display a strong bias and a low variance on each topic; and (iii) that this bias randomly varies from one topic to the next. We label this pattern “machine bias,” a concept we define, and whose consequences for LLM-based research we further explore.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"37 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143853640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sandrine Chausson, Marion Fourcade, David J. Harding, Björn Ross, Grégory Renard
{"title":"The Insight-Inference Loop: Efficient Text Classification via Natural Language Inference and Threshold-Tuning","authors":"Sandrine Chausson, Marion Fourcade, David J. Harding, Björn Ross, Grégory Renard","doi":"10.1177/00491241251326819","DOIUrl":"https://doi.org/10.1177/00491241251326819","url":null,"abstract":"Modern computational text classification methods have brought social scientists tantalizingly close to the goal of unlocking vast insights buried in text data—from centuries of historical documents to streams of social media posts. Yet three barriers still stand in the way: the tedious labor of manual text annotation, the technical complexity that keeps these tools out of reach for many researchers, and, perhaps most critically, the challenge of bridging the gap between sophisticated algorithms and the deep theoretical understanding social scientists have already developed about human interactions, social structures, and institutions. To counter these limitations, we propose an approach to large-scale text analysis that requires substantially less human-labeled data, and no machine learning expertise, and efficiently integrates the social scientist into critical steps in the workflow. This approach, which allows the detection of statements in text, relies on large language models pre-trained for natural language inference, and a “few-shot” threshold-tuning algorithm rooted in active learning principles. We describe and showcase our approach by analyzing tweets collected during the 2020 U.S. presidential election campaign, and benchmark it against various computational approaches across three datasets.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"1 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143851025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Locating Cultural Holes Brokers in Diffusion Dynamics Across Bright Symbolic Boundaries","authors":"Diego F. Leal","doi":"10.1177/00491241251322517","DOIUrl":"https://doi.org/10.1177/00491241251322517","url":null,"abstract":"Although the literature on cultural holes has expanded considerably in recent years, there is no concrete measure in that literature to locate cultural holes brokers. This article develops a conceptual framework grounded in social network theory and cultural sociology to propose a specific solution to fill this measurement gap. Agent-based computational experiments are leveraged to develop a theoretical test of the analytic purchase and distinctiveness of the proposed measure, termed potential for intercultural brokerage (PIB). Results demonstrate the effectiveness of PIB in locating early adopters that can achieve widespread levels of diffusion in societies segregated along bright symbolic boundaries. Findings also show the superiority of PIB when compared to classic alternative measures in the network literature that focus on locating early adopters based on structural holes (e.g., network constraint, effective size), geodesics (e.g., betweenness centrality), and degree (e.g., degree centrality), among other classic network measures. Broader implications of these findings for brokerage theory are discussed herein.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"19 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143661168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sourabh Balgi, Adel Daoud, Jose M. Peña, Geoffrey T. Wodtke, Jesse Zhou
{"title":"Deep Learning With DAGs","authors":"Sourabh Balgi, Adel Daoud, Jose M. Peña, Geoffrey T. Wodtke, Jesse Zhou","doi":"10.1177/00491241251319291","DOIUrl":"https://doi.org/10.1177/00491241251319291","url":null,"abstract":"Social science theories often postulate systems of causal relationships among variables, which are commonly represented using directed acyclic graphs (DAGs). As non-parametric causal models, DAGs require no assumptions about the functional form of the hypothesized relationships. Nevertheless, to simplify empirical evaluation, researchers typically invoke such assumptions anyway, even though they are often arbitrary and do not reflect any theoretical content or prior knowledge. Moreover, functional form assumptions can engender bias, whenever they fail to accurately capture the true complexity of the system. In this article, we introduce causal-graphical normalizing flows (cGNFs), a novel approach to causal inference that leverages deep neural networks to empirically evaluate theories represented as DAGs. Unlike conventional methods, cGNFs model the full joint distribution of the data using a DAG specified by the analyst, without relying on stringent assumptions about functional form. This enables flexible, non-parametric estimation of any causal estimand identified from the DAG, including total effects, direct and indirect effects, and path-specific effects. We illustrate the method with a reanalysis of Blau and Duncan’s ( <jats:xref ref-type=\"bibr\">1967</jats:xref> ) model of status attainment and Zhou’s ( <jats:xref ref-type=\"bibr\">2019</jats:xref> ) model of controlled mobility. The article concludes with a discussion of current limitations and directions for future development.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"11 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143608045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"When to Use Counterfactuals in Causal Historiography: Methods for Semantics and Inference","authors":"Tay Jeong","doi":"10.1177/00491241251314039","DOIUrl":"https://doi.org/10.1177/00491241251314039","url":null,"abstract":"According to the interventionist framework of actual causality, causal claims in history are ultimately claims about special types of functional dependencies between variables, which consist not only of actual events but also of corresponding counterfactual states of affairs. Instead of advocating the methodological use of counterfactuals tout court, we propose specific circumstances in historical writing where counterfactual reasoning comes in most handy. At the level of semantics, that is, the specification of the variables and their possible values, an explicit specification of the latent contrast classes becomes particularly useful in situations where one may be prompted to take an event that is pre-empted by the antecedent of interest as its proper causal contrast. At the level of inference, we argue that cases in which two or more antecedents appear to be playing a similar role tend to fumble our pretheoretical intuition about cause and propose a sequence of counterfactual tests based on actual examples from causal historiography.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"10 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143084171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Integration of Bayesian Regression Analysis and Bayesian Process Tracing in Mixed-Methods Research","authors":"Lion Behrens, Ingo Rohlfing","doi":"10.1177/00491241241295336","DOIUrl":"https://doi.org/10.1177/00491241241295336","url":null,"abstract":"In this article, we develop a mixed-methods design that combines Bayesian regression with Bayesian process tracing. A fully Bayesian multimethod design allows one to include empirical knowledge at each stage of the analysis and to coherently transfer information from the quantitative to the qualitative analysis, and vice versa. We present a complete mixed-methods workflow explaining how this is accomplished and how to integrate both methods. It is demonstrated how to use the posterior highest density interval and the Bayes factor from the regression analysis to update the prior level of confidence about what mechanisms possibly connect the cause to the outcome. It is further shown how to choose cases for the qualitative analysis through posterior predictive sampling. We illustrate this approach with an empirical analysis of colonial development and compare it with alternative designs, including nested analysis and the Bayesian integration of qualitative and quantitative methods.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"12 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143026628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving Cross-Cultural Comparability of Measures on Gender and Age Stereotypes by Means of Piloting Methods","authors":"Natalja Menold, Patricia Hadler, Cornelia Neuert","doi":"10.1177/00491241241307600","DOIUrl":"https://doi.org/10.1177/00491241241307600","url":null,"abstract":"The study addresses the effects of piloting methods on the cross-cultural comparability and reliability of the measurement of gender and age stereotypes. We conducted a summative evaluation of expert reviews, cognitive pretests and web probing. We first piloted a gender role, an ageism, and a children stereotypes instrument in German and American English. We then randomly assigned the original and piloted versions to respondents in Germany and the United States using an online survey experiment and quota samples. No configural invariance was shown by the original instruments and the reliability of the gender role instrument was insufficiently low. The results show that piloting methods increased reliability and improved measurement invariance, although the effects varied by topic. Cross-cultural expert reviews and web probing provided more consistent results than other methods. A combination of web probing and cross-cultural expert reviews can maximize both reliability and measurement invariance.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"105 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142991933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Rise in Occupational Coding Mismatches and Occupational Mobility, 1991–2020","authors":"Andrew Taeho Kim, ChangHwan Kim","doi":"10.1177/00491241241303517","DOIUrl":"https://doi.org/10.1177/00491241241303517","url":null,"abstract":"Occupation is a construct prone to classification mismatches by coders and description inconsistency by respondents. We explore whether mismatches in occupational coding have recently increased, what factors are associated with the rise in mismatches, and how the rise affects estimates of intragenerational occupational mobility. Utilizing the 1991–2020 Annual Social and Economic Supplement of the Current Population Survey, which collects information on respondents’ current occupation and the previous year’s main occupation, we identify coding mismatches and compare the probabilities of occupational mobility based on four combinations of two variables. Our results show that not only do the estimates of occupational mobility between two adjacent years vary substantially across measures, but also that the magnitudes of intragenerational occupational mobility across measures become increasingly decoupled over time. We demonstrate that the likely cause of this divergence is the rise in coding mismatches between coders. We discuss the implications of our findings.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"30 1","pages":""},"PeriodicalIF":6.3,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142986720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}