Adam S. Miner, Sheridan A Stewart, M. Halley, Laura K. Nelson, Eleni Linos
{"title":"Formally comparing topic models and human-generated qualitative coding of physician mothers’ experiences of workplace discrimination","authors":"Adam S. Miner, Sheridan A Stewart, M. Halley, Laura K. Nelson, Eleni Linos","doi":"10.1177/20539517221149106","DOIUrl":null,"url":null,"abstract":"Differences between computationally generated and human-generated themes in unstructured text are important to understand yet difficult to assess formally. In this study, we bridge these approaches through two contributions. First, we formally compare a primarily computational approach, topic modeling, to a primarily human-driven approach, qualitative thematic coding, in an impactful context: physician mothers’ experience of workplace discrimination. Second, we compare our chosen topic model to a principled alternative topic model to make explicit study design decisions meriting consideration in future research. By formally contrasting computationally generated (i.e. topic modeling) and human-generated (i.e. thematic coding) knowledge, we shed light on issues of interest to several audiences, notably computational social scientists who wish to understand study design tradeoffs, and qualitative researchers who may wish to leverage computational methods to improve the speed and reproducibility of labor-intensive coding. Although useful in other domains, we highlight the value of fast, reproducible methods to better understand experiences of workplace discrimination.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":6.5000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data & Society","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.1177/20539517221149106","RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIAL SCIENCES, INTERDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Differences between computationally generated and human-generated themes in unstructured text are important to understand yet difficult to assess formally. In this study, we bridge these approaches through two contributions. First, we formally compare a primarily computational approach, topic modeling, to a primarily human-driven approach, qualitative thematic coding, in an impactful context: physician mothers’ experience of workplace discrimination. Second, we compare our chosen topic model to a principled alternative topic model to make explicit study design decisions meriting consideration in future research. By formally contrasting computationally generated (i.e. topic modeling) and human-generated (i.e. thematic coding) knowledge, we shed light on issues of interest to several audiences, notably computational social scientists who wish to understand study design tradeoffs, and qualitative researchers who may wish to leverage computational methods to improve the speed and reproducibility of labor-intensive coding. Although useful in other domains, we highlight the value of fast, reproducible methods to better understand experiences of workplace discrimination.
期刊介绍:
Big Data & Society (BD&S) is an open access, peer-reviewed scholarly journal that publishes interdisciplinary work principally in the social sciences, humanities, and computing and their intersections with the arts and natural sciences. The journal focuses on the implications of Big Data for societies and aims to connect debates about Big Data practices and their effects on various sectors such as academia, social life, industry, business, and government.
BD&S considers Big Data as an emerging field of practices, not solely defined by but generative of unique data qualities such as high volume, granularity, data linking, and mining. The journal pays attention to digital content generated both online and offline, encompassing social media, search engines, closed networks (e.g., commercial or government transactions), and open networks like digital archives, open government, and crowdsourced data. Rather than providing a fixed definition of Big Data, BD&S encourages interdisciplinary inquiries, debates, and studies on various topics and themes related to Big Data practices.
BD&S seeks contributions that analyze Big Data practices, involve empirical engagements and experiments with innovative methods, and reflect on the consequences of these practices for the representation, realization, and governance of societies. As a digital-only journal, BD&S's platform can accommodate multimedia formats such as complex images, dynamic visualizations, videos, and audio content. The contents of the journal encompass peer-reviewed research articles, colloquia, bookcasts, think pieces, state-of-the-art methods, and work by early career researchers.