Miriam Hurtado Bodell, Måns Magnusson, Marc Keuschnigg
{"title":"Seeded Topic Models in Digital Archives: Analyzing Interpretations of Immigration in Swedish Newspapers, 1945–2019","authors":"Miriam Hurtado Bodell, Måns Magnusson, Marc Keuschnigg","doi":"10.1177/00491241241268453","DOIUrl":null,"url":null,"abstract":"Sociologists are discussing the need for more formal ways to extract meaning from digital text archives. We focus attention on the seeded topic model, a semi-supervised extension to the standard topic model that allows sociological knowledge to be infused into the computational learning of meaning structures. Seed words help crystallize topics around known concepts, while utilizing topic models’ functionality to identify associations in text based on word co-occurrences. The method estimates a concept’s shared interpretation (or framing) via its associations with other frequently co-occurring topics. In a case study, we extract longitudinal measures of media frames regarding immigration from a vast corpus of millions of Swedish newspaper articles from the period 1945–2019. We infer turning points that partition the immigration discourse into meaningful eras and locate Sweden’s era of multicultural ideals that coined its tolerant reputation.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"17 1","pages":""},"PeriodicalIF":6.5000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sociological Methods & Research","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.1177/00491241241268453","RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIAL SCIENCES, MATHEMATICAL METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Sociologists are discussing the need for more formal ways to extract meaning from digital text archives. We focus attention on the seeded topic model, a semi-supervised extension to the standard topic model that allows sociological knowledge to be infused into the computational learning of meaning structures. Seed words help crystallize topics around known concepts, while utilizing topic models’ functionality to identify associations in text based on word co-occurrences. The method estimates a concept’s shared interpretation (or framing) via its associations with other frequently co-occurring topics. In a case study, we extract longitudinal measures of media frames regarding immigration from a vast corpus of millions of Swedish newspaper articles from the period 1945–2019. We infer turning points that partition the immigration discourse into meaningful eras and locate Sweden’s era of multicultural ideals that coined its tolerant reputation.
期刊介绍:
Sociological Methods & Research is a quarterly journal devoted to sociology as a cumulative empirical science. The objectives of SMR are multiple, but emphasis is placed on articles that advance the understanding of the field through systematic presentations that clarify methodological problems and assist in ordering the known facts in an area. Review articles will be published, particularly those that emphasize a critical analysis of the status of the arts, but original presentations that are broadly based and provide new research will also be published. Intrinsically, SMR is viewed as substantive journal but one that is highly focused on the assessment of the scientific status of sociology. The scope is broad and flexible, and authors are invited to correspond with the editors about the appropriateness of their articles.