更新“编码的未来”：用生成式大型语言模型进行定性编码

IF 6.5 2区社会学 Q1 SOCIAL SCIENCES, MATHEMATICAL METHODS

Sociological Methods & Research Pub Date : 2025-05-21 DOI:10.1177/00491241251339188

Nga Than, Leanne Fan, Tina Law, Laura K. Nelson, Leslie McCall

{"title":"更新“编码的未来”：用生成式大型语言模型进行定性编码","authors":"Nga Than, Leanne Fan, Tina Law, Laura K. Nelson, Leslie McCall","doi":"10.1177/00491241251339188","DOIUrl":null,"url":null,"abstract":"Over the past decade, social scientists have adapted computational methods for qualitative text analysis, with the hope that they can match the accuracy and reliability of hand coding. The emergence of GPT and open-source generative large language models (LLMs) has transformed this process by shifting from programming to engaging with models using natural language, potentially mimicking the in-depth, inductive, and/or iterative process of qualitative analysis. We test the ability of generative LLMs to replicate and augment traditional qualitative coding, experimenting with multiple prompt structures across four closed- and open-source generative LLMs and proposing a workflow for conducting qualitative coding with generative LLMs. We find that LLMs can perform nearly as well as prior supervised machine learning models in accurately matching hand-coding output. Moreover, using generative LLMs as a natural language interlocutor closely replicates traditional qualitative methods, indicating their potential to transform the qualitative research process, despite ongoing challenges.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"11 1","pages":""},"PeriodicalIF":6.5000,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Updating “The Future of Coding”: Qualitative Coding with Generative Large Language Models\",\"authors\":\"Nga Than, Leanne Fan, Tina Law, Laura K. Nelson, Leslie McCall\",\"doi\":\"10.1177/00491241251339188\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Over the past decade, social scientists have adapted computational methods for qualitative text analysis, with the hope that they can match the accuracy and reliability of hand coding. The emergence of GPT and open-source generative large language models (LLMs) has transformed this process by shifting from programming to engaging with models using natural language, potentially mimicking the in-depth, inductive, and/or iterative process of qualitative analysis. We test the ability of generative LLMs to replicate and augment traditional qualitative coding, experimenting with multiple prompt structures across four closed- and open-source generative LLMs and proposing a workflow for conducting qualitative coding with generative LLMs. We find that LLMs can perform nearly as well as prior supervised machine learning models in accurately matching hand-coding output. Moreover, using generative LLMs as a natural language interlocutor closely replicates traditional qualitative methods, indicating their potential to transform the qualitative research process, despite ongoing challenges.\",\"PeriodicalId\":21849,\"journal\":{\"name\":\"Sociological Methods & Research\",\"volume\":\"11 1\",\"pages\":\"\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2025-05-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Sociological Methods & Research\",\"FirstCategoryId\":\"90\",\"ListUrlMain\":\"https://doi.org/10.1177/00491241251339188\",\"RegionNum\":2,\"RegionCategory\":\"社会学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"SOCIAL SCIENCES, MATHEMATICAL METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sociological Methods & Research","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.1177/00491241251339188","RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIAL SCIENCES, MATHEMATICAL METHODS","Score":null,"Total":0}

引用次数: 0

摘要

在过去的十年里，社会科学家已经将计算方法应用于定性文本分析，希望它们能够与手工编码的准确性和可靠性相匹配。GPT和开源生成式大型语言模型（llm）的出现改变了这一过程，从编程转向使用自然语言的模型，潜在地模仿了定性分析的深入、归纳和/或迭代过程。我们测试了生成法学硕士复制和增强传统定性编码的能力，在四个封闭和开源的生成法学硕士中试验了多个提示结构，并提出了使用生成法学硕士进行定性编码的工作流程。我们发现llm在精确匹配手工编码输出方面的表现几乎与先验监督机器学习模型一样好。此外，使用生成法学硕士作为自然语言对话者密切复制了传统的定性方法，表明它们有潜力改变定性研究过程，尽管存在挑战。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Updating “The Future of Coding”: Qualitative Coding with Generative Large Language Models

Over the past decade, social scientists have adapted computational methods for qualitative text analysis, with the hope that they can match the accuracy and reliability of hand coding. The emergence of GPT and open-source generative large language models (LLMs) has transformed this process by shifting from programming to engaging with models using natural language, potentially mimicking the in-depth, inductive, and/or iterative process of qualitative analysis. We test the ability of generative LLMs to replicate and augment traditional qualitative coding, experimenting with multiple prompt structures across four closed- and open-source generative LLMs and proposing a workflow for conducting qualitative coding with generative LLMs. We find that LLMs can perform nearly as well as prior supervised machine learning models in accurately matching hand-coding output. Moreover, using generative LLMs as a natural language interlocutor closely replicates traditional qualitative methods, indicating their potential to transform the qualitative research process, despite ongoing challenges.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Sociological Methods & Research Multiple-

CiteScore

16.30

自引率

3.20%

发文量

期刊介绍： Sociological Methods & Research is a quarterly journal devoted to sociology as a cumulative empirical science. The objectives of SMR are multiple, but emphasis is placed on articles that advance the understanding of the field through systematic presentations that clarify methodological problems and assist in ordering the known facts in an area. Review articles will be published, particularly those that emphasize a critical analysis of the status of the arts, but original presentations that are broadly based and provide new research will also be published. Intrinsically, SMR is viewed as substantive journal but one that is highly focused on the assessment of the scientific status of sociology. The scope is broad and flexible, and authors are invited to correspond with the editors about the appropriateness of their articles.