如何指导法学硕士在内容分析过程中进行自动编码的实践指南和案例研究

IF 2.7 2区社会学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Social Science Computer Review Pub Date : 2025-06-10 DOI:10.1177/08944393251349541

Mike Farjam, Hendrik Meyer, Meike Lohkamp

{"title":"如何指导法学硕士在内容分析过程中进行自动编码的实践指南和案例研究","authors":"Mike Farjam, Hendrik Meyer, Meike Lohkamp","doi":"10.1177/08944393251349541","DOIUrl":null,"url":null,"abstract":"This paper provides a practical example and guide on how to augment or replace human coders with Large Language Models (LLMs) during content analysis. We demonstrate this by replicating and extending an influential study on environmental communication. Our setup, running locally on consumer-grade hardware, makes it feasible for university researchers operating within typical computational and legal constraints. We validate the LLM’s performance by replicating the original study’s codings, scaling the analysis to cover a tenfold increase in articles, and extending the LLM’s application to a comparable German-language corpus, comparing these results to human expert coders. We offer guidelines for instructing LLMs, validating output, and handling multilingual coding, presenting a replicable framework for future research. This paper is intended to systematically guide other researchers when integrating LLMs into their workflows, ensuring reliable and scalable coding practices. We demonstrate several advantages of LLMs as coders, including cost-effective multilingual coding, overcoming the limitations of small-sample content analysis, and improving both the replicability and transparency of the coding process.","PeriodicalId":49509,"journal":{"name":"Social Science Computer Review","volume":"218 1","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Practical Guide and Case Study on How to Instruct LLMs for Automated Coding During Content Analysis\",\"authors\":\"Mike Farjam, Hendrik Meyer, Meike Lohkamp\",\"doi\":\"10.1177/08944393251349541\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper provides a practical example and guide on how to augment or replace human coders with Large Language Models (LLMs) during content analysis. We demonstrate this by replicating and extending an influential study on environmental communication. Our setup, running locally on consumer-grade hardware, makes it feasible for university researchers operating within typical computational and legal constraints. We validate the LLM’s performance by replicating the original study’s codings, scaling the analysis to cover a tenfold increase in articles, and extending the LLM’s application to a comparable German-language corpus, comparing these results to human expert coders. We offer guidelines for instructing LLMs, validating output, and handling multilingual coding, presenting a replicable framework for future research. This paper is intended to systematically guide other researchers when integrating LLMs into their workflows, ensuring reliable and scalable coding practices. We demonstrate several advantages of LLMs as coders, including cost-effective multilingual coding, overcoming the limitations of small-sample content analysis, and improving both the replicability and transparency of the coding process.\",\"PeriodicalId\":49509,\"journal\":{\"name\":\"Social Science Computer Review\",\"volume\":\"218 1\",\"pages\":\"\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Social Science Computer Review\",\"FirstCategoryId\":\"90\",\"ListUrlMain\":\"https://doi.org/10.1177/08944393251349541\",\"RegionNum\":2,\"RegionCategory\":\"社会学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Social Science Computer Review","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.1177/08944393251349541","RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

本文提供了一个实际的例子和指南，说明如何在内容分析期间用大型语言模型（llm）增加或取代人类编码人员。我们通过复制和扩展一项有影响力的环境传播研究来证明这一点。我们的设置在消费级硬件上本地运行，使大学研究人员可以在典型的计算和法律限制下进行操作。我们通过复制原始研究的编码来验证法学硕士的性能，扩展分析以覆盖十倍增长的文章，并将法学硕士的应用扩展到可比较的德语语料库，将这些结果与人类专家编码人员进行比较。我们提供了指导法学硕士，验证输出和处理多语言编码的指导方针，为未来的研究提供了一个可复制的框架。本文旨在系统地指导其他研究人员将法学硕士集成到他们的工作流程中，确保可靠和可扩展的编码实践。我们展示了llm作为编码器的几个优势，包括具有成本效益的多语言编码，克服小样本内容分析的局限性，以及提高编码过程的可复制性和透明度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Practical Guide and Case Study on How to Instruct LLMs for Automated Coding During Content Analysis

This paper provides a practical example and guide on how to augment or replace human coders with Large Language Models (LLMs) during content analysis. We demonstrate this by replicating and extending an influential study on environmental communication. Our setup, running locally on consumer-grade hardware, makes it feasible for university researchers operating within typical computational and legal constraints. We validate the LLM’s performance by replicating the original study’s codings, scaling the analysis to cover a tenfold increase in articles, and extending the LLM’s application to a comparable German-language corpus, comparing these results to human expert coders. We offer guidelines for instructing LLMs, validating output, and handling multilingual coding, presenting a replicable framework for future research. This paper is intended to systematically guide other researchers when integrating LLMs into their workflows, ensuring reliable and scalable coding practices. We demonstrate several advantages of LLMs as coders, including cost-effective multilingual coding, overcoming the limitations of small-sample content analysis, and improving both the replicability and transparency of the coding process.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Social Science Computer Review 社会科学-计算机：跨学科应用

CiteScore

9.00

自引率

4.90%

发文量

审稿时长

>12 weeks

期刊介绍： Unique Scope Social Science Computer Review is an interdisciplinary journal covering social science instructional and research applications of computing, as well as societal impacts of informational technology. Topics included: artificial intelligence, business, computational social science theory, computer-assisted survey research, computer-based qualitative analysis, computer simulation, economic modeling, electronic modeling, electronic publishing, geographic information systems, instrumentation and research tools, public administration, social impacts of computing and telecommunications, software evaluation, world-wide web resources for social scientists. Interdisciplinary Nature Because the Uses and impacts of computing are interdisciplinary, so is Social Science Computer Review. The journal is of direct relevance to scholars and scientists in a wide variety of disciplines. In its pages you''ll find work in the following areas: sociology, anthropology, political science, economics, psychology, computer literacy, computer applications, and methodology.