大型语言模型和人群注释对政治社交媒体信息准确内容分析的功效

IF 3 2区 社会学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Jennifer Stromer-Galley, Brian McKernan, Saklain Zaman, Chinmay Maganur, Sampada Regmi
{"title":"大型语言模型和人群注释对政治社交媒体信息准确内容分析的功效","authors":"Jennifer Stromer-Galley, Brian McKernan, Saklain Zaman, Chinmay Maganur, Sampada Regmi","doi":"10.1177/08944393251334977","DOIUrl":null,"url":null,"abstract":"Systematic content analysis of messaging has been a staple method in the study of communication. While computer-assisted content analysis has been used in the field for three decades, advances in machine learning and crowd-based annotation combined with the ease of collecting volumes of text-based communication via social media have made the opportunities for classification of messages easier and faster. The greatest advancement yet might be in the form of general intelligence large language models (LLMs), which are ostensibly able to accurately and reliably classify messages by leveraging context to disambiguate meaning. It is unclear, however, how effective LLMs are in deploying the method of content analysis. In this study, we compare the classification of political candidate social media messages between trained annotators, crowd annotators, and large language models from Open AI accessed through the free Web (ChatGPT) and the paid API (GPT API) on five different categories of political communication commonly used in the literature. We find that crowd annotation generally had higher F1 scores than ChatGPT and an earlier version of the GPT API, although the newest version, GPT-4 API, demonstrated good performance as compared with the crowd and with ground truth data derived from trained student annotators. This study suggests the application of any LLM to an annotation task requires validation, and that freely available and older LLM models may not be effective for studying human communication.","PeriodicalId":49509,"journal":{"name":"Social Science Computer Review","volume":"43 1","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Efficacy of Large Language Models and Crowd Annotation for Accurate Content Analysis of Political Social Media Messages\",\"authors\":\"Jennifer Stromer-Galley, Brian McKernan, Saklain Zaman, Chinmay Maganur, Sampada Regmi\",\"doi\":\"10.1177/08944393251334977\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Systematic content analysis of messaging has been a staple method in the study of communication. While computer-assisted content analysis has been used in the field for three decades, advances in machine learning and crowd-based annotation combined with the ease of collecting volumes of text-based communication via social media have made the opportunities for classification of messages easier and faster. The greatest advancement yet might be in the form of general intelligence large language models (LLMs), which are ostensibly able to accurately and reliably classify messages by leveraging context to disambiguate meaning. It is unclear, however, how effective LLMs are in deploying the method of content analysis. In this study, we compare the classification of political candidate social media messages between trained annotators, crowd annotators, and large language models from Open AI accessed through the free Web (ChatGPT) and the paid API (GPT API) on five different categories of political communication commonly used in the literature. We find that crowd annotation generally had higher F1 scores than ChatGPT and an earlier version of the GPT API, although the newest version, GPT-4 API, demonstrated good performance as compared with the crowd and with ground truth data derived from trained student annotators. This study suggests the application of any LLM to an annotation task requires validation, and that freely available and older LLM models may not be effective for studying human communication.\",\"PeriodicalId\":49509,\"journal\":{\"name\":\"Social Science Computer Review\",\"volume\":\"43 1\",\"pages\":\"\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-05-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Social Science Computer Review\",\"FirstCategoryId\":\"90\",\"ListUrlMain\":\"https://doi.org/10.1177/08944393251334977\",\"RegionNum\":2,\"RegionCategory\":\"社会学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Social Science Computer Review","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.1177/08944393251334977","RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

系统的信息内容分析一直是传播学研究的主要方法。虽然计算机辅助内容分析已经在该领域使用了三十年,但机器学习和基于人群的注释的进步,加上通过社交媒体收集大量基于文本的通信,使得信息分类变得更加容易和快速。迄今为止最大的进步可能是通用智能大型语言模型(llm)的形式,它表面上能够通过利用上下文来消除歧义来准确可靠地分类消息。然而,目前尚不清楚法学硕士在部署内容分析方法方面有多有效。在这项研究中,我们比较了经过训练的注释者、人群注释者以及通过免费网络(ChatGPT)和付费API (GPT API)访问的Open AI大型语言模型对政治候选人社交媒体消息的分类,这些模型在文献中常用的五种不同的政治传播类别上。我们发现群体注释通常比ChatGPT和早期版本的GPT API具有更高的F1分数,尽管最新版本的GPT-4 API与群体和来自训练有素的学生注释者的真实数据相比表现出良好的性能。这项研究表明,将任何LLM应用于注释任务都需要验证,并且免费提供的旧LLM模型可能无法有效地研究人类交流。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The Efficacy of Large Language Models and Crowd Annotation for Accurate Content Analysis of Political Social Media Messages
Systematic content analysis of messaging has been a staple method in the study of communication. While computer-assisted content analysis has been used in the field for three decades, advances in machine learning and crowd-based annotation combined with the ease of collecting volumes of text-based communication via social media have made the opportunities for classification of messages easier and faster. The greatest advancement yet might be in the form of general intelligence large language models (LLMs), which are ostensibly able to accurately and reliably classify messages by leveraging context to disambiguate meaning. It is unclear, however, how effective LLMs are in deploying the method of content analysis. In this study, we compare the classification of political candidate social media messages between trained annotators, crowd annotators, and large language models from Open AI accessed through the free Web (ChatGPT) and the paid API (GPT API) on five different categories of political communication commonly used in the literature. We find that crowd annotation generally had higher F1 scores than ChatGPT and an earlier version of the GPT API, although the newest version, GPT-4 API, demonstrated good performance as compared with the crowd and with ground truth data derived from trained student annotators. This study suggests the application of any LLM to an annotation task requires validation, and that freely available and older LLM models may not be effective for studying human communication.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Social Science Computer Review
Social Science Computer Review 社会科学-计算机:跨学科应用
CiteScore
9.00
自引率
4.90%
发文量
95
审稿时长
>12 weeks
期刊介绍: Unique Scope Social Science Computer Review is an interdisciplinary journal covering social science instructional and research applications of computing, as well as societal impacts of informational technology. Topics included: artificial intelligence, business, computational social science theory, computer-assisted survey research, computer-based qualitative analysis, computer simulation, economic modeling, electronic modeling, electronic publishing, geographic information systems, instrumentation and research tools, public administration, social impacts of computing and telecommunications, software evaluation, world-wide web resources for social scientists. Interdisciplinary Nature Because the Uses and impacts of computing are interdisciplinary, so is Social Science Computer Review. The journal is of direct relevance to scholars and scientists in a wide variety of disciplines. In its pages you''ll find work in the following areas: sociology, anthropology, political science, economics, psychology, computer literacy, computer applications, and methodology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信