用于文本分类的大型语言模型:从零学习到指令调整

IF 6.5 2区 社会学 Q1 SOCIAL SCIENCES, MATHEMATICAL METHODS
Youngjin Chae, Thomas Davidson
{"title":"用于文本分类的大型语言模型:从零学习到指令调整","authors":"Youngjin Chae, Thomas Davidson","doi":"10.1177/00491241251325243","DOIUrl":null,"url":null,"abstract":"Large language models (LLMs) have tremendous potential for social science research as they are trained on vast amounts of text and can generalize to many tasks. We explore the use of LLMs for supervised text classification, specifically the application to stance detection, which involves detecting attitudes and opinions in texts. We examine the performance of these models across different architectures, training regimes, and task specifications. We compare 10 models ranging in size from tens of millions to hundreds of billions of parameters and test four distinct training regimes: Prompt-based zero-shot learning and few-shot learning, fine-tuning, and instruction-tuning, which combines prompting and fine-tuning. The largest, most powerful models generally offer the best predictive performance even with little or no training examples, but fine-tuning smaller models is a competitive solution due to their relatively high accuracy and low cost. Instruction-tuning the latest generative LLMs expands the scope of text classification, enabling applications to more complex tasks than previously feasible. We offer practical recommendations on the use of LLMs for text classification in sociological research and discuss their limitations and challenges. Ultimately, LLMs can make text classification and other text analysis methods more accurate, accessible, and adaptable, opening new possibilities for computational social science.","PeriodicalId":21849,"journal":{"name":"Sociological Methods & Research","volume":"72 1","pages":""},"PeriodicalIF":6.5000,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Large Language Models for Text Classification: From Zero-Shot Learning to Instruction-Tuning\",\"authors\":\"Youngjin Chae, Thomas Davidson\",\"doi\":\"10.1177/00491241251325243\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Large language models (LLMs) have tremendous potential for social science research as they are trained on vast amounts of text and can generalize to many tasks. We explore the use of LLMs for supervised text classification, specifically the application to stance detection, which involves detecting attitudes and opinions in texts. We examine the performance of these models across different architectures, training regimes, and task specifications. We compare 10 models ranging in size from tens of millions to hundreds of billions of parameters and test four distinct training regimes: Prompt-based zero-shot learning and few-shot learning, fine-tuning, and instruction-tuning, which combines prompting and fine-tuning. The largest, most powerful models generally offer the best predictive performance even with little or no training examples, but fine-tuning smaller models is a competitive solution due to their relatively high accuracy and low cost. Instruction-tuning the latest generative LLMs expands the scope of text classification, enabling applications to more complex tasks than previously feasible. We offer practical recommendations on the use of LLMs for text classification in sociological research and discuss their limitations and challenges. Ultimately, LLMs can make text classification and other text analysis methods more accurate, accessible, and adaptable, opening new possibilities for computational social science.\",\"PeriodicalId\":21849,\"journal\":{\"name\":\"Sociological Methods & Research\",\"volume\":\"72 1\",\"pages\":\"\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2025-04-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Sociological Methods & Research\",\"FirstCategoryId\":\"90\",\"ListUrlMain\":\"https://doi.org/10.1177/00491241251325243\",\"RegionNum\":2,\"RegionCategory\":\"社会学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"SOCIAL SCIENCES, MATHEMATICAL METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sociological Methods & Research","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.1177/00491241251325243","RegionNum":2,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOCIAL SCIENCES, MATHEMATICAL METHODS","Score":null,"Total":0}
引用次数: 0

摘要

大型语言模型(llm)在社会科学研究中具有巨大的潜力,因为它们是在大量文本上训练的,并且可以推广到许多任务。我们探索了llm在监督文本分类中的使用,特别是在立场检测中的应用,这涉及到检测文本中的态度和观点。我们在不同的体系结构、训练制度和任务规范中检查这些模型的性能。我们比较了10个模型,其规模从数千万到数千亿个参数不等,并测试了四种不同的训练机制:基于提示的零次学习和少次学习、微调和指令调整,后者结合了提示和微调。最大、最强大的模型通常即使在很少或没有训练样例的情况下也能提供最好的预测性能,但微调较小的模型是一种有竞争力的解决方案,因为它们相对较高的准确性和较低的成本。指令调优最新的生成法学硕士扩展了文本分类的范围,使应用程序能够执行比以前更复杂的任务。我们提供了在社会学研究中使用法学硕士进行文本分类的实用建议,并讨论了它们的局限性和挑战。最终,法学硕士可以使文本分类和其他文本分析方法更加准确、可访问和适应性强,为计算社会科学开辟了新的可能性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Large Language Models for Text Classification: From Zero-Shot Learning to Instruction-Tuning
Large language models (LLMs) have tremendous potential for social science research as they are trained on vast amounts of text and can generalize to many tasks. We explore the use of LLMs for supervised text classification, specifically the application to stance detection, which involves detecting attitudes and opinions in texts. We examine the performance of these models across different architectures, training regimes, and task specifications. We compare 10 models ranging in size from tens of millions to hundreds of billions of parameters and test four distinct training regimes: Prompt-based zero-shot learning and few-shot learning, fine-tuning, and instruction-tuning, which combines prompting and fine-tuning. The largest, most powerful models generally offer the best predictive performance even with little or no training examples, but fine-tuning smaller models is a competitive solution due to their relatively high accuracy and low cost. Instruction-tuning the latest generative LLMs expands the scope of text classification, enabling applications to more complex tasks than previously feasible. We offer practical recommendations on the use of LLMs for text classification in sociological research and discuss their limitations and challenges. Ultimately, LLMs can make text classification and other text analysis methods more accurate, accessible, and adaptable, opening new possibilities for computational social science.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
16.30
自引率
3.20%
发文量
40
期刊介绍: Sociological Methods & Research is a quarterly journal devoted to sociology as a cumulative empirical science. The objectives of SMR are multiple, but emphasis is placed on articles that advance the understanding of the field through systematic presentations that clarify methodological problems and assist in ordering the known facts in an area. Review articles will be published, particularly those that emphasize a critical analysis of the status of the arts, but original presentations that are broadly based and provide new research will also be published. Intrinsically, SMR is viewed as substantive journal but one that is highly focused on the assessment of the scientific status of sociology. The scope is broad and flexible, and authors are invited to correspond with the editors about the appropriateness of their articles.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信