大型语言模型的基本能力和应用：综述

IF 23.8 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys Pub Date : 2025-05-15 DOI:10.1145/3735632

Jiawei Li, Yang Gao, Yizhe Yang, Yu Bai, Xiaofeng Zhou, Yinghao Li, Huashan Sun, Yuhang Liu, Xingpeng Si, Yuhao Ye, Yixiao Wu, Yiguan Lin, Bin Xu, Bowen Ren, Chong Feng, Heyan Huang

{"title":"大型语言模型的基本能力和应用：综述","authors":"Jiawei Li, Yang Gao, Yizhe Yang, Yu Bai, Xiaofeng Zhou, Yinghao Li, Huashan Sun, Yuhang Liu, Xingpeng Si, Yuhao Ye, Yixiao Wu, Yiguan Lin, Bin Xu, Bowen Ren, Chong Feng, Heyan Huang","doi":"10.1145/3735632","DOIUrl":null,"url":null,"abstract":"Large Language Models (LLMs) have demonstrated remarkable effectiveness across various domain-specific applications. However, which fundamental capabilities most contribute to their success in different domains remains unclear. This uncertainty complicates LLM evaluation, as existing benchmark-based assessments often fail to capture their real-world performance, where the required capabilities may differ from those measured in the benchmarks. In this survey, we provide a systematic introduction to LLMs’ fundamental capabilities, encompassing their definitions, formation mechanisms, and practical applications. We further explore the relationships among these capabilities and discuss how they collectively support complex problem-solving in domain-specific applications. Building on this foundation, we review recent advances in LLM-driven applications across nine specific domains: medicine, law, computational biology, finance, social sciences and psychology, computer programming and software engineering, robots and agents, AI for disciplines, and creative work. We analyze how specific capabilities are leveraged for each domain to address unique requirements. This perspective enables us to establish connections between these capabilities and domain requirements, and to evaluate the varying importance of different capabilities across different domains. Based on these insights, we propose evaluation strategies tailored to the essential capabilities required in each domain, offering practical guidance for selecting suitable backbone LLMs in real-world applications.","PeriodicalId":50926,"journal":{"name":"ACM Computing Surveys","volume":"130 1","pages":""},"PeriodicalIF":23.8000,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fundamental Capabilities and Applications of Large Language Models: A Survey\",\"authors\":\"Jiawei Li, Yang Gao, Yizhe Yang, Yu Bai, Xiaofeng Zhou, Yinghao Li, Huashan Sun, Yuhang Liu, Xingpeng Si, Yuhao Ye, Yixiao Wu, Yiguan Lin, Bin Xu, Bowen Ren, Chong Feng, Heyan Huang\",\"doi\":\"10.1145/3735632\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Large Language Models (LLMs) have demonstrated remarkable effectiveness across various domain-specific applications. However, which fundamental capabilities most contribute to their success in different domains remains unclear. This uncertainty complicates LLM evaluation, as existing benchmark-based assessments often fail to capture their real-world performance, where the required capabilities may differ from those measured in the benchmarks. In this survey, we provide a systematic introduction to LLMs’ fundamental capabilities, encompassing their definitions, formation mechanisms, and practical applications. We further explore the relationships among these capabilities and discuss how they collectively support complex problem-solving in domain-specific applications. Building on this foundation, we review recent advances in LLM-driven applications across nine specific domains: medicine, law, computational biology, finance, social sciences and psychology, computer programming and software engineering, robots and agents, AI for disciplines, and creative work. We analyze how specific capabilities are leveraged for each domain to address unique requirements. This perspective enables us to establish connections between these capabilities and domain requirements, and to evaluate the varying importance of different capabilities across different domains. Based on these insights, we propose evaluation strategies tailored to the essential capabilities required in each domain, offering practical guidance for selecting suitable backbone LLMs in real-world applications.\",\"PeriodicalId\":50926,\"journal\":{\"name\":\"ACM Computing Surveys\",\"volume\":\"130 1\",\"pages\":\"\"},\"PeriodicalIF\":23.8000,\"publicationDate\":\"2025-05-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Computing Surveys\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3735632\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Computing Surveys","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3735632","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

大型语言模型（llm）已经在各种领域特定的应用程序中证明了显著的有效性。然而，哪些基本能力对他们在不同领域的成功贡献最大仍不清楚。这种不确定性使LLM评估变得复杂，因为现有的基于基准的评估通常无法捕获它们的实际性能，其中所需的功能可能与基准中测量的功能不同。在本调查中，我们系统地介绍了法学硕士的基本能力，包括其定义、形成机制和实际应用。我们进一步探索这些功能之间的关系，并讨论它们如何共同支持特定领域应用程序中的复杂问题解决。在此基础上，我们回顾了法学硕士驱动的应用程序在九个特定领域的最新进展：医学、法律、计算生物学、金融、社会科学和心理学、计算机编程和软件工程、机器人和代理、学科人工智能和创造性工作。我们分析如何利用每个领域的特定功能来处理独特的需求。这个视角使我们能够在这些功能和领域需求之间建立联系，并评估跨不同领域的不同功能的不同重要性。基于这些见解，我们提出了针对每个领域所需的基本能力量身定制的评估策略，为在实际应用中选择合适的骨干llm提供实用指导。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Fundamental Capabilities and Applications of Large Language Models: A Survey

Large Language Models (LLMs) have demonstrated remarkable effectiveness across various domain-specific applications. However, which fundamental capabilities most contribute to their success in different domains remains unclear. This uncertainty complicates LLM evaluation, as existing benchmark-based assessments often fail to capture their real-world performance, where the required capabilities may differ from those measured in the benchmarks. In this survey, we provide a systematic introduction to LLMs’ fundamental capabilities, encompassing their definitions, formation mechanisms, and practical applications. We further explore the relationships among these capabilities and discuss how they collectively support complex problem-solving in domain-specific applications. Building on this foundation, we review recent advances in LLM-driven applications across nine specific domains: medicine, law, computational biology, finance, social sciences and psychology, computer programming and software engineering, robots and agents, AI for disciplines, and creative work. We analyze how specific capabilities are leveraged for each domain to address unique requirements. This perspective enables us to establish connections between these capabilities and domain requirements, and to evaluate the varying importance of different capabilities across different domains. Based on these insights, we propose evaluation strategies tailored to the essential capabilities required in each domain, offering practical guidance for selecting suitable backbone LLMs in real-world applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Computing Surveys 工程技术-计算机：理论方法

CiteScore

33.20

自引率

0.60%

发文量

372

审稿时长

12 months

期刊介绍： ACM Computing Surveys is an academic journal that focuses on publishing surveys and tutorials on various areas of computing research and practice. The journal aims to provide comprehensive and easily understandable articles that guide readers through the literature and help them understand topics outside their specialties. In terms of impact, CSUR has a high reputation with a 2022 Impact Factor of 16.6. It is ranked 3rd out of 111 journals in the field of Computer Science Theory & Methods. ACM Computing Surveys is indexed and abstracted in various services, including AI2 Semantic Scholar, Baidu, Clarivate/ISI: JCR, CNKI, DeepDyve, DTU, EBSCO: EDS/HOST, and IET Inspec, among others.