大型语言模型和大脑中的上下文特征提取层次趋同

IF 18.8 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Nature Machine Intelligence Pub Date : 2024-11-26 DOI:10.1038/s42256-024-00925-4

Gavin Mischler, Yinghao Aaron Li, Stephan Bickel, Ashesh D. Mehta, Nima Mesgarani

{"title":"大型语言模型和大脑中的上下文特征提取层次趋同","authors":"Gavin Mischler, Yinghao Aaron Li, Stephan Bickel, Ashesh D. Mehta, Nima Mesgarani","doi":"10.1038/s42256-024-00925-4","DOIUrl":null,"url":null,"abstract":"Recent advancements in artificial intelligence have sparked interest in the parallels between large language models (LLMs) and human neural processing, particularly in language comprehension. Although previous research has demonstrated similarities between LLM representations and neural responses, the computational principles driving this convergence—especially as LLMs evolve—remain elusive. Here we used intracranial electroencephalography recordings from neurosurgical patients listening to speech to investigate the alignment between high-performance LLMs and the language-processing mechanisms of the brain. We examined a diverse selection of LLMs with similar parameter sizes and found that as their performance on benchmark tasks improves, they not only become more brain-like, reflected in better neural response predictions from model embeddings, but they also align more closely with the hierarchical feature extraction pathways of the brain, using fewer layers for the same encoding. Additionally, we identified commonalities in the hierarchical processing mechanisms of high-performing LLMs, revealing their convergence towards similar language-processing strategies. Finally, we demonstrate the critical role of contextual information in both LLM performance and brain alignment. These findings reveal converging aspects of language processing in the brain and LLMs, offering new directions for developing models that better align with human cognitive processing. Why brain-like feature extraction emerges in large language models (LLMs) remains elusive. Mischler, Li and colleagues demonstrate that high-performing LLMs not only predict neural responses more accurately than other LLMs but also align more closely with the hierarchical language processing pathway in the brain, revealing parallels between these models and human cognitive mechanisms.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 12","pages":"1467-1477"},"PeriodicalIF":18.8000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Contextual feature extraction hierarchies converge in large language models and the brain\",\"authors\":\"Gavin Mischler, Yinghao Aaron Li, Stephan Bickel, Ashesh D. Mehta, Nima Mesgarani\",\"doi\":\"10.1038/s42256-024-00925-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent advancements in artificial intelligence have sparked interest in the parallels between large language models (LLMs) and human neural processing, particularly in language comprehension. Although previous research has demonstrated similarities between LLM representations and neural responses, the computational principles driving this convergence—especially as LLMs evolve—remain elusive. Here we used intracranial electroencephalography recordings from neurosurgical patients listening to speech to investigate the alignment between high-performance LLMs and the language-processing mechanisms of the brain. We examined a diverse selection of LLMs with similar parameter sizes and found that as their performance on benchmark tasks improves, they not only become more brain-like, reflected in better neural response predictions from model embeddings, but they also align more closely with the hierarchical feature extraction pathways of the brain, using fewer layers for the same encoding. Additionally, we identified commonalities in the hierarchical processing mechanisms of high-performing LLMs, revealing their convergence towards similar language-processing strategies. Finally, we demonstrate the critical role of contextual information in both LLM performance and brain alignment. These findings reveal converging aspects of language processing in the brain and LLMs, offering new directions for developing models that better align with human cognitive processing. Why brain-like feature extraction emerges in large language models (LLMs) remains elusive. Mischler, Li and colleagues demonstrate that high-performing LLMs not only predict neural responses more accurately than other LLMs but also align more closely with the hierarchical language processing pathway in the brain, revealing parallels between these models and human cognitive mechanisms.\",\"PeriodicalId\":48533,\"journal\":{\"name\":\"Nature Machine Intelligence\",\"volume\":\"6 12\",\"pages\":\"1467-1477\"},\"PeriodicalIF\":18.8000,\"publicationDate\":\"2024-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nature Machine Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.nature.com/articles/s42256-024-00925-4\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.nature.com/articles/s42256-024-00925-4","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

人工智能的最新进展引发了人们对大型语言模型（LLM）与人类神经处理之间相似性的兴趣，尤其是在语言理解方面。尽管之前的研究已经证明了大型语言模型表征与神经反应之间的相似性，但驱动这种趋同的计算原理--尤其是在大型语言模型不断进化的过程中--仍然难以捉摸。在这里，我们利用神经外科患者聆听语音时的颅内脑电图记录来研究高性能 LLM 与大脑语言处理机制之间的一致性。我们研究了具有相似参数大小的多种 LLM，发现随着它们在基准任务上的表现不断提高，它们不仅变得更像大脑，反映在模型嵌入的神经响应预测上，而且它们与大脑的分层特征提取途径更加一致，使用更少的层数进行相同的编码。此外，我们还发现了高绩效 LLM 的分层处理机制的共性，揭示了它们向类似语言处理策略的趋同。最后，我们证明了语境信息在 LLM 性能和大脑排列中的关键作用。这些发现揭示了大脑和 LLMs 语言处理的趋同性，为开发更符合人类认知处理的模型提供了新的方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Contextual feature extraction hierarchies converge in large language models and the brain

查看原文本刊更多论文

Contextual feature extraction hierarchies converge in large language models and the brain

Recent advancements in artificial intelligence have sparked interest in the parallels between large language models (LLMs) and human neural processing, particularly in language comprehension. Although previous research has demonstrated similarities between LLM representations and neural responses, the computational principles driving this convergence—especially as LLMs evolve—remain elusive. Here we used intracranial electroencephalography recordings from neurosurgical patients listening to speech to investigate the alignment between high-performance LLMs and the language-processing mechanisms of the brain. We examined a diverse selection of LLMs with similar parameter sizes and found that as their performance on benchmark tasks improves, they not only become more brain-like, reflected in better neural response predictions from model embeddings, but they also align more closely with the hierarchical feature extraction pathways of the brain, using fewer layers for the same encoding. Additionally, we identified commonalities in the hierarchical processing mechanisms of high-performing LLMs, revealing their convergence towards similar language-processing strategies. Finally, we demonstrate the critical role of contextual information in both LLM performance and brain alignment. These findings reveal converging aspects of language processing in the brain and LLMs, offering new directions for developing models that better align with human cognitive processing. Why brain-like feature extraction emerges in large language models (LLMs) remains elusive. Mischler, Li and colleagues demonstrate that high-performing LLMs not only predict neural responses more accurately than other LLMs but also align more closely with the hierarchical language processing pathway in the brain, revealing parallels between these models and human cognitive mechanisms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Nature Machine Intelligence Multiple-

CiteScore

36.90

自引率

2.10%

发文量

127

期刊介绍： Nature Machine Intelligence is a distinguished publication that presents original research and reviews on various topics in machine learning, robotics, and AI. Our focus extends beyond these fields, exploring their profound impact on other scientific disciplines, as well as societal and industrial aspects. We recognize limitless possibilities wherein machine intelligence can augment human capabilities and knowledge in domains like scientific exploration, healthcare, medical diagnostics, and the creation of safe and sustainable cities, transportation, and agriculture. Simultaneously, we acknowledge the emergence of ethical, social, and legal concerns due to the rapid pace of advancements. To foster interdisciplinary discussions on these far-reaching implications, Nature Machine Intelligence serves as a platform for dialogue facilitated through Comments, News Features, News & Views articles, and Correspondence. Our goal is to encourage a comprehensive examination of these subjects. Similar to all Nature-branded journals, Nature Machine Intelligence operates under the guidance of a team of skilled editors. We adhere to a fair and rigorous peer-review process, ensuring high standards of copy-editing and production, swift publication, and editorial independence.