Stefan Küchemann, Karina E Avila, Yavuz Dinc, Chiara Hortmann, Natalia Revenga, Verena Ruf, Niklas Stausberg, Steffen Steinert, Frank Fischer, Martin Fischer, Enkelejda Kasneci, Gjergji Kasneci, Thomas Kuhr, Gitta Kutyniok, Sarah Malone, Michael Sailer, Albrecht Schmidt, Matthias Stadler, Jochen Weller, Jochen Kuhn
{"title":"On opportunities and challenges of large multimodal foundation models in education.","authors":"Stefan Küchemann, Karina E Avila, Yavuz Dinc, Chiara Hortmann, Natalia Revenga, Verena Ruf, Niklas Stausberg, Steffen Steinert, Frank Fischer, Martin Fischer, Enkelejda Kasneci, Gjergji Kasneci, Thomas Kuhr, Gitta Kutyniok, Sarah Malone, Michael Sailer, Albrecht Schmidt, Matthias Stadler, Jochen Weller, Jochen Kuhn","doi":"10.1038/s41539-025-00301-w","DOIUrl":null,"url":null,"abstract":"<p><p>Recently, the option to use large language models as a middleware connecting various AI tools and other large language models led to the development of so-called large multimodal foundation models, which have the power to process spoken text, music, images and videos. In this overview, we explain a new set of opportunities and challenges that arise from the integration of large multimodal foundation models in education.</p>","PeriodicalId":48503,"journal":{"name":"npj Science of Learning","volume":"10 1","pages":"11"},"PeriodicalIF":3.6000,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11861286/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"npj Science of Learning","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1038/s41539-025-00301-w","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, the option to use large language models as a middleware connecting various AI tools and other large language models led to the development of so-called large multimodal foundation models, which have the power to process spoken text, music, images and videos. In this overview, we explain a new set of opportunities and challenges that arise from the integration of large multimodal foundation models in education.