{"title":"Multimodality of AI for Education: Toward Artificial General Intelligence","authors":"Gyeonggeon Lee;Lehong Shi;Ehsan Latif;Yizhu Gao;Arne Bewersdorff;Matthew Nyaaba;Shuchen Guo;Zhengliang Liu;Gengchen Mai;Tianming Liu;Xiaoming Zhai","doi":"10.1109/TLT.2025.3574466","DOIUrl":null,"url":null,"abstract":"This article addresses the growing importance of understanding how multimodal artificial general intelligence (AGI) can be integrated into educational practices. We first reviewed the theoretical foundations of multimodality in human learning, encompassing its concept and history, dual coding theory and multimedia theory, VARK multimodality, and multimodal assessment (see Section II-A). After that, we revisited the essential components of AGI, particularly focusing on the multimodal nature of AGI that distinguished it from artificial narrow intelligence. Based on its conversational functionality, multimodal AGI is considered an educational agent already tested in various educational situations (see Section II-B). How significant text, image, audio, and video modalities are for education, the technological backgrounds of AGI for analyzing and generating them, and educational applications of artificial intelligence (AI) for each modality were thoroughly reviewed (Sections III–VI). Finally, we comprehensively investigated the ethics of AGI in education, originating from the ethics of AI and specified in three strands: first, data privacy and ethical integrity, second, explainability, transparency, and fairness, and third, responsibility and decision-making. Practical implementation of ethical AGI frameworks in education was reviewed (see Section VII). This article also discusses the implications for learning theories, derived operational design principles, current research gaps, practical constraints and institutional readiness, and future directions (see Section VIII). This exploration aims to provide an advanced understanding of the intersection between AI, multimodality, and education, setting a foundation for future research and development.","PeriodicalId":49191,"journal":{"name":"IEEE Transactions on Learning Technologies","volume":"18 ","pages":"666-683"},"PeriodicalIF":4.9000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Learning Technologies","FirstCategoryId":"95","ListUrlMain":"https://ieeexplore.ieee.org/document/11016818/","RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
This article addresses the growing importance of understanding how multimodal artificial general intelligence (AGI) can be integrated into educational practices. We first reviewed the theoretical foundations of multimodality in human learning, encompassing its concept and history, dual coding theory and multimedia theory, VARK multimodality, and multimodal assessment (see Section II-A). After that, we revisited the essential components of AGI, particularly focusing on the multimodal nature of AGI that distinguished it from artificial narrow intelligence. Based on its conversational functionality, multimodal AGI is considered an educational agent already tested in various educational situations (see Section II-B). How significant text, image, audio, and video modalities are for education, the technological backgrounds of AGI for analyzing and generating them, and educational applications of artificial intelligence (AI) for each modality were thoroughly reviewed (Sections III–VI). Finally, we comprehensively investigated the ethics of AGI in education, originating from the ethics of AI and specified in three strands: first, data privacy and ethical integrity, second, explainability, transparency, and fairness, and third, responsibility and decision-making. Practical implementation of ethical AGI frameworks in education was reviewed (see Section VII). This article also discusses the implications for learning theories, derived operational design principles, current research gaps, practical constraints and institutional readiness, and future directions (see Section VIII). This exploration aims to provide an advanced understanding of the intersection between AI, multimodality, and education, setting a foundation for future research and development.
期刊介绍:
The IEEE Transactions on Learning Technologies covers all advances in learning technologies and their applications, including but not limited to the following topics: innovative online learning systems; intelligent tutors; educational games; simulation systems for education and training; collaborative learning tools; learning with mobile devices; wearable devices and interfaces for learning; personalized and adaptive learning systems; tools for formative and summative assessment; tools for learning analytics and educational data mining; ontologies for learning systems; standards and web services that support learning; authoring tools for learning materials; computer support for peer tutoring; learning via computer-mediated inquiry, field, and lab work; social learning techniques; social networks and infrastructures for learning and knowledge sharing; and creation and management of learning objects.