Mixture of experts (MoE): A big data perspective

IF 15.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Wensheng Gan , Zhenyao Ning , Zhenlian Qi , Philip S. Yu
{"title":"Mixture of experts (MoE): A big data perspective","authors":"Wensheng Gan ,&nbsp;Zhenyao Ning ,&nbsp;Zhenlian Qi ,&nbsp;Philip S. Yu","doi":"10.1016/j.inffus.2025.103664","DOIUrl":null,"url":null,"abstract":"<div><div>As the era of big data arrives, traditional artificial intelligence algorithms have difficulty processing the demands of massive and diverse data. Mixture of experts (MoE) has shown excellent performance and broad application prospects. This paper provides an in-depth review and analysis of the latest progress in this field from multiple perspectives, including the basic principles, algorithmic models, key technical challenges, and application practices of MoE. First, we introduce the basic concept of MoE and its core idea and elaborate on its advantages over traditional single models. Then, we discuss the basic architecture of MoE and its main components, including the gating network, expert networks, and learning algorithms. Next, we review the applications of MoE in addressing key technical issues in big data, including high-dimensional sparse data modeling, heterogeneous multisource data fusion, real-time online learning, and the interpretability of the model. For each challenge, we provide specific MoE solutions and their innovations. Furthermore, we summarize the typical use cases of MoE in various application domains, including natural language processing, computer vision, and recommendation systems, and analyze their outstanding achievements. This fully demonstrates the powerful capability of MoE in big data processing. We also analyze the advantages of MoE in big data environments, including high scalability, efficient resource utilization, and better generalization ability, as well as the challenges it faces, such as load imbalance and expert utilization, gating network stability, and training difficulty. Finally, we explore the future development trends of MoE, including the improvement of model generalization capabilities, the enhancement of algorithmic interpretability, and the increase in system automation levels. We believe that MoE will become an important paradigm of artificial intelligence in the era of big data. In summary, this paper systematically elaborates on the principles, techniques, and applications of MoE in big data processing, providing theoretical and practical references to further promote the application of MoE in real scenarios.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"127 ","pages":"Article 103664"},"PeriodicalIF":15.5000,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525007365","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

As the era of big data arrives, traditional artificial intelligence algorithms have difficulty processing the demands of massive and diverse data. Mixture of experts (MoE) has shown excellent performance and broad application prospects. This paper provides an in-depth review and analysis of the latest progress in this field from multiple perspectives, including the basic principles, algorithmic models, key technical challenges, and application practices of MoE. First, we introduce the basic concept of MoE and its core idea and elaborate on its advantages over traditional single models. Then, we discuss the basic architecture of MoE and its main components, including the gating network, expert networks, and learning algorithms. Next, we review the applications of MoE in addressing key technical issues in big data, including high-dimensional sparse data modeling, heterogeneous multisource data fusion, real-time online learning, and the interpretability of the model. For each challenge, we provide specific MoE solutions and their innovations. Furthermore, we summarize the typical use cases of MoE in various application domains, including natural language processing, computer vision, and recommendation systems, and analyze their outstanding achievements. This fully demonstrates the powerful capability of MoE in big data processing. We also analyze the advantages of MoE in big data environments, including high scalability, efficient resource utilization, and better generalization ability, as well as the challenges it faces, such as load imbalance and expert utilization, gating network stability, and training difficulty. Finally, we explore the future development trends of MoE, including the improvement of model generalization capabilities, the enhancement of algorithmic interpretability, and the increase in system automation levels. We believe that MoE will become an important paradigm of artificial intelligence in the era of big data. In summary, this paper systematically elaborates on the principles, techniques, and applications of MoE in big data processing, providing theoretical and practical references to further promote the application of MoE in real scenarios.
专家组合(MoE):大数据视角
随着大数据时代的到来,传统的人工智能算法难以处理海量、多样化的数据需求。专家混合(MoE)已显示出优异的性能和广阔的应用前景。本文从MoE的基本原理、算法模型、关键技术挑战和应用实践等多个角度对该领域的最新进展进行了深入的综述和分析。首先,我们介绍了MoE的基本概念和核心思想,阐述了MoE相对于传统单一模型的优势。然后,我们讨论了MoE的基本架构及其主要组成部分,包括门控网络、专家网络和学习算法。接下来,我们回顾了MoE在解决大数据关键技术问题上的应用,包括高维稀疏数据建模、异构多源数据融合、实时在线学习以及模型的可解释性。针对每个挑战,我们提供具体的MoE解决方案及其创新。此外,我们还总结了MoE在自然语言处理、计算机视觉和推荐系统等各个应用领域的典型用例,并分析了它们的突出成果。这充分体现了教育部在大数据处理方面的强大能力。分析了MoE在大数据环境下的高可扩展性、高效的资源利用、更好的泛化能力等优势,以及其面临的负载不平衡与专家利用、门控网络稳定性、训练难度等挑战。最后,探讨了MoE的未来发展趋势,包括模型泛化能力的提高、算法可解释性的增强和系统自动化水平的提高。我们相信,MoE将成为大数据时代人工智能的重要典范。综上所述,本文系统阐述了MoE在大数据处理中的原理、技术和应用,为进一步推进MoE在实际场景中的应用提供理论和实践参考。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Information Fusion
Information Fusion 工程技术-计算机:理论方法
CiteScore
33.20
自引率
4.30%
发文量
161
审稿时长
7.9 months
期刊介绍: Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信