A Primer on Pretrained Multilingual Language Models

IF 23.8 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

ACM Computing Surveys Pub Date : 2025-04-01 DOI:10.1145/3727339

Sumanth Doddapaneni, Gowtham Ramesh, Mitesh Khapra, Anoop Kunchukuttan, Pratyush Kumar

引用次数: 0

Abstract

Multilingual Language Models (MLLMs) such as mBERT, XLM, XLM-R, etc. have emerged as a viable option for bringing the power of pretraining to a large number of languages. Given their success in zero-shot transfer learning, there has emerged a large body of work in (i) building bigger MLLMs covering a large number of languages (ii) creating exhaustive benchmarks covering a wider variety of tasks and languages for evaluating MLLMs (iii) analysing the performance of MLLMs on monolingual, zero-shot cross-lingual and bilingual tasks (iv) understanding the universal language patterns (if any) learnt by MLLMs and (v) augmenting the (often) limited capacity of MLLMs to improve their performance on seen or even unseen languages. In this survey, we review the existing literature covering the above broad areas of research pertaining to MLLMs. Based on our survey, we recommend some promising directions of future research.

查看原文本刊更多论文

预训练多语种语言模型入门

mBERT、XLM、XLM- r等多语言语言模型（mllm）已经成为将预训练功能引入大量语言的可行选择。鉴于他们在零迁移学习方面的成功，在以下方面出现了大量的工作：(i)构建覆盖大量语言的更大的mllm；（ii）创建涵盖更广泛的任务和语言的详尽基准来评估mllm；（iii）分析mllm在单语言上的性能。零概率跨语言和双语任务（iv）理解mllm学习的通用语言模式（如果有的话），以及(v)增强mllm（通常）有限的能力，以提高他们在已知甚至未知语言上的表现。在这项调查中，我们回顾了现有的文献涵盖上述广泛的研究领域有关的传销。在此基础上，提出了今后的研究方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Computing Surveys 工程技术-计算机：理论方法

CiteScore

33.20

自引率

0.60%

发文量

372

审稿时长

12 months

期刊介绍： ACM Computing Surveys is an academic journal that focuses on publishing surveys and tutorials on various areas of computing research and practice. The journal aims to provide comprehensive and easily understandable articles that guide readers through the literature and help them understand topics outside their specialties. In terms of impact, CSUR has a high reputation with a 2022 Impact Factor of 16.6. It is ranked 3rd out of 111 journals in the field of Computer Science Theory & Methods. ACM Computing Surveys is indexed and abstracted in various services, including AI2 Semantic Scholar, Baidu, Clarivate/ISI: JCR, CNKI, DeepDyve, DTU, EBSCO: EDS/HOST, and IET Inspec, among others.