Investigating Massive Multilingual Pre-Trained Machine Translation Models for Clinical Domain via Transfer Learning

Clinical Natural Language Processing Workshop Pub Date : 2022-10-12 DOI:10.18653/v1/2023.clinicalnlp-1.5

Lifeng Han, G. Erofeev, I. Sorokina, S. Gladkoff, G. Nenadic

{"title":"Investigating Massive Multilingual Pre-Trained Machine Translation Models for Clinical Domain via Transfer Learning","authors":"Lifeng Han, G. Erofeev, I. Sorokina, S. Gladkoff, G. Nenadic","doi":"10.18653/v1/2023.clinicalnlp-1.5","DOIUrl":null,"url":null,"abstract":"Massively multilingual pre-trained language models (MMPLMs) are developed in recent years demonstrating superpowers and the pre-knowledge they acquire for downstream tasks.This work investigates whether MMPLMs can be applied to clinical domain machine translation (MT) towards entirely unseen languages via transfer learning.We carry out an experimental investigation using Meta-AI’s MMPLMs “wmt21-dense-24-wide-en-X and X-en (WMT21fb)” which were pre-trained on 7 language pairs and 14 translation directions including English to Czech, German, Hausa, Icelandic, Japanese, Russian, and Chinese, and the opposite direction.We fine-tune these MMPLMs towards English-Spanish language pair which did not exist at all in their original pre-trained corpora both implicitly and explicitly.We prepare carefully aligned clinical domain data for this fine-tuning, which is different from their original mixed domain knowledge.Our experimental result shows that the fine-tuning is very successful using just 250k well-aligned in-domain EN-ES segments for three sub-task translation testings: clinical cases, clinical terms, and ontology concepts. It achieves very close evaluation scores to another MMPLM NLLB from Meta-AI, which included Spanish as a high-resource setting in the pre-training.To the best of our knowledge, this is the first work on using MMPLMs towards clinical domain transfer-learning NMT successfully for totally unseen languages during pre-training.","PeriodicalId":216954,"journal":{"name":"Clinical Natural Language Processing Workshop","volume":"146 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Natural Language Processing Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2023.clinicalnlp-1.5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Massively multilingual pre-trained language models (MMPLMs) are developed in recent years demonstrating superpowers and the pre-knowledge they acquire for downstream tasks.This work investigates whether MMPLMs can be applied to clinical domain machine translation (MT) towards entirely unseen languages via transfer learning.We carry out an experimental investigation using Meta-AI’s MMPLMs “wmt21-dense-24-wide-en-X and X-en (WMT21fb)” which were pre-trained on 7 language pairs and 14 translation directions including English to Czech, German, Hausa, Icelandic, Japanese, Russian, and Chinese, and the opposite direction.We fine-tune these MMPLMs towards English-Spanish language pair which did not exist at all in their original pre-trained corpora both implicitly and explicitly.We prepare carefully aligned clinical domain data for this fine-tuning, which is different from their original mixed domain knowledge.Our experimental result shows that the fine-tuning is very successful using just 250k well-aligned in-domain EN-ES segments for three sub-task translation testings: clinical cases, clinical terms, and ontology concepts. It achieves very close evaluation scores to another MMPLM NLLB from Meta-AI, which included Spanish as a high-resource setting in the pre-training.To the best of our knowledge, this is the first work on using MMPLMs towards clinical domain transfer-learning NMT successfully for totally unseen languages during pre-training.

查看原文本刊更多论文

基于迁移学习的临床领域大规模多语言预训练机器翻译模型研究

大规模多语言预训练语言模型(MMPLMs)是近年来发展起来的，它展示了它们在下游任务中获得的超能力和预知识。这项工作研究了MMPLMs是否可以通过迁移学习应用于临床领域机器翻译(MT)，以实现完全看不见的语言。我们使用Meta-AI的MMPLMs“wmt21-dense-24- broad -en- x and X-en (WMT21fb)”进行了实验研究，该MMPLMs在7对语言和14个翻译方向上进行了预训练，包括英语到捷克语、德语、豪萨语、冰岛语、日语、俄语和汉语，以及反方向。我们针对原始预训练语料库中根本不存在的英语-西班牙语对对这些mmplm进行了隐式和显式微调。我们为这种微调准备了仔细对齐的临床领域数据，这与他们最初的混合领域知识不同。我们的实验结果表明，对于三个子任务翻译测试:临床病例、临床术语和本体概念，仅使用250k个对齐良好的领域内EN-ES片段进行微调是非常成功的。它与Meta-AI的另一个MMPLM NLLB的评估分数非常接近，后者在预训练中将西班牙语作为高资源设置。据我们所知，这是在预训练期间成功地将mmplm用于临床领域迁移学习NMT的第一次工作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Clinical Natural Language Processing Workshop

自引率

0.00%

发文量