通过基于任务的元多损失改进跨语言低资源语音识别

IF 3.1 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computer Speech and Language Pub Date : 2024-04-09 DOI:10.1016/j.csl.2024.101648

Yaqi Chen , Hao Zhang , Xukui Yang , Wenlin Zhang , Dan Qu

{"title":"通过基于任务的元多损失改进跨语言低资源语音识别","authors":"Yaqi Chen , Hao Zhang , Xukui Yang , Wenlin Zhang , Dan Qu","doi":"10.1016/j.csl.2024.101648","DOIUrl":null,"url":null,"abstract":"<div><p>Multilingual meta learning has emerged as a promising paradigm for transferring knowledge from source languages to facilitate the learning of low-resource target languages. Loss functions are a type of meta-knowledge that is crucial to the effective training of neural networks. However, the misalignment between the loss functions and the learning paradigms of meta learning degrades the network’s performance. To address this challenge, we propose a new method called Task-based Meta PolyLoss (TMPL) for meta learning. By regarding speech recognition tasks as normal samples and applying PolyLoss to the meta loss function, TMPL can be denoted as a linear combination of polynomial functions based on task query loss. Theoretical analysis shows that TMPL improves meta learning by enabling attention adjustment across different tasks, which can be tailored for different datasets. Experiments on three datasets demonstrated that gradient-based meta learning methods achieve superior performance with TMPL. Furthermore, our experiments validate that the task-based loss function effectively mitigates the misalignment issue.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"87 ","pages":"Article 101648"},"PeriodicalIF":3.1000,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving cross-lingual low-resource speech recognition by Task-based Meta PolyLoss\",\"authors\":\"Yaqi Chen , Hao Zhang , Xukui Yang , Wenlin Zhang , Dan Qu\",\"doi\":\"10.1016/j.csl.2024.101648\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Multilingual meta learning has emerged as a promising paradigm for transferring knowledge from source languages to facilitate the learning of low-resource target languages. Loss functions are a type of meta-knowledge that is crucial to the effective training of neural networks. However, the misalignment between the loss functions and the learning paradigms of meta learning degrades the network’s performance. To address this challenge, we propose a new method called Task-based Meta PolyLoss (TMPL) for meta learning. By regarding speech recognition tasks as normal samples and applying PolyLoss to the meta loss function, TMPL can be denoted as a linear combination of polynomial functions based on task query loss. Theoretical analysis shows that TMPL improves meta learning by enabling attention adjustment across different tasks, which can be tailored for different datasets. Experiments on three datasets demonstrated that gradient-based meta learning methods achieve superior performance with TMPL. Furthermore, our experiments validate that the task-based loss function effectively mitigates the misalignment issue.</p></div>\",\"PeriodicalId\":50638,\"journal\":{\"name\":\"Computer Speech and Language\",\"volume\":\"87 \",\"pages\":\"Article 101648\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-04-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Speech and Language\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0885230824000317\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Speech and Language","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0885230824000317","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

多语言元学习已经成为一种很有前途的范式，它可以将源语言的知识转移到低资源目标语言的学习中。损失函数是一种元知识，对神经网络的有效训练至关重要。然而，损失函数与元学习范式之间的不匹配会降低网络的性能。为了应对这一挑战，我们提出了一种用于元学习的新方法，称为基于任务的元多损失（TMPL）。通过将语音识别任务视为普通样本，并将 PolyLoss 应用于元损失函数，TMPL 可以表示为基于任务查询损失的多项式函数的线性组合。理论分析表明，TMPL 可以在不同任务中调整注意力，从而改进元学习，并可针对不同的数据集进行调整。在三个数据集上进行的实验表明，基于梯度的元学习方法在 TMPL 的帮助下取得了优异的性能。此外，我们的实验还验证了基于任务的损失函数能有效缓解错位问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improving cross-lingual low-resource speech recognition by Task-based Meta PolyLoss

Multilingual meta learning has emerged as a promising paradigm for transferring knowledge from source languages to facilitate the learning of low-resource target languages. Loss functions are a type of meta-knowledge that is crucial to the effective training of neural networks. However, the misalignment between the loss functions and the learning paradigms of meta learning degrades the network’s performance. To address this challenge, we propose a new method called Task-based Meta PolyLoss (TMPL) for meta learning. By regarding speech recognition tasks as normal samples and applying PolyLoss to the meta loss function, TMPL can be denoted as a linear combination of polynomial functions based on task query loss. Theoretical analysis shows that TMPL improves meta learning by enabling attention adjustment across different tasks, which can be tailored for different datasets. Experiments on three datasets demonstrated that gradient-based meta learning methods achieve superior performance with TMPL. Furthermore, our experiments validate that the task-based loss function effectively mitigates the misalignment issue.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Speech and Language 工程技术-计算机：人工智能

CiteScore

11.30

自引率

4.70%

发文量

审稿时长

22.9 weeks

期刊介绍： Computer Speech & Language publishes reports of original research related to the recognition, understanding, production, coding and mining of speech and language. The speech and language sciences have a long history, but it is only relatively recently that large-scale implementation of and experimentation with complex models of speech and language processing has become feasible. Such research is often carried out somewhat separately by practitioners of artificial intelligence, computer science, electronic engineering, information retrieval, linguistics, phonetics, or psychology.