通过基于任务的元多损失改进跨语言低资源语音识别

IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Yaqi Chen , Hao Zhang , Xukui Yang , Wenlin Zhang , Dan Qu
{"title":"通过基于任务的元多损失改进跨语言低资源语音识别","authors":"Yaqi Chen ,&nbsp;Hao Zhang ,&nbsp;Xukui Yang ,&nbsp;Wenlin Zhang ,&nbsp;Dan Qu","doi":"10.1016/j.csl.2024.101648","DOIUrl":null,"url":null,"abstract":"<div><p>Multilingual meta learning has emerged as a promising paradigm for transferring knowledge from source languages to facilitate the learning of low-resource target languages. Loss functions are a type of meta-knowledge that is crucial to the effective training of neural networks. However, the misalignment between the loss functions and the learning paradigms of meta learning degrades the network’s performance. To address this challenge, we propose a new method called Task-based Meta PolyLoss (TMPL) for meta learning. By regarding speech recognition tasks as normal samples and applying PolyLoss to the meta loss function, TMPL can be denoted as a linear combination of polynomial functions based on task query loss. Theoretical analysis shows that TMPL improves meta learning by enabling attention adjustment across different tasks, which can be tailored for different datasets. Experiments on three datasets demonstrated that gradient-based meta learning methods achieve superior performance with TMPL. Furthermore, our experiments validate that the task-based loss function effectively mitigates the misalignment issue.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":"87 ","pages":"Article 101648"},"PeriodicalIF":3.1000,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improving cross-lingual low-resource speech recognition by Task-based Meta PolyLoss\",\"authors\":\"Yaqi Chen ,&nbsp;Hao Zhang ,&nbsp;Xukui Yang ,&nbsp;Wenlin Zhang ,&nbsp;Dan Qu\",\"doi\":\"10.1016/j.csl.2024.101648\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Multilingual meta learning has emerged as a promising paradigm for transferring knowledge from source languages to facilitate the learning of low-resource target languages. Loss functions are a type of meta-knowledge that is crucial to the effective training of neural networks. However, the misalignment between the loss functions and the learning paradigms of meta learning degrades the network’s performance. To address this challenge, we propose a new method called Task-based Meta PolyLoss (TMPL) for meta learning. By regarding speech recognition tasks as normal samples and applying PolyLoss to the meta loss function, TMPL can be denoted as a linear combination of polynomial functions based on task query loss. Theoretical analysis shows that TMPL improves meta learning by enabling attention adjustment across different tasks, which can be tailored for different datasets. Experiments on three datasets demonstrated that gradient-based meta learning methods achieve superior performance with TMPL. Furthermore, our experiments validate that the task-based loss function effectively mitigates the misalignment issue.</p></div>\",\"PeriodicalId\":50638,\"journal\":{\"name\":\"Computer Speech and Language\",\"volume\":\"87 \",\"pages\":\"Article 101648\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-04-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Speech and Language\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0885230824000317\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Speech and Language","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0885230824000317","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

多语言元学习已经成为一种很有前途的范式,它可以将源语言的知识转移到低资源目标语言的学习中。损失函数是一种元知识,对神经网络的有效训练至关重要。然而,损失函数与元学习范式之间的不匹配会降低网络的性能。为了应对这一挑战,我们提出了一种用于元学习的新方法,称为基于任务的元多损失(TMPL)。通过将语音识别任务视为普通样本,并将 PolyLoss 应用于元损失函数,TMPL 可以表示为基于任务查询损失的多项式函数的线性组合。理论分析表明,TMPL 可以在不同任务中调整注意力,从而改进元学习,并可针对不同的数据集进行调整。在三个数据集上进行的实验表明,基于梯度的元学习方法在 TMPL 的帮助下取得了优异的性能。此外,我们的实验还验证了基于任务的损失函数能有效缓解错位问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Improving cross-lingual low-resource speech recognition by Task-based Meta PolyLoss

Multilingual meta learning has emerged as a promising paradigm for transferring knowledge from source languages to facilitate the learning of low-resource target languages. Loss functions are a type of meta-knowledge that is crucial to the effective training of neural networks. However, the misalignment between the loss functions and the learning paradigms of meta learning degrades the network’s performance. To address this challenge, we propose a new method called Task-based Meta PolyLoss (TMPL) for meta learning. By regarding speech recognition tasks as normal samples and applying PolyLoss to the meta loss function, TMPL can be denoted as a linear combination of polynomial functions based on task query loss. Theoretical analysis shows that TMPL improves meta learning by enabling attention adjustment across different tasks, which can be tailored for different datasets. Experiments on three datasets demonstrated that gradient-based meta learning methods achieve superior performance with TMPL. Furthermore, our experiments validate that the task-based loss function effectively mitigates the misalignment issue.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computer Speech and Language
Computer Speech and Language 工程技术-计算机:人工智能
CiteScore
11.30
自引率
4.70%
发文量
80
审稿时长
22.9 weeks
期刊介绍: Computer Speech & Language publishes reports of original research related to the recognition, understanding, production, coding and mining of speech and language. The speech and language sciences have a long history, but it is only relatively recently that large-scale implementation of and experimentation with complex models of speech and language processing has become feasible. Such research is often carried out somewhat separately by practitioners of artificial intelligence, computer science, electronic engineering, information retrieval, linguistics, phonetics, or psychology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信