面向OLR 2021挑战赛的蚂蚁多语言识别系统

Anqi Lyu, Zhiming Wang, Huijia Zhu
{"title":"面向OLR 2021挑战赛的蚂蚁多语言识别系统","authors":"Anqi Lyu, Zhiming Wang, Huijia Zhu","doi":"10.21437/interspeech.2022-355","DOIUrl":null,"url":null,"abstract":"This paper presents a comprehensive description of the Ant multilingual recognition system for the 6th Oriental Language Recognition(OLR 2021) Challenge. Inspired by the transfer learning scheme, the encoder components of language iden-tification(LID) model is initialized from pretrained automatic speech recognition(ASR) networks for integrating the lexical phonetic information into language identification. The ASR model is encoder-decoder networks based on U2++ architecture [1]; then inheriting the shared conformer encoder [2] from pretrained ASR model which is effective at global information capturing and local invariance modeling, the LID model, with an attentive statistical pooling layer and a following linear projection layer added on the encoder, is further finetuned until its optimum. Furthermore, data augmentation, score normalization and model ensemble are good strategies to improve performance indicators, which are investigated and analysed in detail within our paper. In the OLR 2021 Challenge, our submitted systems ranked the top in both tasks 1 and 2 with primary met-rics of 0.0025 and 0.0039 respectively, less than 1/3 of the second place 1 , which fully illustrates that our methodologies for multilingual identification are effectual and competitive in real-life scenarios.","PeriodicalId":73500,"journal":{"name":"Interspeech","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Ant Multilingual Recognition System for OLR 2021 Challenge\",\"authors\":\"Anqi Lyu, Zhiming Wang, Huijia Zhu\",\"doi\":\"10.21437/interspeech.2022-355\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a comprehensive description of the Ant multilingual recognition system for the 6th Oriental Language Recognition(OLR 2021) Challenge. Inspired by the transfer learning scheme, the encoder components of language iden-tification(LID) model is initialized from pretrained automatic speech recognition(ASR) networks for integrating the lexical phonetic information into language identification. The ASR model is encoder-decoder networks based on U2++ architecture [1]; then inheriting the shared conformer encoder [2] from pretrained ASR model which is effective at global information capturing and local invariance modeling, the LID model, with an attentive statistical pooling layer and a following linear projection layer added on the encoder, is further finetuned until its optimum. Furthermore, data augmentation, score normalization and model ensemble are good strategies to improve performance indicators, which are investigated and analysed in detail within our paper. In the OLR 2021 Challenge, our submitted systems ranked the top in both tasks 1 and 2 with primary met-rics of 0.0025 and 0.0039 respectively, less than 1/3 of the second place 1 , which fully illustrates that our methodologies for multilingual identification are effectual and competitive in real-life scenarios.\",\"PeriodicalId\":73500,\"journal\":{\"name\":\"Interspeech\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Interspeech\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21437/interspeech.2022-355\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Interspeech","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/interspeech.2022-355","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

本文全面介绍了第六届东方语言识别(OLR 2021)挑战赛的蚂蚁多语言识别系统。受迁移学习方案的启发,语言识别(LID)模型的编码器组件从预训练的自动语音识别(ASR)网络中初始化,用于将词汇语音信息集成到语言识别中。ASR模型是基于U2++架构的编解码器网络[1];然后从预训练的ASR模型中继承共享的一致性编码器[2],该模型在全局信息捕获和局部不变性建模方面是有效的,在编码器上添加了注意的统计池化层和随后的线性投影层的LID模型被进一步微调,直到其最优。此外,数据扩充、分数归一化和模型集成是提高绩效指标的好策略,本文对此进行了详细的研究和分析。在OLR 2021挑战赛中,我们提交的系统在任务1和任务2中都排名第一,主要成绩分别为0.0025和0.0039,不到第二名1的1/3,这充分说明我们的多语言识别方法在现实场景中是有效和有竞争力的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Ant Multilingual Recognition System for OLR 2021 Challenge
This paper presents a comprehensive description of the Ant multilingual recognition system for the 6th Oriental Language Recognition(OLR 2021) Challenge. Inspired by the transfer learning scheme, the encoder components of language iden-tification(LID) model is initialized from pretrained automatic speech recognition(ASR) networks for integrating the lexical phonetic information into language identification. The ASR model is encoder-decoder networks based on U2++ architecture [1]; then inheriting the shared conformer encoder [2] from pretrained ASR model which is effective at global information capturing and local invariance modeling, the LID model, with an attentive statistical pooling layer and a following linear projection layer added on the encoder, is further finetuned until its optimum. Furthermore, data augmentation, score normalization and model ensemble are good strategies to improve performance indicators, which are investigated and analysed in detail within our paper. In the OLR 2021 Challenge, our submitted systems ranked the top in both tasks 1 and 2 with primary met-rics of 0.0025 and 0.0039 respectively, less than 1/3 of the second place 1 , which fully illustrates that our methodologies for multilingual identification are effectual and competitive in real-life scenarios.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信