基于课程的非自回归解码器的高效手语翻译

Pei-Ju Yu, Liang Zhang, Biao Fu, Yidong Chen
{"title":"基于课程的非自回归解码器的高效手语翻译","authors":"Pei-Ju Yu, Liang Zhang, Biao Fu, Yidong Chen","doi":"10.24963/ijcai.2023/584","DOIUrl":null,"url":null,"abstract":"Most existing studies on Sign Language Translation (SLT) employ AutoRegressive Decoding Mechanism (AR-DM) to generate target sentences. However, the main disadvantage of the AR-DM is high inference latency. To address this problem, we introduce Non-AutoRegressive Decoding Mechanism (NAR-DM) into SLT, which generates the whole sentence at once. Meanwhile, to improve its decoding ability, we integrate the advantages of curriculum learning and NAR-DM and propose a Curriculum-based NAR Decoder (CND). Specifically, the lower layers of the CND are expected to predict simple tokens that could be predicted correctly using source-side information solely. Meanwhile, the upper layers could predict complex tokens based on the lower layers' predictions. Therefore, our CND significantly reduces the model's inference latency while maintaining its competitive performance. Moreover, to further boost the performance of our CND, we propose a mutual learning framework, containing two decoders, i.e., an AR decoder and our CND. We jointly train the two decoders and minimize the KL divergence between their outputs, which enables our CND to learn the forward sequential knowledge from the strengthened AR decoder. Experimental results on PHOENIX2014T and CSL-Daily demonstrate that our model consistently outperforms all competitive baselines and achieves 7.92/8.02× speed-up compared to the AR SLT model respectively. Our source code is available at https://github.com/yp20000921/CND.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient Sign Language Translation with a Curriculum-based Non-autoregressive Decoder\",\"authors\":\"Pei-Ju Yu, Liang Zhang, Biao Fu, Yidong Chen\",\"doi\":\"10.24963/ijcai.2023/584\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most existing studies on Sign Language Translation (SLT) employ AutoRegressive Decoding Mechanism (AR-DM) to generate target sentences. However, the main disadvantage of the AR-DM is high inference latency. To address this problem, we introduce Non-AutoRegressive Decoding Mechanism (NAR-DM) into SLT, which generates the whole sentence at once. Meanwhile, to improve its decoding ability, we integrate the advantages of curriculum learning and NAR-DM and propose a Curriculum-based NAR Decoder (CND). Specifically, the lower layers of the CND are expected to predict simple tokens that could be predicted correctly using source-side information solely. Meanwhile, the upper layers could predict complex tokens based on the lower layers' predictions. Therefore, our CND significantly reduces the model's inference latency while maintaining its competitive performance. Moreover, to further boost the performance of our CND, we propose a mutual learning framework, containing two decoders, i.e., an AR decoder and our CND. We jointly train the two decoders and minimize the KL divergence between their outputs, which enables our CND to learn the forward sequential knowledge from the strengthened AR decoder. Experimental results on PHOENIX2014T and CSL-Daily demonstrate that our model consistently outperforms all competitive baselines and achieves 7.92/8.02× speed-up compared to the AR SLT model respectively. Our source code is available at https://github.com/yp20000921/CND.\",\"PeriodicalId\":394530,\"journal\":{\"name\":\"International Joint Conference on Artificial Intelligence\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Joint Conference on Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.24963/ijcai.2023/584\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Joint Conference on Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.24963/ijcai.2023/584","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

现有的手语翻译研究大多采用自回归译码机制(AR-DM)生成目标句。然而,AR-DM的主要缺点是高推理延迟。为了解决这一问题,我们在SLT中引入了非自回归解码机制(Non-AutoRegressive Decoding Mechanism, NAR-DM),该机制可以一次生成整个句子。同时,为了提高其解码能力,我们结合课程学习和NAR- dm的优势,提出了一种基于课程的NAR解码器(CND)。具体地说,CND的较低层应该预测可以仅使用源端信息正确预测的简单令牌。同时,上层可以根据下层的预测预测复杂的令牌。因此,我们的CND在保持其竞争性能的同时显著降低了模型的推理延迟。此外,为了进一步提高我们的CND的性能,我们提出了一个相互学习框架,包含两个解码器,即AR解码器和我们的CND。我们联合训练两个解码器,并最小化它们输出之间的KL散度,使我们的CND能够从增强的AR解码器中学习前向顺序知识。在PHOENIX2014T和CSL-Daily上的实验结果表明,我们的模型始终优于所有竞争基准,与AR SLT模型相比,分别实现了7.92/8.02倍的加速。我们的源代码可从https://github.com/yp20000921/CND获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Efficient Sign Language Translation with a Curriculum-based Non-autoregressive Decoder
Most existing studies on Sign Language Translation (SLT) employ AutoRegressive Decoding Mechanism (AR-DM) to generate target sentences. However, the main disadvantage of the AR-DM is high inference latency. To address this problem, we introduce Non-AutoRegressive Decoding Mechanism (NAR-DM) into SLT, which generates the whole sentence at once. Meanwhile, to improve its decoding ability, we integrate the advantages of curriculum learning and NAR-DM and propose a Curriculum-based NAR Decoder (CND). Specifically, the lower layers of the CND are expected to predict simple tokens that could be predicted correctly using source-side information solely. Meanwhile, the upper layers could predict complex tokens based on the lower layers' predictions. Therefore, our CND significantly reduces the model's inference latency while maintaining its competitive performance. Moreover, to further boost the performance of our CND, we propose a mutual learning framework, containing two decoders, i.e., an AR decoder and our CND. We jointly train the two decoders and minimize the KL divergence between their outputs, which enables our CND to learn the forward sequential knowledge from the strengthened AR decoder. Experimental results on PHOENIX2014T and CSL-Daily demonstrate that our model consistently outperforms all competitive baselines and achieves 7.92/8.02× speed-up compared to the AR SLT model respectively. Our source code is available at https://github.com/yp20000921/CND.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信