bit -小米的AutoSimTrans系统2022

Proceedings of the Third Workshop on Automatic Simultaneous Translation Pub Date : 1900-01-01 DOI:10.18653/v1/2022.autosimtrans-1.6

Mengge Liu, Xiang Li, Bao Chen, Yanzhi Tian, Tianwei Lan, Silin Li, Yuhang Guo, Jian Luan, Bin Wang

{"title":"bit -小米的AutoSimTrans系统2022","authors":"Mengge Liu, Xiang Li, Bao Chen, Yanzhi Tian, Tianwei Lan, Silin Li, Yuhang Guo, Jian Luan, Bin Wang","doi":"10.18653/v1/2022.autosimtrans-1.6","DOIUrl":null,"url":null,"abstract":"This system paper describes the BIT-Xiaomi simultaneous translation system for Autosimtrans 2022 simultaneous translation challenge. We participated in three tracks: the Zh-En text-to-text track, the Zh-En audio-to-text track and the En-Es test-to-text track. In our system, wait-k is employed to train prefix-to-prefix translation models. We integrate streaming chunking to detect boundaries as the source streaming read in. We further improve our system with data selection, data-augmentation and R-drop training methods. Results show that our wait-k implementation outperforms organizer’s baseline by 8 BLEU score at most, and our proposed streaming chunking method further improves about 2 BLEU in low latency regime.","PeriodicalId":444422,"journal":{"name":"Proceedings of the Third Workshop on Automatic Simultaneous Translation","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"BIT-Xiaomi’s System for AutoSimTrans 2022\",\"authors\":\"Mengge Liu, Xiang Li, Bao Chen, Yanzhi Tian, Tianwei Lan, Silin Li, Yuhang Guo, Jian Luan, Bin Wang\",\"doi\":\"10.18653/v1/2022.autosimtrans-1.6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This system paper describes the BIT-Xiaomi simultaneous translation system for Autosimtrans 2022 simultaneous translation challenge. We participated in three tracks: the Zh-En text-to-text track, the Zh-En audio-to-text track and the En-Es test-to-text track. In our system, wait-k is employed to train prefix-to-prefix translation models. We integrate streaming chunking to detect boundaries as the source streaming read in. We further improve our system with data selection, data-augmentation and R-drop training methods. Results show that our wait-k implementation outperforms organizer’s baseline by 8 BLEU score at most, and our proposed streaming chunking method further improves about 2 BLEU in low latency regime.\",\"PeriodicalId\":444422,\"journal\":{\"name\":\"Proceedings of the Third Workshop on Automatic Simultaneous Translation\",\"volume\":\"29 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Third Workshop on Automatic Simultaneous Translation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/2022.autosimtrans-1.6\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Third Workshop on Automatic Simultaneous Translation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2022.autosimtrans-1.6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

本系统论文介绍了bit -小米同声翻译系统在Autosimtrans 2022同声翻译挑战赛中的应用。我们参与了三个轨道:zhen文本到文本轨道，zhen音频到文本轨道和En-Es测试到文本轨道。在我们的系统中，使用wait-k来训练前缀到前缀的翻译模型。我们集成了流分块来检测源流读入时的边界。我们通过数据选择、数据增强和R-drop训练方法进一步改进我们的系统。结果表明，我们的wait-k实现最多比组织者的基线高出8个BLEU分数，并且我们提出的流分块方法在低延迟状态下进一步提高了约2个BLEU分数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

BIT-Xiaomi’s System for AutoSimTrans 2022

This system paper describes the BIT-Xiaomi simultaneous translation system for Autosimtrans 2022 simultaneous translation challenge. We participated in three tracks: the Zh-En text-to-text track, the Zh-En audio-to-text track and the En-Es test-to-text track. In our system, wait-k is employed to train prefix-to-prefix translation models. We integrate streaming chunking to detect boundaries as the source streaming read in. We further improve our system with data selection, data-augmentation and R-drop training methods. Results show that our wait-k implementation outperforms organizer’s baseline by 8 BLEU score at most, and our proposed streaming chunking method further improves about 2 BLEU in low latency regime.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Third Workshop on Automatic Simultaneous Translation

自引率

0.00%

发文量