{"title":"A Mongolian-Chinese neural machine translation model based on Transformer’s two-branch gating structure","authors":"Genmao Zhang, Yonghong Tian, Jia Hao, Junjin Zhang","doi":"10.1109/IIP57348.2022.00085","DOIUrl":null,"url":null,"abstract":"To design a new two-branch gating structure for the existence of Transformer structure, by dividing the attention mechanism into two pieces, one part adopts the attention mechanism for global information capture and the other part adopts dynamic convolution for local information capture, and the captured two-branch is fused with features by way of gating mechanism to replace the attention mechanism and feedforward neural network, which makes the model fewer parameters and higher ability of capturing information. The experimental results show that our BLEU4 value is improved by 3. 07 compared to the Transformer structure.","PeriodicalId":412907,"journal":{"name":"2022 4th International Conference on Intelligent Information Processing (IIP)","volume":"131 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 4th International Conference on Intelligent Information Processing (IIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IIP57348.2022.00085","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
To design a new two-branch gating structure for the existence of Transformer structure, by dividing the attention mechanism into two pieces, one part adopts the attention mechanism for global information capture and the other part adopts dynamic convolution for local information capture, and the captured two-branch is fused with features by way of gating mechanism to replace the attention mechanism and feedforward neural network, which makes the model fewer parameters and higher ability of capturing information. The experimental results show that our BLEU4 value is improved by 3. 07 compared to the Transformer structure.