{"title":"基于混合输入的改进注意力转换器的机器翻译","authors":"M. Abrishami, Mohammad J. Rashti, M. Naderan","doi":"10.1109/ICWR49608.2020.9122317","DOIUrl":null,"url":null,"abstract":"Machine Translation (MT) refers to the automated software-based translation of natural language text. The embedded complexities and incompatibilities of natural languages have made MT a daunting task facing numerous challenges, especially when it is to be compared to a manual translation. With the emergence of deep-learning AI approaches, the Neural Machine Translation (NMT) has pushed MT results closer to human expectations. One of the newest deep learning approaches is the sequence-to-sequence approach based on Recurrent Neural Networks (RNN), complex convolutions, and transformers, and employing encoders/decoder pairs. In this study, an attention-based deep learning architecture is proposed for MT, with all layers focused exclusively on multi-head attention and based on a transformer that includes multi-layer encoders/decoders. The main contributions of the proposed model lie in the weighted combination of layers' primary input and output of the previous layers, feeding into the next layer. This mechanism results in a more accurate transformation compared to non-hybrid inputs. The model is evaluated using two datasets for German/English translation, the WMT'14 dataset for training, and the newstest'2012 dataset for testing. The experiments are run on GPD-equipped Google Colab instances and the results show an accuracy of 36.7 BLEU, a 5% improvement over the previous work without the hybrid-input technique.","PeriodicalId":231982,"journal":{"name":"2020 6th International Conference on Web Research (ICWR)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Machine Translation Using Improved Attention-based Transformer with Hybrid Input\",\"authors\":\"M. Abrishami, Mohammad J. Rashti, M. Naderan\",\"doi\":\"10.1109/ICWR49608.2020.9122317\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine Translation (MT) refers to the automated software-based translation of natural language text. The embedded complexities and incompatibilities of natural languages have made MT a daunting task facing numerous challenges, especially when it is to be compared to a manual translation. With the emergence of deep-learning AI approaches, the Neural Machine Translation (NMT) has pushed MT results closer to human expectations. One of the newest deep learning approaches is the sequence-to-sequence approach based on Recurrent Neural Networks (RNN), complex convolutions, and transformers, and employing encoders/decoder pairs. In this study, an attention-based deep learning architecture is proposed for MT, with all layers focused exclusively on multi-head attention and based on a transformer that includes multi-layer encoders/decoders. The main contributions of the proposed model lie in the weighted combination of layers' primary input and output of the previous layers, feeding into the next layer. This mechanism results in a more accurate transformation compared to non-hybrid inputs. The model is evaluated using two datasets for German/English translation, the WMT'14 dataset for training, and the newstest'2012 dataset for testing. The experiments are run on GPD-equipped Google Colab instances and the results show an accuracy of 36.7 BLEU, a 5% improvement over the previous work without the hybrid-input technique.\",\"PeriodicalId\":231982,\"journal\":{\"name\":\"2020 6th International Conference on Web Research (ICWR)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 6th International Conference on Web Research (ICWR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICWR49608.2020.9122317\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 6th International Conference on Web Research (ICWR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICWR49608.2020.9122317","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Machine Translation Using Improved Attention-based Transformer with Hybrid Input
Machine Translation (MT) refers to the automated software-based translation of natural language text. The embedded complexities and incompatibilities of natural languages have made MT a daunting task facing numerous challenges, especially when it is to be compared to a manual translation. With the emergence of deep-learning AI approaches, the Neural Machine Translation (NMT) has pushed MT results closer to human expectations. One of the newest deep learning approaches is the sequence-to-sequence approach based on Recurrent Neural Networks (RNN), complex convolutions, and transformers, and employing encoders/decoder pairs. In this study, an attention-based deep learning architecture is proposed for MT, with all layers focused exclusively on multi-head attention and based on a transformer that includes multi-layer encoders/decoders. The main contributions of the proposed model lie in the weighted combination of layers' primary input and output of the previous layers, feeding into the next layer. This mechanism results in a more accurate transformation compared to non-hybrid inputs. The model is evaluated using two datasets for German/English translation, the WMT'14 dataset for training, and the newstest'2012 dataset for testing. The experiments are run on GPD-equipped Google Colab instances and the results show an accuracy of 36.7 BLEU, a 5% improvement over the previous work without the hybrid-input technique.