Kanyanut Kriengket, Kanchana Saengthongpattana, Peerachet Porkaew, Vorapon Luantangsrisuk, P. Boonkwan, T. Supnithi
{"title":"复杂语法结构转换模型的行为分析","authors":"Kanyanut Kriengket, Kanchana Saengthongpattana, Peerachet Porkaew, Vorapon Luantangsrisuk, P. Boonkwan, T. Supnithi","doi":"10.1109/iSAI-NLP51646.2020.9376782","DOIUrl":null,"url":null,"abstract":"State-of-the-art neural MT, e.g. Transformer, yields quite promising translation accuracy. However, these models are easy to be interfered by noises, causing over- and undertranslation issues. This paper presents a behavioral analysis of Transformer models in translating complex grammatical structures, i.e. multiple-word expressions and long-distance dependency. Results consistently show that the more complex structures, the less translation accuracy the models yield. We imply that as phrase structures become more complex, the focus patterns learned by the attention mechanism may get erratically sporadic due to the issue of data sparseness. We suggest the use of locality penalty and the increase of attention heads to mitigate the issue, but their trade-offs should also be aware.","PeriodicalId":311014,"journal":{"name":"2020 15th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP)","volume":"4 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Behavioral Analysis of Transformer Models on Complex Grammatical Structures\",\"authors\":\"Kanyanut Kriengket, Kanchana Saengthongpattana, Peerachet Porkaew, Vorapon Luantangsrisuk, P. Boonkwan, T. Supnithi\",\"doi\":\"10.1109/iSAI-NLP51646.2020.9376782\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"State-of-the-art neural MT, e.g. Transformer, yields quite promising translation accuracy. However, these models are easy to be interfered by noises, causing over- and undertranslation issues. This paper presents a behavioral analysis of Transformer models in translating complex grammatical structures, i.e. multiple-word expressions and long-distance dependency. Results consistently show that the more complex structures, the less translation accuracy the models yield. We imply that as phrase structures become more complex, the focus patterns learned by the attention mechanism may get erratically sporadic due to the issue of data sparseness. We suggest the use of locality penalty and the increase of attention heads to mitigate the issue, but their trade-offs should also be aware.\",\"PeriodicalId\":311014,\"journal\":{\"name\":\"2020 15th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP)\",\"volume\":\"4 4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 15th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/iSAI-NLP51646.2020.9376782\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 15th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iSAI-NLP51646.2020.9376782","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Behavioral Analysis of Transformer Models on Complex Grammatical Structures
State-of-the-art neural MT, e.g. Transformer, yields quite promising translation accuracy. However, these models are easy to be interfered by noises, causing over- and undertranslation issues. This paper presents a behavioral analysis of Transformer models in translating complex grammatical structures, i.e. multiple-word expressions and long-distance dependency. Results consistently show that the more complex structures, the less translation accuracy the models yield. We imply that as phrase structures become more complex, the focus patterns learned by the attention mechanism may get erratically sporadic due to the issue of data sparseness. We suggest the use of locality penalty and the increase of attention heads to mitigate the issue, but their trade-offs should also be aware.