{"title":"A Structural Transformer with Relative Positions in Trees for Code-to-Sequence Tasks","authors":"Johannes Villmow, A. Ulges, Ulrich Schwanecke","doi":"10.1109/IJCNN52387.2021.9533717","DOIUrl":null,"url":null,"abstract":"We suggest two approaches to incorporate syntactic information into transformer models encoding trees (e.g. abstract syntax trees) and generating sequences. First, we use self-attention with relative position representations to consider structural relationships between nodes using a representation that encodes movements between any pair of nodes in the tree, and demonstrate how those movements can be computed efficiently on the fly. Second, we suggest an auxiliary loss enforcing the network to predict the lowest common ancestor of node pairs. We apply both methods to source code summarization tasks, where we outperform the state-of-the-art by up to 6 % F1. On natural language machine translation, our models yield competitive results. We also consistently outperform sequence-based transformers, and demonstrate that our method yields representations that are more closely aligned with the AST structure.","PeriodicalId":396583,"journal":{"name":"2021 International Joint Conference on Neural Networks (IJCNN)","volume":"5 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN52387.2021.9533717","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
We suggest two approaches to incorporate syntactic information into transformer models encoding trees (e.g. abstract syntax trees) and generating sequences. First, we use self-attention with relative position representations to consider structural relationships between nodes using a representation that encodes movements between any pair of nodes in the tree, and demonstrate how those movements can be computed efficiently on the fly. Second, we suggest an auxiliary loss enforcing the network to predict the lowest common ancestor of node pairs. We apply both methods to source code summarization tasks, where we outperform the state-of-the-art by up to 6 % F1. On natural language machine translation, our models yield competitive results. We also consistently outperform sequence-based transformers, and demonstrate that our method yields representations that are more closely aligned with the AST structure.