{"title":"Marking Mechanism in Sequence-to-sequence Model for Mapping Language to Logical Form","authors":"Phuong Minh Nguyen, Khoat Than, Minh Le Nguyen","doi":"10.1109/KSE.2019.8919471","DOIUrl":null,"url":null,"abstract":"Semantic parsing in Natural language processing (NLP) is a challenging task, which has been studied for many years. The main purpose is to model the language as a logical form like a machine translation task. Recently, an approach which uses a Neural network with Sequence to sequence model (Seq2seq) has achieved positive results. However, there are many challenges which have not been solved thoroughly yet, especially in the problem of rare words. Rare words in a natural sentence are usually the name of an object, a place or number, time, etc. Although these words are very various and difficult for the model to capture meaning, it holds a key information role in human communication (for example: name all the rivers in colorado ?). There are some methods to solve this problem such as using Attention or using Copy mechanism. However, these methods still difficult to copy phrase rare words, especially in case these phrases are variable in size. This paper proposes a novel approach to solve this problem, namely Marking mechanism in Seq2seq. The main idea is to label special words which are rare-words in a sentence by the encoder (marking step) and the decoder represents the logical form based on those labels (transforming step). Our experiments demonstrate that this approach works effectively, achieved a competitive result with old methods on all 3 datasets Geo, Atis, Jobs and special outperformed on our Artificial dataset.","PeriodicalId":439841,"journal":{"name":"2019 11th International Conference on Knowledge and Systems Engineering (KSE)","volume":"9 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 11th International Conference on Knowledge and Systems Engineering (KSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KSE.2019.8919471","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Semantic parsing in Natural language processing (NLP) is a challenging task, which has been studied for many years. The main purpose is to model the language as a logical form like a machine translation task. Recently, an approach which uses a Neural network with Sequence to sequence model (Seq2seq) has achieved positive results. However, there are many challenges which have not been solved thoroughly yet, especially in the problem of rare words. Rare words in a natural sentence are usually the name of an object, a place or number, time, etc. Although these words are very various and difficult for the model to capture meaning, it holds a key information role in human communication (for example: name all the rivers in colorado ?). There are some methods to solve this problem such as using Attention or using Copy mechanism. However, these methods still difficult to copy phrase rare words, especially in case these phrases are variable in size. This paper proposes a novel approach to solve this problem, namely Marking mechanism in Seq2seq. The main idea is to label special words which are rare-words in a sentence by the encoder (marking step) and the decoder represents the logical form based on those labels (transforming step). Our experiments demonstrate that this approach works effectively, achieved a competitive result with old methods on all 3 datasets Geo, Atis, Jobs and special outperformed on our Artificial dataset.