{"title":"Paraphrase Identification Using Dependency Tree and Word Embeddings","authors":"V. Vrublevskyi, O. Marchenko","doi":"10.1109/ATIT50783.2020.9349338","DOIUrl":null,"url":null,"abstract":"In this paper, we are trying to develop an efficient and simple model for detecting paraphrase sentences in the English language. The dependency tree was chosen as the main structure to represent the relationships between words in a sentence. To represent the word semantics, we are using pre-trained general-purpose word embeddings. Based on these two key components, we designed a few features that can help to identify paraphrases. Conducted experiments proved that the model is efficient and shows relatively close results to state-of-the-art models.","PeriodicalId":312916,"journal":{"name":"2020 IEEE 2nd International Conference on Advanced Trends in Information Theory (ATIT)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 2nd International Conference on Advanced Trends in Information Theory (ATIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ATIT50783.2020.9349338","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In this paper, we are trying to develop an efficient and simple model for detecting paraphrase sentences in the English language. The dependency tree was chosen as the main structure to represent the relationships between words in a sentence. To represent the word semantics, we are using pre-trained general-purpose word embeddings. Based on these two key components, we designed a few features that can help to identify paraphrases. Conducted experiments proved that the model is efficient and shows relatively close results to state-of-the-art models.