Seyed Vahid Moravvej, Mehdi Joodaki, Mohammad Javad Maleki Kahaki, Moein Salimi Sartakhti
{"title":"A method Based on an Attention Mechanism to Measure the Similarity of two Sentences","authors":"Seyed Vahid Moravvej, Mehdi Joodaki, Mohammad Javad Maleki Kahaki, Moein Salimi Sartakhti","doi":"10.1109/ICWR51868.2021.9443135","DOIUrl":null,"url":null,"abstract":"Bidirectional LSTMs and the attention mechanism play an essential role in many areas of natural language processing. Many studies give equal importance to words, which leads to a flawed model. This research offers a method based on Attention-Based Bidirectional Long-Short Term Memory (BLSTM) to solve the problem of plagiarism at the sentence level. For this purpose, word embedding is first made with Glove and Word2Vec methods and is considered as initial embedding. Then the two BLSTM networks are used separately for sentence embedding. Finally, the embedding of sentences and their differences are connected and passed through a classifier. We evaluate our model on two datasets of Persian and English. The evaluation results show the superiority of the proposed model over other compared methods.","PeriodicalId":377597,"journal":{"name":"2021 7th International Conference on Web Research (ICWR)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 7th International Conference on Web Research (ICWR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICWR51868.2021.9443135","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18
Abstract
Bidirectional LSTMs and the attention mechanism play an essential role in many areas of natural language processing. Many studies give equal importance to words, which leads to a flawed model. This research offers a method based on Attention-Based Bidirectional Long-Short Term Memory (BLSTM) to solve the problem of plagiarism at the sentence level. For this purpose, word embedding is first made with Glove and Word2Vec methods and is considered as initial embedding. Then the two BLSTM networks are used separately for sentence embedding. Finally, the embedding of sentences and their differences are connected and passed through a classifier. We evaluate our model on two datasets of Persian and English. The evaluation results show the superiority of the proposed model over other compared methods.