Veena Gangadharan, Deepa Gupta, Amritha L, Athira T A
{"title":"基于深度神经网络的词嵌入技术释义检测","authors":"Veena Gangadharan, Deepa Gupta, Amritha L, Athira T A","doi":"10.1109/ICOEI48184.2020.9142877","DOIUrl":null,"url":null,"abstract":"This paper focuses on detecting paraphrase in sentences using different word vectorization techniques and finding which vectorization method is more efficient. Word vectorization is a technique which is used to retrieve information from large collection of textual data like corpus or documents by associating each word as a vector. As the textual data are massive, the problem with the text data is that it need to defined in a form of numbers for solving mathematical problems. There are elementary to composite methods to solve this problem. In this paper we are comparing different word vectorization techniques and they are, Count Vectorizer,Hashing Vectorizer, TF-IDF Vectorizer, fastText, ELMo, GloVe, BERT.","PeriodicalId":267795,"journal":{"name":"2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Paraphrase Detection Using Deep Neural Network Based Word Embedding Techniques\",\"authors\":\"Veena Gangadharan, Deepa Gupta, Amritha L, Athira T A\",\"doi\":\"10.1109/ICOEI48184.2020.9142877\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper focuses on detecting paraphrase in sentences using different word vectorization techniques and finding which vectorization method is more efficient. Word vectorization is a technique which is used to retrieve information from large collection of textual data like corpus or documents by associating each word as a vector. As the textual data are massive, the problem with the text data is that it need to defined in a form of numbers for solving mathematical problems. There are elementary to composite methods to solve this problem. In this paper we are comparing different word vectorization techniques and they are, Count Vectorizer,Hashing Vectorizer, TF-IDF Vectorizer, fastText, ELMo, GloVe, BERT.\",\"PeriodicalId\":267795,\"journal\":{\"name\":\"2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICOEI48184.2020.9142877\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOEI48184.2020.9142877","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Paraphrase Detection Using Deep Neural Network Based Word Embedding Techniques
This paper focuses on detecting paraphrase in sentences using different word vectorization techniques and finding which vectorization method is more efficient. Word vectorization is a technique which is used to retrieve information from large collection of textual data like corpus or documents by associating each word as a vector. As the textual data are massive, the problem with the text data is that it need to defined in a form of numbers for solving mathematical problems. There are elementary to composite methods to solve this problem. In this paper we are comparing different word vectorization techniques and they are, Count Vectorizer,Hashing Vectorizer, TF-IDF Vectorizer, fastText, ELMo, GloVe, BERT.