{"title":"对文本数据分析中词序相关性的洞察","authors":"R. Menon, R. Akhil dev, Sreehari G Bhattathiri","doi":"10.1109/ICCMC48092.2020.ICCMC-00040","DOIUrl":null,"url":null,"abstract":"Sentence ordering and word ordering is always remaining as a critical task for natural language processing applications. It is expected that introduction of word order information will lead to improvements in document related tasks like keyword extraction, context identification, topic analysis, intent identification, summary generation, document classification, sentiment analysis, clustering etc. In this paper, we are maintaining the structure of the document data by using various deep learning techniques. Most of the techniques can be compared on the basis of vector similarity. The proposed research work helps to improve the accuracy on the basis of the order of word occurrence. We also compare different types of word ordering techniques to maintain the structure of the document. The obtained results indicate that Doc2Vec model outperforms Tfidf model in terms of word order similarity.","PeriodicalId":130581,"journal":{"name":"2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"An Insight into the Relevance of Word Ordering for Text Data Analysis\",\"authors\":\"R. Menon, R. Akhil dev, Sreehari G Bhattathiri\",\"doi\":\"10.1109/ICCMC48092.2020.ICCMC-00040\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sentence ordering and word ordering is always remaining as a critical task for natural language processing applications. It is expected that introduction of word order information will lead to improvements in document related tasks like keyword extraction, context identification, topic analysis, intent identification, summary generation, document classification, sentiment analysis, clustering etc. In this paper, we are maintaining the structure of the document data by using various deep learning techniques. Most of the techniques can be compared on the basis of vector similarity. The proposed research work helps to improve the accuracy on the basis of the order of word occurrence. We also compare different types of word ordering techniques to maintain the structure of the document. The obtained results indicate that Doc2Vec model outperforms Tfidf model in terms of word order similarity.\",\"PeriodicalId\":130581,\"journal\":{\"name\":\"2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCMC48092.2020.ICCMC-00040\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCMC48092.2020.ICCMC-00040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Insight into the Relevance of Word Ordering for Text Data Analysis
Sentence ordering and word ordering is always remaining as a critical task for natural language processing applications. It is expected that introduction of word order information will lead to improvements in document related tasks like keyword extraction, context identification, topic analysis, intent identification, summary generation, document classification, sentiment analysis, clustering etc. In this paper, we are maintaining the structure of the document data by using various deep learning techniques. Most of the techniques can be compared on the basis of vector similarity. The proposed research work helps to improve the accuracy on the basis of the order of word occurrence. We also compare different types of word ordering techniques to maintain the structure of the document. The obtained results indicate that Doc2Vec model outperforms Tfidf model in terms of word order similarity.