{"title":"Unsupervised Topic Detection based on 2D Vector Space model using Apriori Algorithm and NLP","authors":"Michael George","doi":"10.1109/ICDIM.2018.8846982","DOIUrl":null,"url":null,"abstract":"Topic modelling is an approach in data mining, use machine learning methods to discover patterns in large amount of unstructured text. It takes a collection of documents and group the words into clusters of words that we call Bag of words, and identify topics by using process of similarity. Topic modelling provides us with methods to organize, understand and summarize large collections of textual information. There are a lot of approaches have been exposed for Topic modelling, the most in use are Latent Semantic Analysis (LSA), Latent Dirichlet Allocation (LDA) and explicit semantic analysis (ESA). In our study we describing an approach to refine Topic detection based on 2d vector space model VSM by using Apriori algorithm along with Natural language processing, to form a better connected terms in vector space for clean engagement with the query.","PeriodicalId":120884,"journal":{"name":"2018 Thirteenth International Conference on Digital Information Management (ICDIM)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Thirteenth International Conference on Digital Information Management (ICDIM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDIM.2018.8846982","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Topic modelling is an approach in data mining, use machine learning methods to discover patterns in large amount of unstructured text. It takes a collection of documents and group the words into clusters of words that we call Bag of words, and identify topics by using process of similarity. Topic modelling provides us with methods to organize, understand and summarize large collections of textual information. There are a lot of approaches have been exposed for Topic modelling, the most in use are Latent Semantic Analysis (LSA), Latent Dirichlet Allocation (LDA) and explicit semantic analysis (ESA). In our study we describing an approach to refine Topic detection based on 2d vector space model VSM by using Apriori algorithm along with Natural language processing, to form a better connected terms in vector space for clean engagement with the query.