Vishnu Murthy G, Vishnu Vardhan B, Sarangam K, V. P
{"title":"A Comparative study on Term Weighting Methods for Automated Telugu Text Categorization with Effective Classifiers","authors":"Vishnu Murthy G, Vishnu Vardhan B, Sarangam K, V. P","doi":"10.5121/IJDKP.2013.3606","DOIUrl":null,"url":null,"abstract":"Automatic Text categorization refers to the process of assigning a category or some categories automatically among predefined ones. Text categorization is challenging in Indian languages has rich in morphology, a large number of word forms and large feature spaces. This paper investigates the performance of different classification approaches using different term weighting approaches in order to decide the most applicable one to Telugu text classification problem. We have investigated on different term weighting methods for Telugu corpus in combination with Naive Bayes ( NB), Support Vector Machine (SVM) and k Nearest Neighbor (kNN) classifiers.","PeriodicalId":131153,"journal":{"name":"International Journal of Data Mining & Knowledge Management Process","volume":"129 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Data Mining & Knowledge Management Process","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5121/IJDKP.2013.3606","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
Automatic Text categorization refers to the process of assigning a category or some categories automatically among predefined ones. Text categorization is challenging in Indian languages has rich in morphology, a large number of word forms and large feature spaces. This paper investigates the performance of different classification approaches using different term weighting approaches in order to decide the most applicable one to Telugu text classification problem. We have investigated on different term weighting methods for Telugu corpus in combination with Naive Bayes ( NB), Support Vector Machine (SVM) and k Nearest Neighbor (kNN) classifiers.