{"title":"Likelihood calculation classification for Indonesian language news documents","authors":"Aini Rachmania, J. Jaafar, N. Zamin","doi":"10.1109/ICITEED.2013.6676229","DOIUrl":null,"url":null,"abstract":"Text categorization has been an important research area that seeks to classify textual documents into a group of predetermined categories. Unfortunately, the interest towards Indonesian news classification has been very little. In this paper, we propose a text categorization algorithm based on Bracewell method that uses the likelihood calculation between the article and the category's keywords. Through experiments, the algorithm succeeded in classifying Indonesian news corpus with accuracy as high as 93,84% in offline environment, 93,82% in online environment, and 80% benchmarking against human evaluation.","PeriodicalId":204082,"journal":{"name":"2013 International Conference on Information Technology and Electrical Engineering (ICITEE)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Information Technology and Electrical Engineering (ICITEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICITEED.2013.6676229","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Text categorization has been an important research area that seeks to classify textual documents into a group of predetermined categories. Unfortunately, the interest towards Indonesian news classification has been very little. In this paper, we propose a text categorization algorithm based on Bracewell method that uses the likelihood calculation between the article and the category's keywords. Through experiments, the algorithm succeeded in classifying Indonesian news corpus with accuracy as high as 93,84% in offline environment, 93,82% in online environment, and 80% benchmarking against human evaluation.