{"title":"Automatic multilabel classification for Indonesian news articles","authors":"Dyah Rahmawati, M. L. Khodra","doi":"10.1109/ICAICTA.2015.7335382","DOIUrl":"https://doi.org/10.1109/ICAICTA.2015.7335382","url":null,"abstract":"Problem transformation and algorithm adaptation are the two main approaches in machine learning to solve multilabel classification problem. The purpose of this paper is to investigate both approaches in multilabel classification for Indonesian news articles. Since this classification deals with a large number of features, we also employ some feature selection methods to reduce feature dimension. There are four factors as the focuses of this paper, i.e., feature weighting method, feature selection method, multilabel classification approach, and single-label classification algorithm. These factors will be combined to determine the best combination. The experiments show that the best performer for multilabel classification of Indonesian news articles is the combination of TF-IDF feature weighting method, Symmetrical Uncertainty feature selection method, Calibrated Label Ranking - which belongs to problem transformation approach -, and SVM algorithm. This best combination achieves F-measure of 85.13% in 10-fold cross-validation, but the F-measure decreases to 76.73% in testing because of OOV.","PeriodicalId":319020,"journal":{"name":"2015 2nd International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129098963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}