Analysis of Non-Negative Double Singular Value Decomposition Initialization Method on Eigenspace-based Fuzzy C-Means Algorithm for Indonesian Online News Topic Detection
{"title":"Analysis of Non-Negative Double Singular Value Decomposition Initialization Method on Eigenspace-based Fuzzy C-Means Algorithm for Indonesian Online News Topic Detection","authors":"Raden Trivan Sutrisman, H. Murfi","doi":"10.1109/ICOICT.2018.8528791","DOIUrl":null,"url":null,"abstract":"The rapid increasing of online news in Indonesia creates the need for news analysis to obtain information as fast as possible. Topics are basic components that are often used to analyze data in the textual forms, such as the news article. By using topic modeling, topics can be detected automatically on large news documents which are difficult to perform manually. One of the topic modeling that can be used is the clustering-based method, i.e., Eigenspace-based Fuzzy C-Means (EFCM). The common initialization method of EFCM is random. However, this random initialization usually produces different topics for each run. Therefore, we consider Non-Negative Double Singular Value Decomposition (NNDSVD) as an initialization method of EFCM. Besides the advantage of non-randomness, our simulations show that the NNDSVD method gives better accuracies in term of interpretability score than the random method.","PeriodicalId":266335,"journal":{"name":"2018 6th International Conference on Information and Communication Technology (ICoICT)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 6th International Conference on Information and Communication Technology (ICoICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOICT.2018.8528791","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The rapid increasing of online news in Indonesia creates the need for news analysis to obtain information as fast as possible. Topics are basic components that are often used to analyze data in the textual forms, such as the news article. By using topic modeling, topics can be detected automatically on large news documents which are difficult to perform manually. One of the topic modeling that can be used is the clustering-based method, i.e., Eigenspace-based Fuzzy C-Means (EFCM). The common initialization method of EFCM is random. However, this random initialization usually produces different topics for each run. Therefore, we consider Non-Negative Double Singular Value Decomposition (NNDSVD) as an initialization method of EFCM. Besides the advantage of non-randomness, our simulations show that the NNDSVD method gives better accuracies in term of interpretability score than the random method.