{"title":"基于特征空间的模糊c均值算法非负双奇异值分解初始化方法在印尼语在线新闻主题检测中的应用","authors":"Raden Trivan Sutrisman, H. Murfi","doi":"10.1109/ICOICT.2018.8528791","DOIUrl":null,"url":null,"abstract":"The rapid increasing of online news in Indonesia creates the need for news analysis to obtain information as fast as possible. Topics are basic components that are often used to analyze data in the textual forms, such as the news article. By using topic modeling, topics can be detected automatically on large news documents which are difficult to perform manually. One of the topic modeling that can be used is the clustering-based method, i.e., Eigenspace-based Fuzzy C-Means (EFCM). The common initialization method of EFCM is random. However, this random initialization usually produces different topics for each run. Therefore, we consider Non-Negative Double Singular Value Decomposition (NNDSVD) as an initialization method of EFCM. Besides the advantage of non-randomness, our simulations show that the NNDSVD method gives better accuracies in term of interpretability score than the random method.","PeriodicalId":266335,"journal":{"name":"2018 6th International Conference on Information and Communication Technology (ICoICT)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Analysis of Non-Negative Double Singular Value Decomposition Initialization Method on Eigenspace-based Fuzzy C-Means Algorithm for Indonesian Online News Topic Detection\",\"authors\":\"Raden Trivan Sutrisman, H. Murfi\",\"doi\":\"10.1109/ICOICT.2018.8528791\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The rapid increasing of online news in Indonesia creates the need for news analysis to obtain information as fast as possible. Topics are basic components that are often used to analyze data in the textual forms, such as the news article. By using topic modeling, topics can be detected automatically on large news documents which are difficult to perform manually. One of the topic modeling that can be used is the clustering-based method, i.e., Eigenspace-based Fuzzy C-Means (EFCM). The common initialization method of EFCM is random. However, this random initialization usually produces different topics for each run. Therefore, we consider Non-Negative Double Singular Value Decomposition (NNDSVD) as an initialization method of EFCM. Besides the advantage of non-randomness, our simulations show that the NNDSVD method gives better accuracies in term of interpretability score than the random method.\",\"PeriodicalId\":266335,\"journal\":{\"name\":\"2018 6th International Conference on Information and Communication Technology (ICoICT)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 6th International Conference on Information and Communication Technology (ICoICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICOICT.2018.8528791\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 6th International Conference on Information and Communication Technology (ICoICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOICT.2018.8528791","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Analysis of Non-Negative Double Singular Value Decomposition Initialization Method on Eigenspace-based Fuzzy C-Means Algorithm for Indonesian Online News Topic Detection
The rapid increasing of online news in Indonesia creates the need for news analysis to obtain information as fast as possible. Topics are basic components that are often used to analyze data in the textual forms, such as the news article. By using topic modeling, topics can be detected automatically on large news documents which are difficult to perform manually. One of the topic modeling that can be used is the clustering-based method, i.e., Eigenspace-based Fuzzy C-Means (EFCM). The common initialization method of EFCM is random. However, this random initialization usually produces different topics for each run. Therefore, we consider Non-Negative Double Singular Value Decomposition (NNDSVD) as an initialization method of EFCM. Besides the advantage of non-randomness, our simulations show that the NNDSVD method gives better accuracies in term of interpretability score than the random method.