A. Guran, M. Ganiz, H. S. Naiboglu, Halil Oguz Kaptikacti
{"title":"基于NMF的土耳其语文本聚类降维方法","authors":"A. Guran, M. Ganiz, H. S. Naiboglu, Halil Oguz Kaptikacti","doi":"10.1109/INISTA.2013.6577618","DOIUrl":null,"url":null,"abstract":"In this work, we analyze the effects of NMF based dimension reduction methods on clustering of Turkish documents by using k-means clustering algorithm. All experiments are conducted on two different datasets that we call Milliyet4c1k and 1150haber. The NMF based dimension reduction methods have two purposes: to reduce the original vector space by transformation and to reduce size and dimension by summarizing original documents. Experimental results show that NMF transformation yields to better clustering results on both datasets. Using k-means on summarized documents produces almost identical result with k-means on original documents. Although using summaries instead of full documents doesn't improve quality of clustering, we show that it significantly reduces the size of the processed data and execution time of k-means clustering algorithm.","PeriodicalId":301458,"journal":{"name":"2013 IEEE INISTA","volume":"129 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"NMF based dimension reduction methods for Turkish text clustering\",\"authors\":\"A. Guran, M. Ganiz, H. S. Naiboglu, Halil Oguz Kaptikacti\",\"doi\":\"10.1109/INISTA.2013.6577618\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work, we analyze the effects of NMF based dimension reduction methods on clustering of Turkish documents by using k-means clustering algorithm. All experiments are conducted on two different datasets that we call Milliyet4c1k and 1150haber. The NMF based dimension reduction methods have two purposes: to reduce the original vector space by transformation and to reduce size and dimension by summarizing original documents. Experimental results show that NMF transformation yields to better clustering results on both datasets. Using k-means on summarized documents produces almost identical result with k-means on original documents. Although using summaries instead of full documents doesn't improve quality of clustering, we show that it significantly reduces the size of the processed data and execution time of k-means clustering algorithm.\",\"PeriodicalId\":301458,\"journal\":{\"name\":\"2013 IEEE INISTA\",\"volume\":\"129 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE INISTA\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INISTA.2013.6577618\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE INISTA","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INISTA.2013.6577618","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
NMF based dimension reduction methods for Turkish text clustering
In this work, we analyze the effects of NMF based dimension reduction methods on clustering of Turkish documents by using k-means clustering algorithm. All experiments are conducted on two different datasets that we call Milliyet4c1k and 1150haber. The NMF based dimension reduction methods have two purposes: to reduce the original vector space by transformation and to reduce size and dimension by summarizing original documents. Experimental results show that NMF transformation yields to better clustering results on both datasets. Using k-means on summarized documents produces almost identical result with k-means on original documents. Although using summaries instead of full documents doesn't improve quality of clustering, we show that it significantly reduces the size of the processed data and execution time of k-means clustering algorithm.