Laith Mohammad Abualigah, A. Khader, M. Al-Betar, M. Awadallah
{"title":"A krill herd algorithm for efficient text documents clustering","authors":"Laith Mohammad Abualigah, A. Khader, M. Al-Betar, M. Awadallah","doi":"10.1109/ISCAIE.2016.7575039","DOIUrl":null,"url":null,"abstract":"Recently, due to the huge growth of web pages, social media and modern applications, text clustering technique has emerged as a significant task to deal with a huge amount of text documents. Some web pages are easily browsed and tidily presented via applying the clustering technique in order to partition the documents into a subset of homogeneous clusters. In this paper, two novel text clustering algorithms based on krill herd (KH) algorithm are proposed to improve the web text documents clustering. In the first method, the basic KH algorithm with all its operators is utilized while in the second method, the genetic operators in the basic KH algorithm are neglected. The performance of the proposed KH algorithms is analyzed and compared with the k-mean algorithm. The experiments were conducted using four standard benchmark text datasets. The results showed that the proposed KH algorithms outperformed the k-mean algorithm in term of clusters quality that is evaluated using two common clustering measures, namely, Purity and Entropy.","PeriodicalId":412517,"journal":{"name":"2016 IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE)","volume":"153 6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"54","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCAIE.2016.7575039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 54
Abstract
Recently, due to the huge growth of web pages, social media and modern applications, text clustering technique has emerged as a significant task to deal with a huge amount of text documents. Some web pages are easily browsed and tidily presented via applying the clustering technique in order to partition the documents into a subset of homogeneous clusters. In this paper, two novel text clustering algorithms based on krill herd (KH) algorithm are proposed to improve the web text documents clustering. In the first method, the basic KH algorithm with all its operators is utilized while in the second method, the genetic operators in the basic KH algorithm are neglected. The performance of the proposed KH algorithms is analyzed and compared with the k-mean algorithm. The experiments were conducted using four standard benchmark text datasets. The results showed that the proposed KH algorithms outperformed the k-mean algorithm in term of clusters quality that is evaluated using two common clustering measures, namely, Purity and Entropy.