{"title":"INGC: Graph Clustering & Outlier Detection Algorithm Using Label Propagation","authors":"Vandana Bhatia, Bharti Saneja, Rinkle Rani","doi":"10.1109/MLDS.2017.14","DOIUrl":null,"url":null,"abstract":"In the last decade, the size of data have increased at tremendous rate. To extract knowledgeable insights from this huge amount of data, data mining has to be done. To get the useful insights the connection in between data is sometimes of high interest. This connection can be efficiently represented as graphs. It provides an influential way to provide efficient illustrations for many applications spanning from biological networks, social networks to web networks. Graph mining techniques such as clustering and outlier detection can be beneficial in gathering the useful information. In this paper, an efficient influence based graph clustering and outlier detection algorithm (INGC) is proposed based on label propagation. The proposed algorithm improves the performance of the traditional Label Propagation algorithm by making it more robust. The proposed INGC saves time by labeling only high influential vertices of network. Further the labels are propagated among the rest of the nodes of network. And, the nodes with same vertex label are gathered to form a cluster. The vertices to which no label has been assigned during clustering are identified as outliers. Experiments were carried out on three real life graph datasets. It is shown that the proposed INGC outperforms the state-of art clustering algorithms in terms of F-Measure and Modularity. INGC also proved to be efficient in terms of detection rate of outliers.","PeriodicalId":248656,"journal":{"name":"2017 International Conference on Machine Learning and Data Science (MLDS)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Machine Learning and Data Science (MLDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MLDS.2017.14","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In the last decade, the size of data have increased at tremendous rate. To extract knowledgeable insights from this huge amount of data, data mining has to be done. To get the useful insights the connection in between data is sometimes of high interest. This connection can be efficiently represented as graphs. It provides an influential way to provide efficient illustrations for many applications spanning from biological networks, social networks to web networks. Graph mining techniques such as clustering and outlier detection can be beneficial in gathering the useful information. In this paper, an efficient influence based graph clustering and outlier detection algorithm (INGC) is proposed based on label propagation. The proposed algorithm improves the performance of the traditional Label Propagation algorithm by making it more robust. The proposed INGC saves time by labeling only high influential vertices of network. Further the labels are propagated among the rest of the nodes of network. And, the nodes with same vertex label are gathered to form a cluster. The vertices to which no label has been assigned during clustering are identified as outliers. Experiments were carried out on three real life graph datasets. It is shown that the proposed INGC outperforms the state-of art clustering algorithms in terms of F-Measure and Modularity. INGC also proved to be efficient in terms of detection rate of outliers.