Mustafa H. Hajeer, D. Dasgupta, Alexander Semenov, J. Veijalainen
{"title":"数据聚类和建模的分布式进化方法","authors":"Mustafa H. Hajeer, D. Dasgupta, Alexander Semenov, J. Veijalainen","doi":"10.1109/CIDM.2014.7008660","DOIUrl":null,"url":null,"abstract":"In this article we describe a framework (DEGA-Gen) for the application of distributed genetic algorithms for detection of communities in networks. The framework proposes efficient ways of encoding the network in the chromosomes, greatly optimizing the memory use and computations, resulting in a scalable framework. Different objective functions may be used for producing division of network into communities. The framework is implemented using open source implementation of MapReduce paradigm, Hadoop. We validate the framework by developing community detection algorithm, which uses modularity as measure of the division. Result of the algorithm is the network, partitioned into non-overlapping communities, in such a way, that network modularity is maximized. We apply the algorithm to well-known data sets, such as Zachary Karate club, bottlenose Dolphins network, College football dataset, and US political books dataset. Framework shows comparable results in achieved modularity; however, much less space is used for network representation in memory. Further, the framework is scalable and can deal with large graphs as it was tested on a larger youtube.com dataset.","PeriodicalId":117542,"journal":{"name":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Distributed evolutionary approach to data clustering and modeling\",\"authors\":\"Mustafa H. Hajeer, D. Dasgupta, Alexander Semenov, J. Veijalainen\",\"doi\":\"10.1109/CIDM.2014.7008660\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this article we describe a framework (DEGA-Gen) for the application of distributed genetic algorithms for detection of communities in networks. The framework proposes efficient ways of encoding the network in the chromosomes, greatly optimizing the memory use and computations, resulting in a scalable framework. Different objective functions may be used for producing division of network into communities. The framework is implemented using open source implementation of MapReduce paradigm, Hadoop. We validate the framework by developing community detection algorithm, which uses modularity as measure of the division. Result of the algorithm is the network, partitioned into non-overlapping communities, in such a way, that network modularity is maximized. We apply the algorithm to well-known data sets, such as Zachary Karate club, bottlenose Dolphins network, College football dataset, and US political books dataset. Framework shows comparable results in achieved modularity; however, much less space is used for network representation in memory. Further, the framework is scalable and can deal with large graphs as it was tested on a larger youtube.com dataset.\",\"PeriodicalId\":117542,\"journal\":{\"name\":\"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIDM.2014.7008660\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE Symposium on Computational Intelligence and Data Mining (CIDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIDM.2014.7008660","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Distributed evolutionary approach to data clustering and modeling
In this article we describe a framework (DEGA-Gen) for the application of distributed genetic algorithms for detection of communities in networks. The framework proposes efficient ways of encoding the network in the chromosomes, greatly optimizing the memory use and computations, resulting in a scalable framework. Different objective functions may be used for producing division of network into communities. The framework is implemented using open source implementation of MapReduce paradigm, Hadoop. We validate the framework by developing community detection algorithm, which uses modularity as measure of the division. Result of the algorithm is the network, partitioned into non-overlapping communities, in such a way, that network modularity is maximized. We apply the algorithm to well-known data sets, such as Zachary Karate club, bottlenose Dolphins network, College football dataset, and US political books dataset. Framework shows comparable results in achieved modularity; however, much less space is used for network representation in memory. Further, the framework is scalable and can deal with large graphs as it was tested on a larger youtube.com dataset.