Comparing the performance of group detection algorithm in serial and parallel processing environments

Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.) Pub Date : 2012-07-16 DOI:10.1145/2335755.2335817

Channing Brown, Iftekhar Ahmed, Y. D. Cai, M. S. Poole, Andrew Pilny, Yannick Atouba Ada

{"title":"Comparing the performance of group detection algorithm in serial and parallel processing environments","authors":"Channing Brown, Iftekhar Ahmed, Y. D. Cai, M. S. Poole, Andrew Pilny, Yannick Atouba Ada","doi":"10.1145/2335755.2335817","DOIUrl":null,"url":null,"abstract":"Developing an algorithm for group identification from a collection of individuals without grouping data has been getting significant attention because of the need for increased understanding of groups and teams in online environments. This study used space, time, task, and players' virtual behavioral indicators from a game database to develop an algorithm to detect groups over time. The group detection algorithm was primarily developed for a serial processing environment and later then modified to allow for parallel processing on Gordon. For a collection of data representing 192 days of game play (approximately 140 gigabytes of log data), the computation required 266 minutes for the major steps of the analysis when running on a single processor. The same computation required 25 minutes when running on Gordon with 16 processors. The provision of massive compute nodes and the rich shared memory environment on Gordon has improved the performance of our analysis by a factor of 11. Besides demonstrating the possibility to save time and effort, this study also highlights some lessons learned for transforming a serial detection algorithm to parallel environments.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"29 1","pages":"21:1-21:4"},"PeriodicalIF":0.0000,"publicationDate":"2012-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2335755.2335817","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Developing an algorithm for group identification from a collection of individuals without grouping data has been getting significant attention because of the need for increased understanding of groups and teams in online environments. This study used space, time, task, and players' virtual behavioral indicators from a game database to develop an algorithm to detect groups over time. The group detection algorithm was primarily developed for a serial processing environment and later then modified to allow for parallel processing on Gordon. For a collection of data representing 192 days of game play (approximately 140 gigabytes of log data), the computation required 266 minutes for the major steps of the analysis when running on a single processor. The same computation required 25 minutes when running on Gordon with 16 processors. The provision of massive compute nodes and the rich shared memory environment on Gordon has improved the performance of our analysis by a factor of 11. Besides demonstrating the possibility to save time and effort, this study also highlights some lessons learned for transforming a serial detection algorithm to parallel environments.

查看原文本刊更多论文

比较了组检测算法在串行和并行处理环境下的性能

由于需要增加对在线环境中的群体和团队的理解，在不分组数据的情况下，开发一种从个人集合中进行群体识别的算法受到了极大的关注。该研究利用游戏数据库中的空间、时间、任务和玩家的虚拟行为指标，开发出一种算法来检测群体。组检测算法主要是为串行处理环境开发的，后来修改为允许在Gordon上并行处理。对于代表192天游戏体验的数据集合(大约140 gb的日志数据)，当在单个处理器上运行时，计算需要266分钟来完成分析的主要步骤。同样的计算在有16个处理器的Gordon上运行需要25分钟。在Gordon上提供的大量计算节点和丰富的共享内存环境将我们的分析性能提高了11倍。除了展示节省时间和精力的可能性之外，本研究还强调了将串行检测算法转换为并行环境的一些经验教训。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)

自引率

0.00%

发文量