{"title":"在多核集群上并行处理数据密集型应用程序的性能问题","authors":"Vignesh T. Ravi, G. Agrawal","doi":"10.1109/CCGRID.2009.83","DOIUrl":null,"url":null,"abstract":"The deluge of available data for analysis demands the need to scale the performance of data mining implementations. With the current architectural trends, one of the major challenges today is achieving programmability and performance for data mining applications on multi-core machines and cluster of multi-core machines. To address this problem, we have been developing a runtime framework, FREERIDE, that enables parallel execution of data mining and data analysis tasks.The contributions of this paper are two-fold: 1) This paper describes and evaluates various shared-memory parallelization techniques developed in our run-time system on a cluster of multi-cores, and 2) We report on a detailed performance study to understand why certain parallelization techniques out-perform othertechniques for a particular application.","PeriodicalId":118263,"journal":{"name":"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":"{\"title\":\"Performance Issues in Parallelizing Data-Intensive Applications on a Multi-core Cluster\",\"authors\":\"Vignesh T. Ravi, G. Agrawal\",\"doi\":\"10.1109/CCGRID.2009.83\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The deluge of available data for analysis demands the need to scale the performance of data mining implementations. With the current architectural trends, one of the major challenges today is achieving programmability and performance for data mining applications on multi-core machines and cluster of multi-core machines. To address this problem, we have been developing a runtime framework, FREERIDE, that enables parallel execution of data mining and data analysis tasks.The contributions of this paper are two-fold: 1) This paper describes and evaluates various shared-memory parallelization techniques developed in our run-time system on a cluster of multi-cores, and 2) We report on a detailed performance study to understand why certain parallelization techniques out-perform othertechniques for a particular application.\",\"PeriodicalId\":118263,\"journal\":{\"name\":\"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-05-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"23\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCGRID.2009.83\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGRID.2009.83","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Performance Issues in Parallelizing Data-Intensive Applications on a Multi-core Cluster
The deluge of available data for analysis demands the need to scale the performance of data mining implementations. With the current architectural trends, one of the major challenges today is achieving programmability and performance for data mining applications on multi-core machines and cluster of multi-core machines. To address this problem, we have been developing a runtime framework, FREERIDE, that enables parallel execution of data mining and data analysis tasks.The contributions of this paper are two-fold: 1) This paper describes and evaluates various shared-memory parallelization techniques developed in our run-time system on a cluster of multi-cores, and 2) We report on a detailed performance study to understand why certain parallelization techniques out-perform othertechniques for a particular application.