{"title":"A Hierarchical Model for the Analysis of Efficiency and Speed-Up of Multi-core Cluster-Computers","authors":"Heinz Kredel, H. Kruse, S. Richling","doi":"10.1109/3PGCIC.2015.49","DOIUrl":null,"url":null,"abstract":"We develop a simple hierarchical model for the performance analysis of compute clusters assembled from multi-core compute nodes connected by a (high-speed) network. The performance is described by the dimensionless speed-up and efficiency in dependence on important hardware and application parameters. The hardware parameters are the number of compute nodes and the bandwidth the network, together with the number of cores per node, the theoretical performance of each core and the bandwidth of the main memory. The application parameters are the total number of operations performed on a number of bytes and the total number of bytes communicated between the processing units. In order to exemplify our concept we apply it to the scalar product of vectors, matrix multiplication, Linpack and FFT. Our previous performance models are contained as special cases in the new more comprehensive approach.","PeriodicalId":395401,"journal":{"name":"2015 10th International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 10th International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/3PGCIC.2015.49","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
We develop a simple hierarchical model for the performance analysis of compute clusters assembled from multi-core compute nodes connected by a (high-speed) network. The performance is described by the dimensionless speed-up and efficiency in dependence on important hardware and application parameters. The hardware parameters are the number of compute nodes and the bandwidth the network, together with the number of cores per node, the theoretical performance of each core and the bandwidth of the main memory. The application parameters are the total number of operations performed on a number of bytes and the total number of bytes communicated between the processing units. In order to exemplify our concept we apply it to the scalar product of vectors, matrix multiplication, Linpack and FFT. Our previous performance models are contained as special cases in the new more comprehensive approach.