Craig A. Lee, C. DeMatteis, J. Stepanek, John Wang
{"title":"集群性能及其对分布式异构网格性能的影响","authors":"Craig A. Lee, C. DeMatteis, J. Stepanek, John Wang","doi":"10.1109/HCW.2000.843749","DOIUrl":null,"url":null,"abstract":"Examines the issues surrounding efficient execution in heterogeneous grid environments. The performances of a Linux cluster and a parallel supercomputer are initially compared using both benchmarks and an application. With an understanding of how benchmark and application performance is affected by processor and interconnect speed, a comparison is made with the bandwidth and latencies available in a tested grid. Of significant concern is the fact that the available communication bandwidth and latencies have a dynamic range of 3 to 4 orders of magnitude, while processor speeds have a range of about one-half order of magnitude. Also, while both processor speed and network bandwidth are increasing very rapidly, simple propagation delay will become more significant in the network latencies seen by many grid applications. That is to say, the pipes in a grid will be getting fatter but not commensurately shorter. How are we to effectively utilize such an infrastructure? Clearly, an attractive approach is to require sufficient concurrency in the application such that a coarse-grain, data-driven model of execution can be used to hide latencies while hopefully keeping context-switching overheads low. If the \"spatial component\" of an application is understood, then runtime systems could also apply established techniques like caching, compression, estimation and speculative pre-fetching. Ideally, this low-level performance management should be encapsulated in an easy-to-use abstraction.","PeriodicalId":351836,"journal":{"name":"Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556)","volume":"242 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Cluster performance and the implications for distributed, heterogeneous grid performance\",\"authors\":\"Craig A. Lee, C. DeMatteis, J. Stepanek, John Wang\",\"doi\":\"10.1109/HCW.2000.843749\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Examines the issues surrounding efficient execution in heterogeneous grid environments. The performances of a Linux cluster and a parallel supercomputer are initially compared using both benchmarks and an application. With an understanding of how benchmark and application performance is affected by processor and interconnect speed, a comparison is made with the bandwidth and latencies available in a tested grid. Of significant concern is the fact that the available communication bandwidth and latencies have a dynamic range of 3 to 4 orders of magnitude, while processor speeds have a range of about one-half order of magnitude. Also, while both processor speed and network bandwidth are increasing very rapidly, simple propagation delay will become more significant in the network latencies seen by many grid applications. That is to say, the pipes in a grid will be getting fatter but not commensurately shorter. How are we to effectively utilize such an infrastructure? Clearly, an attractive approach is to require sufficient concurrency in the application such that a coarse-grain, data-driven model of execution can be used to hide latencies while hopefully keeping context-switching overheads low. If the \\\"spatial component\\\" of an application is understood, then runtime systems could also apply established techniques like caching, compression, estimation and speculative pre-fetching. Ideally, this low-level performance management should be encapsulated in an easy-to-use abstraction.\",\"PeriodicalId\":351836,\"journal\":{\"name\":\"Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556)\",\"volume\":\"242 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2000-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HCW.2000.843749\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HCW.2000.843749","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Cluster performance and the implications for distributed, heterogeneous grid performance
Examines the issues surrounding efficient execution in heterogeneous grid environments. The performances of a Linux cluster and a parallel supercomputer are initially compared using both benchmarks and an application. With an understanding of how benchmark and application performance is affected by processor and interconnect speed, a comparison is made with the bandwidth and latencies available in a tested grid. Of significant concern is the fact that the available communication bandwidth and latencies have a dynamic range of 3 to 4 orders of magnitude, while processor speeds have a range of about one-half order of magnitude. Also, while both processor speed and network bandwidth are increasing very rapidly, simple propagation delay will become more significant in the network latencies seen by many grid applications. That is to say, the pipes in a grid will be getting fatter but not commensurately shorter. How are we to effectively utilize such an infrastructure? Clearly, an attractive approach is to require sufficient concurrency in the application such that a coarse-grain, data-driven model of execution can be used to hide latencies while hopefully keeping context-switching overheads low. If the "spatial component" of an application is understood, then runtime systems could also apply established techniques like caching, compression, estimation and speculative pre-fetching. Ideally, this low-level performance management should be encapsulated in an easy-to-use abstraction.