Takashi Soga, A. Musa, Y. Shimomura, Ryusuke Egawa, K. Itakura, H. Takizawa, Koki Okabe, Hiroaki Kobayashi
{"title":"Performance evaluation of NEC SX-9 using real science and engineering applications","authors":"Takashi Soga, A. Musa, Y. Shimomura, Ryusuke Egawa, K. Itakura, H. Takizawa, Koki Okabe, Hiroaki Kobayashi","doi":"10.1145/1654059.1654088","DOIUrl":null,"url":null,"abstract":"This paper describes a new-generation vector parallel supercomputer, NEC SX-9 system. The SX-9 processor has an outstanding core to achieve over 100Gflop/s, and a software-controllable on-chip cache to keep the high ratio of the memory bandwidth to the floating-point operation rate. Moreover, its large SMP nodes of 16 vector processors with 1.6Tflop/s performance and 1TB memory are connected with dedicated network switches, which can achieve inter-node communication at 128GB/s per direction. The sustained performance of the SX-9 processor is evaluated using six practical applications in comparison with conventional vector processors and the latest scalar processor such as Nehalem-EP. Based on the results, this paper discusses the performance tuning strategies for new-generation vector systems. An SX-9 system of 16 nodes is also evaluated by using the HPC challenge benchmark suite and a CFD code. Those evaluation results clarify the highest sustained performance and scalability of the SX-9 system.","PeriodicalId":371415,"journal":{"name":"Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"43","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1654059.1654088","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 43
Abstract
This paper describes a new-generation vector parallel supercomputer, NEC SX-9 system. The SX-9 processor has an outstanding core to achieve over 100Gflop/s, and a software-controllable on-chip cache to keep the high ratio of the memory bandwidth to the floating-point operation rate. Moreover, its large SMP nodes of 16 vector processors with 1.6Tflop/s performance and 1TB memory are connected with dedicated network switches, which can achieve inter-node communication at 128GB/s per direction. The sustained performance of the SX-9 processor is evaluated using six practical applications in comparison with conventional vector processors and the latest scalar processor such as Nehalem-EP. Based on the results, this paper discusses the performance tuning strategies for new-generation vector systems. An SX-9 system of 16 nodes is also evaluated by using the HPC challenge benchmark suite and a CFD code. Those evaluation results clarify the highest sustained performance and scalability of the SX-9 system.