{"title":"龙星3a四核SMP系统上OpenMP结构和内核基准性能评估","authors":"Qiuming Luo, Chang Kong, Ye Cai, Gang Liu","doi":"10.1109/PDCAT.2011.66","DOIUrl":null,"url":null,"abstract":"As a competitor and alternative to mainstream general-purpose CPU (Intel/AMD/etc.), Loongson is a family of general-purpose MIPS-compatible CPUs developed at the ICT of CAS in China. The quad-core Loongson 3A is evaluated in this paper. The performance of the basic OpenMP constructs on Loongson-3A quad-core SMP is obtained by applying the EPCC Micro benchmarks. And then the performance of NAS kernel codes is obtained by applying NAS Parallel Benchmarks (NPB). These benchmarking are carried out for three different OpenMP compilers (and the runtime system), which includes GCC, OMPipth (OMPi with pthread library) and OMPi-psth (OMPi with psthread library). The results show that OMPI-pth's performance is the best and OMPi-psth's performance is the worst. Those test results might help to program the OpenMP codes as well as to select the appropriate compiler and its runtime system. And an Intel core i5 quad-core platform is used for comparison purpose, by running NPB, which implies that Loongson 3A's performance is nearly one tenth of i5's. The NPB results can help to defining a Loongson system's scale when replacing an Intel i5 system for a given problem size.","PeriodicalId":137617,"journal":{"name":"2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Performance Evaluation of OpenMP Constructs and Kernel Benchmarks on a Loongson-3A Quad-Core SMP System\",\"authors\":\"Qiuming Luo, Chang Kong, Ye Cai, Gang Liu\",\"doi\":\"10.1109/PDCAT.2011.66\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As a competitor and alternative to mainstream general-purpose CPU (Intel/AMD/etc.), Loongson is a family of general-purpose MIPS-compatible CPUs developed at the ICT of CAS in China. The quad-core Loongson 3A is evaluated in this paper. The performance of the basic OpenMP constructs on Loongson-3A quad-core SMP is obtained by applying the EPCC Micro benchmarks. And then the performance of NAS kernel codes is obtained by applying NAS Parallel Benchmarks (NPB). These benchmarking are carried out for three different OpenMP compilers (and the runtime system), which includes GCC, OMPipth (OMPi with pthread library) and OMPi-psth (OMPi with psthread library). The results show that OMPI-pth's performance is the best and OMPi-psth's performance is the worst. Those test results might help to program the OpenMP codes as well as to select the appropriate compiler and its runtime system. And an Intel core i5 quad-core platform is used for comparison purpose, by running NPB, which implies that Loongson 3A's performance is nearly one tenth of i5's. The NPB results can help to defining a Loongson system's scale when replacing an Intel i5 system for a given problem size.\",\"PeriodicalId\":137617,\"journal\":{\"name\":\"2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-10-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDCAT.2011.66\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT.2011.66","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Performance Evaluation of OpenMP Constructs and Kernel Benchmarks on a Loongson-3A Quad-Core SMP System
As a competitor and alternative to mainstream general-purpose CPU (Intel/AMD/etc.), Loongson is a family of general-purpose MIPS-compatible CPUs developed at the ICT of CAS in China. The quad-core Loongson 3A is evaluated in this paper. The performance of the basic OpenMP constructs on Loongson-3A quad-core SMP is obtained by applying the EPCC Micro benchmarks. And then the performance of NAS kernel codes is obtained by applying NAS Parallel Benchmarks (NPB). These benchmarking are carried out for three different OpenMP compilers (and the runtime system), which includes GCC, OMPipth (OMPi with pthread library) and OMPi-psth (OMPi with psthread library). The results show that OMPI-pth's performance is the best and OMPi-psth's performance is the worst. Those test results might help to program the OpenMP codes as well as to select the appropriate compiler and its runtime system. And an Intel core i5 quad-core platform is used for comparison purpose, by running NPB, which implies that Loongson 3A's performance is nearly one tenth of i5's. The NPB results can help to defining a Loongson system's scale when replacing an Intel i5 system for a given problem size.