{"title":"STAC-A2 on Intel Architecture: From Scalar Code to Heterogeneous Application","authors":"Evgeny Fiksman, S. Salahuddin","doi":"10.1109/WHPCF.2014.6","DOIUrl":null,"url":null,"abstract":"STAC-A2™ is compute and memory intensive industry benchmark in the field of market risk analysis. The benchmark specifications were created by the Securities Technology Analysis Center (aka STAC®) and are based on inputs collected from the leading trading companies, universities, and high performance computing vendors. The specifications describe the models which represent realistic market risk analysis workloads. In this paper we discuss the development steps that lead to competitive performance of the STAC-A2 benchmark executed on systems consisting of Intel® Xeon® processor(s) and an Intel® Xeon Phi™ coprocessor. We show the importance of utilization of all parallel resources available on Intel architectures to achieve maximum performance. We demonstrate that the offload extension supported by Intel® Composer XE minimizes the efforts required to create accelerated applications by using only C/C++ language. With Intel's latest implementation of the STAC-A2 benchmark we were able to achieve a significant (800%) performance gain by using a heterogeneous approach running on two Intel Xeon E5-2699 v3 processors and a single Intel® Xeon Phi™ 7120A card, compared to earlier version running on only two Intel Xeon E5-2697 v2 processors. This implementation outperforms Nvidia's implementation based on an Intel Xeon processor based server with two NVIDIA* K20Xm cards.","PeriodicalId":368134,"journal":{"name":"2014 Seventh Workshop on High Performance Computational Finance","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 Seventh Workshop on High Performance Computational Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WHPCF.2014.6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
STAC-A2™ is compute and memory intensive industry benchmark in the field of market risk analysis. The benchmark specifications were created by the Securities Technology Analysis Center (aka STAC®) and are based on inputs collected from the leading trading companies, universities, and high performance computing vendors. The specifications describe the models which represent realistic market risk analysis workloads. In this paper we discuss the development steps that lead to competitive performance of the STAC-A2 benchmark executed on systems consisting of Intel® Xeon® processor(s) and an Intel® Xeon Phi™ coprocessor. We show the importance of utilization of all parallel resources available on Intel architectures to achieve maximum performance. We demonstrate that the offload extension supported by Intel® Composer XE minimizes the efforts required to create accelerated applications by using only C/C++ language. With Intel's latest implementation of the STAC-A2 benchmark we were able to achieve a significant (800%) performance gain by using a heterogeneous approach running on two Intel Xeon E5-2699 v3 processors and a single Intel® Xeon Phi™ 7120A card, compared to earlier version running on only two Intel Xeon E5-2697 v2 processors. This implementation outperforms Nvidia's implementation based on an Intel Xeon processor based server with two NVIDIA* K20Xm cards.