STAC-A2 on Intel Architecture: From Scalar Code to Heterogeneous Application

2014 Seventh Workshop on High Performance Computational Finance Pub Date : 2014-11-16 DOI:10.1109/WHPCF.2014.6

Evgeny Fiksman, S. Salahuddin

{"title":"STAC-A2 on Intel Architecture: From Scalar Code to Heterogeneous Application","authors":"Evgeny Fiksman, S. Salahuddin","doi":"10.1109/WHPCF.2014.6","DOIUrl":null,"url":null,"abstract":"STAC-A2™ is compute and memory intensive industry benchmark in the field of market risk analysis. The benchmark specifications were created by the Securities Technology Analysis Center (aka STAC®) and are based on inputs collected from the leading trading companies, universities, and high performance computing vendors. The specifications describe the models which represent realistic market risk analysis workloads. In this paper we discuss the development steps that lead to competitive performance of the STAC-A2 benchmark executed on systems consisting of Intel® Xeon® processor(s) and an Intel® Xeon Phi™ coprocessor. We show the importance of utilization of all parallel resources available on Intel architectures to achieve maximum performance. We demonstrate that the offload extension supported by Intel® Composer XE minimizes the efforts required to create accelerated applications by using only C/C++ language. With Intel's latest implementation of the STAC-A2 benchmark we were able to achieve a significant (800%) performance gain by using a heterogeneous approach running on two Intel Xeon E5-2699 v3 processors and a single Intel® Xeon Phi™ 7120A card, compared to earlier version running on only two Intel Xeon E5-2697 v2 processors. This implementation outperforms Nvidia's implementation based on an Intel Xeon processor based server with two NVIDIA* K20Xm cards.","PeriodicalId":368134,"journal":{"name":"2014 Seventh Workshop on High Performance Computational Finance","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 Seventh Workshop on High Performance Computational Finance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WHPCF.2014.6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

STAC-A2™ is compute and memory intensive industry benchmark in the field of market risk analysis. The benchmark specifications were created by the Securities Technology Analysis Center (aka STAC®) and are based on inputs collected from the leading trading companies, universities, and high performance computing vendors. The specifications describe the models which represent realistic market risk analysis workloads. In this paper we discuss the development steps that lead to competitive performance of the STAC-A2 benchmark executed on systems consisting of Intel® Xeon® processor(s) and an Intel® Xeon Phi™ coprocessor. We show the importance of utilization of all parallel resources available on Intel architectures to achieve maximum performance. We demonstrate that the offload extension supported by Intel® Composer XE minimizes the efforts required to create accelerated applications by using only C/C++ language. With Intel's latest implementation of the STAC-A2 benchmark we were able to achieve a significant (800%) performance gain by using a heterogeneous approach running on two Intel Xeon E5-2699 v3 processors and a single Intel® Xeon Phi™ 7120A card, compared to earlier version running on only two Intel Xeon E5-2697 v2 processors. This implementation outperforms Nvidia's implementation based on an Intel Xeon processor based server with two NVIDIA* K20Xm cards.

查看原文本刊更多论文

Intel架构上的STAC-A2:从标量代码到异构应用

STAC-A2™是市场风险分析领域的计算和内存密集型行业基准。基准规范由证券技术分析中心(又名STAC®)创建，并基于从领先的交易公司、大学和高性能计算供应商收集的输入。这些规范描述了代表现实市场风险分析工作负载的模型。在本文中，我们讨论了导致在由Intel®Xeon®处理器和Intel®Xeon Phi™协处理器组成的系统上执行的具有竞争力性能的STAC-A2基准的开发步骤。我们展示了利用英特尔架构上可用的所有并行资源以实现最大性能的重要性。我们演示了由Intel®Composer XE支持的卸载扩展，通过仅使用C/ c++语言来最大限度地减少创建加速应用程序所需的工作量。与仅在两个Intel Xeon E5-2697 v2处理器上运行的早期版本相比，通过使用异构方法在两个Intel Xeon E5-2699 v3处理器和单个Intel®Xeon Phi™7120A卡上运行，我们能够实现显着(800%)的性能提升。该实现优于Nvidia基于英特尔至强处理器的服务器和两个Nvidia * K20Xm卡的实现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 Seventh Workshop on High Performance Computational Finance

自引率

0.00%

发文量