High Performance Computational Finance最新文献

Intel® version of STAC-A2 benchmark: toward better performance with less effort Intel®版本的STAC-A2基准测试:以更少的努力获得更好的性能

High Performance Computational Finance Pub Date : 2013-11-18 DOI: 10.1145/2535557.2535566

Andrey Nikolaev, Ilya Burylov, S. Salahuddin

{"title":"Intel® version of STAC-A2 benchmark: toward better performance with less effort","authors":"Andrey Nikolaev, Ilya Burylov, S. Salahuddin","doi":"10.1145/2535557.2535566","DOIUrl":"https://doi.org/10.1145/2535557.2535566","url":null,"abstract":"Market risk analysis is a computationally intensive problem which requires powerful computing resources. To enable consistent comparisons of vendors' technologies in this area the Securities Technology Analysis Center (STAC*), with inputs from leading trading companies, universities, and high performance computing vendors, has created STAC-A2* specifications which describe realistic market risk analysis workloads.\u0000 In this paper we analyze and compare the performance of STAC-A2 workloads on two systems based on Intel® processors: Intel® Xeon® processor E5 family and Intel® Xeon Phi#8482; coprocessor. We show the importance of algorithmic optimizations and a few mathematical building blocks such as random number generation, mathematical functions and matrix multiplications on overall performance of the benchmark. We demonstrate that changes made in response to this analysis provide an additional ~1.6x performance improvement of the STAC-A2 benchmark on the Intel Xeon processor E5 family and up to ~15x performance improvement on Intel Xeon Phi coprocessor-based systems compared with the previous version of the benchmark. Intel Xeon Phi coprocessor architecture is ~1.10--1.38x faster than 16-core Intel Xeon processor E5 family-based systems, depending on the problem size, while the 32-core Intel Xeon processor E5 is the fastest among all analyzed platforms.","PeriodicalId":241950,"journal":{"name":"High Performance Computational Finance","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116215232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Optimizing IBM algorithmics' mark-to-future aggregation engine for real-time counterparty credit risk scoring 优化IBM算法的实时交易对手信用风险评分的“面向未来的标记”聚合引擎

High Performance Computational Finance Pub Date : 2013-11-18 DOI: 10.1145/2535557.2535567

Amy Wang, Jan Treibig, Bob Blainey, Peng Wu, Yaoqing Gao, Barnaby Dalton, D. Gupta, Fahham Khan, Neil Bartlett, Lior Velichover, James Sedgwick, Louis Ly

{"title":"Optimizing IBM algorithmics' mark-to-future aggregation engine for real-time counterparty credit risk scoring","authors":"Amy Wang, Jan Treibig, Bob Blainey, Peng Wu, Yaoqing Gao, Barnaby Dalton, D. Gupta, Fahham Khan, Neil Bartlett, Lior Velichover, James Sedgwick, Louis Ly","doi":"10.1145/2535557.2535567","DOIUrl":"https://doi.org/10.1145/2535557.2535567","url":null,"abstract":"The concept of default and its associated painful repercussions have been a particular area of focus for financial institutions, especially after the 2007/2008 global financial crisis. Counterparty credit risk (CCR), i.e. risk associated with a counterparty default prior to the expiration of a contract, has gained tremendous amount of attention which resulted in new CCR measures and regulations being introduced. In particular users would like to measure the potential impact of each real time trade or potential real time trade against exposure limits for the counterparty using Monte Carlo simulations of the trade value, and also calculate the Credit Value Adjustment (i.e, how much it will cost to cover the risk of default with this particular counterparty if/when the trade is made). These rapid limit checks and CVA calculations demand more compute power from the hardware. Furthermore, with the emergence of electronic trading, the extreme low latency and high throughput real time compute requirement push both the software and hardware capabilities to the limit. Our work focuses on optimizing the computation of risk measures and trade processing in the existing Mark-to-future Aggregation (MAG) engine in the IBM Algorithmics product offering. We propose a new software approach to speed up the end-to-end trade processing based on a pre-compiled approach. The net result is an impressive speed up of 3--5x over the existing MAG engine using a real client workload, for processing trades which perform limit check and CVA reporting on exposures while taking full collateral modelling into account.","PeriodicalId":241950,"journal":{"name":"High Performance Computational Finance","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127915949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Pricing American options with least squares Monte Carlo on GPUs 基于gpu的最小二乘蒙特卡罗美式期权定价

High Performance Computational Finance Pub Date : 2013-11-18 DOI: 10.1145/2535557.2535564

M. Fatica, E. Phillips

引用次数: 26

System architecture for on-line optimization of automated trading strategies 用于自动交易策略在线优化的系统架构

High Performance Computational Finance Pub Date : 2013-11-18 DOI: 10.1145/2535557.2535563

Fábio Daros Freitas, C. D. Freitas, A. D. Souza

引用次数: 2

Many-core architectures boost the pricing of basket options on adaptive sparse grids 多核架构提高了自适应稀疏网格上篮子期权的定价

High Performance Computational Finance Pub Date : 2013-11-18 DOI: 10.1145/2535557.2535560

A. Heinecke, J. Jepsen, H. Bungartz

引用次数: 3

Heterogeneous COS pricing of rainbow options 彩虹期权的异构COS定价

High Performance Computational Finance Pub Date : 2013-11-18 DOI: 10.1145/2535557.2535561

A. Cassagnes, Yu Chen, H. Ohashi

引用次数: 0

Accounting for secondary uncertainty: efficient computation of portfolio risk measures on multi and many core architectures 二次不确定性的核算:多核心体系结构上投资组合风险度量的有效计算

High Performance Computational Finance Pub Date : 2013-10-08 DOI: 10.1145/2535557.2535562

B. Varghese, A. Rau-Chaplin

{"title":"Accounting for secondary uncertainty: efficient computation of portfolio risk measures on multi and many core architectures","authors":"B. Varghese, A. Rau-Chaplin","doi":"10.1145/2535557.2535562","DOIUrl":"https://doi.org/10.1145/2535557.2535562","url":null,"abstract":"Aggregate Risk Analysis is a computationally intensive and a data intensive problem, thereby making the application of high-performance computing techniques interesting. In this paper, the design and implementation of a parallel Aggregate Risk Analysis algorithm on multi-core CPU and many-core GPU platforms are explored. The efficient computation of key risk measures, including Probable Maximum Loss (PML) and the Tail Value-at-Risk (TVaR) in the presence of both primary and secondary uncertainty for a portfolio of property catastrophe insurance treaties is considered. Primary Uncertainty is the the uncertainty associated with whether a catastrophe event occurs or not in a simulated year, while Secondary Uncertainty is the uncertainty in the amount of loss when the event occurs.\u0000 A number of statistical algorithms are investigated for computing secondary uncertainty. Numerous challenges such as loading large data onto hardware with limited memory and organising it are addressed. The results obtained from experimental studies are encouraging. Consider for example, an aggregate risk analysis involving 800,000 trials, with 1,000 catastrophic events per trial, a million locations, and a complex contract structure taking into account secondary uncertainty. The analysis can be performed in just 41 seconds on a GPU, that is 24x faster than the sequential counterpart on a fast multi-core CPU. The results indicate that GPUs can be used to efficiently accelerate aggregate risk analysis even in the presence of secondary uncertainty.","PeriodicalId":241950,"journal":{"name":"High Performance Computational Finance","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125697740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

DSL programmable engine for high frequency trading acceleration DSL可编程引擎，用于高频交易加速

High Performance Computational Finance Pub Date : 2011-11-13 DOI: 10.1145/2088256.2088268

Heiner Litz, Christian Leber, Benjamin Geib

引用次数: 4

Algorithmic complexity in the heston model: an implementation view 赫斯顿模型中的算法复杂性:一个实现视图

High Performance Computational Finance Pub Date : 2011-11-13 DOI: 10.1145/2088256.2088261

H. Marxen, A. Kostiuk, R. Korn, C. D. Schryver, S. Wurm, I. Shcherbakov, N. Wehn

引用次数: 3

Autotuning for high performance computing 用于高性能计算的自动调整

High Performance Computational Finance Pub Date : 2011-11-13 DOI: 10.1145/2088256.2088264

D. Padua

{"title":"Autotuning for high performance computing","authors":"D. Padua","doi":"10.1145/2088256.2088264","DOIUrl":"https://doi.org/10.1145/2088256.2088264","url":null,"abstract":"Program performance depends not only on the algorithms and data structures implemented in the program but also on coding parameters. These parameters include frequency and size of messages, shape of loop tiles, and minimum number of iterations required for parallel execution of a loop. Making the right selection of algorithms, data structures and coding parameters for a given target machine can be an onerous task in part because of the many machine parameters that must be taken into account and the interaction between these parameters. Important machine parameters include cache size, memory bandwidth, communication costs, and overhead. Furthermore, some of the selections must often be reassessed when porting to a different machine even when this machine does not differ significantly from the original target.\u0000 It is clearly advantageous to make use of tools and techniques that help reduce the initial effort of programming for performance as well as the cost of porting. The tool that comes first to mind is the compiler. Compilers were developed to enable machine independent programming and, to this end, apply powerful code generation and optimization strategies that take into account machine parameters. However, compilers not always suffice. They operate almost exclusively at the coding level and even at this low level they are not always effective. For example, compilers often fail to reorganize loops in the best manner when generating code for microprocessor vector extensions. Good use of these extensions today requires manual intervention.\u0000 Autotuning programs are those capable of generating one or several versions of a program. These versions could be derived from a parameterized program, or from descriptions at a higher level of abstraction that could take the form of algorithms or even problem specification. It is also desirable to take into account target machine parameters and the characteristics of the input data in the generation process.\u0000 When multiple versions are generated, one is selected at compile-time or at run time by carrying out an empirical search that executes the versions with representative data and measures program performance to guide the selection.\u0000 Autotuning programs can be written in conventional code such as Fortran, C, C++, or java, annotated with transformations that can be applied to the whole program or to code segments. Alternatively, autotuning programs can be written in a very high level declarative notation that represent algorithms or problems to be solved.\u0000 Although the initial cost of developing an autotuning program is higher than that of developing a conventional program, it has the advantage that much of the analysis required for the first target machine is done automatically and that it can be ported across machines and machine classes maintaining good performance.","PeriodicalId":241950,"journal":{"name":"High Performance Computational Finance","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128838540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1