{"title":"Havens: Explicit reliable memory regions for HPC applications","authors":"Saurabh Hukerikar, C. Engelmann","doi":"10.1109/HPEC.2016.7761593","DOIUrl":"https://doi.org/10.1109/HPEC.2016.7761593","url":null,"abstract":"Supporting error resilience in future exascale-class supercomputing systems is a critical challenge. Due to transistor scaling trends and increasing memory density, scientific simulations are expected to experience more interruptions caused by transient errors in the system memory. Existing hardware-based detection and recovery techniques will be inadequate to manage the presence of high memory fault rates. In this paper we propose a partial memory protection scheme based on region-based memory management. We define the concept of regions called havens that provide fault protection for program objects. We provide reliability for the regions through a software-based parity protection mechanism. Our approach enables critical program objects to be placed in these havens. The fault coverage provided by our approach is application agnostic, unlike algorithm-based fault tolerance techniques.","PeriodicalId":308129,"journal":{"name":"2016 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124241537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"In-storage embedded accelerator for sparse pattern processing","authors":"S. Jun, H. Nguyen, V. Gadepally, Arvind","doi":"10.1109/HPEC.2016.7761588","DOIUrl":"https://doi.org/10.1109/HPEC.2016.7761588","url":null,"abstract":"We present a novel architecture for sparse pattern processing, using flash storage with embedded accelerators. Sparse pattern processing on large data sets is the essence of applications such as document search, natural language processing, bioinformatics, subgraph matching, machine learning, and graph processing. One slice of our prototype accelerator is capable of handling up to 1TB of data, and experiments show that it can outperform C/C++ software solutions on a 16-core system at a fraction of the power and cost; an optimized version of the accelerator can match the performance of a 48-core server.","PeriodicalId":308129,"journal":{"name":"2016 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115301250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mark Hodson, D. Fletcher, Dan Padilha, Tristan Cook
{"title":"Rapid prototyping with symbolic computation: Fast development of quantum annealing solutions","authors":"Mark Hodson, D. Fletcher, Dan Padilha, Tristan Cook","doi":"10.1109/HPEC.2016.7761632","DOIUrl":"https://doi.org/10.1109/HPEC.2016.7761632","url":null,"abstract":"Quantum computing promises to improve the speed and scalability of computations over that of classical computing hardware. At this early stage of quantum computer hardware development, software frameworks which support rapid prototyping of quantum solutions on small-scale hardware or simulators are necessary to explore the application of quantum algorithms to hard computational problems. We present a software library, “QxLib,” which incorporates symbolic computation of optimization functions for quantum annealers as one means to enable rapid prototyping. We demonstrate its effectiveness on integer linear programming and integer factorization problems.","PeriodicalId":308129,"journal":{"name":"2016 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"165 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127539091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data transformation and migration in polystores","authors":"Adam Dziedzic, Aaron J. Elmore, M. Stonebraker","doi":"10.1109/HPEC.2016.7761594","DOIUrl":"https://doi.org/10.1109/HPEC.2016.7761594","url":null,"abstract":"Ever increasing data size and new requirements in data processing has fostered the development of many new database systems. The result is that many data-intensive applications are underpinned by different engines. To enable data mobility there is a need to transfer data between systems easily and efficiently. We analyze the state-of-the-art of data migration and outline research opportunities for a rapid data transfer. Our experiments explore data migration between a diverse set of databases, including PostgreSQL, SciDB, S-Store and Accumulo. Each of the systems excels at specific application requirements, such as transactional processing, numerical computation, streaming data, and large scale text processing. Providing an efficient data migration tool is essential to take advantage of superior processing from that specialized databases. Our goal is to build such a data migration framework that will take advantage of recent advancement in hardware and software.","PeriodicalId":308129,"journal":{"name":"2016 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122075142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A scale-free structure for power-law graphs","authors":"R. Veras, Tze Meng Low, F. Franchetti","doi":"10.1109/HPEC.2016.7761608","DOIUrl":"https://doi.org/10.1109/HPEC.2016.7761608","url":null,"abstract":"Many real-world graphs, such as those that arise from the web, biology and transportation, appear random and without a structure that can be exploited for performance on modern computer architectures. However, these graphs have a scale-free graph topology that can be leveraged for locality. Existing sparse data formats are not designed to take advantage of this structure. They focus primarily on reducing storage requirements and improving the cost of certain matrix operations for these large data sets. Therefore, we propose a data structure for storing real-world scale-free graphs in a sparse and hierarchical fashion. By maintaining the structure of the graph, we preserve locality in the graph and in the cache. For synthetic scale-free graph data we outperform the state of the art for graphs with up to 107 non-zero edges.","PeriodicalId":308129,"journal":{"name":"2016 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"110 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120885190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jeffrey Caldwell, B. Marr, David Bloom, D. Thompson
{"title":"Optimizing simulation speed of FPGA model-based synthesis","authors":"Jeffrey Caldwell, B. Marr, David Bloom, D. Thompson","doi":"10.1109/HPEC.2016.7761627","DOIUrl":"https://doi.org/10.1109/HPEC.2016.7761627","url":null,"abstract":"FPGA capacity is quickly outpacing designer's productivity and limiting the ability to exploit FPGA processing resources. Model-based synthesis, where a high level behavioral model is used for fast design iteration, which is then synthesizable directly into FPGA object code has been proposed as a solution. Several orders of magnitude difference in simulation speed have been observed between different variants of behavioral and model-based design tools and thus understanding and optimizing the trade between simulation speed and abstraction is critical. A dynamic level of abstraction of the model is also examined to study trades between abstraction, simulation speed and accuracy. Mathworks' HDL Coder tool, hand optimized behavioral VHDL, and vendor optimized Xilinx's System Generator are compared. Results are shown of directly synthesizable models with up to 894X simulation speedups compared to hand coded HDL simulations and 4356X speedups compared to other model-based synthesis tools.","PeriodicalId":308129,"journal":{"name":"2016 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134624452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pierre-David Létourneau, M. H. Langston, R. Lethin
{"title":"A sparse multi-dimensional Fast Fourier Transform with stability to noise in the context of image processing and change detection","authors":"Pierre-David Létourneau, M. H. Langston, R. Lethin","doi":"10.1109/HPEC.2016.7761579","DOIUrl":"https://doi.org/10.1109/HPEC.2016.7761579","url":null,"abstract":"We present the sparse multidimensional FFT (sMFFT) for positive real vectors with application to image processing. Our algorithm works in any fixed dimension, requires an (almost)-optimal number of samples (O (Rlog (N/R))) and runs in O (Rlog (N/R)) complexity (to first order) for N unknowns and R nonzeros. It is stable to noise and exhibits an exponentially small probability of failure. Numerical results show sMFFT's large quantitative and qualitative strengths as compared to ℓ1-minimization for Compressive Sensing as well as advantages in the context of image processing and change detection.","PeriodicalId":308129,"journal":{"name":"2016 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131304416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael Booth, E. Dahl, M. Furtney, Steven P. Reinhardt
{"title":"Abstractions considered helpful: A tools architecture for quantum annealers","authors":"Michael Booth, E. Dahl, M. Furtney, Steven P. Reinhardt","doi":"10.1109/HPEC.2016.7761625","DOIUrl":"https://doi.org/10.1109/HPEC.2016.7761625","url":null,"abstract":"Today's usable quantum computers, variously known as adiabatic quantum computers or quantum annealers and exemplified by the D-Wave 2X™ system, have an instruction set architecture foreign to mainstream classical computers and thus require a new class of programming tools to enable their widespread use. We submit that well-chosen abstractions, each balancing the ability of high- and low-level tools to use it, will play an essential role in fostering a vibrant ecosystem of such new tools. We propose the virtual quadratic unconstrained binary optimization (vQUBO) problem as one such abstraction and describe our experience in implementing and using it. As one step toward an effective quantum computing ecosystem, we invite other tool developers to create complementary tools that map from user problems to the vQUBO form for end-to-end usability and performance.","PeriodicalId":308129,"journal":{"name":"2016 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124486358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analyzing heterogeneous computing architectures for ADAS and Mobile Imaging applications","authors":"Rafal Malewski, Markus Levy, P. Torelli","doi":"10.1109/HPEC.2016.7761611","DOIUrl":"https://doi.org/10.1109/HPEC.2016.7761611","url":null,"abstract":"This document describes a benchmark suite that utilizes real-world workloads from ADAS and Mobile Imaging, to stress various forms of compute resources on embedded heterogeneous architectures, for determination of optimal distribution of compute load across accelerators.","PeriodicalId":308129,"journal":{"name":"2016 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129377982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
John Meehan, S. Zdonik, Shaobo Tian, Yulong Tian, Nesime Tatbul, Adam Dziedzic, Aaron J. Elmore
{"title":"Integrating real-time and batch processing in a polystore","authors":"John Meehan, S. Zdonik, Shaobo Tian, Yulong Tian, Nesime Tatbul, Adam Dziedzic, Aaron J. Elmore","doi":"10.1109/HPEC.2016.7761585","DOIUrl":"https://doi.org/10.1109/HPEC.2016.7761585","url":null,"abstract":"This paper describes a stream processing engine called S-Store and its role in the BigDAWG polystore. Fundamentally, S-Store acts as a frontend processor that accepts input from multiple sources, and massages it into a form that has eliminated errors (data cleaning) and translates that input into a form that can be efficiently ingested into BigDAWG. S-Store also acts as an intelligent router that sends input tuples to the appropriate components of BigDAWG. All updates to S-Store's shared memory are done in a transactionally consistent (ACID) way, thereby eliminating new errors caused by non-synchronized reads and writes. The ability to migrate data from component to component of BigDAWG is crucial. We have described a migrator from S-Store to Postgres that we have implemented as a first proof of concept. We report some interesting results using this migrator that impact the evaluation of query plans.","PeriodicalId":308129,"journal":{"name":"2016 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128959983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}