2022 IEEE High Performance Extreme Computing Conference (HPEC)最新文献_第10页

Distributed Hardware Accelerated Secure Joint Computation on the COPA Framework 分布式硬件加速COPA框架下的安全联合计算

2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-04-11 DOI: 10.1109/HPEC55821.2022.9926388

Rushi Patel, Pouya Haghi, Shweta Jain, A. Kot, V. Krishnan, Mayank Varia, Martin C. Herbordt

{"title":"Distributed Hardware Accelerated Secure Joint Computation on the COPA Framework","authors":"Rushi Patel, Pouya Haghi, Shweta Jain, A. Kot, V. Krishnan, Mayank Varia, Martin C. Herbordt","doi":"10.1109/HPEC55821.2022.9926388","DOIUrl":"https://doi.org/10.1109/HPEC55821.2022.9926388","url":null,"abstract":"Performance of distributed data center applications can be improved through use of FPGA-based SmartNICs, which provide additional functionality and enable higher bandwidth communication and lower latency. Until lately, however, the lack of a simple approach for customizing SmartNICs to application requirements has limited the potential benefits. Intel's Configurable Network Protocol Accelerator (COPA) provides a customizable FPGA framework that integrates both hardware and software development to improve computation and commu-nication performance. In this first case study, we demonstrate the capabilities of the COPA framework with an application from cryptography - secure Multi-Party Computation (MPC) - that utilizes hardware accelerators connected directly to host memory and the COPA network. We find that using the COPA framework gives significant improvements to both computation and communication as compared to traditional implementations of MPC that use CPUs and NICs. A single MPC accelerator running on COPA enables more than 17Gb/s of communication bandwidth while using only 3% of Stratix 10 resources. We show that utilizing the COPA framework enables multiple MPC accelerators running in parallel to fully saturate a 100Gbps link enabling higher performance compared to traditional NICs.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130832726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GraphBLAS on the Edge: Anonymized High Performance Streaming of Network Traffic 边缘上的GraphBLAS:匿名高性能网络流量流

2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-03-25 DOI: 10.1109/HPEC55821.2022.9926332

Michael Jones, J. Kepner, Daniel Andersen, A. Buluç, C. Byun, K. Claffy, Tim Davis, W. Arcand, Jonathan Bernays, David Bestor, William Bergeron, V. Gadepally, Micheal Houle, M. Hubbell, Hayden Jananthan, Anna Klein, C. Meiners, Lauren Milechin, J. Mullen, Sandeep Pisharody, Andrew Prout, A. Reuther, Antonio Rosa, S. Samsi, Jon Sreekanth, Douglas Stetson, Charles Yee, P. Michaleas

{"title":"GraphBLAS on the Edge: Anonymized High Performance Streaming of Network Traffic","authors":"Michael Jones, J. Kepner, Daniel Andersen, A. Buluç, C. Byun, K. Claffy, Tim Davis, W. Arcand, Jonathan Bernays, David Bestor, William Bergeron, V. Gadepally, Micheal Houle, M. Hubbell, Hayden Jananthan, Anna Klein, C. Meiners, Lauren Milechin, J. Mullen, Sandeep Pisharody, Andrew Prout, A. Reuther, Antonio Rosa, S. Samsi, Jon Sreekanth, Douglas Stetson, Charles Yee, P. Michaleas","doi":"10.1109/HPEC55821.2022.9926332","DOIUrl":"https://doi.org/10.1109/HPEC55821.2022.9926332","url":null,"abstract":"Long range detection is a cornerstone of defense in many operating domains (land, sea, undersea, air, space,…,). In the cyber domain, long range detection requires the analysis of significant network traffic from a variety of observatories and outposts. Construction of anonymized hypersparse traffic matrices on edge network devices can be a key enabler by providing significant data compression in a rapidly analyzable format that protects privacy. GraphBLAS is ideally suited for both constructing and analyzing anonymized hypersparse traffic matrices. The performance of GraphBLAS on an Accolade Technologies edge network device is demonstrated on a near worse case traffic scenario using a continuous stream of CAIDA Telescope darknet packets. The performance for varying numbers of traffic buffers, threads, and processor cores is explored. Anonymized hypersparse traffic matrices can be constructed at a rate of over 50,000,000 packets per second; exceeding a typical 400 Gigabit network link. This performance demonstrates that anonymized hypersparse traffic matrices are readily computable on edge network devices with minimal compute resources and can be a viable data product for such devices.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121755340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Benchmarking Resource Usage for Efficient Distributed Deep Learning 高效分布式深度学习的资源使用基准测试

2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-01-28 DOI: 10.1109/HPEC55821.2022.9926375

Nathan C Frey, Baolin Li, Joseph McDonald, Dan Zhao, Michael Jones, David Bestor, Devesh Tiwari, V. Gadepally, S. Samsi

{"title":"Benchmarking Resource Usage for Efficient Distributed Deep Learning","authors":"Nathan C Frey, Baolin Li, Joseph McDonald, Dan Zhao, Michael Jones, David Bestor, Devesh Tiwari, V. Gadepally, S. Samsi","doi":"10.1109/HPEC55821.2022.9926375","DOIUrl":"https://doi.org/10.1109/HPEC55821.2022.9926375","url":null,"abstract":"Deep learning (DL) workflows demand an ever-increasing budget of compute and energy in order to achieve outsized gains. As such, it becomes essential to understand how different deep neural networks (DNNs) and training leverage increasing compute and energy resources-especially specialized computationally-intensive models across different domains and applications. In this paper, we conduct over 3,400 experiments training an array of deep networks representing various domains/tasks-natural language processing, computer vision, and chemistry-on up to 424 graphics processing units (GPUs). During training, our experiments systematically vary compute resource characteristics and energy -saving mechanisms such as power utilization and GPU clock rate limits to capture and illustrate the different trade-offs and scaling behaviors each representative model exhibits under various resource and energy-constrained regimes. We fit power law models that describe how training time scales with available compute resources and energy constraints. We anticipate that these findings will help inform and guide high-performance computing providers in optimizing resource utilization, by selectively reducing energy consumption for different deep learning tasks/workflows with minimal impact on training.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126664161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

A Hierarchical Jacobi Iteration for Structured Matrices on GPUs using Shared Memory 基于共享内存的gpu上结构化矩阵的层次Jacobi迭代

2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2020-06-30 DOI: 10.1109/HPEC55821.2022.9926410

M. S. Islam, Qiqi Wang

引用次数: 1