2022 IEEE High Performance Extreme Computing Conference (HPEC)最新文献_第3页

The Viability of Using Online Prediction to Perform Extra Work while Executing BSP Applications 在执行BSP应用程序时使用在线预测执行额外工作的可行性

2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926405

P. Chen, Pouya Haghi, J.-Y. Chung, Tong Geng, R. West, A. Skjellum, Martin C. Herbordt

引用次数: 1

Powering Practical Performance: Accelerated Numerical Computing in Pure Python 增强实用性能:在纯Python加速数值计算

2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926309

Matthew Penn, Chris Milroy

{"title":"Powering Practical Performance: Accelerated Numerical Computing in Pure Python","authors":"Matthew Penn, Chris Milroy","doi":"10.1109/HPEC55821.2022.9926309","DOIUrl":"https://doi.org/10.1109/HPEC55821.2022.9926309","url":null,"abstract":"In this paper, we tackle a generic n-dimensional numerical computing problem to compare performance and analyze tradeoffs between popular frameworks using open source Jupyter notebook examples. Most data science practitioners perform their work in Python because of its high-level abstraction and rich set of numerical computing libraries. However, the choice of library and methodology is driven by complexity-impacting constraints like problem size, latency, memory, physical size, weight, power, hardware, and others. To that end, we demonstrate that a wide selection of GPU-accelerated libraries (RAPIDS, CuPy, Numba, Dask), including the development of hand-tuned CUDA kernels, are accessible to data scientists without ever leaving Python. We address the Python developer community by showing C/C++ is not necessary to access single/multi-GPU acceleration for data science applications. We solve a common numerical computing problem - finding the closest point in array B from every point (and its index) in array A, requiring up to 8.8 trillion distance comparisons - on a GPU-equipped workstation without writing a line of C/C++.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114374474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Predicting Ankle Moment Trajectory with Adaptive Weighted Ensemble of LSTM Networks 利用 LSTM 网络的自适应加权集合预测踝关节力矩轨迹

2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926370

E. Grzesiak, Jennifer Sloboda, H. Siu

{"title":"Predicting Ankle Moment Trajectory with Adaptive Weighted Ensemble of LSTM Networks","authors":"E. Grzesiak, Jennifer Sloboda, H. Siu","doi":"10.1109/HPEC55821.2022.9926370","DOIUrl":"https://doi.org/10.1109/HPEC55821.2022.9926370","url":null,"abstract":"Estimations of ankle moments can provide clinically helpful information on the function of lower extremities and further lead to insight on patient rehabilitation and assistive wearable exoskeleton design. Current methods for estimating ankle moments leave room for improvement, with most recent cutting-edge methods relying on machine learning models trained on wearable sEMG and IMU data. While machine learning eliminates many practical challenges that troubled more traditional human body models for this application, we aim to expand on prior work that showed the feasibility of using LSTM models by employing an ensemble of LSTM networks. We present an adaptive weighted LSTM ensemble network and demonstrate its performance during standing, walking, running, and sprinting. Our result show that the LSTM ensemble outperformed every single LSTM model component within the ensemble. Across every activity, the ensemble reduced median root mean squared error (RMSE) by 0.0017-0.0053 N. m/kg, which is 2.7 – 10.3% lower than the best performing single LSTM model. Hypothesis testing revealed that most reductions in RMSE were statistically significant between the ensemble and other single models across all activities and subjects. Future work may analyze different trajectory lengths and different combinations of LSTM submodels within the ensemble.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125903796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Scalable Interactive Autonomous Navigation Simulations on HPC 基于HPC的可扩展交互式自主导航仿真

2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926384

W. Brewer, Joel U. Bretheim, John Kaniarz, Peilin Song, Burhman Q. Gates

{"title":"Scalable Interactive Autonomous Navigation Simulations on HPC","authors":"W. Brewer, Joel U. Bretheim, John Kaniarz, Peilin Song, Burhman Q. Gates","doi":"10.1109/HPEC55821.2022.9926384","DOIUrl":"https://doi.org/10.1109/HPEC55821.2022.9926384","url":null,"abstract":"We present our work of enabling HPC in an interactive real-time autonomy loop. The workflow consists of many different software components deployed within Singu-larity containers and communicating using both the Robotic Operating System's (ROS) publish-subscribe system and the Message Passing Interface (MPI). We use Singularity's container networking interface (CNI) to enable virtual networking within the containers, so that multiple containers can run the various components using different IP addresses on the same compute node. The Virtual Autonomous Navigation Environment Environmental Sensor Engine (VANE: ESE) is used for physically-realistic simulation of LIDAR along with the Autonomous Navigation Virtual Environment Laboratory (ANVEL) for vehicle simulation. VANE: ESE sends Velodyne UDP LIDAR packets directly to the Robotic Technology Kernel (RTK) and is distributed across multiple compute nodes via MPI along with OpenMP for shared memory parallelism within each compute node. The user interfaces with the navigation environment using an XFCE desk-top with virtual workspaces running over a VNC containerized deployment through a double-hop ssh tunnel, which uses noVNC (a JavaScript-based VNC client) to provide a browser-based client interface. We automate the complete launch process using a custom iLauncher plugin. We benchmark scalable performance with multiple vehicle simulations on four different HPC systems and discuss our findings.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127271769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AutoPager: Auto-tuning Memory-Mapped I/O Parameters in Userspace AutoPager:自动调优用户空间中的内存映射I/O参数

2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926409

Karim Youssef, Niteya Shah, M. Gokhale, R. Pearce, Wu-chun Feng

引用次数: 0

Edge-Connected Jaccard Similarity for Graph Link Prediction on FPGA FPGA图链路预测的边连通Jaccard相似度

2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926326

P. Sathre, Atharva Gondhalekar, Wu-chun Feng

{"title":"Edge-Connected Jaccard Similarity for Graph Link Prediction on FPGA","authors":"P. Sathre, Atharva Gondhalekar, Wu-chun Feng","doi":"10.1109/HPEC55821.2022.9926326","DOIUrl":"https://doi.org/10.1109/HPEC55821.2022.9926326","url":null,"abstract":"Graph analysis is a critical task in many fields, such as social networking, epidemiology, bioinformatics, and fraud de-tection. In particular, understanding and inferring relationships between graph elements lies at the core of many graph-based workloads. Real-world graph workloads and their associated data structures create irregular computational patterns that compli-cate the realization of high-performance kernels. Given these complications, there does not exist a de facto “best” architecture, language, or algorithmic approach that simultaneously balances performance, energy efficiency, portability, and productivity. In this paper, we realize different algorithms of edge-connected Jaccard similarity for graph link prediction and characterize their performance across a broad spectrum of graphs on an Intel Stratix 10 FPGA. By utilizing a high-level synthesis (HLS)-driven, high-productivity approach (via the C++-based SYCL language) we rapidly prototype two implementations - a from-scratch edge-centric version and a faithfully-ported commodity GPU implementation - which would have been intractable via a hardware description language. With these implementations, we further consider the benefit and necessity of four HLS-enabled optimizations, both in isolation and in concert - totaling seven distinct synthesized hardware pipelines. Leveraging real-world graphs of up to 516 million edges, we show empirically-measured speedups of up to 9.5 x over the initial HLS implementations when all optimizations work in concert.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130369622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

How to Prevent a Sick ASIC 如何预防ASIC生病

2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926305

W. Ellersick

引用次数: 0

Improved Distributed-memory Triangle Counting by Exploiting the Graph Structure 利用图结构改进分布式内存三角形计数

2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926376

Sayan Ghosh

{"title":"Improved Distributed-memory Triangle Counting by Exploiting the Graph Structure","authors":"Sayan Ghosh","doi":"10.1109/HPEC55821.2022.9926376","DOIUrl":"https://doi.org/10.1109/HPEC55821.2022.9926376","url":null,"abstract":"Graphs are ubiquitous in modeling complex systems and representing interactions between entities to uncover structural information of the domain. Traditionally, graph analytics workloads are challenging to efficiently scale (both strong and weak cases) on distributed memory due to the irregular memory-access driven nature (with little or no computations) of the meth-ods. The structure of graphs and their relative distribution over the processing elements poses another level of complexity, making it difficult to attain sustainable scalability across platforms. In this paper, we discuss enhancements to TriC, a distributed-memory implementation of graph triangle counting using Mes-sage Passing Interface (MPI), which was featured in the 2020 Graph Challenge competition. We have made some incremental enhancements to TriC, primarily adopting a user-defined buffering strategy to overcome the startup problem for large graphs (by fixing the memory for intermediate data), and experimenting with probabilistic data structures such as bloom filter to improve the query response time for assessing edge existence, at the expense of increasing the overall false positive rate. These adjustments have led to a modest improvements in most cases, as compared to the previous version.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134090681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Systolic Array based FPGA accelerator for Yolov3-tiny 基于收缩阵列的Yolov3-tiny FPGA加速器

2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926371

Prithvi Velicheti, Sivani Pentapati, Suresh Purini

引用次数: 1

Evaluation of a Novel Scratchpad Memory through Compiler Supported Simulation 通过编译器支持的仿真评估一种新型刮记板存储器

2022 IEEE High Performance Extreme Computing Conference (HPEC) Pub Date : 2022-09-19 DOI: 10.1109/HPEC55821.2022.9926335

Essa Imhmed, Jonathan J. Cook, Abdel-Hameed A. Badawy

引用次数: 1