Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region最新文献_第2页

AshPipe: Asynchronous Hybrid Pipeline Parallel for DNN Training AshPipe：用于 DNN 训练的异步混合管道并行技术

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Pub Date : 2024-01-18 DOI: 10.1145/3635035.3635045

Ryubu Hosoki, Toshio Endo, Takahiro Hirofuchi, Tsutomu Ikegami

引用次数: 0

QUBO formulation using inequalities for problems with complex constraints 利用不等式对具有复杂约束条件的问题进行 QUBO 表述

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Pub Date : 2024-01-18 DOI: 10.1145/3635035.3635042

Tomoko Komiyama, Tomohiro Suzuki

引用次数: 0

Bruck Algorithm Performance Analysis for Multi-GPU All-to-All Communication 多 GPU 全对全通信的布鲁克算法性能分析

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Pub Date : 2024-01-18 DOI: 10.1145/3635035.3635047

Andres Sewell, Ke Fan, Ahmedur Rahman Shovon, Landon Dyken, Sidharth Kumar, Steve Petruzza

引用次数: 0

Non-Blocking GPU-CPU Notifications to Enable More GPU-CPU Parallelism 无阻塞 GPU-CPU 通知，实现更多 GPU-CPU 并行性

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Pub Date : 2024-01-18 DOI: 10.1145/3635035.3635036

Bengisu Elis, Olga Pearce, David Boehme, J. Burmark, Martin Schulz

引用次数: 0

Parallelized Remapping Algorithms for km-scale Global Weather and Climate Simulations with Icosahedral Grid System 采用二十面体网格系统的千米级全球天气和气候模拟的并行重映射算法

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Pub Date : 2024-01-18 DOI: 10.1145/3635035.3635040

C. Kodama, H. Yashiro, Takashi Arakawa, Daisuke Takasuka, Shuhei Matsugishi, Hirofumi Tomita

引用次数: 0

Information Entropy-based Camera Focus Point and Zoom Level Adjustment for Smart In-Situ Visualization 基于信息熵的相机对焦点和变焦水平调整，实现智能现场可视化

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Pub Date : 2024-01-18 DOI: 10.1145/3635035.3635049

Taisei Matsushima, Ken Iwata, Naohisa Sakamoto, J. Nonaka, Chongke Bi

引用次数: 0

An Efficient Task-Parallel Pipeline Programming Framework 高效的任务并行流水线编程框架

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Pub Date : 2024-01-18 DOI: 10.1145/3635035.3635037

Cheng-Hsiang Chiu, Zhicheng Xiong, Zizheng Guo, Tsung-Wei Huang, Yibo Lin

引用次数: 1

Associative Operator Precedence Parsing: A Method To Increase Data Parsing Parallelism 关联运算符优先解析:一种增加数据解析并行性的方法

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Pub Date : 2023-02-27 DOI: 10.1145/3578178.3578233

Le Li, K. Taura

{"title":"Associative Operator Precedence Parsing: A Method To Increase Data Parsing Parallelism","authors":"Le Li, K. Taura","doi":"10.1145/3578178.3578233","DOIUrl":"https://doi.org/10.1145/3578178.3578233","url":null,"abstract":"Many data often come with a high volume in textual format (JSON, XML, CSV). Because parsing can easily dominate data analysis time, researchers have been working on parallelizing parsing. Operator Precedence Parsing (OPP), among candidate parsing methods, is amenable to parallelization, with a practical algorithm proposed. The “locally parsable” property allows the parser to deduce if a reduction is safe with limited context. However, when the grammar has productions that tend to produce a highly skewed parse tree, OPP raises reductions mostly in serial, and the parsing still suffers from a long critical path. In pactice, OPP has little or even no speedup when parsing data because data often contain high percentage of parallel elements (e.g., JSON array elements separated by commas) produced from such productions, a situation that frequently occurs when processing big data. To address this issue and scale textual data parsing, we propose a parsing algorithm that lifts the restriction of deterministic parsing. For an ambiguous grammar, the parser non-deterministically produces a subtree for parallel elements. Such parsers can still produce deterministic semantics when the operator that connects these subtrees is considered associative for data analysis (e.g., map-union). We thus name the algorithm Associative OPP (AOPP), where parsing a large sequence of parallel elements can enjoy much parallelism as reductions can happen in any order. We show that AOPP is of practical use and scales in most cases through textual data parsing.","PeriodicalId":314778,"journal":{"name":"Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121887171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Exploiting Data Parallelism in Graph-Based Simultaneous Localization and Mapping: A Case Study with GPU Accelerations 在基于图的同步定位和映射中利用数据并行性:基于GPU加速的案例研究

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Pub Date : 2023-02-27 DOI: 10.1145/3578178.3578237

Junyuan Zheng, Yuan He, Masaaki Kondo

{"title":"Exploiting Data Parallelism in Graph-Based Simultaneous Localization and Mapping: A Case Study with GPU Accelerations","authors":"Junyuan Zheng, Yuan He, Masaaki Kondo","doi":"10.1145/3578178.3578237","DOIUrl":"https://doi.org/10.1145/3578178.3578237","url":null,"abstract":"Graph-based simultaneous localization and mapping (G-SLAM) is an intuitive SLAM implementation where graphs are used to represent poses, landmarks and sensor measurements when a mobile robot builds a map of the environment and locates itself in it. Being a very important application employed in many realistic scenarios, estimating the whole environment and all trajectories through solving graph problems for SLAM can incur a large amount of computation and consume a significant amount of energy. For the purpose of improving both performance and energy efficiency, we have unveiled the critical path of the G-SLAM algorithm in this paper and implemented a GPU-based solution to aid it. Furthermore, we have attempted to offload performance-critical components (such as matrix inversions when updating the trajectory) in the G-SLAM process into GPUs through CUDA to exploit data parallelism. With our solution, we observe a speed-up of up to 19.7x and an energy saving of up to 83.7% over a modern workstation class x86 CPU; while on a platform dedicated for edge computing (NVIDIA Jetson Nano), we achieve a speed-up of up to 2.5x and an energy saving of up to 6.4% with its integrated GPU, respectively.","PeriodicalId":314778,"journal":{"name":"Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120945926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Memory Usage Prediction of HPC Workloads Using Feature Engineering and Machine Learning 基于特征工程和机器学习的高性能计算工作负载内存使用预测

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Pub Date : 2023-02-27 DOI: 10.1145/3578178.3578241

Md Nahid Newaz, Md Atiqul Mollah

引用次数: 1