Parallel Computing最新文献_第6页

Using heterogeneous GPU nodes with a Cabana-based implementation of MPCD 使用异构GPU节点和基于cabana的MPCD实现

IF 1.4 4区计算机科学

Parallel Computing Pub Date : 2023-09-01 DOI: 10.1016/j.parco.2023.103033

R. Halver, Christoph Junghans, G. Sutmann

引用次数: 0

Big data BPMN workflow resource optimization in the cloud 云中的大数据BPMN工作流资源优化

IF 1.4 4区计算机科学

Parallel Computing Pub Date : 2023-09-01 DOI: 10.1016/j.parco.2023.103025

Srđan Daniel Simić, Nikola Tanković, Darko Etinger

{"title":"Big data BPMN workflow resource optimization in the cloud","authors":"Srđan Daniel Simić, Nikola Tanković, Darko Etinger","doi":"10.1016/j.parco.2023.103025","DOIUrl":"https://doi.org/10.1016/j.parco.2023.103025","url":null,"abstract":"<div><p>Cloud computing is one of the critical technologies that meet the demand of various businesses for the high-capacity computational processing power needed to gain knowledge from their ever-growing business data. When utilizing cloud computing resources to deal with Big Data processing, companies face the challenge of determining the optimal use of resources within their business processes. The miscalculation of the necessary resources directly affects their budget and can cause delays in the cycle time of their key processes. This study investigates the simulation of cloud resource optimization for Big Data workflows modeled with the Business Process Modeling Notation (BPMN). To this end, a BPMN performance evaluation framework was developed. The framework’s capabilities were presented using real-world data science workflow and later evaluated on workflows consisting of 13, 52, and 104 tasks. The results show that the developed framework is adequate for estimating the overall run-time distribution and optimizing the cloud resource deployment and that the BPMN can be utilized for Big Data processing workflows. Therefore, this study contributes to BPMN practitioners by providing a tool to apply BPMN for their Big Data workflows and decision-makers by giving them critical insights into their key business processes. The framework source code is available at <span>https://github.com/ntankovic/python-bpmn-engine</span><svg><path></path></svg>.</p></div>","PeriodicalId":54642,"journal":{"name":"Parallel Computing","volume":"117 ","pages":"Article 103025"},"PeriodicalIF":1.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49877447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ESA: An efficient sequence alignment algorithm for biological database search on Sunway TaihuLight ESA:一种用于神威太湖之光生物数据库检索的高效序列比对算法

IF 1.4 4区计算机科学

Parallel Computing Pub Date : 2023-08-01 DOI: 10.1016/j.parco.2023.103043

H. Zhang, Zhiyi Huang, Yawen Chen, Jianguo Liang, Xiran Gao

引用次数: 0

Finding inputs that trigger floating-point exceptions in heterogeneous computing via Bayesian optimization 通过贝叶斯优化查找异构计算中触发浮点异常的输入

IF 1.4 4区计算机科学

Parallel Computing Pub Date : 2023-08-01 DOI: 10.1016/j.parco.2023.103042

I. Laguna, Anh Tran, G. Gopalakrishnan

引用次数: 0

A flexible sparse matrix data format and parallel algorithms for the assembly of finite element matrices on shared memory systems 一种灵活的稀疏矩阵数据格式及其在共享存储系统上有限元矩阵装配的并行算法

IF 1.4 4区计算机科学

Parallel Computing Pub Date : 2023-07-01 DOI: 10.1016/j.parco.2023.103039

A. Sky, César Polindara, I. Muench, C. Birk

引用次数: 0

Characterizing the performance of node-aware strategies for irregular point-to-point communication on heterogeneous architectures 异构体系结构中不规则点对点通信节点感知策略的性能表征

IF 1.4 4区计算机科学

Parallel Computing Pub Date : 2023-07-01 DOI: 10.1016/j.parco.2023.103021

Shelby Lockhart , Amanda Bienz , William D. Gropp , Luke N. Olson

{"title":"Characterizing the performance of node-aware strategies for irregular point-to-point communication on heterogeneous architectures","authors":"Shelby Lockhart , Amanda Bienz , William D. Gropp , Luke N. Olson","doi":"10.1016/j.parco.2023.103021","DOIUrl":"https://doi.org/10.1016/j.parco.2023.103021","url":null,"abstract":"<div><p>Supercomputer architectures are trending toward higher computational throughput due to the inclusion of heterogeneous compute nodes. These multi-GPU nodes increase on-node computational efficiency, while also increasing the amount of data to be communicated and the number of potential data flow paths. In this work, we characterize the performance of irregular point-to-point communication with MPI on heterogeneous compute environments through performance modeling, demonstrating the limitations of standard communication strategies for both device-aware and staging-through-host communication techniques. Presented models suggest staging communicated data through host processes then using node-aware communication strategies for high inter-node message counts. Notably, the models also predict that node-aware communication utilizing all available CPU cores to communicate inter-node data leads to the most performant strategy when communicating with a high number of nodes. Model validation is provided via a case study of irregular point-to-point communication patterns in distributed sparse matrix–vector products. Importantly, we include a discussion on the implications model predictions have on communication strategy design for emerging supercomputer architectures.</p></div>","PeriodicalId":54642,"journal":{"name":"Parallel Computing","volume":"116 ","pages":"Article 103021"},"PeriodicalIF":1.4,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49728377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Segment based power-efficient scheduling for real-time DAG tasks on edge devices 基于段的边缘设备实时DAG任务节能调度

IF 1.4 4区计算机科学

Parallel Computing Pub Date : 2023-07-01 DOI: 10.1016/j.parco.2023.103022

Lei Yu , Tianqi Zhong , Peng Bi , Lan Wang , Fei Teng

{"title":"Segment based power-efficient scheduling for real-time DAG tasks on edge devices","authors":"Lei Yu , Tianqi Zhong , Peng Bi , Lan Wang , Fei Teng","doi":"10.1016/j.parco.2023.103022","DOIUrl":"https://doi.org/10.1016/j.parco.2023.103022","url":null,"abstract":"<div><p><span>Smart Mobile Devices<span><span><span> (SMDs) are crucial for the edge computing paradigm’s real-world sensing. Real-time applications, which are computationally intensive and periodic with strict time constraints, can typically be used to replicate real-world sensing. Such applications call for increased processing speed, memory capacity, and battery life on SMDs, which are typically resource-constrained due to physical size restrictions. As a result, scheduling real-time applications for SMDs that are power efficient is crucial for the regular operation of edge computing platforms, and downstream decision-making tasks like </span>computation offloading require the prediction of </span>power consumption using power-saving approaches like DVFS. The main question is how to swiftly develop a better solution to the NP-Hard power efficient scheduling problem with DVFS. Thus, by segmenting the aligned tasks on an SMD, we present a segment-based analysis approach. Additionally, we offer a segment-based </span></span>scheduling algorithm (SEDF) that draws inspiration from the segment-based analysis approach to achieve power-efficient scheduling for these real-time workloads. This segment-based approach yields a power consumption bound (PB), and a computation offloading use case is developed to demonstrate the application of PB in the subsequent decision-making processes. Both simulations and actual device tests are used to confirm the PB, SEDF, and the effectiveness of offloading decision-making. We demonstrate empirically that PB can be utilized to make approximative optimal decisions in decision-making problems involving computation offloading. SEDF is a straightforward and effective scheduling approach that can cut the power consumption of a multi-core SMD by roughly 30%.</p></div>","PeriodicalId":54642,"journal":{"name":"Parallel Computing","volume":"116 ","pages":"Article 103022"},"PeriodicalIF":1.4,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49728378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient checkpoint/Restart of CUDA applications 有效的检查点/重新启动CUDA应用程序

IF 1.4 4区计算机科学

Parallel Computing Pub Date : 2023-07-01 DOI: 10.1016/j.parco.2023.103018

Akira Nukada , Taichiro Suzuki , Satoshi Matsuoka

引用次数: 0

GPU acceleration of Levenshtein distance computation between long strings 长字符串间Levenshtein距离计算的GPU加速

IF 1.4 4区计算机科学

Parallel Computing Pub Date : 2023-07-01 DOI: 10.1016/j.parco.2023.103019

David Castells-Rufas

引用次数: 0

NPDP benchmark suite for the evaluation of the effectiveness of automatic optimizing compilers NPDP基准套件，用于评估自动优化编译器的有效性