Workshop Proceedings of the 49th International Conference on Parallel Processing最新文献

筛选
英文 中文
Symmetric Tokens based Group Mutual Exclusion 基于群互斥的对称令牌
A. Aravind
{"title":"Symmetric Tokens based Group Mutual Exclusion","authors":"A. Aravind","doi":"10.1145/3409390.3409395","DOIUrl":"https://doi.org/10.1145/3409390.3409395","url":null,"abstract":"The group mutual exclusion (GME) problem is a generalization of the mutual exclusion problem. The problem is fundamental to parallel and distributed processing, as it is inherent in several applications in the modern multicore-integrated cloud era of the distributed computing world. This paper proposes a First-Come-First-Served (FCFS) GME algorithm that only uses atomic read/write operations for n threads. The proposed algorithm has three key features: (i) its simplicity; (ii) it has complexity in both space (shared variable requirement) and time (remote memory references (RMR)) in cache coherent (CC) models; and (ii) it settles the open problem posed in 2001.","PeriodicalId":350506,"journal":{"name":"Workshop Proceedings of the 49th International Conference on Parallel Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115081700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Fast Modeling of Network Contention in Batch Point-to-point Communications by Packet-level Simulation with Dynamic Time-stepping 批处理点对点通信中网络争用的动态时间步进分组级仿真快速建模
Zhang Yang, Jintao Peng, Qingkai Liu
{"title":"Fast Modeling of Network Contention in Batch Point-to-point Communications by Packet-level Simulation with Dynamic Time-stepping","authors":"Zhang Yang, Jintao Peng, Qingkai Liu","doi":"10.1145/3409390.3409398","DOIUrl":"https://doi.org/10.1145/3409390.3409398","url":null,"abstract":"Network contention has long been one of the root causes of performance loss in large-scale parallel applications. With the increasing importance of performance modeling to both large-scale application optimization and application-system co-design, the conflict of speed and accuracy in contention modeling is becoming prominent. Cycle-accurate network simulators are often too slow for large scale applications, while point-to-point analytical models are not accurate enough to capture the contention effects. To model the network contention in batch point-to-point communications, we propose a unified contention model after the flow-fair end-to-end congestion control mechanism. The model uses packet-level simulations to be accurate, but can be approximated by a flow-level semi-analytical model when messages are large enough, thus is fast. Furthermore, we propose a dynamic time-stepping technique which significantly speeds up the packet-level simulation with only minor accuracy loss. Experiments with typical communication patterns and application traces show that our model accurately predicates the communication time with an average error of 9%(fixed time step) and the dynamic time-stepping technique improve the simulation performance by up to 131 folds with an average accuracy loss of 10.5% for real application traces.","PeriodicalId":350506,"journal":{"name":"Workshop Proceedings of the 49th International Conference on Parallel Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115218870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Communication-aware Job Scheduling using SLURM 使用SLURM的感知通信的作业调度
P. Mishra, Tushar Agrawal, Preeti Malakar
{"title":"Communication-aware Job Scheduling using SLURM","authors":"P. Mishra, Tushar Agrawal, Preeti Malakar","doi":"10.1145/3409390.3409410","DOIUrl":"https://doi.org/10.1145/3409390.3409410","url":null,"abstract":"Job schedulers play an important role in selecting optimal resources for the submitted jobs. However, most of the current job schedulers do not consider job-specific characteristics such as communication patterns during resource allocation. This often leads to sub-optimal node allocations. We propose three node allocation algorithms that consider the job’s communication behavior to improve the performance of communication-intensive jobs. We develop our algorithms for tree-based network topologies. The proposed algorithms aim at minimizing network contention by allocating nodes on the least contended switches. We also show that allocating nodes in powers of two leads to a decrease in inter-switch communication for MPI communications, which further improves performance. We implement and evaluate our algorithms using SLURM, a widely-used and well-known job scheduler. We show that the proposed algorithms can reduce the execution times of communication-intensive jobs by 9% (326 hours) on average. The average wait time of jobs is reduced by 31% across three supercomputer job logs.","PeriodicalId":350506,"journal":{"name":"Workshop Proceedings of the 49th International Conference on Parallel Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129069036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Preference Aware Smart Hospital Selection System for Patients 患者偏好感知智能医院选择系统
Md. Solaiman Chowdhury, Jenifar Rahman, Md. Mahfuzur Rahman
{"title":"Preference Aware Smart Hospital Selection System for Patients","authors":"Md. Solaiman Chowdhury, Jenifar Rahman, Md. Mahfuzur Rahman","doi":"10.1145/3409390.3409391","DOIUrl":"https://doi.org/10.1145/3409390.3409391","url":null,"abstract":"With the rapid enhancement of wireless and mobile technologies, the context information of the user or environment can now easily be collected and analyzed to create useful services. The traditional healthcare facilities in most developing countries do not provide their medical services with equal quality. The patients face lots of difficulties in choosing the best-suited medical services or hospitals when they become sick. To make the proper decision for appropriate services, the patients need to consider many criteria that often create complexity. An efficient system is required to help the patients automatically accumulate the information necessary in making correct medical service selection. In this paper, we have proposed a preference-aware hospital selection model integrated into a cloud computing based context-aware system to satisfy the patients in selecting appropriate services. Through experimentation, we have shown that the developed system makes decisions accurately for the patients.","PeriodicalId":350506,"journal":{"name":"Workshop Proceedings of the 49th International Conference on Parallel Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125589167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A GCC-based Compliance Checker for Single-translation-unit, Identifier-related MISRA-C Rules 针对单个翻译单元、标识符相关的MISRA-C规则的基于gcc的遵从性检查器
Guan-Ren Wang, Peng-Sheng Chen
{"title":"A GCC-based Compliance Checker for Single-translation-unit, Identifier-related MISRA-C Rules","authors":"Guan-Ren Wang, Peng-Sheng Chen","doi":"10.1145/3409390.3409396","DOIUrl":"https://doi.org/10.1145/3409390.3409396","url":null,"abstract":"MISRA-C is a well-defined software specification for the C programming language that gives programmers criteria to develop reliable programs. This paper implements a MISRA-C compliance checker based on the GCC compiler infrastructure. It focuses on identifier-related rules that are single-translation-unit-labeled. We describe and develop strategies for implementing the checking codes. We also discuss the rules that can be detected by existing GCC options. For the tested benchmark programs, the modified GCC compiler can correctly assess compliance with the target MISRA- C rules.","PeriodicalId":350506,"journal":{"name":"Workshop Proceedings of the 49th International Conference on Parallel Processing","volume":"2030 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129774815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing the Overhead of Offloading Compression Tasks 评估卸载压缩任务的开销
L. Promberger, R. Schwemmer, H. Fröning
{"title":"Assessing the Overhead of Offloading Compression Tasks","authors":"L. Promberger, R. Schwemmer, H. Fröning","doi":"10.1145/3409390.3409405","DOIUrl":"https://doi.org/10.1145/3409390.3409405","url":null,"abstract":"Exploring compression is increasingly promising as trade-off between computations and data movement. There are two main reasons: First, the gap between processing speed and I/O continues to grow, and technology trends indicate a continuation of this. Second, performance is determined by energy efficiency, and the overall power consumption is dominated by the consumption of data movements. For these reasons there is already a plethora of related works on compression from various domains. Most recently, a couple of accelerators have been introduced to offload compression tasks from the main processor, for instance by AHA, Intel and Microsoft. Yet, one lacks the understanding of the overhead of compression when offloading tasks. In particular, such offloading is most beneficial for overlap with other tasks, if the associated overhead on the main processor is negligible. This work evaluates the integration costs compared to a solely software-based solution considering multiple compression algorithms. Among others, High Energy Physics data are used as a prime example of big data sources. The results imply that on average the zlib implementation on the accelerator achieves a comparable compression ratio to zlib level 2 on a CPU, while having up to 17 times the throughput and utilizing over 80 % less CPU resources. These results suggest that, given the right orchestration of compression and data movement tasks, the overhead of offloading compression is limited but present. Considering that compression is only a single task of a larger data processing pipeline, this overhead cannot be neglected.","PeriodicalId":350506,"journal":{"name":"Workshop Proceedings of the 49th International Conference on Parallel Processing","volume":"272 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122763264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Improving the Space-Time Efficiency of Matrix Multiplication Algorithms 提高矩阵乘法算法的空时效率
Yuan Tang
{"title":"Improving the Space-Time Efficiency of Matrix Multiplication Algorithms","authors":"Yuan Tang","doi":"10.1145/3409390.3409404","DOIUrl":"https://doi.org/10.1145/3409390.3409404","url":null,"abstract":"Classic cache-oblivious parallel matrix multiplication algorithms achieve optimality either in time or space, but not both, which promotes lots of research on the best possible balance or trade-off of such algorithms. We study modern processor-oblivious runtime systems and figure out several ways to improve algorithm’s time complexity while still bounding space and cache requirements to be asymptotically optimal. By our study, we give out sub-linear time, optimal work, space and caching algorithms for both general matrix multiplication on a semiring and Strassen-like fast algorithms on a ring. Our experiments show such algorithms have empirical advantages over classic counterparts. Our study provides new insights and research angles on how to optimize cache-oblivious parallel algorithms from both theoretical and empirical perspectives.","PeriodicalId":350506,"journal":{"name":"Workshop Proceedings of the 49th International Conference on Parallel Processing","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122846715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Network and Load-Aware Resource Manager for MPI Programs MPI程序的网络和负载感知资源管理器
Ashish Kumar Kumar, N. Jain, Preeti Malakar
{"title":"Network and Load-Aware Resource Manager for MPI Programs","authors":"Ashish Kumar Kumar, N. Jain, Preeti Malakar","doi":"10.1145/3409390.3409406","DOIUrl":"https://doi.org/10.1145/3409390.3409406","url":null,"abstract":"We present a resource broker for MPI jobs in a shared cluster, considering the current compute load and available network bandwidths. MPI programs are generally communication-intensive. Thus the current network availability between the compute nodes impacts performance. Many existing resource allocation techniques mostly consider static node attributes and some dynamic resource attributes. This does not lead to a good allocation in case of shared clusters because the network usage and system load vary. We developed a load and network-aware heuristic for resource allocation. We incorporated the current network state in our heuristic. It is able to reduce execution times by more than 38% on average as compared to the default allocation.","PeriodicalId":350506,"journal":{"name":"Workshop Proceedings of the 49th International Conference on Parallel Processing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122013226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BSRNG: A High Throughput Parallel BitSliced Approach for Random Number Generators BSRNG:一种用于随机数生成器的高吞吐量并行位切片方法
Saleh Khalaj Monfared, Omid Hajihassani, M. Kiarostami, S. M. Zanjani, Dara Rahmati, S. Gorgin
{"title":"BSRNG: A High Throughput Parallel BitSliced Approach for Random Number Generators","authors":"Saleh Khalaj Monfared, Omid Hajihassani, M. Kiarostami, S. M. Zanjani, Dara Rahmati, S. Gorgin","doi":"10.1145/3409390.3409402","DOIUrl":"https://doi.org/10.1145/3409390.3409402","url":null,"abstract":"In this work, a high throughput method for generating high-quality Pseudo-Random Numbers using the bitslicing technique is proposed. In such a technique, instead of the conventional row-major data representation, column-major data representation is employed, which allows the bitslicing implementation to take full advantage of all the available datapath of the hardware platform. By employing this data representation as building blocks of algorithms, we showcase the capability and scalability of our proposed method in various PRNG methods in the category of block and stream ciphers. The LFSR-based (Linear Feedback Shift Register) nature of the PRNG in our implementation perfectly suits the GPU’s many-core structure due to its register oriented architecture. In the proposed SIMD vectorized GPU implementation, each GPU thread can generate several 32 pseudo-random bits in each LFSR clock cycle. We then compare our implementation with some of the most significant PRNGs that display a satisfactory performance throughput and randomness criteria. The proposed implementation successfully passes the NIST test for statistical randomness and bit-wise correlation criteria. For computer-based PRNG and the optical solutions in terms of performance and performance per cost, this technique is efficient while maintaining an acceptable randomness measure. Our highest performance among all of the implemented CPRNGs with the proposed method is achieved by the MICKEY 2.0 algorithm, which shows 40% improvement over state of the art NVIDIA’s proprietary high-performance PRNG, cuRAND library, achieving 2.72 Tb/s of throughput on the affordable NVIDIA GTX 2080 Ti.","PeriodicalId":350506,"journal":{"name":"Workshop Proceedings of the 49th International Conference on Parallel Processing","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127508806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Randomized Authentication using IBE for Opportunistic Networks 基于IBE的机会网络随机认证
Kai Wang, Kazuya Sakai
{"title":"Randomized Authentication using IBE for Opportunistic Networks","authors":"Kai Wang, Kazuya Sakai","doi":"10.1145/3409390.3409392","DOIUrl":"https://doi.org/10.1145/3409390.3409392","url":null,"abstract":"Opportunistic networks (ONs) are widely used in many critical network applications, and security/privacy issues in ONs are critical for its wide adaption. In this paper, we propose a randomized authentication protocol which consists of node registration and authentication phases using identity-based encpryption (IBE) and trust framework. The key ideas of our authentication protocol are to generate public keys from publicly available node IDs, and not only central registration server but also the nodes with a high trust value can authenticate nodes in a network. By doing this, our protocol is of light-weight and the authentication process is randomized in a distributed way. In addition, to accommodate the disadvantage of IBE, we introduce the idea of distributed KGCs (key generation centers) and the trust framework. The protocol level security of the proposed scheme is proven by indistinguishability-based provable security analysis using random oracles, and the qualitative security analyses for various attacks are conducted.","PeriodicalId":350506,"journal":{"name":"Workshop Proceedings of the 49th International Conference on Parallel Processing","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128326974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信