Parallel Process. Lett.最新文献

筛选
英文 中文
A Note on the Steiner k-Diameter of Tensor Product Networks 关于张量积网络的Steiner k-直径的一个注记
Parallel Process. Lett. Pub Date : 2019-06-01 DOI: 10.1142/S0129626419500087
Pranav Arunandhi, E. Cheng, Christopher Melekian
{"title":"A Note on the Steiner k-Diameter of Tensor Product Networks","authors":"Pranav Arunandhi, E. Cheng, Christopher Melekian","doi":"10.1142/S0129626419500087","DOIUrl":"https://doi.org/10.1142/S0129626419500087","url":null,"abstract":"Given a graph [Formula: see text] and [Formula: see text], the Steiner distance [Formula: see text] is the minimum size among all connected subgraphs of [Formula: see text] whose vertex sets contain [Formula: see text]. The Steiner [Formula: see text]-diameter [Formula: see text] is the maximum value of [Formula: see text] among all sets of [Formula: see text] vertices. In this short note, we study the Steiner [Formula: see text]-diameters of the tensor product of complete graphs.","PeriodicalId":422436,"journal":{"name":"Parallel Process. Lett.","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116461558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The Generalized Connectivity of Data Center Networks 数据中心网络的广义连通性
Parallel Process. Lett. Pub Date : 2019-06-01 DOI: 10.1142/S0129626419500075
Chen Hao, Weihua Yang
{"title":"The Generalized Connectivity of Data Center Networks","authors":"Chen Hao, Weihua Yang","doi":"10.1142/S0129626419500075","DOIUrl":"https://doi.org/10.1142/S0129626419500075","url":null,"abstract":"The generalized [Formula: see text]-connectivity of a graph [Formula: see text] is a parameter that can measure the reliability of a network [Formula: see text] to connect any [Formula: see text] vertices in [Formula: see text], which is a generalization of traditional connectivity. Let [Formula: see text] and [Formula: see text] denote the maximum number [Formula: see text] of edge-disjoint trees [Formula: see text] in [Formula: see text] such that [Formula: see text] for any [Formula: see text] and [Formula: see text]. For an integer [Formula: see text] with [Formula: see text], the generalized [Formula: see text]-connectivity of a graph [Formula: see text] is defined as [Formula: see text] and [Formula: see text]. Data centers are essential to the business of companies such as Google, Amazon, Facebook and Microsoft et al. Based on data centers, the data center networks [Formula: see text], introduced by Guo et al. in 2008, have many desirable properties. In this paper, we study the generalized [Formula: see text]-connectivity of [Formula: see text] and show that [Formula: see text] for [Formula: see text] and [Formula: see text].","PeriodicalId":422436,"journal":{"name":"Parallel Process. Lett.","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127003403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Round Robin Thread Selection Optimization in Multithreaded Processors 多线程处理器中的轮循线程选择优化
Parallel Process. Lett. Pub Date : 2019-05-10 DOI: 10.1142/S0129626419500038
Shane Carroll, Wei-Ming Lin
{"title":"Round Robin Thread Selection Optimization in Multithreaded Processors","authors":"Shane Carroll, Wei-Ming Lin","doi":"10.1142/S0129626419500038","DOIUrl":"https://doi.org/10.1142/S0129626419500038","url":null,"abstract":"We propose a variation of round-robin ordering in an multi-threaded pipeline to increase system throughput and resource distribution fairness. We show that using round robin with a typical arbitrary ordering results in inefficient use of shared resources and subsequent thread starvation. To address this but still use a simple round-robin approach, we optimally and dynamically sort the order of the round robin periodically at runtime. We show that with 4-threaded workloads, throughput can be improved by over 9% and harmonic throughput by over 3% by sorting thread order at run time. We experiment with multiple stages of the pipeline and show consistent results throughout several experiments using the SPEC CPU 2006 benchmarks. Furthermore, since the technique is still a simple round robin, the increased performance requires little overhead to implement.","PeriodicalId":422436,"journal":{"name":"Parallel Process. Lett.","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131793475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Algebraic Multigrid Preconditioners on Clusters of GPUs gpu集群上高效的代数多网格预处理
Parallel Process. Lett. Pub Date : 2019-05-10 DOI: 10.1142/S0129626419500014
A. A. Hassan, V. Cardellini, P. D'Ambra, D. Serafino, S. Filippone
{"title":"Efficient Algebraic Multigrid Preconditioners on Clusters of GPUs","authors":"A. A. Hassan, V. Cardellini, P. D'Ambra, D. Serafino, S. Filippone","doi":"10.1142/S0129626419500014","DOIUrl":"https://doi.org/10.1142/S0129626419500014","url":null,"abstract":"Many scientific applications require the solution of large and sparse linear systems of equations using Krylov subspace methods; in this case, the choice of an effective preconditioner may be crucial for the convergence of the Krylov solver. Algebraic MultiGrid (AMG) methods are widely used as preconditioners, because of their optimal computational cost and their algorithmic scalability. The wide availability of GPUs, now found in many of the fastest supercomputers, poses the problem of implementing efficiently these methods on high-throughput processors. In this work we focus on the application phase of AMG preconditioners, and in particular on the choice and implementation of smoothers and coarsest-level solvers capable of exploiting the computational power of clusters of GPUs. We consider block-Jacobi smoothers using sparse approximate inverses in the solve phase associated with the local blocks. The choice of approximate inverses instead of sparse matrix factorizations is driven by the large amount of parallelism exposed by the matrix-vector product as compared to the solution of large triangular systems on GPUs. The selected smoothers and solvers are implemented within the AMG preconditioning framework provided by the MLD2P4 library, using suitable sparse matrix data structures from the PSBLAS library. Their behaviour is illustrated in terms of execution speed and scalability, on a test case concerning groundwater modelling, provided by the Jülich Supercomputing Center within the Horizon 2020 Project EoCoE.","PeriodicalId":422436,"journal":{"name":"Parallel Process. Lett.","volume":"93 Suppl 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128836650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Efficient Communication Induced Checkpointing Protocol for Broadcast Network-based Distributed Systems 基于广播网络的分布式系统的高效通信诱导检查点协议
Parallel Process. Lett. Pub Date : 2019-05-10 DOI: 10.1142/S012962641950004X
Jinho Ahn
{"title":"Efficient Communication Induced Checkpointing Protocol for Broadcast Network-based Distributed Systems","authors":"Jinho Ahn","doi":"10.1142/S012962641950004X","DOIUrl":"https://doi.org/10.1142/S012962641950004X","url":null,"abstract":"This paper proposes an enhanced Fully Informed Communication-Induced Checkpointing (FI-CIC) protocol to highly improve the possibility of detecting Z-cycle free patterns with no extra control message by utilizing the advantageous feature of the broadcast network in an effective way compared with the original FI-CIC protocol. Experimental results show that our protocol outperforms the previous one in terms of the number of forced checkpoints per process.","PeriodicalId":422436,"journal":{"name":"Parallel Process. Lett.","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131335845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Implementing ♢P with Bounded Messages on a Network of ADD Channels 在ADD通道网络上实现有界消息的招收P
Parallel Process. Lett. Pub Date : 2019-05-10 DOI: 10.1142/S0129626419500026
Saptaparni Kumar, J. Welch
{"title":"Implementing ♢P with Bounded Messages on a Network of ADD Channels","authors":"Saptaparni Kumar, J. Welch","doi":"10.1142/S0129626419500026","DOIUrl":"https://doi.org/10.1142/S0129626419500026","url":null,"abstract":"We present an implementation of the eventually perfect failure detector [Formula: see text] from the original hierarchy of the Chandra-Toueg [3] oracles on an arbitrary partitionable network composed of unreliable channels that can lose and reorder messages. Prior implementations of [Formula: see text] have assumed different partially synchronous models ranging from bounded point-to-point message delay and reliable communication to unbounded message size and known network topologies. We implement [Formula: see text] under very weak assumptions on an arbitrary, partitionable network composed of Average Delayed/Dropped (ADD) channels [11] to model unreliable communication. Unlike older implementations, our failure detection algorithm uses bounded-sized messages to eventually detect all nodes that are unreachable (crashed or disconnected) from it.","PeriodicalId":422436,"journal":{"name":"Parallel Process. Lett.","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128911554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Optimizing Data Intensive Flows for Networks on Chips 优化芯片上网络的数据密集型流
Parallel Process. Lett. Pub Date : 2018-12-18 DOI: 10.1142/S0129626421500134
Junwei Zhang, Yang Liu, Shi Li, T. Robertazzi
{"title":"Optimizing Data Intensive Flows for Networks on Chips","authors":"Junwei Zhang, Yang Liu, Shi Li, T. Robertazzi","doi":"10.1142/S0129626421500134","DOIUrl":"https://doi.org/10.1142/S0129626421500134","url":null,"abstract":"A novel framework is proposed to find efficient data intensive flow distributions on Networks on Chip (NoC). Voronoi diagram techniques are used to divide a NoC array of homogeneous processors and links into clusters. A new mathematical tool, named the flow matrix, is proposed to find the optimal flow distribution for individual clusters. Individual flow distributions on clusters are reconciled to be more evenly distributed. This leads to an efficient makespan and a significant savings in the number of cores actually used. The approach here is described in terms of a mesh interconnection but is suitable for other interconnection topologies.","PeriodicalId":422436,"journal":{"name":"Parallel Process. Lett.","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122494265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Reconfigurable Hardware Generation of Multigrid Solvers with Conjugate Gradient Coarse-Grid Solution 基于共轭梯度粗网格解的多网格求解器可重构硬件生成
Parallel Process. Lett. Pub Date : 2018-12-01 DOI: 10.1142/S0129626418500160
Christian Schmitt, Moritz Schmid, S. Kuckuk, H. Köstler, Jürgen Teich, Frank Hannig
{"title":"Reconfigurable Hardware Generation of Multigrid Solvers with Conjugate Gradient Coarse-Grid Solution","authors":"Christian Schmitt, Moritz Schmid, S. Kuckuk, H. Köstler, Jürgen Teich, Frank Hannig","doi":"10.1142/S0129626418500160","DOIUrl":"https://doi.org/10.1142/S0129626418500160","url":null,"abstract":"Not only in the field of high-performance computing (HPC), field programmable gate arrays (FPGAs) are a soaringly popular accelerator technology. However, they use a completely different programming paradigm and tool set compared to central processing units (CPUs) or even graphics processing units (GPUs), adding extra development steps and requiring special knowledge, hindering widespread use in scientific computing. To bridge this programmability gap, domain-specific languages (DSLs) are a popular choice to generate low-level implementations from an abstract algorithm description. In this work, we demonstrate our approach for the generation of numerical solver implementations based on the multigrid method for FPGAs from the same code base that is also used to generate code for CPUs using a hybrid parallelization of MPI and OpenMP. Our approach yields in a hardware design that can compute up to 11 V-cycles per second with an input grid size of 4096[Formula: see text]4096 and solution on the coarsest using the conjugate gradient (CG) method on a mid-range FPGA, beating vectorized, multi-threaded execution on an Intel Xeon processor.","PeriodicalId":422436,"journal":{"name":"Parallel Process. Lett.","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124187311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Regular Connected Bipancyclic Spanning Subgraphs of Torus Networks 环面网络的正则连通双环生成子图
Parallel Process. Lett. Pub Date : 2018-12-01 DOI: 10.1142/S0129626418500135
M. Lu, Shurong Zhang, Weihua Yang
{"title":"Regular Connected Bipancyclic Spanning Subgraphs of Torus Networks","authors":"M. Lu, Shurong Zhang, Weihua Yang","doi":"10.1142/S0129626418500135","DOIUrl":"https://doi.org/10.1142/S0129626418500135","url":null,"abstract":"It is well known that an [Formula: see text]-dimensional torus [Formula: see text] is Hamiltonian. Then the torus [Formula: see text] contains a spanning subgraph which is 2-regular and 2-connected. In this paper, we explore a strong property of torus networks. We prove that for any even integer [Formula: see text] with [Formula: see text], the torus [Formula: see text] contains a spanning subgraph which is [Formula: see text]-regular, k-connected and bipancyclic; and if [Formula: see text] is odd, the result holds when some [Formula: see text] is even.","PeriodicalId":422436,"journal":{"name":"Parallel Process. Lett.","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126919466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Fractional Matching Preclusion for (n, k)-Star Graphs (n, k)-星图的分数匹配排除
Parallel Process. Lett. Pub Date : 2018-12-01 DOI: 10.1142/S0129626418500172
Tianlong Ma, Y. Mao, E. Cheng, Jinling Wang
{"title":"Fractional Matching Preclusion for (n, k)-Star Graphs","authors":"Tianlong Ma, Y. Mao, E. Cheng, Jinling Wang","doi":"10.1142/S0129626418500172","DOIUrl":"https://doi.org/10.1142/S0129626418500172","url":null,"abstract":"The matching preclusion number of a graph is the minimum number of edges whose deletion results in a graph that has neither perfect matchings nor almost perfect matchings. As a generalization, Liu and Liu introduced the concept of fractional matching preclusion number in 2017. The Fractional Matching Preclusion Number (FMP number) of G is the minimum number of edges whose deletion leaves the resulting graph without a fractional perfect matching. The Fractional Strong Matching Preclusion Number (FSMP number) of G is the minimum number of vertices and/or edges whose deletion leaves the resulting graph without a fractional perfect matching. In this paper, we obtain the FMP number and the FSMP number for (n, k)-star graphs. In addition, all the optimal fractional strong matching preclusion sets of these graphs are categorized.","PeriodicalId":422436,"journal":{"name":"Parallel Process. Lett.","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125482865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信