Parallel efficient hierarchical algorithms for module placement of large chips on distributed memory architectures

Proceedings. International Conference on Parallel Computing in Electrical Engineering Pub Date : 2002-09-22 DOI:10.1109/PCEE.2002.1115310

L. Yang

{"title":"Parallel efficient hierarchical algorithms for module placement of large chips on distributed memory architectures","authors":"L. Yang","doi":"10.1109/PCEE.2002.1115310","DOIUrl":null,"url":null,"abstract":"The PROUD module placement algorithm mainly uses a hierarchical decomposition technique and the solution of sparse linear systems based on a resistive network analogy. It has been shown that the PROUD algorithm can achieve a comparable design of the placement problems for very large circuits with the best placement algorithm based on simulated annealing, but with several order of magnitude faster. The modified PROUD, namely MPROUD algorithm by perturbing the coefficient matrices performs much faster that the original PROUD algorithm. Due to the instability and unguaranteed convergence of MPROUD algorithm, we have proposed a new convergent and numerically stable PROUD, namely Improved PROUD algorithm, denoted as IPROUD with attractive computational costs to solve the module placement problems by making use of the SYMMLQ and MINRES methods based on Lanczos process (Yang, 1997). We subsequently propose parallel versions of the improved PROUD algorithms. The parallel algorithm is derived such that all inner products and matrix-vector multiplications of a single iteration step are independent. Therefore, the cost of global communication which represents the bottleneck of the parallel performance on parallel distributed memory computers can be significantly reduced, therefore, to obtain another order of magnitude improvement in the runtime without loss of the quality of the layout.","PeriodicalId":444003,"journal":{"name":"Proceedings. International Conference on Parallel Computing in Electrical Engineering","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. International Conference on Parallel Computing in Electrical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PCEE.2002.1115310","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The PROUD module placement algorithm mainly uses a hierarchical decomposition technique and the solution of sparse linear systems based on a resistive network analogy. It has been shown that the PROUD algorithm can achieve a comparable design of the placement problems for very large circuits with the best placement algorithm based on simulated annealing, but with several order of magnitude faster. The modified PROUD, namely MPROUD algorithm by perturbing the coefficient matrices performs much faster that the original PROUD algorithm. Due to the instability and unguaranteed convergence of MPROUD algorithm, we have proposed a new convergent and numerically stable PROUD, namely Improved PROUD algorithm, denoted as IPROUD with attractive computational costs to solve the module placement problems by making use of the SYMMLQ and MINRES methods based on Lanczos process (Yang, 1997). We subsequently propose parallel versions of the improved PROUD algorithms. The parallel algorithm is derived such that all inner products and matrix-vector multiplications of a single iteration step are independent. Therefore, the cost of global communication which represents the bottleneck of the parallel performance on parallel distributed memory computers can be significantly reduced, therefore, to obtain another order of magnitude improvement in the runtime without loss of the quality of the layout.

查看原文本刊更多论文

分布式存储架构中大型芯片模块放置的并行高效分层算法

PROUD模块放置算法主要采用层次分解技术和基于电阻网络类比的稀疏线性系统求解。研究表明，该算法可以与基于模拟退火的最佳布局算法实现相当的超大电路布局问题设计，但速度要快几个数量级。改进的PROUD算法，即通过扰动系数矩阵的MPROUD算法，比原PROUD算法执行速度快得多。由于MPROUD算法的不稳定性和不保证收敛性，我们提出了一种新的收敛且数值稳定的PROUD算法，即改进的PROUD算法，记为IPROUD，计算成本可观，利用基于Lanczos过程的SYMMLQ和MINRES方法来解决模块放置问题(Yang, 1997)。我们随后提出了改进的PROUD算法的并行版本。推导了单迭代步骤的内积和矩阵向量相乘相互独立的并行算法。因此，在并行分布式存储计算机上，代表并行性能瓶颈的全局通信成本可以显著降低，从而在不损失布局质量的情况下，在运行时获得另一个数量级的改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings. International Conference on Parallel Computing in Electrical Engineering

自引率

0.00%

发文量