用于半导体器件并行瞬态仿真的波形ICGS技术

Proceedings 16th Annual International Symposium on High Performance Computing Systems and Applications Pub Date : 2002-06-16 DOI:10.1109/HPCSA.2002.1019148

L. Yang

{"title":"用于半导体器件并行瞬态仿真的波形ICGS技术","authors":"L. Yang","doi":"10.1109/HPCSA.2002.1019148","DOIUrl":null,"url":null,"abstract":"In this paper, the parallelization aspects of the accelerated waveform relaxation algorithms for the transient simulation of semiconductor devices on parallel distributed memory computers are studied. These methods are competitive with standard pointwise methods on serial architectures, but are significantly faster on parallel computers. We make use of an improved parallel version of the conjugate gradient squared method (ICGS) combining elements of numerical stability and parallel algorithm design, for solving the resulting sequence of time-varying sparse linear differential-algebraic initial-value problems arising at each linearization step with waveform Newton. We reorganize the algorithm such that all the inner products, matrix-vector multiplications and vector updates of a single iteration step are independent and communication time required for inner product can be overlapped efficiently with computation time of vector updates. Therefore, the bottleneck of the performance, namely the cost of global communication on parallel distributed memory computers can be significantly reduced. The resulting ICGS algorithm maintains the favorable properties of the original algorithm while not increasing the computational costs.","PeriodicalId":111862,"journal":{"name":"Proceedings 16th Annual International Symposium on High Performance Computing Systems and Applications","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The waveform ICGS technique for parallel transient simulation of semiconductor devices\",\"authors\":\"L. Yang\",\"doi\":\"10.1109/HPCSA.2002.1019148\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, the parallelization aspects of the accelerated waveform relaxation algorithms for the transient simulation of semiconductor devices on parallel distributed memory computers are studied. These methods are competitive with standard pointwise methods on serial architectures, but are significantly faster on parallel computers. We make use of an improved parallel version of the conjugate gradient squared method (ICGS) combining elements of numerical stability and parallel algorithm design, for solving the resulting sequence of time-varying sparse linear differential-algebraic initial-value problems arising at each linearization step with waveform Newton. We reorganize the algorithm such that all the inner products, matrix-vector multiplications and vector updates of a single iteration step are independent and communication time required for inner product can be overlapped efficiently with computation time of vector updates. Therefore, the bottleneck of the performance, namely the cost of global communication on parallel distributed memory computers can be significantly reduced. The resulting ICGS algorithm maintains the favorable properties of the original algorithm while not increasing the computational costs.\",\"PeriodicalId\":111862,\"journal\":{\"name\":\"Proceedings 16th Annual International Symposium on High Performance Computing Systems and Applications\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 16th Annual International Symposium on High Performance Computing Systems and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCSA.2002.1019148\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 16th Annual International Symposium on High Performance Computing Systems and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCSA.2002.1019148","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文研究了并行分布式存储计算机上用于半导体器件瞬态仿真的加速波形松弛算法的并行化问题。这些方法在串行架构上与标准的逐点方法竞争，但在并行计算机上要快得多。结合数值稳定性元素和并行算法设计，我们利用改进的并行版共轭梯度平方法(ICGS)，求解了具有牛顿波形的线性化每一步产生的时变稀疏线性微分代数初值问题序列。我们对算法进行了重组，使得单个迭代步骤的内积、矩阵-向量乘法和向量更新都是独立的，并且内积所需的通信时间可以有效地与向量更新的计算时间重叠。因此，可以显著降低并行分布式存储计算机的性能瓶颈，即全局通信成本。所得到的ICGS算法在不增加计算成本的同时保持了原算法的良好性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The waveform ICGS technique for parallel transient simulation of semiconductor devices

In this paper, the parallelization aspects of the accelerated waveform relaxation algorithms for the transient simulation of semiconductor devices on parallel distributed memory computers are studied. These methods are competitive with standard pointwise methods on serial architectures, but are significantly faster on parallel computers. We make use of an improved parallel version of the conjugate gradient squared method (ICGS) combining elements of numerical stability and parallel algorithm design, for solving the resulting sequence of time-varying sparse linear differential-algebraic initial-value problems arising at each linearization step with waveform Newton. We reorganize the algorithm such that all the inner products, matrix-vector multiplications and vector updates of a single iteration step are independent and communication time required for inner product can be overlapped efficiently with computation time of vector updates. Therefore, the bottleneck of the performance, namely the cost of global communication on parallel distributed memory computers can be significantly reduced. The resulting ICGS algorithm maintains the favorable properties of the original algorithm while not increasing the computational costs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings 16th Annual International Symposium on High Performance Computing Systems and Applications

自引率

0.00%

发文量