An Efficient Scaling Scheme for Polynomial Acceleration in ILU-Preconditioned Systems

IF 1.5 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Concurrency and Computation-Practice & Experience Pub Date : 2025-04-22 DOI:10.1002/cpe.70098

Feng Zhang, Xuebin Chi, Jinrong Jiang, Junlin Wei, Lian Zhao, Yuzhu Wang

{"title":"An Efficient Scaling Scheme for Polynomial Acceleration in ILU-Preconditioned Systems","authors":"Feng Zhang, Xuebin Chi, Jinrong Jiang, Junlin Wei, Lian Zhao, Yuzhu Wang","doi":"10.1002/cpe.70098","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Polynomial preconditioning accelerates iterative methods for large-scale sparse linear systems by optimizing the spectral distribution and decreasing reduction communication overhead. The Neumann polynomial is notable for its simple construction and stable performance, making it easy to combine with other preconditioners and widely used in high-performance computing. The choice of scaling parameter within the Neumann series is critical for polynomial acceleration, requiring an accurate estimate of the eigenvalue bounds of the preconditioned system. In preconditioned systems, the clustering of the largest eigenvalues often slows the convergence of iterative methods used to estimate the maximum eigenvalue, leading to an underestimated scaling parameter. We address this issue by using a Least-Squares model with linear inequality constraints to learn effective combination weights of Ritz values from training samples. While the Rayleigh-Ritz process (the current best eigen-estimation approach) requires 20-30 iterations and systematically underestimates extremal eigenvalues due to Ritz values' interior spectral distribution, our constrained optimization approach achieves comparable accuracy in 10 iterations by learning optimal combination weights from Ritz value distributions and corrects the systematic underestimation while preserving positive definitenessï£¡a critical stability requirement that ensures robust preconditioning performance across diverse problem configurations. Our implementation of the Neumann polynomial with the proposed scaling scheme achieved acceleration ratios of 2.61 and 3.52 for ILU (Incomplete LU factorization) and block-ILU preconditioned systems, respectively. It achieves comparable acceleration with the recent state-of-the-art minimum residual polynomial in the ILU-preconditioned systems frequently providing better convergence acceleration in numerous practical scenarios.</p>\n </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 9-11","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurrency and Computation-Practice & Experience","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpe.70098","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

Polynomial preconditioning accelerates iterative methods for large-scale sparse linear systems by optimizing the spectral distribution and decreasing reduction communication overhead. The Neumann polynomial is notable for its simple construction and stable performance, making it easy to combine with other preconditioners and widely used in high-performance computing. The choice of scaling parameter within the Neumann series is critical for polynomial acceleration, requiring an accurate estimate of the eigenvalue bounds of the preconditioned system. In preconditioned systems, the clustering of the largest eigenvalues often slows the convergence of iterative methods used to estimate the maximum eigenvalue, leading to an underestimated scaling parameter. We address this issue by using a Least-Squares model with linear inequality constraints to learn effective combination weights of Ritz values from training samples. While the Rayleigh-Ritz process (the current best eigen-estimation approach) requires 20-30 iterations and systematically underestimates extremal eigenvalues due to Ritz values' interior spectral distribution, our constrained optimization approach achieves comparable accuracy in 10 iterations by learning optimal combination weights from Ritz value distributions and corrects the systematic underestimation while preserving positive definitenessï£¡a critical stability requirement that ensures robust preconditioning performance across diverse problem configurations. Our implementation of the Neumann polynomial with the proposed scaling scheme achieved acceleration ratios of 2.61 and 3.52 for ILU (Incomplete LU factorization) and block-ILU preconditioned systems, respectively. It achieves comparable acceleration with the recent state-of-the-art minimum residual polynomial in the ILU-preconditioned systems frequently providing better convergence acceleration in numerous practical scenarios.

查看原文本刊更多论文

ilu预条件系统中多项式加速的一种有效标度方案

多项式预处理通过优化谱分布和减少通信开销来加速大规模稀疏线性系统的迭代方法。Neumann多项式结构简单，性能稳定，易于与其他预调节器组合，在高性能计算中得到广泛应用。诺伊曼级数内尺度参数的选择对于多项式加速至关重要，需要对预条件系统的特征值界进行准确估计。在预条件系统中，最大特征值的聚类通常会减慢用于估计最大特征值的迭代方法的收敛速度，从而导致尺度参数被低估。我们通过使用具有线性不等式约束的最小二乘模型从训练样本中学习Ritz值的有效组合权值来解决这个问题。虽然瑞利-里兹过程（目前最好的特征估计方法）需要20-30次迭代，并且由于里兹值的内部光谱分布，系统地低估了极值特征值，我们的约束优化方法通过从里兹值分布中学习最优组合权重，在10次迭代中实现了相当的精度，并纠正了系统的低估，同时保持了正的definitenessï临界稳定性要求，确保了在不同问题配置中的鲁棒预处理性能。我们使用所提出的标度方案实现的Neumann多项式对ILU（不完全LU分解）和块ILU预处理系统分别实现了2.61和3.52的加速比。它在ilu预条件系统中实现了与最近最先进的最小残差多项式相当的加速度，在许多实际场景中经常提供更好的收敛加速度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Concurrency and Computation-Practice & Experience 工程技术-计算机：理论方法

CiteScore

5.00

自引率

10.00%

发文量

664

审稿时长

9.6 months

期刊介绍： Concurrency and Computation: Practice and Experience (CCPE) publishes high-quality, original research papers, and authoritative research review papers, in the overlapping fields of: Parallel and distributed computing; High-performance computing; Computational and data science; Artificial intelligence and machine learning; Big data applications, algorithms, and systems; Network science; Ontologies and semantics; Security and privacy; Cloud/edge/fog computing; Green computing; and Quantum computing.