{"title":"Performance Optimization of Tridiagonal Matrix Algorithm [TDMA] on Multicore Architectures: Computational Framework and Mathematical Modelling","authors":"Anishchandran Chathalingath, A. Manoharan","doi":"10.4018/ijghpc.2019100101","DOIUrl":null,"url":null,"abstract":"Fast and efficient tridiagonal solvers are highly appreciated in scientific and engineering domain, but challenging optimization task for computer engineers. The state-of-the-art developments in multi-core computing paves the way to meet this challenge to an extent. The technical advances in multi-core computing provide opportunities to exploit lower levels of parallelism and concurrency for inherently sequential algorithms. In this article, the authors present an optimal performance pipelined parallel variant of the conventional Tridiagonal Matrix Algorithm (TDMA), aka the Thomas algorithm, on a multi-core CPU platform. The implementation, analysis and performance comparison of the proposed pipelined parallel TDMA and the conventional version are performed on an Intel SIMD multi-core architecture. The results are compared in terms of elapsed time, speedup, cache miss rate. For a system of ‘n' linear equations where n = 2^36 in presented pipelined parallel TDMA achieves speedup of 1.294X with a parallel efficiency of 43% initially and inclines towards linear speed up as the system grows.","PeriodicalId":43565,"journal":{"name":"International Journal of Grid and High Performance Computing","volume":"32 1","pages":"1-12"},"PeriodicalIF":0.6000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Grid and High Performance Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/ijghpc.2019100101","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Fast and efficient tridiagonal solvers are highly appreciated in scientific and engineering domain, but challenging optimization task for computer engineers. The state-of-the-art developments in multi-core computing paves the way to meet this challenge to an extent. The technical advances in multi-core computing provide opportunities to exploit lower levels of parallelism and concurrency for inherently sequential algorithms. In this article, the authors present an optimal performance pipelined parallel variant of the conventional Tridiagonal Matrix Algorithm (TDMA), aka the Thomas algorithm, on a multi-core CPU platform. The implementation, analysis and performance comparison of the proposed pipelined parallel TDMA and the conventional version are performed on an Intel SIMD multi-core architecture. The results are compared in terms of elapsed time, speedup, cache miss rate. For a system of ‘n' linear equations where n = 2^36 in presented pipelined parallel TDMA achieves speedup of 1.294X with a parallel efficiency of 43% initially and inclines towards linear speed up as the system grows.