{"title":"使用TBB并行化doolittle算法","authors":"S. Sah, Dinesh Naik","doi":"10.1109/PDGC.2014.7030707","DOIUrl":null,"url":null,"abstract":"This paper presents a different approach for parallelizing the Doolittle Algorithm with the help of Intel Threading Building Blocks (TBB) allowing the users to utilize the power of multiple cores present in the modern CPUs. Parallel Doolittle Algorithm (PDA) has been divided into 3 parts: Decomposing the data, Parallely processing the data, finally Composing the data. Using the PDA we can solve the linear system of equations in considerably lesser amount time as compare to Serial Doolittle Algorithm (SDA). The PDA has been implemented in C++ using TBB library which makes it highly efficient, cross-platform compatible, and scalable. The efficiency of PDA over SDA has been verified by comparing the running time on different order of matrices. Experiments proved that PDA outperformed SDA by utilizing all the cores present in the CPU.","PeriodicalId":311953,"journal":{"name":"2014 International Conference on Parallel, Distributed and Grid Computing","volume":"35 5","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Parallelizing doolittle algorithm using TBB\",\"authors\":\"S. Sah, Dinesh Naik\",\"doi\":\"10.1109/PDGC.2014.7030707\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a different approach for parallelizing the Doolittle Algorithm with the help of Intel Threading Building Blocks (TBB) allowing the users to utilize the power of multiple cores present in the modern CPUs. Parallel Doolittle Algorithm (PDA) has been divided into 3 parts: Decomposing the data, Parallely processing the data, finally Composing the data. Using the PDA we can solve the linear system of equations in considerably lesser amount time as compare to Serial Doolittle Algorithm (SDA). The PDA has been implemented in C++ using TBB library which makes it highly efficient, cross-platform compatible, and scalable. The efficiency of PDA over SDA has been verified by comparing the running time on different order of matrices. Experiments proved that PDA outperformed SDA by utilizing all the cores present in the CPU.\",\"PeriodicalId\":311953,\"journal\":{\"name\":\"2014 International Conference on Parallel, Distributed and Grid Computing\",\"volume\":\"35 5\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference on Parallel, Distributed and Grid Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDGC.2014.7030707\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Parallel, Distributed and Grid Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDGC.2014.7030707","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
This paper presents a different approach for parallelizing the Doolittle Algorithm with the help of Intel Threading Building Blocks (TBB) allowing the users to utilize the power of multiple cores present in the modern CPUs. Parallel Doolittle Algorithm (PDA) has been divided into 3 parts: Decomposing the data, Parallely processing the data, finally Composing the data. Using the PDA we can solve the linear system of equations in considerably lesser amount time as compare to Serial Doolittle Algorithm (SDA). The PDA has been implemented in C++ using TBB library which makes it highly efficient, cross-platform compatible, and scalable. The efficiency of PDA over SDA has been verified by comparing the running time on different order of matrices. Experiments proved that PDA outperformed SDA by utilizing all the cores present in the CPU.