{"title":"Clustered Software Queue for Efficient Pipelined Multithreading","authors":"Yuanming Zhang, K. Ootsu, T. Yokota, T. Baba","doi":"10.1109/PDCAT.2009.24","DOIUrl":null,"url":null,"abstract":"Multi-core processors have emerged as predominant architecture. Parallelizing applications into multithreaded ones executing on multiple cores is the key to achieving performance improvements. Recently proposed pipelined multithreading (PMT) techniques have shown great promise to parallelizing general applications. However, significant inter-core communication overheads limit the potential performance and hinder the wide commercial use. While dedicated inter-core communication mechanism has been proposed, it demands chip redesign effort, costs so much and needs extensions to ISA. Software queues avoid these problems. In this paper, we propose a clustered software queue technique, which applies a new clustered communication mechanism, to minimize the communication overheads from the average standpoint. Our research shows that very low average communication overheads (ACOs) can be achieved by sacrificing a certain amount of parallelisms. The principle of clustered communication mechanism and how to reduce the ACOs with it are presented in detail. A concurrent lock-free clustered software queue algorithm is given and then evaluated on commodity multi-core processors. Experimental results show that the communication performance of clustered software queue is over 10x faster than that of conventional software queue, and much higher PMT performances of real applications are achieved.","PeriodicalId":312929,"journal":{"name":"2009 International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"138 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Conference on Parallel and Distributed Computing, Applications and Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT.2009.24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Multi-core processors have emerged as predominant architecture. Parallelizing applications into multithreaded ones executing on multiple cores is the key to achieving performance improvements. Recently proposed pipelined multithreading (PMT) techniques have shown great promise to parallelizing general applications. However, significant inter-core communication overheads limit the potential performance and hinder the wide commercial use. While dedicated inter-core communication mechanism has been proposed, it demands chip redesign effort, costs so much and needs extensions to ISA. Software queues avoid these problems. In this paper, we propose a clustered software queue technique, which applies a new clustered communication mechanism, to minimize the communication overheads from the average standpoint. Our research shows that very low average communication overheads (ACOs) can be achieved by sacrificing a certain amount of parallelisms. The principle of clustered communication mechanism and how to reduce the ACOs with it are presented in detail. A concurrent lock-free clustered software queue algorithm is given and then evaluated on commodity multi-core processors. Experimental results show that the communication performance of clustered software queue is over 10x faster than that of conventional software queue, and much higher PMT performances of real applications are achieved.