José González, Qiong Cai, P. Chaparro, G. Magklis, R. Rakvic, Antonio González
{"title":"Thread fusion","authors":"José González, Qiong Cai, P. Chaparro, G. Magklis, R. Rakvic, Antonio González","doi":"10.1145/1393921.1394018","DOIUrl":null,"url":null,"abstract":"This work proposes Thread Fusion as an effective way of reducing power consumption when a Simultaneous Multi-Threaded (SMT) core is executing two threads from a homogeneous parallel application. Two dynamic instances of the same static instruction, each from a different thread are merged (fused) into a single instruction, consuming half of the resources of front-end pipeline stages. When the fused instruction is executed, it is cloned and it proceeds at full bandwidth. Our simulation results show average energy reduction of 10% with less than 1% impact on performance.","PeriodicalId":166672,"journal":{"name":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1393921.1394018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
This work proposes Thread Fusion as an effective way of reducing power consumption when a Simultaneous Multi-Threaded (SMT) core is executing two threads from a homogeneous parallel application. Two dynamic instances of the same static instruction, each from a different thread are merged (fused) into a single instruction, consuming half of the resources of front-end pipeline stages. When the fused instruction is executed, it is cloned and it proceeds at full bandwidth. Our simulation results show average energy reduction of 10% with less than 1% impact on performance.