Lluc Alvarez, Ramon Bertran Monfort, Marc González, X. Martorell, N. Navarro, E. Ayguadé
{"title":"Design space exploration for aggressive core replication schemes in CMPs","authors":"Lluc Alvarez, Ramon Bertran Monfort, Marc González, X. Martorell, N. Navarro, E. Ayguadé","doi":"10.1145/1996130.1996169","DOIUrl":null,"url":null,"abstract":"Chip multiprocessors (CMPs) are the dominating architectures nowadays. There is a big variety of designs in current CMPs, with different number of cores and memory subsystems. This is because they are used in a wide spectrum of domains, each of them with their own design goals. This pa per studies different chip configurations in terms of number of cores, size of the shared L3 cache and off-chip bandwidth requirements in order to find what is the most efficient design for High Performance Computing applications. Results show that CMP schemes that reduce the shared L3 cache in order to make room for additional cores achieve speedups of up to 3.31x against a baseline architecture.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Symposium on High-Performance Parallel Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1996130.1996169","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Chip multiprocessors (CMPs) are the dominating architectures nowadays. There is a big variety of designs in current CMPs, with different number of cores and memory subsystems. This is because they are used in a wide spectrum of domains, each of them with their own design goals. This pa per studies different chip configurations in terms of number of cores, size of the shared L3 cache and off-chip bandwidth requirements in order to find what is the most efficient design for High Performance Computing applications. Results show that CMP schemes that reduce the shared L3 cache in order to make room for additional cores achieve speedups of up to 3.31x against a baseline architecture.