{"title":"APRIL: a processor architecture for multiprocessing","authors":"A. Agarwal, B. Lim, D. Kranz, J. Kubiatowicz","doi":"10.1145/325164.325119","DOIUrl":null,"url":null,"abstract":"The architecture of a rapid-context-switching processor called APRIL, with support for fine-grain threads and synchronization, is described. APRIL achieves high single-thread performance and supports virtual dynamic threads. A commercial reduced-instruction-set-computer-(RISC-) based implementation of APRIL and a run-time software system that can switch contexts in about 10 cycles are described. Measurements taken for several parallel applications on an APRIL simulator show that the overhead for supporting parallel tasks based on futures is reduced by a factor of 2 over a corresponding implementation on the Encore Multimax. The scalability of a multiprocessor based on APRIL is explored using a performance model. The authors show that the SPARC-based implementation of APRIL can achieve close to 80% processor utilization with as few as three resident threads per processor in a large-scale cache-based machine with an average base network latency of 55 cycles.<<ETX>>","PeriodicalId":297046,"journal":{"name":"[1990] Proceedings. The 17th Annual International Symposium on Computer Architecture","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1990-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"447","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1990] Proceedings. The 17th Annual International Symposium on Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/325164.325119","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 447
Abstract
The architecture of a rapid-context-switching processor called APRIL, with support for fine-grain threads and synchronization, is described. APRIL achieves high single-thread performance and supports virtual dynamic threads. A commercial reduced-instruction-set-computer-(RISC-) based implementation of APRIL and a run-time software system that can switch contexts in about 10 cycles are described. Measurements taken for several parallel applications on an APRIL simulator show that the overhead for supporting parallel tasks based on futures is reduced by a factor of 2 over a corresponding implementation on the Encore Multimax. The scalability of a multiprocessor based on APRIL is explored using a performance model. The authors show that the SPARC-based implementation of APRIL can achieve close to 80% processor utilization with as few as three resident threads per processor in a large-scale cache-based machine with an average base network latency of 55 cycles.<>