J. Pille, D. Wendel, O. Wagner, Rolf Sautter, Wolfgang Penth, Thomas Fröhnel, Stefan Büttner, O. Torreiter, Martin Eckert, Jose Paredes, David Hrusecky, David Ray, M. Canada
{"title":"采用45nm SOI技术的32kB 2R/1W L1数据缓存,用于POWER7TM处理器","authors":"J. Pille, D. Wendel, O. Wagner, Rolf Sautter, Wolfgang Penth, Thomas Fröhnel, Stefan Büttner, O. Torreiter, Martin Eckert, Jose Paredes, David Hrusecky, David Ray, M. Canada","doi":"10.1109/ISSCC.2010.5433849","DOIUrl":null,"url":null,"abstract":"Increasing demand for parallelism due to out-of-order and multi-threading computation requires fast and dense arrays with multi-port capabilities. The load-store-unit (LSU) of the POWER7™ microprocessor core has a 32kB L1 data cache composed of four 8kB blocks. In a two-cycle back-to-back operation it supports concurrently two independent read and one write operations. Organized in banks of 16 cells each, the two reads operate independently in any of these banks, including two reads within the same bank, even the same cell. A bank selected for write is blocked for any read operation. If read and write collide within the same bank, collision-control circuitry provides write-over-read priority. Each read port provides 4B from 1 of 256 locations, whereas the double-bandwidth write operation provides individual control of 8B to 128 locations.","PeriodicalId":6418,"journal":{"name":"2010 IEEE International Solid-State Circuits Conference - (ISSCC)","volume":"3 1","pages":"344-345"},"PeriodicalIF":0.0000,"publicationDate":"2010-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"A 32kB 2R/1W L1 data cache in 45nm SOI technology for the POWER7TM processor\",\"authors\":\"J. Pille, D. Wendel, O. Wagner, Rolf Sautter, Wolfgang Penth, Thomas Fröhnel, Stefan Büttner, O. Torreiter, Martin Eckert, Jose Paredes, David Hrusecky, David Ray, M. Canada\",\"doi\":\"10.1109/ISSCC.2010.5433849\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Increasing demand for parallelism due to out-of-order and multi-threading computation requires fast and dense arrays with multi-port capabilities. The load-store-unit (LSU) of the POWER7™ microprocessor core has a 32kB L1 data cache composed of four 8kB blocks. In a two-cycle back-to-back operation it supports concurrently two independent read and one write operations. Organized in banks of 16 cells each, the two reads operate independently in any of these banks, including two reads within the same bank, even the same cell. A bank selected for write is blocked for any read operation. If read and write collide within the same bank, collision-control circuitry provides write-over-read priority. Each read port provides 4B from 1 of 256 locations, whereas the double-bandwidth write operation provides individual control of 8B to 128 locations.\",\"PeriodicalId\":6418,\"journal\":{\"name\":\"2010 IEEE International Solid-State Circuits Conference - (ISSCC)\",\"volume\":\"3 1\",\"pages\":\"344-345\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-03-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE International Solid-State Circuits Conference - (ISSCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISSCC.2010.5433849\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Solid-State Circuits Conference - (ISSCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSCC.2010.5433849","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A 32kB 2R/1W L1 data cache in 45nm SOI technology for the POWER7TM processor
Increasing demand for parallelism due to out-of-order and multi-threading computation requires fast and dense arrays with multi-port capabilities. The load-store-unit (LSU) of the POWER7™ microprocessor core has a 32kB L1 data cache composed of four 8kB blocks. In a two-cycle back-to-back operation it supports concurrently two independent read and one write operations. Organized in banks of 16 cells each, the two reads operate independently in any of these banks, including two reads within the same bank, even the same cell. A bank selected for write is blocked for any read operation. If read and write collide within the same bank, collision-control circuitry provides write-over-read priority. Each read port provides 4B from 1 of 256 locations, whereas the double-bandwidth write operation provides individual control of 8B to 128 locations.