S. Hsu, A. Agarwal, S. Realov, M. Anders, Gregory K. Chen, Monodeep Kar, Raghavan Kumar, H. Sumbul, Phil C. Knag, Himanshu Kaul, Vikram B. Suresh, S. Mathew, Iqbal Rajwani, Satish Damaraju, R. Krishnamurthy, V. De
{"title":"用于10nm CMOS的高性能图形/AI处理器的低时钟功耗数字标准单元ip","authors":"S. Hsu, A. Agarwal, S. Realov, M. Anders, Gregory K. Chen, Monodeep Kar, Raghavan Kumar, H. Sumbul, Phil C. Knag, Himanshu Kaul, Vikram B. Suresh, S. Mathew, Iqbal Rajwani, Satish Damaraju, R. Krishnamurthy, V. De","doi":"10.1109/VLSICircuits18222.2020.9163007","DOIUrl":null,"url":null,"abstract":"Low-clock-power digital standard cell IPs in 10nm CMOS, featuring low-power shared-clock (LPSC) flip-flops (FFs), LPSC back-to-back (B2B) FFs, and pass-gate (PG) integrated clock gates (ICGs), achieve up to 14%, 45%, and 14% measured clock energy improvements, respectively, by reducing the number of clocked devices over state-of-the-art conventional transmission-gate (TG) FF and AND ICG circuits. The LPSC FF achieves a mean worst-case black-hole-time (BHT) improvement of 17ps, while the PG ICG achieves a mean enable/disable setup time improvement of 16ps/15ps, compared to conventional circuits measured at 650mV, 25°C. Power analysis of a graphics processor block with these optimized IPs results in an overall 6% clock power reduction without frequency impact.","PeriodicalId":252787,"journal":{"name":"2020 IEEE Symposium on VLSI Circuits","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Low-Clock-Power Digital Standard Cell IPs for High-Performance Graphics/AI Processors in 10nm CMOS\",\"authors\":\"S. Hsu, A. Agarwal, S. Realov, M. Anders, Gregory K. Chen, Monodeep Kar, Raghavan Kumar, H. Sumbul, Phil C. Knag, Himanshu Kaul, Vikram B. Suresh, S. Mathew, Iqbal Rajwani, Satish Damaraju, R. Krishnamurthy, V. De\",\"doi\":\"10.1109/VLSICircuits18222.2020.9163007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Low-clock-power digital standard cell IPs in 10nm CMOS, featuring low-power shared-clock (LPSC) flip-flops (FFs), LPSC back-to-back (B2B) FFs, and pass-gate (PG) integrated clock gates (ICGs), achieve up to 14%, 45%, and 14% measured clock energy improvements, respectively, by reducing the number of clocked devices over state-of-the-art conventional transmission-gate (TG) FF and AND ICG circuits. The LPSC FF achieves a mean worst-case black-hole-time (BHT) improvement of 17ps, while the PG ICG achieves a mean enable/disable setup time improvement of 16ps/15ps, compared to conventional circuits measured at 650mV, 25°C. Power analysis of a graphics processor block with these optimized IPs results in an overall 6% clock power reduction without frequency impact.\",\"PeriodicalId\":252787,\"journal\":{\"name\":\"2020 IEEE Symposium on VLSI Circuits\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE Symposium on VLSI Circuits\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VLSICircuits18222.2020.9163007\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Symposium on VLSI Circuits","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VLSICircuits18222.2020.9163007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Low-Clock-Power Digital Standard Cell IPs for High-Performance Graphics/AI Processors in 10nm CMOS
Low-clock-power digital standard cell IPs in 10nm CMOS, featuring low-power shared-clock (LPSC) flip-flops (FFs), LPSC back-to-back (B2B) FFs, and pass-gate (PG) integrated clock gates (ICGs), achieve up to 14%, 45%, and 14% measured clock energy improvements, respectively, by reducing the number of clocked devices over state-of-the-art conventional transmission-gate (TG) FF and AND ICG circuits. The LPSC FF achieves a mean worst-case black-hole-time (BHT) improvement of 17ps, while the PG ICG achieves a mean enable/disable setup time improvement of 16ps/15ps, compared to conventional circuits measured at 650mV, 25°C. Power analysis of a graphics processor block with these optimized IPs results in an overall 6% clock power reduction without frequency impact.