电晕:新兴纳米光子技术的系统含义

D. Vantrease, R. Schreiber, M. Monchiero, M. McLaren, N. Jouppi, Marco Fiorentino, A. Davis, N. Binkert, R. Beausoleil, Jung Ho Ahn
{"title":"电晕:新兴纳米光子技术的系统含义","authors":"D. Vantrease, R. Schreiber, M. Monchiero, M. McLaren, N. Jouppi, Marco Fiorentino, A. Davis, N. Binkert, R. Beausoleil, Jung Ho Ahn","doi":"10.1145/1394608.1382135","DOIUrl":null,"url":null,"abstract":"We expect that many-core microprocessors will push performance per chip from the 10 gigaflop to the 10 teraflop range in the coming decade. To support this increased performance, memory and inter-core bandwidths will also have to scale by orders of magnitude. Pin limitations, the energy cost of electrical signaling, and the non-scalability of chip-length global wires are significant bandwidth impediments. Recent developments in silicon nanophotonic technology have the potential to meet these off- and on-stack bandwidth requirements at acceptable power levels. Corona is a 3 D many-core architecture that uses nanophotonic communication for both inter-core communication and off-stack communication to memory or I/O devices. Its peak floating-point performance is 10 teraflops. Dense wavelength division multiplexed optically connected memory modules provide 10 terabyte per second memory bandwidth. A photonic crossbar fully interconnects its 256 low-power multithreaded cores at 20 terabyte per second bandwidth. We have simulated a 1024 thread Corona system running synthetic benchmarks and scaled versions of the SPLASH-2 benchmark suite. We believe that in comparison with an electrically-connected many-core alternative that uses the same on-stack interconnect power, Corona can provide 2 to 6 times more performance on many memory intensive workloads, while simultaneously reducing power.","PeriodicalId":190082,"journal":{"name":"2008 International Symposium on Computer Architecture","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"690","resultStr":"{\"title\":\"Corona: System Implications of Emerging Nanophotonic Technology\",\"authors\":\"D. Vantrease, R. Schreiber, M. Monchiero, M. McLaren, N. Jouppi, Marco Fiorentino, A. Davis, N. Binkert, R. Beausoleil, Jung Ho Ahn\",\"doi\":\"10.1145/1394608.1382135\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We expect that many-core microprocessors will push performance per chip from the 10 gigaflop to the 10 teraflop range in the coming decade. To support this increased performance, memory and inter-core bandwidths will also have to scale by orders of magnitude. Pin limitations, the energy cost of electrical signaling, and the non-scalability of chip-length global wires are significant bandwidth impediments. Recent developments in silicon nanophotonic technology have the potential to meet these off- and on-stack bandwidth requirements at acceptable power levels. Corona is a 3 D many-core architecture that uses nanophotonic communication for both inter-core communication and off-stack communication to memory or I/O devices. Its peak floating-point performance is 10 teraflops. Dense wavelength division multiplexed optically connected memory modules provide 10 terabyte per second memory bandwidth. A photonic crossbar fully interconnects its 256 low-power multithreaded cores at 20 terabyte per second bandwidth. We have simulated a 1024 thread Corona system running synthetic benchmarks and scaled versions of the SPLASH-2 benchmark suite. We believe that in comparison with an electrically-connected many-core alternative that uses the same on-stack interconnect power, Corona can provide 2 to 6 times more performance on many memory intensive workloads, while simultaneously reducing power.\",\"PeriodicalId\":190082,\"journal\":{\"name\":\"2008 International Symposium on Computer Architecture\",\"volume\":\"81 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"690\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 International Symposium on Computer Architecture\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1394608.1382135\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Symposium on Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1394608.1382135","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 690

摘要

我们预计,在未来十年内,多核微处理器将把每片芯片的性能从10千兆次浮点运算提升到10万亿次浮点运算。为了支持这种提高的性能,内存和核间带宽也必须按数量级进行扩展。引脚的限制,电信号的能量成本,以及芯片长度的全局线的不可扩展性是显著的带宽障碍。硅纳米光子技术的最新发展有可能在可接受的功率水平下满足这些栈外和栈内带宽要求。Corona是一种3d多核架构,使用纳米光子通信进行核间通信和栈外通信到内存或I/O设备。它的峰值浮点运算性能是每秒10万亿次浮点运算。密集波分多路光连接存储模块提供每秒10tb的存储带宽。一个光子横杆以每秒20tb的带宽将其256个低功耗多线程核心完全互连。我们模拟了一个1024线程的Corona系统,运行合成基准测试和缩放版本的SPLASH-2基准测试套件。我们相信,与使用相同堆栈上互连功率的电连接多核替代方案相比,Corona可以在许多内存密集型工作负载上提供2到6倍的性能,同时降低功耗。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Corona: System Implications of Emerging Nanophotonic Technology
We expect that many-core microprocessors will push performance per chip from the 10 gigaflop to the 10 teraflop range in the coming decade. To support this increased performance, memory and inter-core bandwidths will also have to scale by orders of magnitude. Pin limitations, the energy cost of electrical signaling, and the non-scalability of chip-length global wires are significant bandwidth impediments. Recent developments in silicon nanophotonic technology have the potential to meet these off- and on-stack bandwidth requirements at acceptable power levels. Corona is a 3 D many-core architecture that uses nanophotonic communication for both inter-core communication and off-stack communication to memory or I/O devices. Its peak floating-point performance is 10 teraflops. Dense wavelength division multiplexed optically connected memory modules provide 10 terabyte per second memory bandwidth. A photonic crossbar fully interconnects its 256 low-power multithreaded cores at 20 terabyte per second bandwidth. We have simulated a 1024 thread Corona system running synthetic benchmarks and scaled versions of the SPLASH-2 benchmark suite. We believe that in comparison with an electrically-connected many-core alternative that uses the same on-stack interconnect power, Corona can provide 2 to 6 times more performance on many memory intensive workloads, while simultaneously reducing power.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信