{"title":"高性能、节能硬件加速器:新兴器件、电路和架构协同设计","authors":"Catherine E. Graves","doi":"10.1145/3310273.3324055","DOIUrl":null,"url":null,"abstract":"General-purpose digital systems have long benefited from favorable scaling, but performance improvements have slowed dramatically in the last decade. Computing is therefore returning to custom and specialized systems, frequently using heterogeneous accelerators. Particularly driven by the data-centric workloads of machine learning and deep learning, an intense development of conventional accelerators (GPUs, FPGAs, CMOS ASICs) but also unconventional accelerators using novel circuits and devices beyond CMOS is currently underway. In this talk, I will discuss some common characteristics of high-performance and power-efficient accelerators in this diverse space and the ecosystem development (such as new interconnects) needed for them to thrive. To illustrate accelerator characteristics and their potential, I will describe our group's efforts to co-design from algorithms and architectures down to novel devices for gains in speed and power. We have developed architectures leveraging the analog and non-volatile nature of memristors (tunable resistance switches) assembled in crossbar arrays to accelerate machine learning, image and signal processing. We have also developed new circuits and assembled architectures to accelerate Finite Automata, enabling rapid pattern matching used in applications from security to genomics. Significant improvements over CPUs, GPUs, and custom digital ASICs are forecasted in both such systems, highlighting the potential for unconventional accelerators in future high-performance computing systems.","PeriodicalId":431860,"journal":{"name":"Proceedings of the 16th ACM International Conference on Computing Frontiers","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"High performance, power efficient hardware accelerators: emerging devices, circuits and architecture co-design\",\"authors\":\"Catherine E. Graves\",\"doi\":\"10.1145/3310273.3324055\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"General-purpose digital systems have long benefited from favorable scaling, but performance improvements have slowed dramatically in the last decade. Computing is therefore returning to custom and specialized systems, frequently using heterogeneous accelerators. Particularly driven by the data-centric workloads of machine learning and deep learning, an intense development of conventional accelerators (GPUs, FPGAs, CMOS ASICs) but also unconventional accelerators using novel circuits and devices beyond CMOS is currently underway. In this talk, I will discuss some common characteristics of high-performance and power-efficient accelerators in this diverse space and the ecosystem development (such as new interconnects) needed for them to thrive. To illustrate accelerator characteristics and their potential, I will describe our group's efforts to co-design from algorithms and architectures down to novel devices for gains in speed and power. We have developed architectures leveraging the analog and non-volatile nature of memristors (tunable resistance switches) assembled in crossbar arrays to accelerate machine learning, image and signal processing. We have also developed new circuits and assembled architectures to accelerate Finite Automata, enabling rapid pattern matching used in applications from security to genomics. Significant improvements over CPUs, GPUs, and custom digital ASICs are forecasted in both such systems, highlighting the potential for unconventional accelerators in future high-performance computing systems.\",\"PeriodicalId\":431860,\"journal\":{\"name\":\"Proceedings of the 16th ACM International Conference on Computing Frontiers\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-04-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 16th ACM International Conference on Computing Frontiers\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3310273.3324055\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th ACM International Conference on Computing Frontiers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3310273.3324055","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
High performance, power efficient hardware accelerators: emerging devices, circuits and architecture co-design
General-purpose digital systems have long benefited from favorable scaling, but performance improvements have slowed dramatically in the last decade. Computing is therefore returning to custom and specialized systems, frequently using heterogeneous accelerators. Particularly driven by the data-centric workloads of machine learning and deep learning, an intense development of conventional accelerators (GPUs, FPGAs, CMOS ASICs) but also unconventional accelerators using novel circuits and devices beyond CMOS is currently underway. In this talk, I will discuss some common characteristics of high-performance and power-efficient accelerators in this diverse space and the ecosystem development (such as new interconnects) needed for them to thrive. To illustrate accelerator characteristics and their potential, I will describe our group's efforts to co-design from algorithms and architectures down to novel devices for gains in speed and power. We have developed architectures leveraging the analog and non-volatile nature of memristors (tunable resistance switches) assembled in crossbar arrays to accelerate machine learning, image and signal processing. We have also developed new circuits and assembled architectures to accelerate Finite Automata, enabling rapid pattern matching used in applications from security to genomics. Significant improvements over CPUs, GPUs, and custom digital ASICs are forecasted in both such systems, highlighting the potential for unconventional accelerators in future high-performance computing systems.