{"title":"Dynamic Resource Partitioning for Multi-Tenant Systolic Array Based DNN Accelerator","authors":"M. Reshadi, David Gregg","doi":"10.1109/PDP59025.2023.00019","DOIUrl":null,"url":null,"abstract":"Deep neural networks (DNN) have become a significant applications in both cloud-server and edge devices. Meanwhile, the growing number of DNNs on those platforms raises the need to execute multiple DNNs on the same device. This paper proposes a dynamic partitioning algorithm to perform concurrent processing of multiple DNNs on asystolic-array-based accelerator. Sharing an accelerator's storage and processing resources across multiple DNNs increases resource utilization and reduces computation time and energy consumption. To this end, we propose a partitioned weight stationary dataflow with a minor modification in the logic of the processing element. We evaluate the energy consumption and computation time with both heavy and light workloads. Simulation results show a 35% and 62% improvement in energy consumption and 56% and 44% in computation time under heavy and light workloads, respectively, compared with single tenancy.","PeriodicalId":153500,"journal":{"name":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 31st Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDP59025.2023.00019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Deep neural networks (DNN) have become a significant applications in both cloud-server and edge devices. Meanwhile, the growing number of DNNs on those platforms raises the need to execute multiple DNNs on the same device. This paper proposes a dynamic partitioning algorithm to perform concurrent processing of multiple DNNs on asystolic-array-based accelerator. Sharing an accelerator's storage and processing resources across multiple DNNs increases resource utilization and reduces computation time and energy consumption. To this end, we propose a partitioned weight stationary dataflow with a minor modification in the logic of the processing element. We evaluate the energy consumption and computation time with both heavy and light workloads. Simulation results show a 35% and 62% improvement in energy consumption and 56% and 44% in computation time under heavy and light workloads, respectively, compared with single tenancy.