{"title":"DNNMark: gpu的深度神经网络基准测试套件","authors":"Shi Dong, D. Kaeli","doi":"10.1145/3038228.3038239","DOIUrl":null,"url":null,"abstract":"Deep learning algorithms have been growing in popularity in the machine learning community based on their ability to accurately perform clustering and classification in a number of domains. One commonly used class of deep learning techniques is deep neural networks (DNNs). They are composed of a massive number of artificial neurons and many hidden layers. As a complex scientific computing problem, deep neural networks encompass a rich set of computing-intensive and data-intensive workloads including convolution, pooling, and inner products. All of these workloads can be used as standalone programs to benchmark hardware performance. As the GPU develops into a popular platform used to run deep learning algorithms, hardware architects should be equipped with a representative set of benchmarks that can be used to explore design tradeoffs. This suite of workloads can be constructed from a number of primitive operations commonly found in deep neural networks. In this paper, we present DNNMark, a GPU benchmark suite that consists of a collection of deep neural network primitives, covering a rich set of GPU computing patterns. This suite is designed to be a highly configurable, extensible, and flexible framework, in which benchmarks can run either individually or collectively. The goal is to provide hardware and software developers with a set of kernels that can be used to develop increasingly complex workload scenarios. We also evaluate selected benchmarks in the suite and showcase their execution behavior on a Nvidia K40 GPU.","PeriodicalId":108772,"journal":{"name":"Proceedings of the General Purpose GPUs","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"57","resultStr":"{\"title\":\"DNNMark: A Deep Neural Network Benchmark Suite for GPUs\",\"authors\":\"Shi Dong, D. Kaeli\",\"doi\":\"10.1145/3038228.3038239\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning algorithms have been growing in popularity in the machine learning community based on their ability to accurately perform clustering and classification in a number of domains. One commonly used class of deep learning techniques is deep neural networks (DNNs). They are composed of a massive number of artificial neurons and many hidden layers. As a complex scientific computing problem, deep neural networks encompass a rich set of computing-intensive and data-intensive workloads including convolution, pooling, and inner products. All of these workloads can be used as standalone programs to benchmark hardware performance. As the GPU develops into a popular platform used to run deep learning algorithms, hardware architects should be equipped with a representative set of benchmarks that can be used to explore design tradeoffs. This suite of workloads can be constructed from a number of primitive operations commonly found in deep neural networks. In this paper, we present DNNMark, a GPU benchmark suite that consists of a collection of deep neural network primitives, covering a rich set of GPU computing patterns. This suite is designed to be a highly configurable, extensible, and flexible framework, in which benchmarks can run either individually or collectively. The goal is to provide hardware and software developers with a set of kernels that can be used to develop increasingly complex workload scenarios. We also evaluate selected benchmarks in the suite and showcase their execution behavior on a Nvidia K40 GPU.\",\"PeriodicalId\":108772,\"journal\":{\"name\":\"Proceedings of the General Purpose GPUs\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-02-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"57\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the General Purpose GPUs\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3038228.3038239\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the General Purpose GPUs","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3038228.3038239","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
DNNMark: A Deep Neural Network Benchmark Suite for GPUs
Deep learning algorithms have been growing in popularity in the machine learning community based on their ability to accurately perform clustering and classification in a number of domains. One commonly used class of deep learning techniques is deep neural networks (DNNs). They are composed of a massive number of artificial neurons and many hidden layers. As a complex scientific computing problem, deep neural networks encompass a rich set of computing-intensive and data-intensive workloads including convolution, pooling, and inner products. All of these workloads can be used as standalone programs to benchmark hardware performance. As the GPU develops into a popular platform used to run deep learning algorithms, hardware architects should be equipped with a representative set of benchmarks that can be used to explore design tradeoffs. This suite of workloads can be constructed from a number of primitive operations commonly found in deep neural networks. In this paper, we present DNNMark, a GPU benchmark suite that consists of a collection of deep neural network primitives, covering a rich set of GPU computing patterns. This suite is designed to be a highly configurable, extensible, and flexible framework, in which benchmarks can run either individually or collectively. The goal is to provide hardware and software developers with a set of kernels that can be used to develop increasingly complex workload scenarios. We also evaluate selected benchmarks in the suite and showcase their execution behavior on a Nvidia K40 GPU.