W. Brewer, G. Behm, A. Scheinine, Ben Parsons, Wesley Emeneker, Robert P. Trevino
{"title":"iBench:一个分布式推理模拟和基准测试套件","authors":"W. Brewer, G. Behm, A. Scheinine, Ben Parsons, Wesley Emeneker, Robert P. Trevino","doi":"10.1109/HPEC43674.2020.9286169","DOIUrl":null,"url":null,"abstract":"We present a novel distributed inference benchmarking system, called “iBench”, that provides relevant performance metrics for high-performance edge computing systems using trained deep learning models. The proposed benchmark is unique in that it includes data transfer performance through a distributed system, such as a supercomputer, using clients and servers to provide a system-level benchmark. iBench is flexible and robust enough to allow for the benchmarking of custom-built inference servers. This was demonstrated through the development of a custom Flask-based inference server to serve MLPerf's official ResNet50v1.5 model. In this paper, we compare iBench against MLPerf inference performance on an 8-V100 GPU node. iBench is shown to provide two primary advantages over MLPerf: (1) the ability to measure distributed inference performance, and (2) a more realistic measure of benchmark performance for inference servers on HPC by taking into account additional factors to inference time, such as HTTP request-response time, payload pre-processing and packing time, and invest time.","PeriodicalId":168544,"journal":{"name":"2020 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"iBench: a Distributed Inference Simulation and Benchmark Suite\",\"authors\":\"W. Brewer, G. Behm, A. Scheinine, Ben Parsons, Wesley Emeneker, Robert P. Trevino\",\"doi\":\"10.1109/HPEC43674.2020.9286169\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a novel distributed inference benchmarking system, called “iBench”, that provides relevant performance metrics for high-performance edge computing systems using trained deep learning models. The proposed benchmark is unique in that it includes data transfer performance through a distributed system, such as a supercomputer, using clients and servers to provide a system-level benchmark. iBench is flexible and robust enough to allow for the benchmarking of custom-built inference servers. This was demonstrated through the development of a custom Flask-based inference server to serve MLPerf's official ResNet50v1.5 model. In this paper, we compare iBench against MLPerf inference performance on an 8-V100 GPU node. iBench is shown to provide two primary advantages over MLPerf: (1) the ability to measure distributed inference performance, and (2) a more realistic measure of benchmark performance for inference servers on HPC by taking into account additional factors to inference time, such as HTTP request-response time, payload pre-processing and packing time, and invest time.\",\"PeriodicalId\":168544,\"journal\":{\"name\":\"2020 IEEE High Performance Extreme Computing Conference (HPEC)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE High Performance Extreme Computing Conference (HPEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPEC43674.2020.9286169\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC43674.2020.9286169","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
iBench: a Distributed Inference Simulation and Benchmark Suite
We present a novel distributed inference benchmarking system, called “iBench”, that provides relevant performance metrics for high-performance edge computing systems using trained deep learning models. The proposed benchmark is unique in that it includes data transfer performance through a distributed system, such as a supercomputer, using clients and servers to provide a system-level benchmark. iBench is flexible and robust enough to allow for the benchmarking of custom-built inference servers. This was demonstrated through the development of a custom Flask-based inference server to serve MLPerf's official ResNet50v1.5 model. In this paper, we compare iBench against MLPerf inference performance on an 8-V100 GPU node. iBench is shown to provide two primary advantages over MLPerf: (1) the ability to measure distributed inference performance, and (2) a more realistic measure of benchmark performance for inference servers on HPC by taking into account additional factors to inference time, such as HTTP request-response time, payload pre-processing and packing time, and invest time.