ScaleServe

Proceedings of the 14th Workshop on General Purpose Processing Using GPU Pub Date : 2022-04-03 DOI:10.1145/3530390.3532735

Ali Jahanshahi, M. Chow, Daniel Wong

{"title":"ScaleServe","authors":"Ali Jahanshahi, M. Chow, Daniel Wong","doi":"10.1145/3530390.3532735","DOIUrl":null,"url":null,"abstract":"We present, ScaleServe, a scalable multi-GPU machine learning inference system that (1) is built on an end-to-end open-sourced software stack, (2) is hardware vendor-agnostic, and (3) is designed with modular components to provide users with ease to modify and extend various configuration knobs. ScaleServe also provides detailed performance metrics from different layers of the inference server which allow designers to pinpoint bottlenecks. We demonstrate ScaleServe's serving scalability with several machine learning tasks including computer vision and natural language processing on an 8-GPU server. The performance results for ResNet152 shows that ScaleServe is able to scale well on a multi-GPU platform.","PeriodicalId":442986,"journal":{"name":"Proceedings of the 14th Workshop on General Purpose Processing Using GPU","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"ScaleServe\",\"authors\":\"Ali Jahanshahi, M. Chow, Daniel Wong\",\"doi\":\"10.1145/3530390.3532735\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present, ScaleServe, a scalable multi-GPU machine learning inference system that (1) is built on an end-to-end open-sourced software stack, (2) is hardware vendor-agnostic, and (3) is designed with modular components to provide users with ease to modify and extend various configuration knobs. ScaleServe also provides detailed performance metrics from different layers of the inference server which allow designers to pinpoint bottlenecks. We demonstrate ScaleServe's serving scalability with several machine learning tasks including computer vision and natural language processing on an 8-GPU server. The performance results for ResNet152 shows that ScaleServe is able to scale well on a multi-GPU platform.\",\"PeriodicalId\":442986,\"journal\":{\"name\":\"Proceedings of the 14th Workshop on General Purpose Processing Using GPU\",\"volume\":\"52 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-04-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 14th Workshop on General Purpose Processing Using GPU\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3530390.3532735\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 14th Workshop on General Purpose Processing Using GPU","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3530390.3532735","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

ScaleServe

We present, ScaleServe, a scalable multi-GPU machine learning inference system that (1) is built on an end-to-end open-sourced software stack, (2) is hardware vendor-agnostic, and (3) is designed with modular components to provide users with ease to modify and extend various configuration knobs. ScaleServe also provides detailed performance metrics from different layers of the inference server which allow designers to pinpoint bottlenecks. We demonstrate ScaleServe's serving scalability with several machine learning tasks including computer vision and natural language processing on an 8-GPU server. The performance results for ResNet152 shows that ScaleServe is able to scale well on a multi-GPU platform.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 14th Workshop on General Purpose Processing Using GPU

自引率

0.00%

发文量