{"title":"文本检测模型的高性能部署:压缩和硬件平台考虑","authors":"Nupur Sumeet, Karan Rawat, M. Nambiar","doi":"10.1109/ISPASS55109.2022.00022","DOIUrl":null,"url":null,"abstract":"Network compression is often adopted for high throughput implementation on commercial accelerators. We propose a heuristic based approach to obtain compressed networks with a hardware-friendly architecture as an alternative to conventional NAS algorithms that are computationally expensive. The proposed compressed network introduces 142 $\\times$ memory-footprint reduction and provide throughput improvement of 5-8 $\\times$ on target hardware platforms, while retaining accuracy within 5% of the baseline trained model. We report performance acceleration on CPU, GPU, and FPGAs for a text detection task.","PeriodicalId":115391,"journal":{"name":"2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"High-Performance Deployment of Text Detection Model: Compression and Hardware Platform considerations\",\"authors\":\"Nupur Sumeet, Karan Rawat, M. Nambiar\",\"doi\":\"10.1109/ISPASS55109.2022.00022\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Network compression is often adopted for high throughput implementation on commercial accelerators. We propose a heuristic based approach to obtain compressed networks with a hardware-friendly architecture as an alternative to conventional NAS algorithms that are computationally expensive. The proposed compressed network introduces 142 $\\\\times$ memory-footprint reduction and provide throughput improvement of 5-8 $\\\\times$ on target hardware platforms, while retaining accuracy within 5% of the baseline trained model. We report performance acceleration on CPU, GPU, and FPGAs for a text detection task.\",\"PeriodicalId\":115391,\"journal\":{\"name\":\"2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISPASS55109.2022.00022\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPASS55109.2022.00022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
High-Performance Deployment of Text Detection Model: Compression and Hardware Platform considerations
Network compression is often adopted for high throughput implementation on commercial accelerators. We propose a heuristic based approach to obtain compressed networks with a hardware-friendly architecture as an alternative to conventional NAS algorithms that are computationally expensive. The proposed compressed network introduces 142 $\times$ memory-footprint reduction and provide throughput improvement of 5-8 $\times$ on target hardware platforms, while retaining accuracy within 5% of the baseline trained model. We report performance acceleration on CPU, GPU, and FPGAs for a text detection task.