Eunchong Lee, Minkyu Lee, Sanghyun Kim, Soyoung Lee, Sung-Joon Jang, Sang-Seol Lee
{"title":"基于硬件性能的伪量化优化方法与实现","authors":"Eunchong Lee, Minkyu Lee, Sanghyun Kim, Soyoung Lee, Sung-Joon Jang, Sang-Seol Lee","doi":"10.1109/ITC-CSCC58803.2023.10212718","DOIUrl":null,"url":null,"abstract":"Deep learning networks can be accelerated by reducing the overall network volume using quantization or pruning techniques. The well-known quantization technique is Post Training Quantization (PTQ) and Quantization Aware Training (QAT). We applied an INT8 quantized network to design deep learning acceleration hardware and found that the performance of the deep learning network deteriorated due to errors occurring in the mult/shift based re-quantization step. This quantization error becomes a bigger problem in the training process rather than inference, and the FP32 arithmetic operator is applied to prevent the resulting accuracy drop. In this paper, we investigate whether the use of FP32 operators can outperform employing mult/shift operators under specific conditions. We accomplish this by analyzing the data flow based on output channel tiling and conducting a size analysis of the implemented hardware.","PeriodicalId":220939,"journal":{"name":"2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimization Method and Implementation of Fake Quantization from the Perspective of Hardware Performance\",\"authors\":\"Eunchong Lee, Minkyu Lee, Sanghyun Kim, Soyoung Lee, Sung-Joon Jang, Sang-Seol Lee\",\"doi\":\"10.1109/ITC-CSCC58803.2023.10212718\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning networks can be accelerated by reducing the overall network volume using quantization or pruning techniques. The well-known quantization technique is Post Training Quantization (PTQ) and Quantization Aware Training (QAT). We applied an INT8 quantized network to design deep learning acceleration hardware and found that the performance of the deep learning network deteriorated due to errors occurring in the mult/shift based re-quantization step. This quantization error becomes a bigger problem in the training process rather than inference, and the FP32 arithmetic operator is applied to prevent the resulting accuracy drop. In this paper, we investigate whether the use of FP32 operators can outperform employing mult/shift operators under specific conditions. We accomplish this by analyzing the data flow based on output channel tiling and conducting a size analysis of the implemented hardware.\",\"PeriodicalId\":220939,\"journal\":{\"name\":\"2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)\",\"volume\":\"48 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITC-CSCC58803.2023.10212718\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITC-CSCC58803.2023.10212718","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Optimization Method and Implementation of Fake Quantization from the Perspective of Hardware Performance
Deep learning networks can be accelerated by reducing the overall network volume using quantization or pruning techniques. The well-known quantization technique is Post Training Quantization (PTQ) and Quantization Aware Training (QAT). We applied an INT8 quantized network to design deep learning acceleration hardware and found that the performance of the deep learning network deteriorated due to errors occurring in the mult/shift based re-quantization step. This quantization error becomes a bigger problem in the training process rather than inference, and the FP32 arithmetic operator is applied to prevent the resulting accuracy drop. In this paper, we investigate whether the use of FP32 operators can outperform employing mult/shift operators under specific conditions. We accomplish this by analyzing the data flow based on output channel tiling and conducting a size analysis of the implemented hardware.