T. Okuno, Yohei Nakata, Yasunori Ishii, Sotaro Tsukizawa
{"title":"无损人工智能:通过知识蒸馏保证量化前后推理的一致性","authors":"T. Okuno, Yohei Nakata, Yasunori Ishii, Sotaro Tsukizawa","doi":"10.23919/MVA51890.2021.9511383","DOIUrl":null,"url":null,"abstract":"Deep learning model compression is necessary for real-time inference on edge devices, which have limited hardware resources. Conventional methods have only focused on suppressing degradation in terms of accuracy. Even if a compressed model has almost equivalent accuracy to its reference model, the inference results may change when we focus on individual samples or objects. Such a change is a crucial challenge for the quality assurance of embedded products because of unexpected behavior for specific applications on edge devices. Therefore, we propose a concept called “Loss-less AI” to guarantee consistency between the inference results of reference and compressed models. In this paper, we propose a training method to align inference results between reference and quantized models by applying knowledge distillation that batch normalization statistics are frozen at moving average values from the middle of training. We evaluated the proposed method on several classification datasets and network architectures. In all cases, our method suppressed the inferred class mismatch between reference and quantized models whereas conventional quantization-aware training did not.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Lossless AI: Toward Guaranteeing Consistency between Inferences Before and After Quantization via Knowledge Distillation\",\"authors\":\"T. Okuno, Yohei Nakata, Yasunori Ishii, Sotaro Tsukizawa\",\"doi\":\"10.23919/MVA51890.2021.9511383\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning model compression is necessary for real-time inference on edge devices, which have limited hardware resources. Conventional methods have only focused on suppressing degradation in terms of accuracy. Even if a compressed model has almost equivalent accuracy to its reference model, the inference results may change when we focus on individual samples or objects. Such a change is a crucial challenge for the quality assurance of embedded products because of unexpected behavior for specific applications on edge devices. Therefore, we propose a concept called “Loss-less AI” to guarantee consistency between the inference results of reference and compressed models. In this paper, we propose a training method to align inference results between reference and quantized models by applying knowledge distillation that batch normalization statistics are frozen at moving average values from the middle of training. We evaluated the proposed method on several classification datasets and network architectures. In all cases, our method suppressed the inferred class mismatch between reference and quantized models whereas conventional quantization-aware training did not.\",\"PeriodicalId\":312481,\"journal\":{\"name\":\"2021 17th International Conference on Machine Vision and Applications (MVA)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 17th International Conference on Machine Vision and Applications (MVA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/MVA51890.2021.9511383\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 17th International Conference on Machine Vision and Applications (MVA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/MVA51890.2021.9511383","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Lossless AI: Toward Guaranteeing Consistency between Inferences Before and After Quantization via Knowledge Distillation
Deep learning model compression is necessary for real-time inference on edge devices, which have limited hardware resources. Conventional methods have only focused on suppressing degradation in terms of accuracy. Even if a compressed model has almost equivalent accuracy to its reference model, the inference results may change when we focus on individual samples or objects. Such a change is a crucial challenge for the quality assurance of embedded products because of unexpected behavior for specific applications on edge devices. Therefore, we propose a concept called “Loss-less AI” to guarantee consistency between the inference results of reference and compressed models. In this paper, we propose a training method to align inference results between reference and quantized models by applying knowledge distillation that batch normalization statistics are frozen at moving average values from the middle of training. We evaluated the proposed method on several classification datasets and network architectures. In all cases, our method suppressed the inferred class mismatch between reference and quantized models whereas conventional quantization-aware training did not.