{"title":"用于嵌入式系统一次性部署的精度无损qnn的计算量化训练框架","authors":"Xingzhi Zhou;Wei Jiang;Jinyu Zhan;Lingxin Jin;Lin Zuo","doi":"10.1109/TC.2025.3603732","DOIUrl":null,"url":null,"abstract":"Quantized Neural Networks (QNNs) have received increasing attention, since they can enrich intelligent applications deployed on embedded devices with limited resources, such as mobile devices and AIoT systems. Unfortunately, the numerical and computational discrepancies between training systems (i.e., servers) and deployment systems (e.g., embedded ends) may lead to large accuracy drop for QNNs in real deployments. We propose a Computation-Quantized Training Framework (CQTF), which simulates deployment-time fixed-point computation during training to enable one-shot, lossless deployment. The training procedure of CQTF is built upon a well-formulated quantization-specific numerical representation that quantifies both numerical and computational discrepancies between training and deployment. Leveraging this representation, forward propagation executes all computations in quantization mode to simulate deployment-time inference, while backward propagation identifies and mitigates gradient vanishing through an efficient floating-point gradient update scheme. Benchmark-based experiments demonstrate the efficiency of our approach, which can achieve no accuracy loss from training to deployment. Compared with existing five frameworks, the deployed accuracy of CQTF can be improved by up to 18.41%.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 11","pages":"3818-3831"},"PeriodicalIF":3.8000,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Computation-Quantized Training Framework to Generate Accuracy Lossless QNNs for One-Shot Deployment in Embedded Systems\",\"authors\":\"Xingzhi Zhou;Wei Jiang;Jinyu Zhan;Lingxin Jin;Lin Zuo\",\"doi\":\"10.1109/TC.2025.3603732\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Quantized Neural Networks (QNNs) have received increasing attention, since they can enrich intelligent applications deployed on embedded devices with limited resources, such as mobile devices and AIoT systems. Unfortunately, the numerical and computational discrepancies between training systems (i.e., servers) and deployment systems (e.g., embedded ends) may lead to large accuracy drop for QNNs in real deployments. We propose a Computation-Quantized Training Framework (CQTF), which simulates deployment-time fixed-point computation during training to enable one-shot, lossless deployment. The training procedure of CQTF is built upon a well-formulated quantization-specific numerical representation that quantifies both numerical and computational discrepancies between training and deployment. Leveraging this representation, forward propagation executes all computations in quantization mode to simulate deployment-time inference, while backward propagation identifies and mitigates gradient vanishing through an efficient floating-point gradient update scheme. Benchmark-based experiments demonstrate the efficiency of our approach, which can achieve no accuracy loss from training to deployment. Compared with existing five frameworks, the deployed accuracy of CQTF can be improved by up to 18.41%.\",\"PeriodicalId\":13087,\"journal\":{\"name\":\"IEEE Transactions on Computers\",\"volume\":\"74 11\",\"pages\":\"3818-3831\"},\"PeriodicalIF\":3.8000,\"publicationDate\":\"2025-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computers\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11145090/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computers","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11145090/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
A Computation-Quantized Training Framework to Generate Accuracy Lossless QNNs for One-Shot Deployment in Embedded Systems
Quantized Neural Networks (QNNs) have received increasing attention, since they can enrich intelligent applications deployed on embedded devices with limited resources, such as mobile devices and AIoT systems. Unfortunately, the numerical and computational discrepancies between training systems (i.e., servers) and deployment systems (e.g., embedded ends) may lead to large accuracy drop for QNNs in real deployments. We propose a Computation-Quantized Training Framework (CQTF), which simulates deployment-time fixed-point computation during training to enable one-shot, lossless deployment. The training procedure of CQTF is built upon a well-formulated quantization-specific numerical representation that quantifies both numerical and computational discrepancies between training and deployment. Leveraging this representation, forward propagation executes all computations in quantization mode to simulate deployment-time inference, while backward propagation identifies and mitigates gradient vanishing through an efficient floating-point gradient update scheme. Benchmark-based experiments demonstrate the efficiency of our approach, which can achieve no accuracy loss from training to deployment. Compared with existing five frameworks, the deployed accuracy of CQTF can be improved by up to 18.41%.
期刊介绍:
The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field. It publishes papers on research in areas of current interest to the readers. These areas include, but are not limited to, the following: a) computer organizations and architectures; b) operating systems, software systems, and communication protocols; c) real-time systems and embedded systems; d) digital devices, computer components, and interconnection networks; e) specification, design, prototyping, and testing methods and tools; f) performance, fault tolerance, reliability, security, and testability; g) case studies and experimental and theoretical evaluations; and h) new and important applications and trends.