HUTNet:一种高效的卷积神经网络，用于乌琴藏文手写体识别。

IF 2.6 4区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Big Data Pub Date : 2023-10-01 Epub Date: 2023-01-19 DOI:10.1089/big.2021.0333

Guowei Zhang, Weilan Wang, Ce Zhang, Penghai Zhao, Mingkai Zhang

{"title":"HUTNet:一种高效的卷积神经网络，用于乌琴藏文手写体识别。","authors":"Guowei Zhang, Weilan Wang, Ce Zhang, Penghai Zhao, Mingkai Zhang","doi":"10.1089/big.2021.0333","DOIUrl":null,"url":null,"abstract":"Recognition of handwritten Uchen Tibetan characters input has been considered an efficient way of acquiring mass data in the digital era. However, it still faces considerable challenges due to seriously touching letters and various morphological features of identical characters. Thus, deeper neural networks are required to achieve decent recognition accuracy, making an efficient, lightweight model design important to balance the inevitable trade-off between accuracy and latency. To reduce the learnable parameters of the network as much as possible and maintain acceptable accuracy, we introduce an efficient model named HUTNet based on the internal relationship between floating-point operations per second (FLOPs) and Memory Access Cost. The proposed network achieves a ResNet-18-level accuracy of 96.86%, with only a tenth of the parameters. The subsequent pruning and knowledge distillation strategies were applied to further reduce the inference latency of the model. Experiments on the test set (Handwritten Uchen Tibetan Data set by Wang [HUTDW]) containing 562 classes of 42,068 samples show that the compressed model achieves a 96.83% accuracy while maintaining lower FLOPs and fewer parameters. To verify the effectiveness of HUTNet, we tested it on the Chinese Handwriting Data sets Handwriting Database 1.1 (HWDB1.1), in which HUTNet achieved an accuracy of 97.24%, higher than that of ResNet-18 and ResNet-34. In general, we conduct extensive experiments on resource and accuracy trade-offs and show a stronger performance compared with other famous models on HUTDW and HWDB1.1. It also unlocks the critical bottleneck for handwritten Uchen Tibetan recognition on low-power computing devices.","PeriodicalId":51314,"journal":{"name":"Big Data","volume":" ","pages":"387-398"},"PeriodicalIF":2.6000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"HUTNet: An Efficient Convolutional Neural Network for Handwritten Uchen Tibetan Character Recognition.\",\"authors\":\"Guowei Zhang, Weilan Wang, Ce Zhang, Penghai Zhao, Mingkai Zhang\",\"doi\":\"10.1089/big.2021.0333\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recognition of handwritten Uchen Tibetan characters input has been considered an efficient way of acquiring mass data in the digital era. However, it still faces considerable challenges due to seriously touching letters and various morphological features of identical characters. Thus, deeper neural networks are required to achieve decent recognition accuracy, making an efficient, lightweight model design important to balance the inevitable trade-off between accuracy and latency. To reduce the learnable parameters of the network as much as possible and maintain acceptable accuracy, we introduce an efficient model named HUTNet based on the internal relationship between floating-point operations per second (FLOPs) and Memory Access Cost. The proposed network achieves a ResNet-18-level accuracy of 96.86%, with only a tenth of the parameters. The subsequent pruning and knowledge distillation strategies were applied to further reduce the inference latency of the model. Experiments on the test set (Handwritten Uchen Tibetan Data set by Wang [HUTDW]) containing 562 classes of 42,068 samples show that the compressed model achieves a 96.83% accuracy while maintaining lower FLOPs and fewer parameters. To verify the effectiveness of HUTNet, we tested it on the Chinese Handwriting Data sets Handwriting Database 1.1 (HWDB1.1), in which HUTNet achieved an accuracy of 97.24%, higher than that of ResNet-18 and ResNet-34. In general, we conduct extensive experiments on resource and accuracy trade-offs and show a stronger performance compared with other famous models on HUTDW and HWDB1.1. It also unlocks the critical bottleneck for handwritten Uchen Tibetan recognition on low-power computing devices.\",\"PeriodicalId\":51314,\"journal\":{\"name\":\"Big Data\",\"volume\":\" \",\"pages\":\"387-398\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2023-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Big Data\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1089/big.2021.0333\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/1/19 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1089/big.2021.0333","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/1/19 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

在数字时代，识别输入的乌陈藏文手写体被认为是获取海量数据的有效途径。然而，由于字母的感人性和相同字符的各种形态特征，它仍然面临着相当大的挑战。因此，需要更深入的神经网络来实现良好的识别精度，这使得高效、轻量级的模型设计对于平衡精度和延迟之间不可避免的权衡非常重要。为了尽可能减少网络的可学习参数并保持可接受的精度，我们引入了一个基于每秒浮点运算（FLOP）和内存访问成本之间的内部关系的高效模型HUTNet。所提出的网络实现了96.86%的ResNet-18级精度，仅具有十分之一的参数。随后的修剪和知识提取策略被应用于进一步减少模型的推理延迟。在包含562类42068个样本的测试集（王[HUTDW]的手写乌陈藏文数据集）上的实验表明，压缩模型在保持较低的FLOP和较少的参数的同时，实现了96.83%的准确率。为了验证HUTNet的有效性，我们在中文手写数据集手写数据库1.1（HWDB1.1）上对其进行了测试，其中HUTNet实现了97.24%的准确率，高于ResNet-18和ResNet-34。总的来说，我们在资源和精度权衡方面进行了广泛的实验，并在HUTDW和HWDB1.1上显示出与其他著名模型相比更强的性能。它还解开了在低功耗计算设备上手写乌琴藏文识别的关键瓶颈。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

HUTNet: An Efficient Convolutional Neural Network for Handwritten Uchen Tibetan Character Recognition.

Recognition of handwritten Uchen Tibetan characters input has been considered an efficient way of acquiring mass data in the digital era. However, it still faces considerable challenges due to seriously touching letters and various morphological features of identical characters. Thus, deeper neural networks are required to achieve decent recognition accuracy, making an efficient, lightweight model design important to balance the inevitable trade-off between accuracy and latency. To reduce the learnable parameters of the network as much as possible and maintain acceptable accuracy, we introduce an efficient model named HUTNet based on the internal relationship between floating-point operations per second (FLOPs) and Memory Access Cost. The proposed network achieves a ResNet-18-level accuracy of 96.86%, with only a tenth of the parameters. The subsequent pruning and knowledge distillation strategies were applied to further reduce the inference latency of the model. Experiments on the test set (Handwritten Uchen Tibetan Data set by Wang [HUTDW]) containing 562 classes of 42,068 samples show that the compressed model achieves a 96.83% accuracy while maintaining lower FLOPs and fewer parameters. To verify the effectiveness of HUTNet, we tested it on the Chinese Handwriting Data sets Handwriting Database 1.1 (HWDB1.1), in which HUTNet achieved an accuracy of 97.24%, higher than that of ResNet-18 and ResNet-34. In general, we conduct extensive experiments on resource and accuracy trade-offs and show a stronger performance compared with other famous models on HUTDW and HWDB1.1. It also unlocks the critical bottleneck for handwritten Uchen Tibetan recognition on low-power computing devices.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Big Data COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-COMPUTER SCIENCE, THEORY & METHODS

CiteScore

9.10

自引率

2.20%

发文量

期刊介绍： Big Data is the leading peer-reviewed journal covering the challenges and opportunities in collecting, analyzing, and disseminating vast amounts of data. The Journal addresses questions surrounding this powerful and growing field of data science and facilitates the efforts of researchers, business managers, analysts, developers, data scientists, physicists, statisticians, infrastructure developers, academics, and policymakers to improve operations, profitability, and communications within their businesses and institutions. Spanning a broad array of disciplines focusing on novel big data technologies, policies, and innovations, the Journal brings together the community to address current challenges and enforce effective efforts to organize, store, disseminate, protect, manipulate, and, most importantly, find the most effective strategies to make this incredible amount of information work to benefit society, industry, academia, and government. Big Data coverage includes: Big data industry standards, New technologies being developed specifically for big data, Data acquisition, cleaning, distribution, and best practices, Data protection, privacy, and policy, Business interests from research to product, The changing role of business intelligence, Visualization and design principles of big data infrastructures, Physical interfaces and robotics, Social networking advantages for Facebook, Twitter, Amazon, Google, etc, Opportunities around big data and how companies can harness it to their advantage.