ShuiAttNet: Fully convolutional attention network for Shuishu character recognition

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2025-04-17 DOI:10.1016/j.eswa.2025.127613

Xiaojun Bi , Lu Han , Weizheng Qiao

{"title":"ShuiAttNet: Fully convolutional attention network for Shuishu character recognition","authors":"Xiaojun Bi , Lu Han , Weizheng Qiao","doi":"10.1016/j.eswa.2025.127613","DOIUrl":null,"url":null,"abstract":"<div><div>Shuishu is one of the most representative hieroglyphs and precious cultural heritage in China, currently facing the risk of extinction. Preserving this endangered script requires innovative approaches to accurately recognize its characters. However, existing methods face significant challenges, including the inability to handle the broad diversity of Shuishu characters and the complexities of authentic ancient manuscripts. To address these issues, we present a comprehensive study that combines dataset construction and advanced deep learning methods. First, we establish the largest and most diverse Shuishu single-character dataset named S842 to date, addressing the critical lack of publicly available resources for Shuishu. Then we propose a novel Fully Convolutional Attention Network named ShuiAttNet, which is specifically designed for Shuishu character recognition. ShuiAttNet introduces two key innovations: the Attentional MBConv (AMC) block and the Fully Convolutional Attention (FCA) block. The AMC block utilizes a novel feature fusion mechanism to capture fine-grained local details while reducing feature redundancy caused by the low-rank characteristics of Shuishu characters. Meanwhile, the FCA block employs Depthwise Separable Dilated Convolution to establish long-range dependencies while preserving the two-dimensional spatial structure of the images. These components enable ShuiAttNet to achieve superior performance with significantly fewer parameters compared to existing methods. Extensive experiments validate the effectiveness and superiority of ShuiAttNet in both quantitative and qualitative assessments. Experimental results show that our proposed model achieves a Top-1 Acc of 97.04%, outperforming other state-of-the-art methods.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"282 ","pages":"Article 127613"},"PeriodicalIF":7.5000,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425012357","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Shuishu is one of the most representative hieroglyphs and precious cultural heritage in China, currently facing the risk of extinction. Preserving this endangered script requires innovative approaches to accurately recognize its characters. However, existing methods face significant challenges, including the inability to handle the broad diversity of Shuishu characters and the complexities of authentic ancient manuscripts. To address these issues, we present a comprehensive study that combines dataset construction and advanced deep learning methods. First, we establish the largest and most diverse Shuishu single-character dataset named S842 to date, addressing the critical lack of publicly available resources for Shuishu. Then we propose a novel Fully Convolutional Attention Network named ShuiAttNet, which is specifically designed for Shuishu character recognition. ShuiAttNet introduces two key innovations: the Attentional MBConv (AMC) block and the Fully Convolutional Attention (FCA) block. The AMC block utilizes a novel feature fusion mechanism to capture fine-grained local details while reducing feature redundancy caused by the low-rank characteristics of Shuishu characters. Meanwhile, the FCA block employs Depthwise Separable Dilated Convolution to establish long-range dependencies while preserving the two-dimensional spatial structure of the images. These components enable ShuiAttNet to achieve superior performance with significantly fewer parameters compared to existing methods. Extensive experiments validate the effectiveness and superiority of ShuiAttNet in both quantitative and qualitative assessments. Experimental results show that our proposed model achieves a Top-1 Acc of 97.04%, outperforming other state-of-the-art methods.

查看原文本刊更多论文

水笔网：用于水笔字符识别的全卷积注意网络

水书是中国最具代表性的象形文字和珍贵的文化遗产之一，目前面临着灭绝的危险。保护这种濒危的文字需要创新的方法来准确地识别其字符。然而，现有的方法面临着巨大的挑战，包括无法处理水书文字的广泛多样性和真实古代手稿的复杂性。为了解决这些问题，我们提出了一项综合研究，结合了数据集构建和先进的深度学习方法。首先，我们建立了迄今为止最大、最多样化的水书单字符数据集S842，解决了水书公共资源严重缺乏的问题。在此基础上，我们提出了一种全新的全卷积注意力网络，命名为“水笔网”，该网络是专门为水笔字符识别设计的。ShuiAttNet引入了两个关键的创新：注意力MBConv （AMC）块和完全卷积注意力（FCA）块。AMC块利用了一种新颖的特征融合机制来捕获细粒度的局部细节，同时减少了由于水书字符的低秩特性而导致的特征冗余。同时，FCA块在保留图像二维空间结构的同时，采用深度可分扩展卷积来建立远程依赖关系。与现有方法相比，这些组件使ShuiAttNet能够以更少的参数实现卓越的性能。大量的实验验证了水网在定量和定性评估方面的有效性和优越性。实验结果表明，我们提出的模型达到了97.04%的Top-1 Acc，优于其他先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.