AncientGlyphNet：一个先进的深度学习框架，用于复杂场景下的古汉字检测

IF 10.7 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence Review Pub Date : 2025-01-08 DOI:10.1007/s10462-024-11095-5

Hengnian Qi, Hao Yang, Zhaojiang Wang, Jiabin Ye, Qiuyi Xin, Chu Zhang, Qing Lang

{"title":"AncientGlyphNet：一个先进的深度学习框架，用于复杂场景下的古汉字检测","authors":"Hengnian Qi, Hao Yang, Zhaojiang Wang, Jiabin Ye, Qiuyi Xin, Chu Zhang, Qing Lang","doi":"10.1007/s10462-024-11095-5","DOIUrl":null,"url":null,"abstract":"<div><p>Detecting ancient Chinese characters in various media, including stone inscriptions, calligraphy, and couplets, is challenging due to the complex backgrounds and diverse styles. This study proposes an advanced deep-learning framework for detecting ancient Chinese characters in complex scenes to improve detection accuracy. First, the framework introduces an Ancient Character Haar Wavelet Transform downsampling block (ACHaar), effectively reducing feature maps’ spatial resolution while preserving key ancient character features. Second, a Glyph Focus Module (GFM) is introduced, utilizing attention mechanisms to enhance the processing of deep semantic information and generating ancient character feature maps that emphasize horizontal and vertical features through a four-path parallel strategy. Third, a Character Contour Refinement Layer (CCRL) is incorporated to sharpen the edges of characters. Additionally, to train and validate the model, a dedicated dataset was constructed, named Huzhou University-Ancient Chinese Character Dataset for Complex Scenes (HUSAM-SinoCDCS), comprising images of stone inscriptions, calligraphy, and couplets. Experimental results demonstrated that the proposed method outperforms previous text detection methods on the HUSAM-SinoCDCS dataset, with accuracy improved by 1.36–92.84%, recall improved by 2.24–85.61%, and F1 score improved by 1.84–89.08%. This research contributes to digitizing ancient Chinese character artifacts and literature, promoting the inheritance and dissemination of traditional Chinese character culture. The source code and the HUSAM-SinoCDCS dataset can be accessed at https://github.com/youngbbi/AncientGlyphNet and https://github.com/youngbbi/HUSAM-SinoCDCS.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 3","pages":""},"PeriodicalIF":10.7000,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-11095-5.pdf","citationCount":"0","resultStr":"{\"title\":\"AncientGlyphNet: an advanced deep learning framework for detecting ancient Chinese characters in complex scene\",\"authors\":\"Hengnian Qi, Hao Yang, Zhaojiang Wang, Jiabin Ye, Qiuyi Xin, Chu Zhang, Qing Lang\",\"doi\":\"10.1007/s10462-024-11095-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Detecting ancient Chinese characters in various media, including stone inscriptions, calligraphy, and couplets, is challenging due to the complex backgrounds and diverse styles. This study proposes an advanced deep-learning framework for detecting ancient Chinese characters in complex scenes to improve detection accuracy. First, the framework introduces an Ancient Character Haar Wavelet Transform downsampling block (ACHaar), effectively reducing feature maps’ spatial resolution while preserving key ancient character features. Second, a Glyph Focus Module (GFM) is introduced, utilizing attention mechanisms to enhance the processing of deep semantic information and generating ancient character feature maps that emphasize horizontal and vertical features through a four-path parallel strategy. Third, a Character Contour Refinement Layer (CCRL) is incorporated to sharpen the edges of characters. Additionally, to train and validate the model, a dedicated dataset was constructed, named Huzhou University-Ancient Chinese Character Dataset for Complex Scenes (HUSAM-SinoCDCS), comprising images of stone inscriptions, calligraphy, and couplets. Experimental results demonstrated that the proposed method outperforms previous text detection methods on the HUSAM-SinoCDCS dataset, with accuracy improved by 1.36–92.84%, recall improved by 2.24–85.61%, and F1 score improved by 1.84–89.08%. This research contributes to digitizing ancient Chinese character artifacts and literature, promoting the inheritance and dissemination of traditional Chinese character culture. The source code and the HUSAM-SinoCDCS dataset can be accessed at https://github.com/youngbbi/AncientGlyphNet and https://github.com/youngbbi/HUSAM-SinoCDCS.</p></div>\",\"PeriodicalId\":8449,\"journal\":{\"name\":\"Artificial Intelligence Review\",\"volume\":\"58 3\",\"pages\":\"\"},\"PeriodicalIF\":10.7000,\"publicationDate\":\"2025-01-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s10462-024-11095-5.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence Review\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10462-024-11095-5\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-024-11095-5","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

由于背景复杂、风格多样，检测各种媒介上的古代汉字，包括石刻、书法和对联，是一项挑战。本研究提出了一种先进的深度学习框架，用于复杂场景下的古汉字检测，以提高检测精度。首先，该框架引入了古字符Haar小波变换下采样块（ACHaar），有效降低了特征映射的空间分辨率，同时保留了关键的古字符特征；其次，引入字形焦点模块（Glyph Focus Module， GFM），利用注意机制增强对深层语义信息的处理，通过四径并行策略生成强调水平和垂直特征的古文字特征图。第三，加入字符轮廓细化层（CCRL）来锐化字符边缘。此外，为了训练和验证模型，构建了一个专用数据集，名为湖州大学-复杂场景古汉字数据集（HUSAM-SinoCDCS），包括石刻、书法和对联图像。实验结果表明，该方法在HUSAM-SinoCDCS数据集上优于以往的文本检测方法，准确率提高1.36 ~ 92.84%，召回率提高2.24 ~ 85.61%，F1分数提高1.84 ~ 89.08%。本研究有助于汉字古物文献数字化，促进传统汉字文化的传承与传播。源代码和HUSAM-SinoCDCS数据集可以在https://github.com/youngbbi/AncientGlyphNet和https://github.com/youngbbi/HUSAM-SinoCDCS上访问。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

AncientGlyphNet: an advanced deep learning framework for detecting ancient Chinese characters in complex scene

Detecting ancient Chinese characters in various media, including stone inscriptions, calligraphy, and couplets, is challenging due to the complex backgrounds and diverse styles. This study proposes an advanced deep-learning framework for detecting ancient Chinese characters in complex scenes to improve detection accuracy. First, the framework introduces an Ancient Character Haar Wavelet Transform downsampling block (ACHaar), effectively reducing feature maps’ spatial resolution while preserving key ancient character features. Second, a Glyph Focus Module (GFM) is introduced, utilizing attention mechanisms to enhance the processing of deep semantic information and generating ancient character feature maps that emphasize horizontal and vertical features through a four-path parallel strategy. Third, a Character Contour Refinement Layer (CCRL) is incorporated to sharpen the edges of characters. Additionally, to train and validate the model, a dedicated dataset was constructed, named Huzhou University-Ancient Chinese Character Dataset for Complex Scenes (HUSAM-SinoCDCS), comprising images of stone inscriptions, calligraphy, and couplets. Experimental results demonstrated that the proposed method outperforms previous text detection methods on the HUSAM-SinoCDCS dataset, with accuracy improved by 1.36–92.84%, recall improved by 2.24–85.61%, and F1 score improved by 1.84–89.08%. This research contributes to digitizing ancient Chinese character artifacts and literature, promoting the inheritance and dissemination of traditional Chinese character culture. The source code and the HUSAM-SinoCDCS dataset can be accessed at https://github.com/youngbbi/AncientGlyphNet and https://github.com/youngbbi/HUSAM-SinoCDCS.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Artificial Intelligence Review 工程技术-计算机：人工智能

CiteScore

22.00

自引率

3.30%

发文量

194

审稿时长

5.3 months

期刊介绍： Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.