{"title":"Radical-Based Chinese Character Recognition via Multi-Labeled Learning of Deep Residual Networks","authors":"Tie-Qiang Wang, Fei Yin, Cheng-Lin Liu","doi":"10.1109/ICDAR.2017.100","DOIUrl":null,"url":null,"abstract":"The digitization of Chinese historical documents poses a new challenge that in the huge set of character categories, majority of characters are not in common use now and have few samples for training the character classifiers. To settle this problem, we consider the radical-level composition of Chinese characters, and propose to detect position-dependent radicals using a deep residual network with multi-labeled learning. This enables the recognition of novel characters without training samples if the characters are composed of radicals appearing in training samples. In multi-labeled learning, each training character sample is labeled as positive for each radical it contains, such that after training, all the radicals appearing in the character can be detected. Experimental results on a large-category-set database of printed Chinese characters demonstrate that the proposed method can detect radicals accurately. Moreover, according to radical configurations, our model can credibly recognize novel characters as well as trained characters.","PeriodicalId":433676,"journal":{"name":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2017.100","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 30
Abstract
The digitization of Chinese historical documents poses a new challenge that in the huge set of character categories, majority of characters are not in common use now and have few samples for training the character classifiers. To settle this problem, we consider the radical-level composition of Chinese characters, and propose to detect position-dependent radicals using a deep residual network with multi-labeled learning. This enables the recognition of novel characters without training samples if the characters are composed of radicals appearing in training samples. In multi-labeled learning, each training character sample is labeled as positive for each radical it contains, such that after training, all the radicals appearing in the character can be detected. Experimental results on a large-category-set database of printed Chinese characters demonstrate that the proposed method can detect radicals accurately. Moreover, according to radical configurations, our model can credibly recognize novel characters as well as trained characters.