{"title":"文档自动读取系统中汉字识别的提速","authors":"Yi-Hong Tseng, Chi-Chang Kuo, Hsi-Jian Lee","doi":"10.1109/ICDAR.1997.620581","DOIUrl":null,"url":null,"abstract":"We present two techniques for speeding up character recognition. Our character recognition system, including the candidate cluster selection and detail matching modules, is implemented using two statistical features: crossing counts and contour direction counts. In the training stage, we divide characters into different clusters. To keep a very high recognition rate, the candidate cluster selection module selects the top 60 clusters with minimal distances from among 300 predefined clusters. To further speed up the recognition speed, we use a modified branch and bound algorithm in the detail matching module. In the automatic document reading system, characters and punctuation marks are first extracted from printed document images and sorted according to their positions and the document orientation. The system then recognizes all printed Chinese characters between pairs of punctuation marks. The results are then spoken aloud by a speech synthesis system.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"45","resultStr":"{\"title\":\"Speeding-up Chinese character recognition in an automatic document reading system\",\"authors\":\"Yi-Hong Tseng, Chi-Chang Kuo, Hsi-Jian Lee\",\"doi\":\"10.1109/ICDAR.1997.620581\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present two techniques for speeding up character recognition. Our character recognition system, including the candidate cluster selection and detail matching modules, is implemented using two statistical features: crossing counts and contour direction counts. In the training stage, we divide characters into different clusters. To keep a very high recognition rate, the candidate cluster selection module selects the top 60 clusters with minimal distances from among 300 predefined clusters. To further speed up the recognition speed, we use a modified branch and bound algorithm in the detail matching module. In the automatic document reading system, characters and punctuation marks are first extracted from printed document images and sorted according to their positions and the document orientation. The system then recognizes all printed Chinese characters between pairs of punctuation marks. The results are then spoken aloud by a speech synthesis system.\",\"PeriodicalId\":435320,\"journal\":{\"name\":\"Proceedings of the Fourth International Conference on Document Analysis and Recognition\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1997-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"45\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Fourth International Conference on Document Analysis and Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.1997.620581\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.1997.620581","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Speeding-up Chinese character recognition in an automatic document reading system
We present two techniques for speeding up character recognition. Our character recognition system, including the candidate cluster selection and detail matching modules, is implemented using two statistical features: crossing counts and contour direction counts. In the training stage, we divide characters into different clusters. To keep a very high recognition rate, the candidate cluster selection module selects the top 60 clusters with minimal distances from among 300 predefined clusters. To further speed up the recognition speed, we use a modified branch and bound algorithm in the detail matching module. In the automatic document reading system, characters and punctuation marks are first extracted from printed document images and sorted according to their positions and the document orientation. The system then recognizes all printed Chinese characters between pairs of punctuation marks. The results are then spoken aloud by a speech synthesis system.