{"title":"基于多尺度纹理的古代手抄本文本识别","authors":"A. Garz, Robert Sablatnig","doi":"10.1109/VSMM.2010.5665938","DOIUrl":null,"url":null,"abstract":"Text recognition in ancient documents poses specific challenges such as degradation and staining, fading out of ink, fluctuating text lines, superimposing of text-elements or varying layouts, amongst others. To cope with those challenges, a texture-based approach is proposed, which exploits the fact that different kinds of textures have distinct orientation distributions. The orientation information is extracted using the Auto-Correlation Function (ACF). The approach is applied to three different manuscripts, namely to Glagolitic manuscripts of the 11th century, a Latin and a composite Latin-German manuscript, both originating from the 14th century. The evaluation is based on manually labeled ground truth and shows the accuracy of the features chosen even when the method is applied to document pages that are different in writing style and line spacing to those in the training set.","PeriodicalId":348792,"journal":{"name":"2010 16th International Conference on Virtual Systems and Multimedia","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Multi-scale texture-based text recognition in ancient manuscripts\",\"authors\":\"A. Garz, Robert Sablatnig\",\"doi\":\"10.1109/VSMM.2010.5665938\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Text recognition in ancient documents poses specific challenges such as degradation and staining, fading out of ink, fluctuating text lines, superimposing of text-elements or varying layouts, amongst others. To cope with those challenges, a texture-based approach is proposed, which exploits the fact that different kinds of textures have distinct orientation distributions. The orientation information is extracted using the Auto-Correlation Function (ACF). The approach is applied to three different manuscripts, namely to Glagolitic manuscripts of the 11th century, a Latin and a composite Latin-German manuscript, both originating from the 14th century. The evaluation is based on manually labeled ground truth and shows the accuracy of the features chosen even when the method is applied to document pages that are different in writing style and line spacing to those in the training set.\",\"PeriodicalId\":348792,\"journal\":{\"name\":\"2010 16th International Conference on Virtual Systems and Multimedia\",\"volume\":\"75 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 16th International Conference on Virtual Systems and Multimedia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VSMM.2010.5665938\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 16th International Conference on Virtual Systems and Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VSMM.2010.5665938","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-scale texture-based text recognition in ancient manuscripts
Text recognition in ancient documents poses specific challenges such as degradation and staining, fading out of ink, fluctuating text lines, superimposing of text-elements or varying layouts, amongst others. To cope with those challenges, a texture-based approach is proposed, which exploits the fact that different kinds of textures have distinct orientation distributions. The orientation information is extracted using the Auto-Correlation Function (ACF). The approach is applied to three different manuscripts, namely to Glagolitic manuscripts of the 11th century, a Latin and a composite Latin-German manuscript, both originating from the 14th century. The evaluation is based on manually labeled ground truth and shows the accuracy of the features chosen even when the method is applied to document pages that are different in writing style and line spacing to those in the training set.