{"title":"基于文本查询的历史文档模式识别方法","authors":"A. Cheddad","doi":"10.1109/CSIT.2016.7549479","DOIUrl":null,"url":null,"abstract":"Historical documents are essentially formed of handwritten texts that exhibit a variety of perceptual environment complexities. The cursive and connected nature of text lines on one hand and the presence of artefacts and noise on the other hand hinder achieving plausible results using current image processing algorithm. In this paper, we present a new algorithm which we termed QTE (Query by Text Example) that allows for training-free and binarisation-free pattern spotting in scanned handwritten historical documents. Our algorithm gives promising results on a subset of our database revealing ~83% success rate in locating word patterns supplied by the user.","PeriodicalId":210905,"journal":{"name":"2016 7th International Conference on Computer Science and Information Technology (CSIT)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Towards Query By Text Example for pattern spotting in historical documents\",\"authors\":\"A. Cheddad\",\"doi\":\"10.1109/CSIT.2016.7549479\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Historical documents are essentially formed of handwritten texts that exhibit a variety of perceptual environment complexities. The cursive and connected nature of text lines on one hand and the presence of artefacts and noise on the other hand hinder achieving plausible results using current image processing algorithm. In this paper, we present a new algorithm which we termed QTE (Query by Text Example) that allows for training-free and binarisation-free pattern spotting in scanned handwritten historical documents. Our algorithm gives promising results on a subset of our database revealing ~83% success rate in locating word patterns supplied by the user.\",\"PeriodicalId\":210905,\"journal\":{\"name\":\"2016 7th International Conference on Computer Science and Information Technology (CSIT)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 7th International Conference on Computer Science and Information Technology (CSIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSIT.2016.7549479\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 7th International Conference on Computer Science and Information Technology (CSIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSIT.2016.7549479","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
摘要
历史文献基本上是由手写文本构成的,这些文本表现出各种感知环境的复杂性。一方面,文本行的草书和连接性质,另一方面,人工制品和噪声的存在阻碍了使用当前的图像处理算法获得可信的结果。在本文中,我们提出了一种新的算法,我们称之为QTE (Query by Text Example),它允许在扫描的手写历史文档中进行无训练和无二值化的模式识别。我们的算法在数据库的一个子集上给出了有希望的结果,在定位用户提供的单词模式方面成功率约为83%。
Towards Query By Text Example for pattern spotting in historical documents
Historical documents are essentially formed of handwritten texts that exhibit a variety of perceptual environment complexities. The cursive and connected nature of text lines on one hand and the presence of artefacts and noise on the other hand hinder achieving plausible results using current image processing algorithm. In this paper, we present a new algorithm which we termed QTE (Query by Text Example) that allows for training-free and binarisation-free pattern spotting in scanned handwritten historical documents. Our algorithm gives promising results on a subset of our database revealing ~83% success rate in locating word patterns supplied by the user.