{"title":"An Innovative Multifunction System for Text Recognition of Digital Resources Reproducing Ancient Handwritten and Hand-Printed Artifacts","authors":"N. Barbuti, Tommaso Caldarola","doi":"10.1145/3240117.3240141","DOIUrl":null,"url":null,"abstract":"The paper outlines the multifunction system ICRPad for recognizing text in digital images, which reproduce pages of ancient handwritten or hand-printed artifacts. The system was developed aiming at proposing an innovative approach in research and retrieval of information in historical digital libraries and archives. This approach is based on application to data humanities of the fourth knowledge paradigm that underlies data science. Following this approach, the algorithms are used to deduce new research hypotheses through the discovery of models directly inferred from large digital libraries. The system has two modules: ICR++ module and ICR M-Evo (Multi-Evolution) module. The first performs the graph or word recognition by a training process based on segmentation of Regions Of Interest (ROI). The M-Evo module uses a graphic matching algorithm based on a shape contour recognition feature, without any segmentation process. The system was tested on case studies related to digital libraries reproducing ancient artifacts. Experimental results both showed high accuracy of ICRPad in recognizing text, and some interesting development in approach to digital humanities research by applying the fourth knowledge paradigm.","PeriodicalId":318568,"journal":{"name":"Proceedings of the 1st International Conference on Digital Tools & Uses Congress","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st International Conference on Digital Tools & Uses Congress","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3240117.3240141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The paper outlines the multifunction system ICRPad for recognizing text in digital images, which reproduce pages of ancient handwritten or hand-printed artifacts. The system was developed aiming at proposing an innovative approach in research and retrieval of information in historical digital libraries and archives. This approach is based on application to data humanities of the fourth knowledge paradigm that underlies data science. Following this approach, the algorithms are used to deduce new research hypotheses through the discovery of models directly inferred from large digital libraries. The system has two modules: ICR++ module and ICR M-Evo (Multi-Evolution) module. The first performs the graph or word recognition by a training process based on segmentation of Regions Of Interest (ROI). The M-Evo module uses a graphic matching algorithm based on a shape contour recognition feature, without any segmentation process. The system was tested on case studies related to digital libraries reproducing ancient artifacts. Experimental results both showed high accuracy of ICRPad in recognizing text, and some interesting development in approach to digital humanities research by applying the fourth knowledge paradigm.