Hiqmat Nisa, J. Thom, V. Ciesielski, Ruwan Tennakoon
{"title":"A deep learning approach to handwritten text recognition in the presence of struck-out text","authors":"Hiqmat Nisa, J. Thom, V. Ciesielski, Ruwan Tennakoon","doi":"10.1109/IVCNZ48456.2019.8961024","DOIUrl":null,"url":null,"abstract":"The accuracy of handwritten text recognition may be affected by the presence of struck-out text in the handwritten manuscript. This paper investigates and improves the performance of a widely used handwritten text recognition approach Convolutional Recurrent Neural Network (CRNN) on handwritten lines containing struck out words. For this purpose, some common types of struck-out strokes were superimposed on words in a text line. A model, trained on the IAM line database was tested on lines containing struck-out words. The Character Error Rate (CER) increased from 0.09 to 0.11. This model was re-trained on dataset containing struck-out text. The model performed well in terms of struck-out text detection. We found that after providing an adequate number of training examples, the model can deal with learning struck-out patterns in a way that does not affect the overall recognition accuracy.","PeriodicalId":217359,"journal":{"name":"2019 International Conference on Image and Vision Computing New Zealand (IVCNZ)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Image and Vision Computing New Zealand (IVCNZ)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IVCNZ48456.2019.8961024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
The accuracy of handwritten text recognition may be affected by the presence of struck-out text in the handwritten manuscript. This paper investigates and improves the performance of a widely used handwritten text recognition approach Convolutional Recurrent Neural Network (CRNN) on handwritten lines containing struck out words. For this purpose, some common types of struck-out strokes were superimposed on words in a text line. A model, trained on the IAM line database was tested on lines containing struck-out words. The Character Error Rate (CER) increased from 0.09 to 0.11. This model was re-trained on dataset containing struck-out text. The model performed well in terms of struck-out text detection. We found that after providing an adequate number of training examples, the model can deal with learning struck-out patterns in a way that does not affect the overall recognition accuracy.