Issue Based OCR Error Prediction in Video Streams

Dirk Siegmund, Luís Rüger Sacco, Arjan Kuijper
{"title":"Issue Based OCR Error Prediction in Video Streams","authors":"Dirk Siegmund, Luís Rüger Sacco, Arjan Kuijper","doi":"10.23919/spa50552.2020.9241245","DOIUrl":null,"url":null,"abstract":"This paper increases the reliability of Optical Character Recognition (OCR) systems in natural scene by proposing a novel Image Quality Assessment (IQA) system. We propose to increase reliability based on the principle that OCR accuracy is a function of the quality of the input image. Detected text boxes are analyzed regarding their OCR score and different quality issues, such as blur, light and reflection effects. The novelty of our approach is to model IQA as a classification task, where one class represents high quality elements and each of the other classes represent a specific quality issue. We demonstrate how this methodology allows the training of IQA systems for complex quality metrics, even when no data labeled with the desired metric is available. Furthermore, a single IQA system outputs the quality score as well as the quality issues for a given image. We built on publicly available databases to generate 60k text boxes for each class and obtain 97,1% classification accuracy on a test set of 24k images. We conclude that the learnt quality metric is a valid indicator of common OCR errors by evaluating on the ICDAR 2003 Robust Word Recognition dataset.","PeriodicalId":157578,"journal":{"name":"2020 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/spa50552.2020.9241245","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This paper increases the reliability of Optical Character Recognition (OCR) systems in natural scene by proposing a novel Image Quality Assessment (IQA) system. We propose to increase reliability based on the principle that OCR accuracy is a function of the quality of the input image. Detected text boxes are analyzed regarding their OCR score and different quality issues, such as blur, light and reflection effects. The novelty of our approach is to model IQA as a classification task, where one class represents high quality elements and each of the other classes represent a specific quality issue. We demonstrate how this methodology allows the training of IQA systems for complex quality metrics, even when no data labeled with the desired metric is available. Furthermore, a single IQA system outputs the quality score as well as the quality issues for a given image. We built on publicly available databases to generate 60k text boxes for each class and obtain 97,1% classification accuracy on a test set of 24k images. We conclude that the learnt quality metric is a valid indicator of common OCR errors by evaluating on the ICDAR 2003 Robust Word Recognition dataset.
基于问题的视频流OCR误差预测
本文提出了一种新的图像质量评估(IQA)系统,提高了光学字符识别系统在自然场景中的可靠性。我们建议基于OCR精度是输入图像质量函数的原理来提高可靠性。分析检测到的文本框的OCR评分和不同的质量问题,如模糊、光线和反射效果。我们方法的新颖之处在于将IQA建模为分类任务,其中一个类代表高质量元素,其他每个类代表特定的质量问题。我们演示了这种方法如何允许IQA系统训练复杂的质量度量,即使没有标记有所需度量的数据可用。此外,单个IQA系统输出给定图像的质量分数和质量问题。我们建立在公开可用的数据库上,为每个类生成60k个文本框,并在24k图像的测试集上获得97.1%的分类准确率。通过对ICDAR 2003鲁棒词识别数据集的评估,我们得出结论,学习质量度量是常见OCR错误的有效指标。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信