{"title":"Optical Character Recognition for Audio-Visual Broadcast Transcription System","authors":"J. Chaloupka, K. Paleček, P. Cerva, J. Zdánský","doi":"10.1109/CogInfoCom50765.2020.9237867","DOIUrl":null,"url":null,"abstract":"This paper investigates the use of optical character recognition (OCR) for system of audio-visual broadcast transcription. Characters were recognized from video frames by open-source program OCR Tesseract. The OCR in this program (from version 4) is based on Recurrent Neural Networks (RNN) and it uses text post-processing by bigram language model. However, the resulting recognized text contains a number of errors. In some images, the text is not detected and recognized correctly or it is not detected at all. We have designed and tested image pre-processing and text post-processing methods for OCR error reduction. The word error rate (WER) was reduced from 29,4% to 15,4%.","PeriodicalId":236400,"journal":{"name":"2020 11th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 11th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CogInfoCom50765.2020.9237867","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
This paper investigates the use of optical character recognition (OCR) for system of audio-visual broadcast transcription. Characters were recognized from video frames by open-source program OCR Tesseract. The OCR in this program (from version 4) is based on Recurrent Neural Networks (RNN) and it uses text post-processing by bigram language model. However, the resulting recognized text contains a number of errors. In some images, the text is not detected and recognized correctly or it is not detected at all. We have designed and tested image pre-processing and text post-processing methods for OCR error reduction. The word error rate (WER) was reduced from 29,4% to 15,4%.