{"title":"Towards audio-video based handwritten mathematical content recognition in classroom videos","authors":"Smita Vemulapalli, M. Hayes","doi":"10.1109/PACRIM.2011.6032992","DOIUrl":null,"url":null,"abstract":"Recognizing handwritten mathematical content in classroom videos poses a range of interesting challenges. In this paper, we focus on improving the character recognition accuracy in such videos using a combination of video and audio based text recognizers. We propose a two step assembly consisting of a video text recognizer (VTR) as the primary character recognizer and an audio text recognizer (ATR) for disambiguating, if needed, the output of the VTR. We propose techniques for (1) detecting ambiguity in the output of the VTR so that a combination with the ATR may be triggered only for ambiguous characters, (2) synchronizing the output of the two recognizers for enabling combination, and (3) combining the options generated by the two recognizers using measurement and rank based methods. We have implemented the system using an open source implementation of a character recognizer and a commercially available phonetic word-spotter. Through experiments conducted using video recorded in a classroom-like environment, we demonstrate the improvement in the character recognition accuracy that can be achieved using our approach.","PeriodicalId":236844,"journal":{"name":"Proceedings of 2011 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 2011 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PACRIM.2011.6032992","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Recognizing handwritten mathematical content in classroom videos poses a range of interesting challenges. In this paper, we focus on improving the character recognition accuracy in such videos using a combination of video and audio based text recognizers. We propose a two step assembly consisting of a video text recognizer (VTR) as the primary character recognizer and an audio text recognizer (ATR) for disambiguating, if needed, the output of the VTR. We propose techniques for (1) detecting ambiguity in the output of the VTR so that a combination with the ATR may be triggered only for ambiguous characters, (2) synchronizing the output of the two recognizers for enabling combination, and (3) combining the options generated by the two recognizers using measurement and rank based methods. We have implemented the system using an open source implementation of a character recognizer and a commercially available phonetic word-spotter. Through experiments conducted using video recorded in a classroom-like environment, we demonstrate the improvement in the character recognition accuracy that can be achieved using our approach.