Towards audio-video based handwritten mathematical content recognition in classroom videos

Smita Vemulapalli, M. Hayes
{"title":"Towards audio-video based handwritten mathematical content recognition in classroom videos","authors":"Smita Vemulapalli, M. Hayes","doi":"10.1109/PACRIM.2011.6032992","DOIUrl":null,"url":null,"abstract":"Recognizing handwritten mathematical content in classroom videos poses a range of interesting challenges. In this paper, we focus on improving the character recognition accuracy in such videos using a combination of video and audio based text recognizers. We propose a two step assembly consisting of a video text recognizer (VTR) as the primary character recognizer and an audio text recognizer (ATR) for disambiguating, if needed, the output of the VTR. We propose techniques for (1) detecting ambiguity in the output of the VTR so that a combination with the ATR may be triggered only for ambiguous characters, (2) synchronizing the output of the two recognizers for enabling combination, and (3) combining the options generated by the two recognizers using measurement and rank based methods. We have implemented the system using an open source implementation of a character recognizer and a commercially available phonetic word-spotter. Through experiments conducted using video recorded in a classroom-like environment, we demonstrate the improvement in the character recognition accuracy that can be achieved using our approach.","PeriodicalId":236844,"journal":{"name":"Proceedings of 2011 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 2011 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PACRIM.2011.6032992","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Recognizing handwritten mathematical content in classroom videos poses a range of interesting challenges. In this paper, we focus on improving the character recognition accuracy in such videos using a combination of video and audio based text recognizers. We propose a two step assembly consisting of a video text recognizer (VTR) as the primary character recognizer and an audio text recognizer (ATR) for disambiguating, if needed, the output of the VTR. We propose techniques for (1) detecting ambiguity in the output of the VTR so that a combination with the ATR may be triggered only for ambiguous characters, (2) synchronizing the output of the two recognizers for enabling combination, and (3) combining the options generated by the two recognizers using measurement and rank based methods. We have implemented the system using an open source implementation of a character recognizer and a commercially available phonetic word-spotter. Through experiments conducted using video recorded in a classroom-like environment, we demonstrate the improvement in the character recognition accuracy that can be achieved using our approach.
基于音视频的课堂视频手写数学内容识别
识别课堂视频中手写的数学内容带来了一系列有趣的挑战。在本文中,我们着重于使用基于视频和音频的文本识别器组合来提高此类视频中的字符识别精度。我们提出了一个由视频文本识别器(VTR)作为主要字符识别器和音频文本识别器(ATR)组成的两步组装,用于消除歧义,如果需要,输出的VTR。我们提出了以下技术:(1)检测VTR输出中的歧义,以便仅对歧义字符触发与ATR的组合;(2)同步两个识别器的输出以启用组合;(3)使用测量和基于秩的方法组合两个识别器生成的选项。我们使用字符识别器的开源实现和商业上可用的语音单词识别器实现了该系统。通过在类似教室的环境中录制的视频进行实验,我们证明了使用我们的方法可以提高字符识别的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信