{"title":"从翻书视频流的OCR","authors":"Dibyayan Chakraborty, P. Roy, J. Álvarez, U. Pal","doi":"10.1109/ACPR.2013.24","DOIUrl":null,"url":null,"abstract":"Optical Character Recognition (OCR) in video stream of flipping pages is a challenging task because flipping at random speed cause difficulties to identify frames that contain the open page image (OPI) for better readability. Also, low resolution, blurring effect shadows add significant noise in selection of proper frames for OCR. In this work, we focus on the problem of identifying the set of optimal representative frames for the OPI from a video stream of flipping pages and then perform OCR without using any explicit hardware. To the best of our knowledge this is the first work in this area. We present an algorithm that exploits cues from edge information of flipping pages. These cues, extracted from the region of interest (ROI) of the frame, determine the flipping or open state of a page. Then a SVM classifier is trained with the edge cue information for this determination. For each OPI we obtain a set of frames. Next we choose the central frame from that set of frames as the representative frame of the corresponding OPI and perform OCR. Experiments are performed on video documents recorded using a standard resolution camera to validate the frame selection algorithm and we have obtained 88% accuracy. Also, we have obtained character recognition accuracy of 82% and word recognition accuracy of 77% from such book flipping OCR.","PeriodicalId":365633,"journal":{"name":"2013 2nd IAPR Asian Conference on Pattern Recognition","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"OCR from Video Stream of Book Flipping\",\"authors\":\"Dibyayan Chakraborty, P. Roy, J. Álvarez, U. Pal\",\"doi\":\"10.1109/ACPR.2013.24\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Optical Character Recognition (OCR) in video stream of flipping pages is a challenging task because flipping at random speed cause difficulties to identify frames that contain the open page image (OPI) for better readability. Also, low resolution, blurring effect shadows add significant noise in selection of proper frames for OCR. In this work, we focus on the problem of identifying the set of optimal representative frames for the OPI from a video stream of flipping pages and then perform OCR without using any explicit hardware. To the best of our knowledge this is the first work in this area. We present an algorithm that exploits cues from edge information of flipping pages. These cues, extracted from the region of interest (ROI) of the frame, determine the flipping or open state of a page. Then a SVM classifier is trained with the edge cue information for this determination. For each OPI we obtain a set of frames. Next we choose the central frame from that set of frames as the representative frame of the corresponding OPI and perform OCR. Experiments are performed on video documents recorded using a standard resolution camera to validate the frame selection algorithm and we have obtained 88% accuracy. Also, we have obtained character recognition accuracy of 82% and word recognition accuracy of 77% from such book flipping OCR.\",\"PeriodicalId\":365633,\"journal\":{\"name\":\"2013 2nd IAPR Asian Conference on Pattern Recognition\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-11-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 2nd IAPR Asian Conference on Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACPR.2013.24\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 2nd IAPR Asian Conference on Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACPR.2013.24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Optical Character Recognition (OCR) in video stream of flipping pages is a challenging task because flipping at random speed cause difficulties to identify frames that contain the open page image (OPI) for better readability. Also, low resolution, blurring effect shadows add significant noise in selection of proper frames for OCR. In this work, we focus on the problem of identifying the set of optimal representative frames for the OPI from a video stream of flipping pages and then perform OCR without using any explicit hardware. To the best of our knowledge this is the first work in this area. We present an algorithm that exploits cues from edge information of flipping pages. These cues, extracted from the region of interest (ROI) of the frame, determine the flipping or open state of a page. Then a SVM classifier is trained with the edge cue information for this determination. For each OPI we obtain a set of frames. Next we choose the central frame from that set of frames as the representative frame of the corresponding OPI and perform OCR. Experiments are performed on video documents recorded using a standard resolution camera to validate the frame selection algorithm and we have obtained 88% accuracy. Also, we have obtained character recognition accuracy of 82% and word recognition accuracy of 77% from such book flipping OCR.