一种高效的视频文本检测粗到精方案

The First Asian Conference on Pattern Recognition Pub Date : 2011-11-01 DOI:10.1109/ACPR.2011.6166605

Liuan Wang, Lin-Lin Huang, Yang Wu

{"title":"一种高效的视频文本检测粗到精方案","authors":"Liuan Wang, Lin-Lin Huang, Yang Wu","doi":"10.1109/ACPR.2011.6166605","DOIUrl":null,"url":null,"abstract":"To achieve fast and accurate text detection from videos, we propose an efficient coarse-to-fine scheme comprising three stages: key frame extraction, candidate text line detection and fine text detection. Key frames, which are assumed to carry texts, are extracted based on multi-threshold difference of color histogram (MDCH). From the key frames, candidate text lines are detected by morphological operations and connected component analysis. Sliding window classification is performed on the candidate text lines so as to detect refined text lines. We use two types of features: histogram of gradients (HOG) and local assembled binary (LAB), and two classifiers: Real Adaboost and polynomial neural network (PNN), for improving the classification accuracy. The effectiveness of the proposed method has been demonstrated by the experiment results on a large video dataset. Also, the benefits of key frame extraction and combining multiple features and classifiers have been justified.","PeriodicalId":287232,"journal":{"name":"The First Asian Conference on Pattern Recognition","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"An efficient coarse-to-fine scheme for text detection in videos\",\"authors\":\"Liuan Wang, Lin-Lin Huang, Yang Wu\",\"doi\":\"10.1109/ACPR.2011.6166605\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To achieve fast and accurate text detection from videos, we propose an efficient coarse-to-fine scheme comprising three stages: key frame extraction, candidate text line detection and fine text detection. Key frames, which are assumed to carry texts, are extracted based on multi-threshold difference of color histogram (MDCH). From the key frames, candidate text lines are detected by morphological operations and connected component analysis. Sliding window classification is performed on the candidate text lines so as to detect refined text lines. We use two types of features: histogram of gradients (HOG) and local assembled binary (LAB), and two classifiers: Real Adaboost and polynomial neural network (PNN), for improving the classification accuracy. The effectiveness of the proposed method has been demonstrated by the experiment results on a large video dataset. Also, the benefits of key frame extraction and combining multiple features and classifiers have been justified.\",\"PeriodicalId\":287232,\"journal\":{\"name\":\"The First Asian Conference on Pattern Recognition\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The First Asian Conference on Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACPR.2011.6166605\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The First Asian Conference on Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACPR.2011.6166605","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

为了实现快速准确的视频文本检测，我们提出了一种高效的从粗到精的方案，包括三个阶段:关键帧提取、候选文本行检测和精细文本检测。基于多阈值颜色直方图差分(MDCH)提取假定携带文本的关键帧。从关键帧中，通过形态学操作和连接成分分析检测候选文本行。对候选文本行进行滑动窗口分类，以检测精细文本行。为了提高分类精度，我们使用了梯度直方图(HOG)和局部组合二值(LAB)两种特征，以及Real Adaboost和多项式神经网络(PNN)两种分类器。在大型视频数据集上的实验结果验证了该方法的有效性。此外，关键帧提取和组合多个特征和分类器的好处也得到了证明。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An efficient coarse-to-fine scheme for text detection in videos

To achieve fast and accurate text detection from videos, we propose an efficient coarse-to-fine scheme comprising three stages: key frame extraction, candidate text line detection and fine text detection. Key frames, which are assumed to carry texts, are extracted based on multi-threshold difference of color histogram (MDCH). From the key frames, candidate text lines are detected by morphological operations and connected component analysis. Sliding window classification is performed on the candidate text lines so as to detect refined text lines. We use two types of features: histogram of gradients (HOG) and local assembled binary (LAB), and two classifiers: Real Adaboost and polynomial neural network (PNN), for improving the classification accuracy. The effectiveness of the proposed method has been demonstrated by the experiment results on a large video dataset. Also, the benefits of key frame extraction and combining multiple features and classifiers have been justified.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

The First Asian Conference on Pattern Recognition

自引率

0.00%

发文量