2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)最新文献

筛选
英文 中文
Text detection in born-digital images by mass estimation 基于质量估计的非数字图像文本检测
2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR) Pub Date : 2015-11-01 DOI: 10.1109/ACPR.2015.7486591
Jiamin Xu, P. Shivakumara, Tong Lu, C. Tan, M. Blumenstein
{"title":"Text detection in born-digital images by mass estimation","authors":"Jiamin Xu, P. Shivakumara, Tong Lu, C. Tan, M. Blumenstein","doi":"10.1109/ACPR.2015.7486591","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486591","url":null,"abstract":"There is a need for effective web-document understanding due to the explosive progress of internet and network technologies. In this paper, we propose a new method for text detection in born-digital images by introducing a mass estimation concept. We propose to explore super-pixel information of different color channels to identify text atoms in images. The proposed method uses similarity graphs and spectral clustering to identify candidate text regions. We propose a new idea of mapping Gabor responses of a candidate text region to a spatial circle to study the spatial coherency of pixels. We introduce a mass estimation concept to identify text candidates from the pixel distribution in a spatial circle. The linear linkage graphs help in grouping text candidates to obtain full text lines. The same Gabor responses are used as features to eliminate false positives with an SVM classifier. We evaluate the proposed method for the testing on standard datasets, such as ICDAR 2013 (challenge-1) and the Situ et al. dataset. Experimental results on both the datasets show that the proposed method outperforms the existing methods.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131737744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Beyond human recognition: A CNN-based framework for handwritten character recognition 超越人类识别:一个基于cnn的手写字符识别框架
2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR) Pub Date : 2015-11-01 DOI: 10.1109/ACPR.2015.7486592
Li Chen, Song Wang, Wei-liang Fan, Jun Sun, S. Naoi
{"title":"Beyond human recognition: A CNN-based framework for handwritten character recognition","authors":"Li Chen, Song Wang, Wei-liang Fan, Jun Sun, S. Naoi","doi":"10.1109/ACPR.2015.7486592","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486592","url":null,"abstract":"Because of the various appearance (different writers, writing styles, noise, etc.), the handwritten character recognition is one of the most challenging task in pattern recognition. Through decades of research, the traditional method has reached its limit while the emergence of deep learning provides a new way to break this limit. In this paper, a CNN-based handwritten character recognition framework is proposed. In this framework, proper sample generation, training scheme and CNN network structure are employed according to the properties of handwritten characters. In the experiments, the proposed framework performed even better than human on handwritten digit (MNIST) and Chinese character (CASIA) recognition. The advantage of this framework is proved by these experimental results.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"304 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131785692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 120
New texture-spatial features for keyword spotting in video images 用于视频图像关键字识别的新纹理空间特征
2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR) Pub Date : 2015-11-01 DOI: 10.1109/ACPR.2015.7486532
P. Shivakumara, Guozhu Liang, Sangheeta Roy, U. Pal, Tong Lu
{"title":"New texture-spatial features for keyword spotting in video images","authors":"P. Shivakumara, Guozhu Liang, Sangheeta Roy, U. Pal, Tong Lu","doi":"10.1109/ACPR.2015.7486532","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486532","url":null,"abstract":"Keyword spotting in video document images is challenging due to low resolution and complex background of video images. We propose the combination of Texture-Spatial-Features (TSF) for keyword spotting in video images without recognizing them. First, a segmentation method extracts words from text lines in each video image. Then we propose the set of texture features for identifying text candidates in the word image with the help of k-means clustering. The proposed method finds proximity between text candidates to study the spatial arrangement of pixels that result in feature vectors for spotting words in the input frame. The proposed method is evaluated on word images of different fonts, contrasts, backgrounds and font sizes, which are chosen from standard databases such as ICDAR 2013 video and our video data. Experimental results show that the proposed method outperforms the existing method in terms of recall, precision and f-measure.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133916734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Bayesian nonparametric inference of latent topic hierarchies for multimodal data 多模态数据潜在主题层次的贝叶斯非参数推理
2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR) Pub Date : 2015-11-01 DOI: 10.1109/ACPR.2015.7486501
Takuji Shimamawari, K. Eguchi, A. Takasu
{"title":"Bayesian nonparametric inference of latent topic hierarchies for multimodal data","authors":"Takuji Shimamawari, K. Eguchi, A. Takasu","doi":"10.1109/ACPR.2015.7486501","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486501","url":null,"abstract":"Research on multimodal data analysis such as annotated image analysis is becoming more important than ever due to the increase in the amount of data. One of the approaches to this problem is multimodal topic models as an extension of latent Dirichlet allocation (LDA). Symmetric correspondence topic models (SymCorrLDA) are state-of-the-art multimodal topic models that can appropriately model multimodal data considering inter-modal dependencies. Incidentally, hierarchically structured categories can help users find relevant data from a large amount of data collection. Hierarchical topic models such as hierarchical latent Dirichlet allocation (hLDA) can discover a tree-structured hierarchy of latent topics from a given unimodal data collection; however, no hierarchical topic models can appropriately handle multimodal data considering intermodal mutual dependencies. In this paper, we propose h-SymCorrLDA to discover latent topic hierarchies from multimodal data by combining the ideas of the two previously mentioned models: multimodal topic models and hierarchical topic models. We demonstrate the effectiveness of our model compared with several baseline models through experiments with two datasets of annotated images.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122880004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Video-level violence rating with rank prediction 带有等级预测的视频暴力等级
2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR) Pub Date : 2015-11-01 DOI: 10.1109/ACPR.2015.7486468
Yu Wang, Jien Kato
{"title":"Video-level violence rating with rank prediction","authors":"Yu Wang, Jien Kato","doi":"10.1109/ACPR.2015.7486468","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486468","url":null,"abstract":"Given a video as input, our objective is to estimate a rate to describe \"how violent it is\". Such an estimation can be directly used in many practical applications, such like preventing children from violent videos. However, due to the unique property of the rating task, existing approaches on human action recognition and violent scenes detection can not be directly utilized. In this paper, we propose an approach that are specially developed for violence rating. The approach is featured with: (1) a novel video descriptor called Violent Attribute Activation (VAA) vector, which provides high level description on the properties of visual violence; and (2) a rank-prediction-based rating approach, which enforces the order constrains in the learning phase. The performance of our approach have been confirmed on a novel dataset that are prepared for violence rating.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124438955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Stereoscopic image warping for enhancing composition aesthetics 立体图像翘曲,增强构图美感
2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR) Pub Date : 2015-11-01 DOI: 10.1109/ACPR.2015.7486582
Md Baharul Islam, L. Wong, Chee-Onn Wong, Kok-Lim Low
{"title":"Stereoscopic image warping for enhancing composition aesthetics","authors":"Md Baharul Islam, L. Wong, Chee-Onn Wong, Kok-Lim Low","doi":"10.1109/ACPR.2015.7486582","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486582","url":null,"abstract":"The increased popularity of stereo photography due to the availability of stereoscopic lens and cameras has aroused research interest in stereo image editing. In this paper, we present an automatic, aesthetic-based warping approach to recompose both the left and right stereo image pair simultaneously using a global optimization algorithm. To maximize image aesthetics, we minimize a set of aesthetics errors formulated based on selected photographic composition rules during the warping process. In addition, our algorithm attempts to preserve the stereoscopic properties by minimizing disparity change and vertical drift in the resulting image. Experimental results shows that our approach successfully relocate salient objects according to the selected photographic rules to enhance compositional aesthetics and maintain disparity consistency to create a comfortable 3D viewing experience.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"182 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121058904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Efficient graph spanning structures for large database image retrieval 大型数据库图像检索的高效图生成结构
2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR) Pub Date : 2015-11-01 DOI: 10.1109/ACPR.2015.7486572
B. Mocanu, Ruxandra Tapu, T. Zaharia
{"title":"Efficient graph spanning structures for large database image retrieval","authors":"B. Mocanu, Ruxandra Tapu, T. Zaharia","doi":"10.1109/ACPR.2015.7486572","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486572","url":null,"abstract":"In this paper we propose a novel method to improve the performance of image retrieval at VLAD descriptor level. The system performs image re-ranking based on relational graphs and neighborhood relations of the top-k candidate results. The technique is able to treat differently various parts of the graph spanning structures by adaptively modifying the similarity score between images. Because most of the processing is performed offline, our algorithm does not influence the retrieval time. By dealing with uneven distribution of images in the dataset, the method is effective and increases the accuracy without relying on low-level information or on the geometrical verification of the considered features.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114445020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Learning clustered sub-spaces for sketch-based image retrieval 学习聚类子空间用于基于草图的图像检索
2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR) Pub Date : 2015-11-01 DOI: 10.1109/ACPR.2015.7486573
Koustav Ghosal, Ameya Prabhu, Riddhiman Dasgupta, A. Namboodiri
{"title":"Learning clustered sub-spaces for sketch-based image retrieval","authors":"Koustav Ghosal, Ameya Prabhu, Riddhiman Dasgupta, A. Namboodiri","doi":"10.1109/ACPR.2015.7486573","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486573","url":null,"abstract":"Most of the traditional sketch-based image retrieval systems compare sketches and images using morphological features. Since these features belong to two different modalities, they are compared either by reducing the image to a sparse sketch like form or by transforming the sketches to a denser image like representation. However, this cross-modal transformation leads to information loss or adds undesirable noise to the system. We propose a method, in which, instead of comparing the two modalities directly, a cross-modal correspondence is established between the images and sketches. Using an extended version of Canonical Correlation Analysis (CCA), the samples are projected onto a lower dimensional subspace, where the images and sketches of the same class are maximally correlated. We test the efficiency of our method on images from Caltech, PASCAL and sketches from TU-BERLIN dataset. Our results show significant improvement in retrieval performance with the cross-modal correspondence.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"258 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122084014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Video-based object recognition with weakly supervised object localization 基于视频的弱监督目标定位识别
2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR) Pub Date : 2015-11-01 DOI: 10.1109/ACPR.2015.7486463
Yang Liu, R. Kouskouridas, Tae-Kyun Kim
{"title":"Video-based object recognition with weakly supervised object localization","authors":"Yang Liu, R. Kouskouridas, Tae-Kyun Kim","doi":"10.1109/ACPR.2015.7486463","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486463","url":null,"abstract":"With the number of videos growing rapidly in modern society, automatically recognizing objects from video input becomes increasingly pressing. Videos contain abundant yet noisy information, with easily obtained video-level labels. This paper targets the problem of video-based object recognition, whilst keeping the advantages of videos. We propose a novel algorithm, which only utilizes the weak video-level label in training, iteratively updating the classifier and inferring the object location in each video frame. During testing we obtain more accurate recognition results by inferring the location of the object in the scene. The background and temporal information are also incorporated in the model to improve the discriminability and consistency of recognition in video. We introduce a novel and challenging YouTube dataset to demonstrate the benefits of our method over other baseline methods.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123229870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Accent classification with phonetic vowel representation 用语音元音表示的重音分类
2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR) Pub Date : 2015-11-01 DOI: 10.1109/ACPR.2015.7486559
Zhenhao Ge, Ying‐Ying Tan, A. Ganapathiraju
{"title":"Accent classification with phonetic vowel representation","authors":"Zhenhao Ge, Ying‐Ying Tan, A. Ganapathiraju","doi":"10.1109/ACPR.2015.7486559","DOIUrl":"https://doi.org/10.1109/ACPR.2015.7486559","url":null,"abstract":"Previous accent classification research focused mainly on detecting accents with pure acoustic information without recognizing accented speech. This work combines phonetic knowledge such as vowels with acoustic information to build Guassian Mixture Model (GMM) classifier with Perceptual Linear Predictive (PLP) features, optimized by Hetroscedastic Linear Discriminant Analysis (HLDA). With input about 20-second accented speech, this system achieves classification rate of 51% on a 7-way classification system focusing on the major types of accents in English, which is competitive to the state-of-the-art results in this field.","PeriodicalId":240902,"journal":{"name":"2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131324902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信