arXiv: Computer Vision and Pattern Recognition最新文献

筛选
英文 中文
MatriVasha: Bangla Handwritten Compound Character Dataset and Recognition MatriVasha:孟加拉语手写复合字数据集和识别
arXiv: Computer Vision and Pattern Recognition Pub Date : 2021-08-06 DOI: 10.17632/V39PC2G2WP.1
J. Ferdous, Suvrajit Karmaker, AKM SHAHARIAR AZAD RABBY, S. A. Hossain
{"title":"MatriVasha: Bangla Handwritten Compound Character Dataset and Recognition","authors":"J. Ferdous, Suvrajit Karmaker, AKM SHAHARIAR AZAD RABBY, S. A. Hossain","doi":"10.17632/V39PC2G2WP.1","DOIUrl":"https://doi.org/10.17632/V39PC2G2WP.1","url":null,"abstract":"At present, recognition of the Bangla handwriting compound character has been an essential issue for many years. In recent years there have been application-based researches in machine learning, and deep learning, which is gained interest, and most notably is handwriting recognition because it has a tremendous application such as Bangla OCR. MatrriVasha, the project which can recognize Bangla, handwritten several compound characters. Currently, compound character recognition is an important topic due to its variant application, and helps to create old forms, and information digitization with reliability. But unfortunately, there is a lack of a comprehensive dataset that can categorize all types of Bangla compound characters. MatrriVasha is an attempt to align compound character, and it's challenging because each person has a unique style of writing shapes. After all, MatrriVasha has proposed a dataset that intends to recognize Bangla 120(one hundred twenty) compound characters that consist of 2552(two thousand five hundred fifty-two) isolated handwritten characters written unique writers which were collected from within Bangladesh. This dataset faced problems in terms of the district, age, and gender-based written related research because the samples were collected that includes a verity of the district, age group, and the equal number of males, and females. As of now, our proposed dataset is so far the most extensive dataset for Bangla compound characters. It is intended to frame the acknowledgment technique for handwritten Bangla compound character. In the future, this dataset will be made publicly available to help to widen the research.","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128691714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Digit Image Recognition Using an Ensemble of One-Versus-All Deep Network Classifiers 基于单对全深度网络分类器集成的数字图像识别
arXiv: Computer Vision and Pattern Recognition Pub Date : 2020-06-28 DOI: 10.1007/978-981-16-0882-7_38
A. M. Hafiz, M. Hassaballah
{"title":"Digit Image Recognition Using an Ensemble of One-Versus-All Deep Network Classifiers","authors":"A. M. Hafiz, M. Hassaballah","doi":"10.1007/978-981-16-0882-7_38","DOIUrl":"https://doi.org/10.1007/978-981-16-0882-7_38","url":null,"abstract":"","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121541866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Predicting Landslides Using Contour Aligning Convolutional Neural Networks 利用等高线对齐卷积神经网络预测山体滑坡
arXiv: Computer Vision and Pattern Recognition Pub Date : 2019-11-12 DOI: 10.14288/1.0385548
Ainaz Hajimoradlou
{"title":"Predicting Landslides Using Contour Aligning Convolutional Neural Networks","authors":"Ainaz Hajimoradlou","doi":"10.14288/1.0385548","DOIUrl":"https://doi.org/10.14288/1.0385548","url":null,"abstract":"Landslides, movement of soil and rock under the influence of gravity, are common phenomena that cause significant human and economic losses every year. Experts use heterogeneous features such as slope, elevation, land cover, lithology, rock age, and rock family to predict landslides. To work with such features, we adapted convolutional neural networks to consider relative spatial information for the prediction task. Traditional filters in these networks either have a fixed orientation or are rotationally invariant. Intuitively, the filters should orient uphill, but there is not enough data to learn the concept of uphill; instead, it can be provided as prior knowledge. We propose a model called Locally Aligned Convolutional Neural Network, LACNN, that follows the ground surface at multiple scales to predict possible landslide occurrence for a single point. To validate our method, we created a standardized dataset of georeferenced images consisting of the heterogeneous features as inputs, and compared our method to several baselines, including linear regression, a neural network, and a convolutional network, using log-likelihood error and Receiver Operating Characteristic curves on the test set. We show that our model performs better than the other proposed baselines.","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121671090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A New Benchmark Dataset for Texture Image Analysis and Surface Defect Detection. 纹理图像分析和表面缺陷检测的新基准数据集。
arXiv: Computer Vision and Pattern Recognition Pub Date : 2019-06-27 DOI: 10.13140/RG.2.2.33612.46722
Shervan Fekri-Ershad
{"title":"A New Benchmark Dataset for Texture Image Analysis and Surface Defect Detection.","authors":"Shervan Fekri-Ershad","doi":"10.13140/RG.2.2.33612.46722","DOIUrl":"https://doi.org/10.13140/RG.2.2.33612.46722","url":null,"abstract":"Texture analysis plays an important role in many image processing applications to describe the image content or objects. On the other hand, visual surface defect detection is a highly research field in the computer vision. Surface defect refers to abnormalities in the texture of the surface. So, in this paper a dual purpose benchmark dataset is proposed for texture image analysis and surface defect detection titled stone texture image (STI dataset). The proposed benchmark dataset consist of 4 different class of stone texture images. The proposed benchmark dataset have some unique properties to make it very near to real applications. Local rotation, different zoom rates, unbalanced classes, variation of textures in size are some properties of the proposed dataset. In the result part, some descriptors are applied on this dataset to evaluate the proposed STI dataset in comparison with other state-of-the-art datasets.","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131300047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Understanding and Improving Deep Neural Network for Activity Recognition 深度神经网络在活动识别中的理解与改进
arXiv: Computer Vision and Pattern Recognition Pub Date : 2018-05-18 DOI: 10.4108/EAI.21-6-2018.2276632
Li Xue, Si Xiandong, Nie Lan-shun, Liu Jiazhen, Ding Renjie, Zhang Dechen, Chu Dian-hui
{"title":"Understanding and Improving Deep Neural Network for Activity Recognition","authors":"Li Xue, Si Xiandong, Nie Lan-shun, Liu Jiazhen, Ding Renjie, Zhang Dechen, Chu Dian-hui","doi":"10.4108/EAI.21-6-2018.2276632","DOIUrl":"https://doi.org/10.4108/EAI.21-6-2018.2276632","url":null,"abstract":"Activity recognition has become a popular research branch in the field of pervasive computing in recent years. A large number of experiments can be obtained that activity sensor-based data's characteristic in activity recognition is variety, volume, and velocity. Deep learning technology, together with its various models, is one of the most effective ways of working on activity data. Nevertheless, there is no clear understanding of why it performs so well or how to make it more effective. In order to solve this problem, first, we applied convolution neural network on Human Activity Recognition Using Smart phones Data Set. Second, we realized the visualization of the sensor-based activity's data features extracted from the neural network. Then we had in-depth analysis of the visualization of features, explored the relationship between activity and features, and analyzed how Neural Networks identify activity based on these features. After that, we extracted the significant features related to the activities and sent the features to the DNN-based fusion model, which improved the classification rate to 96.1%. This is the first work to our knowledge that visualizes abstract sensor-based activity data features. Based on the results, the method proposed in the paper promises to realize the accurate classification of sensor- based activity recognition.","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130290027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Deep learning approach to Fourier ptychographic microscopy 傅立叶型图显微镜的深度学习方法
arXiv: Computer Vision and Pattern Recognition Pub Date : 2018-04-27 DOI: 10.6084/M9.FIGSHARE.C.4113581.V1
Thanh Nguyen, Yujia Xue, Yunzhe Li, Lei Tian, G. Nehmetallah
{"title":"Deep learning approach to Fourier ptychographic microscopy","authors":"Thanh Nguyen, Yujia Xue, Yunzhe Li, Lei Tian, G. Nehmetallah","doi":"10.6084/M9.FIGSHARE.C.4113581.V1","DOIUrl":"https://doi.org/10.6084/M9.FIGSHARE.C.4113581.V1","url":null,"abstract":"We would like to thank NVIDIA Corporation for supporting us with the GeForce Titan Xp through the GPU Grant Program. (NVIDIA Corporation; GeForce Titan Xp through the GPU Grant Program)","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131221930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
3D video quality assessment 3D视频质量评估
arXiv: Computer Vision and Pattern Recognition Pub Date : 2018-03-13 DOI: 10.14288/1.0166613
Amin Banitalebi Dehkordi
{"title":"3D video quality assessment","authors":"Amin Banitalebi Dehkordi","doi":"10.14288/1.0166613","DOIUrl":"https://doi.org/10.14288/1.0166613","url":null,"abstract":"A key factor in designing 3D systems is to understand how different visual cues and distortions affect the perceptual quality of 3D video. The ultimate way to assess video quality is through subjective tests. However, subjective evaluation is time consuming, expensive, and in most cases not even possible. An alternative solution is objective quality metrics, which attempt to model the Human Visual System (HVS) in order to assess the perceptual quality. The potential of 3D technology to significantly improve the immersiveness of video content has been hampered by the difficulty of objectively assessing Quality of Experience (QoE). A no-reference (NR) objective 3D quality metric, which could help determine capturing parameters and improve playback perceptual quality, would be welcomed by camera and display manufactures. Network providers would embrace a full-reference (FR) 3D quality metric, as they could use it to ensure efficient QoE-based resource management during compression and Quality of Service (QoS) during transmission.","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"44 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132444337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Parallel Mapper 并行映射器
arXiv: Computer Vision and Pattern Recognition Pub Date : 2017-12-11 DOI: 10.1007/978-3-030-63089-8_47
Mustafa Hajij, Basem Assiri, P. Rosen
{"title":"Parallel Mapper","authors":"Mustafa Hajij, Basem Assiri, P. Rosen","doi":"10.1007/978-3-030-63089-8_47","DOIUrl":"https://doi.org/10.1007/978-3-030-63089-8_47","url":null,"abstract":"","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116919388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
(k, q)-Compressed Sensing for dMRI with Joint Spatial-Angular Sparsity Prior (k, q)联合空间-角稀疏先验的dMRI压缩感知
arXiv: Computer Vision and Pattern Recognition Pub Date : 2017-07-21 DOI: 10.1007/978-3-319-73839-0_2
Evan Schwab, R. Vidal, N. Charon
{"title":"(k, q)-Compressed Sensing for dMRI with Joint Spatial-Angular Sparsity Prior","authors":"Evan Schwab, R. Vidal, N. Charon","doi":"10.1007/978-3-319-73839-0_2","DOIUrl":"https://doi.org/10.1007/978-3-319-73839-0_2","url":null,"abstract":"","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123052796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Deep Reinforcement Learning Attention Selection For Person Re-Identification 人再识别的深度强化学习注意选择
arXiv: Computer Vision and Pattern Recognition Pub Date : 2017-07-01 DOI: 10.5244/C.31.121
Xu Lan, Hangxiao Wang, S. Gong, Xiatian Zhu
{"title":"Deep Reinforcement Learning Attention Selection For Person Re-Identification","authors":"Xu Lan, Hangxiao Wang, S. Gong, Xiatian Zhu","doi":"10.5244/C.31.121","DOIUrl":"https://doi.org/10.5244/C.31.121","url":null,"abstract":"Existing person re-identification (re-id) methods assume the provision of accurately cropped person bounding boxes with minimum background noise, mostly by manually cropping. This is significantly breached in practice when person bounding boxes must be detected automatically given a very large number of images and/or videos processed. Compared to carefully cropped manually, auto-detected bounding boxes are far less accurate with random amount of background clutter which can degrade notably person re-id matching accuracy. In this work, we develop a joint learning deep model that optimises person re-id attention selection within any auto-detected person bounding boxes by reinforcement learning of background clutter minimisation subject to re-id label pairwise constraints. Specifically, we formulate a novel unified re-id architecture called Identity DiscriminativE Attention reinforcement Learning (IDEAL) to accurately select re-id attention in auto-detected bounding boxes for optimising re-id performance. Our model can improve re-id accuracy comparable to that from exhaustive human manual cropping of bounding boxes with additional advantages from identity discriminative attention selection that specially benefits re-id tasks beyond human knowledge. Extensive comparative evaluations demonstrate the re-id advantages of the proposed IDEAL model over a wide range of state-of-the-art re-id methods on two auto-detected re-id benchmarks CUHK03 and Market-1501.","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133375469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 53
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信