J. Ferdous, Suvrajit Karmaker, AKM SHAHARIAR AZAD RABBY, S. A. Hossain
{"title":"MatriVasha: Bangla Handwritten Compound Character Dataset and Recognition","authors":"J. Ferdous, Suvrajit Karmaker, AKM SHAHARIAR AZAD RABBY, S. A. Hossain","doi":"10.17632/V39PC2G2WP.1","DOIUrl":"https://doi.org/10.17632/V39PC2G2WP.1","url":null,"abstract":"At present, recognition of the Bangla handwriting compound character has been an essential issue for many years. In recent years there have been application-based researches in machine learning, and deep learning, which is gained interest, and most notably is handwriting recognition because it has a tremendous application such as Bangla OCR. MatrriVasha, the project which can recognize Bangla, handwritten several compound characters. Currently, compound character recognition is an important topic due to its variant application, and helps to create old forms, and information digitization with reliability. But unfortunately, there is a lack of a comprehensive dataset that can categorize all types of Bangla compound characters. MatrriVasha is an attempt to align compound character, and it's challenging because each person has a unique style of writing shapes. After all, MatrriVasha has proposed a dataset that intends to recognize Bangla 120(one hundred twenty) compound characters that consist of 2552(two thousand five hundred fifty-two) isolated handwritten characters written unique writers which were collected from within Bangladesh. This dataset faced problems in terms of the district, age, and gender-based written related research because the samples were collected that includes a verity of the district, age group, and the equal number of males, and females. As of now, our proposed dataset is so far the most extensive dataset for Bangla compound characters. It is intended to frame the acknowledgment technique for handwritten Bangla compound character. In the future, this dataset will be made publicly available to help to widen the research.","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128691714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Digit Image Recognition Using an Ensemble of One-Versus-All Deep Network Classifiers","authors":"A. M. Hafiz, M. Hassaballah","doi":"10.1007/978-981-16-0882-7_38","DOIUrl":"https://doi.org/10.1007/978-981-16-0882-7_38","url":null,"abstract":"","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121541866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predicting Landslides Using Contour Aligning Convolutional Neural Networks","authors":"Ainaz Hajimoradlou","doi":"10.14288/1.0385548","DOIUrl":"https://doi.org/10.14288/1.0385548","url":null,"abstract":"Landslides, movement of soil and rock under the influence of gravity, are common phenomena that cause significant human and economic losses every year. Experts use heterogeneous features such as slope, elevation, land cover, lithology, rock age, and rock family to predict landslides. To work with such features, we adapted convolutional neural networks to consider relative spatial information for the prediction task. Traditional filters in these networks either have a fixed orientation or are rotationally invariant. Intuitively, the filters should orient uphill, but there is not enough data to learn the concept of uphill; instead, it can be provided as prior knowledge. We propose a model called Locally Aligned Convolutional Neural Network, LACNN, that follows the ground surface at multiple scales to predict possible landslide occurrence for a single point. To validate our method, we created a standardized dataset of georeferenced images consisting of the heterogeneous features as inputs, and compared our method to several baselines, including linear regression, a neural network, and a convolutional network, using log-likelihood error and Receiver Operating Characteristic curves on the test set. We show that our model performs better than the other proposed baselines.","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121671090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A New Benchmark Dataset for Texture Image Analysis and Surface Defect Detection.","authors":"Shervan Fekri-Ershad","doi":"10.13140/RG.2.2.33612.46722","DOIUrl":"https://doi.org/10.13140/RG.2.2.33612.46722","url":null,"abstract":"Texture analysis plays an important role in many image processing applications to describe the image content or objects. On the other hand, visual surface defect detection is a highly research field in the computer vision. Surface defect refers to abnormalities in the texture of the surface. So, in this paper a dual purpose benchmark dataset is proposed for texture image analysis and surface defect detection titled stone texture image (STI dataset). The proposed benchmark dataset consist of 4 different class of stone texture images. The proposed benchmark dataset have some unique properties to make it very near to real applications. Local rotation, different zoom rates, unbalanced classes, variation of textures in size are some properties of the proposed dataset. In the result part, some descriptors are applied on this dataset to evaluate the proposed STI dataset in comparison with other state-of-the-art datasets.","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131300047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Li Xue, Si Xiandong, Nie Lan-shun, Liu Jiazhen, Ding Renjie, Zhang Dechen, Chu Dian-hui
{"title":"Understanding and Improving Deep Neural Network for Activity Recognition","authors":"Li Xue, Si Xiandong, Nie Lan-shun, Liu Jiazhen, Ding Renjie, Zhang Dechen, Chu Dian-hui","doi":"10.4108/EAI.21-6-2018.2276632","DOIUrl":"https://doi.org/10.4108/EAI.21-6-2018.2276632","url":null,"abstract":"Activity recognition has become a popular research branch in the field of pervasive computing in recent years. A large number of experiments can be obtained that activity sensor-based data's characteristic in activity recognition is variety, volume, and velocity. Deep learning technology, together with its various models, is one of the most effective ways of working on activity data. Nevertheless, there is no clear understanding of why it performs so well or how to make it more effective. In order to solve this problem, first, we applied convolution neural network on Human Activity Recognition Using Smart phones Data Set. Second, we realized the visualization of the sensor-based activity's data features extracted from the neural network. Then we had in-depth analysis of the visualization of features, explored the relationship between activity and features, and analyzed how Neural Networks identify activity based on these features. After that, we extracted the significant features related to the activities and sent the features to the DNN-based fusion model, which improved the classification rate to 96.1%. This is the first work to our knowledge that visualizes abstract sensor-based activity data features. Based on the results, the method proposed in the paper promises to realize the accurate classification of sensor- based activity recognition.","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130290027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thanh Nguyen, Yujia Xue, Yunzhe Li, Lei Tian, G. Nehmetallah
{"title":"Deep learning approach to Fourier ptychographic microscopy","authors":"Thanh Nguyen, Yujia Xue, Yunzhe Li, Lei Tian, G. Nehmetallah","doi":"10.6084/M9.FIGSHARE.C.4113581.V1","DOIUrl":"https://doi.org/10.6084/M9.FIGSHARE.C.4113581.V1","url":null,"abstract":"We would like to thank NVIDIA Corporation for supporting us with the GeForce Titan Xp through the GPU Grant Program. (NVIDIA Corporation; GeForce Titan Xp through the GPU Grant Program)","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131221930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"3D video quality assessment","authors":"Amin Banitalebi Dehkordi","doi":"10.14288/1.0166613","DOIUrl":"https://doi.org/10.14288/1.0166613","url":null,"abstract":"A key factor in designing 3D systems is to understand how different visual cues and distortions affect the perceptual quality of 3D video. The ultimate way to assess video quality is through subjective tests. However, subjective evaluation is time consuming, expensive, and in most cases not even possible. An alternative solution is objective quality metrics, which attempt to model the Human Visual System (HVS) in order to assess the perceptual quality. The potential of 3D technology to significantly improve the immersiveness of video content has been hampered by the difficulty of objectively assessing Quality of Experience (QoE). A no-reference (NR) objective 3D quality metric, which could help determine capturing parameters and improve playback perceptual quality, would be welcomed by camera and display manufactures. Network providers would embrace a full-reference (FR) 3D quality metric, as they could use it to ensure efficient QoE-based resource management during compression and Quality of Service (QoS) during transmission.","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"44 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132444337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"(k, q)-Compressed Sensing for dMRI with Joint Spatial-Angular Sparsity Prior","authors":"Evan Schwab, R. Vidal, N. Charon","doi":"10.1007/978-3-319-73839-0_2","DOIUrl":"https://doi.org/10.1007/978-3-319-73839-0_2","url":null,"abstract":"","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123052796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep Reinforcement Learning Attention Selection For Person Re-Identification","authors":"Xu Lan, Hangxiao Wang, S. Gong, Xiatian Zhu","doi":"10.5244/C.31.121","DOIUrl":"https://doi.org/10.5244/C.31.121","url":null,"abstract":"Existing person re-identification (re-id) methods assume the provision of accurately cropped person bounding boxes with minimum background noise, mostly by manually cropping. This is significantly breached in practice when person bounding boxes must be detected automatically given a very large number of images and/or videos processed. Compared to carefully cropped manually, auto-detected bounding boxes are far less accurate with random amount of background clutter which can degrade notably person re-id matching accuracy. In this work, we develop a joint learning deep model that optimises person re-id attention selection within any auto-detected person bounding boxes by reinforcement learning of background clutter minimisation subject to re-id label pairwise constraints. Specifically, we formulate a novel unified re-id architecture called Identity DiscriminativE Attention reinforcement Learning (IDEAL) to accurately select re-id attention in auto-detected bounding boxes for optimising re-id performance. Our model can improve re-id accuracy comparable to that from exhaustive human manual cropping of bounding boxes with additional advantages from identity discriminative attention selection that specially benefits re-id tasks beyond human knowledge. Extensive comparative evaluations demonstrate the re-id advantages of the proposed IDEAL model over a wide range of state-of-the-art re-id methods on two auto-detected re-id benchmarks CUHK03 and Market-1501.","PeriodicalId":185904,"journal":{"name":"arXiv: Computer Vision and Pattern Recognition","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133375469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}