Electronic Letters on Computer Vision and Image Analysis最新文献_第4页

Edge detection algorithm for omnidirectional images, based on superposition laws on Blach’s sphere and quantum entropy 基于Blach球和量子熵叠加定律的全向图像边缘检测算法

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2021-02-25 DOI: 10.5565/REV/ELCVIA.1338

Ayoub Ezzaki, Dirar Benkhedra, Mohamed El Ansari, L. Masmoudi

引用次数: 3

Recognition of Devanagari Scene Text Using Autoencoder CNN 基于自动编码器CNN的Devanagari场景文本识别

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2021-02-02 DOI: 10.5565/REV/ELCVIA.1344

S. Shiravale, Jayadevan R, S. Sannakki

{"title":"Recognition of Devanagari Scene Text Using Autoencoder CNN","authors":"S. Shiravale, Jayadevan R, S. Sannakki","doi":"10.5565/REV/ELCVIA.1344","DOIUrl":"https://doi.org/10.5565/REV/ELCVIA.1344","url":null,"abstract":"Scene text recognition is a well-rooted research domain covering a diverse application area. Recognition of scene text is challenging due to the complex nature of scene images. Various structural characteristics of the script also influence the recognition process. Text and background segmentation is a mandatory step in the scene text recognition process. A text recognition system produces the most accurate results if the structural and contextual information is preserved by the segmentation technique. Therefore, an attempt is made here to develop a robust foreground/background segmentation(separation) technique that produces the highest recognition results. A ground-truth dataset containing Devanagari scene text images is prepared for the experimentation. An encoder-decoder convolutional neural network model is used for text/background segmentation. The model is trained with Devanagari scene text images for pixel-wise classification of text and background. The segmented text is then recognized using an existing OCR engine (Tesseract). The word and character level recognition rates are computed and compared with other existing segmentation techniques to establish the effectiveness of the proposed technique.","PeriodicalId":38711,"journal":{"name":"Electronic Letters on Computer Vision and Image Analysis","volume":"37 1","pages":"55-69"},"PeriodicalIF":0.0,"publicationDate":"2021-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76510116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

A comparison of an RGB-D cameras performance and a stereo camera in relation to object recognition and spatial position determination RGB-D相机性能与立体相机在物体识别和空间位置确定方面的比较

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2021-01-26 DOI: 10.5565/REV/ELCVIA.1238

Julián S. Rodríguez

引用次数: 7

Adaptive Window Selection for Non-uniform Lighting Image Thresholding 非均匀光照图像阈值的自适应窗口选择

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2021-01-19 DOI: 10.5565/REV/ELCVIA.1301

Tapaswini Pattnaik, P. Kanungo

引用次数: 0

Investigation of Solar Flare Classification to Identify Optimal Performance 太阳耀斑分级识别最优性能的研究

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2021-01-19 DOI: 10.5565/REV/ELCVIA.1274

Aditya Kakde, Durgansh Sharma, B. Kaushik, N. Arora

{"title":"Investigation of Solar Flare Classification to Identify Optimal Performance","authors":"Aditya Kakde, Durgansh Sharma, B. Kaushik, N. Arora","doi":"10.5565/REV/ELCVIA.1274","DOIUrl":"https://doi.org/10.5565/REV/ELCVIA.1274","url":null,"abstract":"When an intense brightness for a small amount of time is seen in the sun, then we can say that a solar flare emerged. As solar flares are made up of high energy photons and particles, thus causing the production of high electric fields and currents and therefore results in the disruption in space-borne or ground-based technological system. It also becomes a challenging task to extract its important features for prediction. Convolutional Neural Networks have gain a significant amount of popularity in the classification and localization tasks. This paper has given stress on the classification of the solar flares emerged on different years by stacking different convolutional layers followed by max pooling layers. From the reference of Alexnet, the pooling layer employed in this paper is the overlapping pooling. Also two different activation functions that are ELU and CReLU have been used to investigate how many number of convolutional layers with a particular activation function provides with the best results on this dataset as the size of the dataset in this domain is always small. The proposed investigation can be further used in a novel solar prediction systems.","PeriodicalId":38711,"journal":{"name":"Electronic Letters on Computer Vision and Image Analysis","volume":"43 1","pages":"28-41"},"PeriodicalIF":0.0,"publicationDate":"2021-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88607036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Increasing the Segmentation Accuracy of Aerial Images with Dilated Spatial Pyramid Pooling 扩展空间金字塔池化提高航空图像分割精度

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2021-01-13 DOI: 10.5565/REV/ELCVIA.1337

Manuel Eugenio Morocho-Cayamcela

引用次数: 1

Saliency-Based Image Retrieval as a Refinement to Content-Based Image Retrieval 基于显著性的图像检索是对基于内容的图像检索的改进

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2021-01-13 DOI: 10.5565/REV/ELCVIA.1325

Mohammad A. N. Al-Azawi

{"title":"Saliency-Based Image Retrieval as a Refinement to Content-Based Image Retrieval","authors":"Mohammad A. N. Al-Azawi","doi":"10.5565/REV/ELCVIA.1325","DOIUrl":"https://doi.org/10.5565/REV/ELCVIA.1325","url":null,"abstract":"Searching for an image in a database is important in different applications; hence, many algorithms have been proposed to identify the contents of the image. In some applications, but not all, identifying the content of the image as a whole can offer good results. Searching for an object inside the image is more important in most applications than identifying the image as a whole. Therefore, studies focused on segmenting the image into small sub-images and identified their contents. In view of the concepts of human attention, various literature defined saliency as a computer representation of it, where different algorithms were developed to extract the salient regions. These salient regions, which are the regions that attract human attention, are used to identify the most important regions that contain important objects in the image. In this paper, we introduce a new algorithm that utilises the saliency principles to identify the contents of an image and search for similar objects in the images stored in a database. We also demonstrate that the use of salient objects produces better and more accurate results in the image retrieval process. A new retrieval algorithm is therefore presented here, focused on identifying the objects extracted from the salient regions. To assess the efficiency of the proposed algorithm, a new evaluation method is also proposed which considers the order of the retrieved image in assessing the efficiency of the algorithm.","PeriodicalId":38711,"journal":{"name":"Electronic Letters on Computer Vision and Image Analysis","volume":"39 1","pages":"1-15"},"PeriodicalIF":0.0,"publicationDate":"2021-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77097065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Processing and Representation of Multispectral Images Using Deep Learning Techniques 使用深度学习技术处理和表示多光谱图像

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2021-01-12 DOI: 10.5565/REV/ELCVIA.1246

Patricia L. Suárez

{"title":"Processing and Representation of Multispectral Images Using Deep Learning Techniques","authors":"Patricia L. Suárez","doi":"10.5565/REV/ELCVIA.1246","DOIUrl":"https://doi.org/10.5565/REV/ELCVIA.1246","url":null,"abstract":"This thesis has implemented innovative techniques in the field of computer vision using visible and near-infrared spectrum images, applying deep learning through convolutional networks, especially GANs' architectures, which are specialists in generating information and also includes meta techniques -learning to tackle the problem of determining the similarity of images of a different spectrum. In this research, with this type of convolutional networks, different supervised and unsupervised techniques have been created to solve challenging problems, like detect the similarity of patches of different spectra (visible-infrared), colorized images of the near-infrared spectrum, estimation of vegetation index (NDVI) and the haze removal present on RGB images using NIR images. For all these techniques different variants of the GAN's networks, such as standard, conditional, stacked, and cyclic have been used. Also, a metric-based meta-learning approach has been implemented. It should be mentioned that together with the implementation of adversarial network models, the use of multiple loss functions has been proposed to improve the generalization and increase the effectiveness of the models. The experiments were performed with paired and unpaired images, given the different supervised and unsupervised architectures implemented, respectively. The experimental results obtained in each of the approaches implemented in the doctoral work compared with the techniques of the state of the art were shown to be more effective.","PeriodicalId":38711,"journal":{"name":"Electronic Letters on Computer Vision and Image Analysis","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43161147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Processing historical photographs and film footage with Photogrammetry and Artificial Intelligence for Cultural Heritage documentation and virtual reconstruction 利用摄影测量和人工智能处理历史照片和电影片段，用于文化遗产文献和虚拟重建

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2021-01-12 DOI: 10.5565/REV/ELCVIA.1323

F. Condorelli

{"title":"Processing historical photographs and film footage with Photogrammetry and Artificial Intelligence for Cultural Heritage documentation and virtual reconstruction","authors":"F. Condorelli","doi":"10.5565/REV/ELCVIA.1323","DOIUrl":"https://doi.org/10.5565/REV/ELCVIA.1323","url":null,"abstract":"The specific objective of this thesis is to offer an excursion through the metric potentialities of different data available in historical archives, by considering the essential role of photogrammetry. The aim is to explore how metric information about buildings which no longer exist or transformed over time could be extracted from old photographs and videos of different quality, for their 3D virtual reconstruction analysing the material stored in historical archives to support researchers and experts in historical research of Cultural Heritage.In order to process these data and to obtain metrically certified results, a modification of the algorithms of the standard photogrammetric pipeline was necessary. This purpose was achieved with the use of open-source Structure-from-Motion algorithms and the creation of a specific benchmark to compare the results.Besides the processing of historical photograph, photogrammetry is combined with Artificial Intelligence to improve ways to search for architectural heritage in video material and to reduce the effort of manually examining them by the operator in the archive in terms of efficiency and time.","PeriodicalId":38711,"journal":{"name":"Electronic Letters on Computer Vision and Image Analysis","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47726226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Distilling Structure from Imagery: Graph-based Models for the Interpretation of Document Images 从图像中提取结构：基于图的文档图像解释模型

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2021-01-12 DOI: 10.5565/REV/ELCVIA.1313

Pau Riba

{"title":"Distilling Structure from Imagery: Graph-based Models for the Interpretation of Document Images","authors":"Pau Riba","doi":"10.5565/REV/ELCVIA.1313","DOIUrl":"https://doi.org/10.5565/REV/ELCVIA.1313","url":null,"abstract":"From its early stages, the community of Pattern Recognition and Computer Vision has considered the importance of leveraging the structural information when understanding images. Usually, graphs have been proposed as a suitable model to represent this kind of information due to their flexibility and representational power able to codify both, the components, objects, or entities and their pairwise relationship. Even though graphs have been successfully applied to a huge variety of tasks, as a result of their symbolic and relational nature, graphs have always suffered from some limitations compared to statistical approaches. Indeed, some trivial mathematical operations do not have an equivalence in the graph domain. For instance, in the core of many pattern recognition applications, there is a need to compare two objects. This operation, which is trivial when considering feature vectors defined in ℝn, is not properly defined for graphs. In this thesis, we have investigated the importance of the structural information from two perspectives, the traditional graph-based methods and the new advances on Geometric Deep Learning. On the one hand, we explore the problem of defining a graph representation and how to deal with it on a large scale and noisy scenario. On the other hand, Graph Neural Networks are proposed to first redefine a Graph Edit Distance methodologies as a metric learning problem, and second, to apply them in a real use case scenario for the detection of repetitive patterns which define tables in invoice documents. As experimental framework, we have validated the different methodological contributions in the domain of Document Image Analysis and Recognition.","PeriodicalId":38711,"journal":{"name":"Electronic Letters on Computer Vision and Image Analysis","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42184504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0