Electronic Letters on Computer Vision and Image Analysis最新文献_第3页

Multi-staged Feature-Attentive Network for Fashion Clothing Classification and Attribute Prediction 基于多阶段特征关注网络的时尚服装分类与属性预测

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2022-01-25 DOI: 10.5565/rev/elcvia.1409

Majuran Shajini, A. Ramanan

{"title":"Multi-staged Feature-Attentive Network for Fashion Clothing Classification and Attribute Prediction","authors":"Majuran Shajini, A. Ramanan","doi":"10.5565/rev/elcvia.1409","DOIUrl":"https://doi.org/10.5565/rev/elcvia.1409","url":null,"abstract":"In the visual fashion clothing analysis, many researchers are attracted with the success of deep learning concepts. In this work, we introduce a multi-staged feature-attentive network to attain clothing category classification and attribute prediction. The proposed network in this work brings out a landmark-independent structure, whereas the existing landmark-dependent structures take up a lot of manpower for landmark annotation and also suffers from inter- and intra-individual variability. Our focus on this work is intensifying feature extraction by incorporating low-level and high-level feature fusion within fashion network. We are aiming on multi-level contextual features which utilise spatial and channel-wise information to create contextual feature supervision. Further, we enclose a semi-supervised learning approach to escalate fashion clothes analysis that utilises knowledge sharing among labelled and unlabelled data. To the best of our knowledge, this is the first attempt to investigate the semi-supervised learning in fashion clothing analysis by adopting multitask architecture which simultaneously study the clothing categories as well as its attributes. We evaluated the proposed approach on large-scale DeepFashion-C dataset while unlabelled dataset obtained from six publicly available fashion datasets. Experimental results show that the proposed architectures for supervised and semi-supervised learning entailing deep convolutional neural network outperforms many state-of-the-art techniques considerably, in fashion clothing analysis.","PeriodicalId":38711,"journal":{"name":"Electronic Letters on Computer Vision and Image Analysis","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44045336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Analysis of the Measurement Matrix in Directional Predictive Coding for Compressive Sensing of Medical Images 医学图像压缩感知方向预测编码中的测量矩阵分析

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2022-01-25 DOI: 10.5565/rev/elcvia.1412

Hepzibah Christinal A, Kowsalya G, Abraham Chandy D, J. S, Chandrajit L. Bajaj

{"title":"Analysis of the Measurement Matrix in Directional Predictive Coding for Compressive Sensing of Medical Images","authors":"Hepzibah Christinal A, Kowsalya G, Abraham Chandy D, J. S, Chandrajit L. Bajaj","doi":"10.5565/rev/elcvia.1412","DOIUrl":"https://doi.org/10.5565/rev/elcvia.1412","url":null,"abstract":"Compressive sensing of 2D signals involves three fundamental steps: sparse representation, linear measurement matrix, and recovery of the signal. This paper focuses on analyzing the efficiency of various measurement matrices for compressive sensing of medical images based on theoretical predictive coding. During encoding, the prediction is efficiently chosen by four directional predictive modes for block-based compressive sensing measurements. In this work, Gaussian, Bernoulli, Laplace, Logistic, and Cauchy random matrices are used as the measurement matrices. While decoding, the same optimal prediction is de-quantized. Peak-signal-to-noise ratio and sparsity are used for evaluating the performance of measurement matrices. The experimental result shows that the spatially directional predictive coding (SDPC) with Laplace measurement matrices performs better compared to scalar quantization (SQ) and differential pulse code modulation (DPCM) methods. The results indicate that the Laplace measurement matrix is the most suitable in compressive sensing of medical images.","PeriodicalId":38711,"journal":{"name":"Electronic Letters on Computer Vision and Image Analysis","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41321976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Deep Learning Based Models for Offline Gurmukhi Handwritten Character and Numeral Recognition 基于深度学习的离线Gurmukhi手写字符和数字识别模型

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2022-01-18 DOI: 10.5565/rev/elcvia.1282

M. K. Mahto, K. Bhatia, R. Sharma

引用次数: 5

An Efficient BoF Representation for Object Classification 一种有效的对象分类BoF表示

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2021-12-16 DOI: 10.5565/rev/elcvia.1403

V. Vinoharan, A. Ramanan

{"title":"An Efficient BoF Representation for Object Classification","authors":"V. Vinoharan, A. Ramanan","doi":"10.5565/rev/elcvia.1403","DOIUrl":"https://doi.org/10.5565/rev/elcvia.1403","url":null,"abstract":"The Bag-of-features (BoF) approach has proved to yield better performance in a patch-based object classification system owing to its simplicity. However, often the very large number of patch-based descriptors (such as scale-invariant feature transform and speeded up robust features, extracted from images to create a BoF vector) leads to huge computational cost and an increased storage requirement. This paper demonstrates a two-staged approach to creating a discriminative and compact BoF representation for object classification. As a preprocessing stage to the codebook construction, ambiguous patch-based descriptors are eliminated using an entropy-based and one-pass feature selection approach, to retain high-quality descriptors. As a post-processing stage to the codebook construction, a subset of codewords which is not activated enough in images are eliminated from the initially constructed codebook based on statistical measures. Finally, each patch-based descriptor of an image is assigned to the closest codeword to create a histogram representation. One-versus-all support vector machine is applied to classify the histogram representation. The proposed methods are evaluated on benchmark image datasets. Testing results show that the proposed methods enables the codebook to be more discriminative and compact in moderate sized visual object classification tasks.","PeriodicalId":38711,"journal":{"name":"Electronic Letters on Computer Vision and Image Analysis","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46662110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Underwater Acoustic Image Denoising Using Stationary Wavelet Transform and Various Shrinkage Functions 基于平稳小波变换和各种收缩函数的水声图像去噪

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2021-09-14 DOI: 10.5565/rev/elcvia.1360

P. Ravisankar

引用次数: 1

Accuracy improvement of the inSAR quality-guided phase unwrapping based on a modified PDV map. 基于改进PDV图的inSAR质量引导相位展开精度的提高。

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2021-08-18 DOI: 10.5565/REV/ELCVIA.1220

Tarek Bentahar

引用次数: 0

Video Summarization for Multiple Sports Using Deep Learning 使用深度学习的多种运动视频摘要

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2021-08-14 DOI: 10.5565/rev/elcvia.1286

Chakradhar Guntuboina, Aditya Porwal, Preety Jain, Hansa Shingrakhia

{"title":"Video Summarization for Multiple Sports Using Deep Learning","authors":"Chakradhar Guntuboina, Aditya Porwal, Preety Jain, Hansa Shingrakhia","doi":"10.5565/rev/elcvia.1286","DOIUrl":"https://doi.org/10.5565/rev/elcvia.1286","url":null,"abstract":"This paper proposes a computationally inexpensive method for automatic key-event extraction and subsequent summarization of sports videos using scoreboard detection. A database consisting of 1300 images was used to train a supervised-learning based object detection algorithm, YOLO (You Only Look Once). Then, for each frame of the video, once the scoreboard was detected using YOLO, the scoreboard was cropped out of the image. After this, image processing techniques were applied on the cropped scoreboard to reduce noise and false positives. Finally, the processed image was passed through an OCR (Optical Character Recognizer) to get the score. A rule-based algorithm was run on the output of the OCR to generate the timestamps of key-events based on the game. The proposed method is best suited for people who want to analyse the games and want precise timestamps of the occurrence of important events. The performance of the proposed design was tested on videos of Bundesliga, English Premier League, ICC WC 2019, IPL 2019, and Pro Kabaddi League. An average F1 Score of 0.979 was achieved during the simulations. The algorithm is trained on five different classes of three separate games (Soccer, Cricket, Kabaddi). The design is implemented using python 3.7.","PeriodicalId":38711,"journal":{"name":"Electronic Letters on Computer Vision and Image Analysis","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71038255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Modelling and Analysis of Facial Expressions Using Optical Flow Derived Divergence and Curl Templates 基于光流衍生散度和旋度模板的面部表情建模与分析

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2021-06-01 DOI: 10.5565/REV/ELCVIA.1275

Shivangi Anthwal

引用次数: 0

Identification of Suitable Contrast Enhancement Technique for Improving the Quality of Astrocytoma Histopathological Images. 提高星形细胞瘤组织病理图像质量的合适对比度增强技术的确定。

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2021-05-27 DOI: 10.5565/REV/ELCVIA.1256

F. A. Dzulkifli

引用次数: 1

Social Video Advertisement Replacement and its Evaluation in Convolutional Neural Networks 基于卷积神经网络的社交视频广告替代及其评价

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2021-05-27 DOI: 10.5565/REV/ELCVIA.1347

Cheng Yang, Xiang Yu, Arun Kumar, G. Ali, P. H. Chong, P. P. Lam

{"title":"Social Video Advertisement Replacement and its Evaluation in Convolutional Neural Networks","authors":"Cheng Yang, Xiang Yu, Arun Kumar, G. Ali, P. H. Chong, P. P. Lam","doi":"10.5565/REV/ELCVIA.1347","DOIUrl":"https://doi.org/10.5565/REV/ELCVIA.1347","url":null,"abstract":"This paper introduces a method to use deep convolutional neural networks (CNNs) to automatically replace advertisement (AD) photo on social (or self-media) videos and provides the suitable evaluation method to compare different CNNs. An AD photo can replace a picture inside a video. However, if a human being occludes the replaced picture in the original video, the newly pasted AD photo will block the human occluded part. The deep learning algorithm is implemented to segment the human being from the video. The segmented human pixels are then pasted back to the occluded area, so that the AD photo replacement becomes natural and perfect appearance in the video. This process requires the predicted occlusion edge to be closed to the ground truth occlusion edge, so that the AD photo can be occluded naturally. Therefore, this research introduces a curve fitting method to measure the predicted occlusion edge’s error. By using this method, three CNN methods are applied and compared for the AD replacement. They are mask of regions convolutional neural network (Mask RCNN), a recurrent network for video object segmentation (ROVS) and DeeplabV3. The experimental results show the comparative segmentation accuracy of the different models and DeeplabV3 shows the best performance.","PeriodicalId":38711,"journal":{"name":"Electronic Letters on Computer Vision and Image Analysis","volume":"27 1","pages":"117-136"},"PeriodicalIF":0.0,"publicationDate":"2021-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81607663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0