2020 IEEE International Conference on Image Processing (ICIP)最新文献

筛选
英文 中文
Video-Based Coding Of Volumetric Data 基于视频的体积数据编码
2020 IEEE International Conference on Image Processing (ICIP) Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190689
D. Graziosi, B. Kroon
{"title":"Video-Based Coding Of Volumetric Data","authors":"D. Graziosi, B. Kroon","doi":"10.1109/ICIP40778.2020.9190689","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190689","url":null,"abstract":"New standards are emerging for the coding of volumetric 3D data such as immersive video and point clouds. Some of these volumetric encoders similarly utilize video codecs as the core of their compression approach, but apply different techniques to convert volumetric 3D data into 2D content for subsequent 2D video compression. Currently in MPEG there are two activities that follow this paradigm: ISO/IEC 23090-5 Video-based Point Cloud Compression (V-PCC) and ISO/IEC 23090-12 MPEG Immersive Video (MIV). In this article we propose for both standards to define 2D projection as common transmission format. We then describe a procedure based on camera projections that is applicable to both standards to convert 3D information into 2D images for efficient 2D compression. Results show that our approach successfully encodes both point clouds and immersive video content with the same performance as the current test models that MPEG experts developed separately for the respective standards. We conclude the article by discussing further integration steps and future directions.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114972387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Improved Intra Coding Beyond AV1 Using Adaptive Prediction Angles and Reference Lines 使用自适应预测角度和参考线改进AV1以外的内部编码
2020 IEEE International Conference on Image Processing (ICIP) Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191279
Liang Zhao, Xin Zhao, Shan Liu
{"title":"Improved Intra Coding Beyond AV1 Using Adaptive Prediction Angles and Reference Lines","authors":"Liang Zhao, Xin Zhao, Shan Liu","doi":"10.1109/ICIP40778.2020.9191279","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191279","url":null,"abstract":"A fixed set of intra prediction angles by using the reconstructed samples in adjacent reference line are employed in AV1 to remove the spatial redundancy of video signals. Two methods are proposed in this paper to further improve the intra coding performance of AV1. Firsly, to better signal the intra prediction modes, only a subset of the intra prediction modes (IPMs) are allowed and signaled for each block, which is adaptively selected according to the IPMs of neighboring blocks. Secondly, to reduce the prediction errors when there is a strong discontinuity between the samples in current block and its adjacent reference samples, an adaptive reference line selection method is proposed by enabling farther reference lines for intra prediction. Experimental results show that, the proposed methods achieve 2.2% luma BD-rate savings with around 150% encoding time for intra coding on top of the libaom implementation of AV1.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116683439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Cross-Modal Variational Framework For Food Image Analysis 食品图像分析的跨模态变分框架
2020 IEEE International Conference on Image Processing (ICIP) Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190758
T. Theodoridis, V. Solachidis, K. Dimitropoulos, P. Daras
{"title":"A Cross-Modal Variational Framework For Food Image Analysis","authors":"T. Theodoridis, V. Solachidis, K. Dimitropoulos, P. Daras","doi":"10.1109/ICIP40778.2020.9190758","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190758","url":null,"abstract":"Food analysis resides at the core of modern nutrition recommender systems, providing the foundation for a high-level understanding of users’ eating habits. This paper focuses on the sub-task of ingredient recognition from food images using a variational framework. The framework consists of two variational encoder-decoder branches, aimed at processing information from different modalities (images and text), as well as a variational mapper branch, which accomplishes the task of aligning the distributions of the individual branches. Experimental results on the Yummly-28K data-set showcase that the proposed framework performs better than similar variational frameworks, while it surpasses current state-of-the-art approaches on the large-scale Recipe1M data-set.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116956258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Optimization Of Motion Compensation Based On GPU And CPU For VVC Decoding 基于GPU和CPU的VVC解码运动补偿优化
2020 IEEE International Conference on Image Processing (ICIP) Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190708
Xu Han, Shanshe Wang, Siwei Ma, Wen Gao
{"title":"Optimization Of Motion Compensation Based On GPU And CPU For VVC Decoding","authors":"Xu Han, Shanshe Wang, Siwei Ma, Wen Gao","doi":"10.1109/ICIP40778.2020.9190708","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190708","url":null,"abstract":"To achieve higher compression efficiency, the new developing video coding standard Versatile Video Coding(VVC) introduced a large amount of new coding technologies, which increases the computational complexity of the decoder significantly. Among these technologies, the inter prediction methods, including affine motion compensation and decoder side motion vector refinement(DMVR), make inter prediction become the most time consuming module and bring new challenges for real-time decoding. In this paper, we proposed an efficient GPU-based motion compensation scheme to speedup the decoding. Through re-partition of coding unit(CU) according to the data dependency and different thread organization methods for different situation, the computational resources of GPU are utilized efficiently. Experiments on NVIDIA GeForce RTX 2080Ti GPU showed the motion compensation can be done in 5ms for Ultra HD 4K, which means the decoding speed is accelerated by 16 times compared to the VVC reference software on CPU.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"179 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120973594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Residual Networks Based Distortion Classification and Ranking for Laparoscopic Image Quality Assessment 基于残差网络的腹腔镜图像质量评价失真分类与排序
2020 IEEE International Conference on Image Processing (ICIP) Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191111
Zohaib Amjad Khan, Azeddine Beghdadi, M. Kaaniche, F. A. Cheikh
{"title":"Residual Networks Based Distortion Classification and Ranking for Laparoscopic Image Quality Assessment","authors":"Zohaib Amjad Khan, Azeddine Beghdadi, M. Kaaniche, F. A. Cheikh","doi":"10.1109/ICIP40778.2020.9191111","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191111","url":null,"abstract":"Laparoscopic images and videos are often affected by different types of distortion like noise, smoke, blur and nonuniform illumination. Automatic detection of these distortions, followed generally by application of appropriate image quality enhancement methods, is critical to avoid errors during surgery. In this context, a crucial step involves an objective assessment of the image quality, which is a two-fold problem requiring both the classification of the distortion type affecting the image and the estimation of the severity level of that distortion. Unlike existing image quality measures which focus mainly on estimating a quality score, we propose in this paper to formulate the image quality assessment task as a multi-label classification problem taking into account both the type as well as the severity level (or rank) of distortions. Here, this problem is then solved by resorting to a deep neural networks based approach. The obtained results on a laparoscopic image dataset show the efficiency of the proposed approach.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127365903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Parallax Motion Effect Generation Through Instance Segmentation And Depth Estimation 基于实例分割和深度估计的视差运动效果生成
2020 IEEE International Conference on Image Processing (ICIP) Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191168
A. Pinto, Manuel Alberto Cordova Neira, L. G. L. Decker, J. L. Flores-Campana, M. R. Souza, A. Santos, Jhonatas S. Conceição, H. F. Gagliardi, D. Luvizon, R. Torres, H. Pedrini
{"title":"Parallax Motion Effect Generation Through Instance Segmentation And Depth Estimation","authors":"A. Pinto, Manuel Alberto Cordova Neira, L. G. L. Decker, J. L. Flores-Campana, M. R. Souza, A. Santos, Jhonatas S. Conceição, H. F. Gagliardi, D. Luvizon, R. Torres, H. Pedrini","doi":"10.1109/ICIP40778.2020.9191168","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191168","url":null,"abstract":"Stereo vision is a growing topic in computer vision due to the innumerable opportunities and applications this technology offers for the development of modern solutions, such as virtual and augmented reality applications. To enhance the user’s experience in three-dimensional virtual environments, the motion parallax estimation is a promising technique to achieve this objective. In this paper, we propose an algorithm for generating parallax motion effects from a single image, taking advantage of state-of-the-art instance segmentation and depth estimation approaches. This work also presents a comparison against such algorithms to investigate the trade-off between efficiency and quality of the parallax motion effects, taking into consideration a multi-task learning network capable of estimating instance segmentation and depth estimation at once. Experimental results and visual quality assessment indicate that the PyD-Net network (depth estimation) combined with Mask R-CNN or FBNet networks (instance segmentation) can produce parallax motion effects with good visual quality.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127434208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Online Learning for Beta-Liouville Hidden Markov Models: Incremental Variational Learning for Video Surveillance and Action Recognition Beta-Liouville隐马尔可夫模型的在线学习:视频监控和动作识别的增量变分学习
2020 IEEE International Conference on Image Processing (ICIP) Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191144
Samr Ali, N. Bouguila
{"title":"Online Learning for Beta-Liouville Hidden Markov Models: Incremental Variational Learning for Video Surveillance and Action Recognition","authors":"Samr Ali, N. Bouguila","doi":"10.1109/ICIP40778.2020.9191144","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191144","url":null,"abstract":"Challenges in realtime installation of surveillance systems is an active area of research, especially with the use of adaptable machine learning techniques. In this paper, we propose the use of variational learning of Beta-Liouville (BL) hidden Markov models (HMM) for AR in an online setup. This proposed incremental framework enables continuous adjustment of the system for better modelling. We evaluate the proposed model on the visible IOSB dataset to validate the framework.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124824291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Prediction-Decision Network For Video Object Tracking 视频目标跟踪预测决策网络
2020 IEEE International Conference on Image Processing (ICIP) Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191145
Yasheng Sun, Tao He, Ying-hong Peng, Jin Qi, Jie Hu
{"title":"Prediction-Decision Network For Video Object Tracking","authors":"Yasheng Sun, Tao He, Ying-hong Peng, Jin Qi, Jie Hu","doi":"10.1109/ICIP40778.2020.9191145","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191145","url":null,"abstract":"In this paper, we introduce an approach for visual tracking in videos that predicts the bounding box location of a target object at every frame. This tracking problem is formulated as a sequential decision-making process where both historical and current information are taken into account to decide the correct object location. We develop a deep reinforcement learning based strategy, via which the target object position is predicted and decided in a unified framework. Specifically, a RNN based prediction network is developed where local features and global features are fused together to predict object movement. Together with the predicted movement, some predefined possible offsets and detection results form into an action space. A decision network is trained in a reinforcement manner to learn to select the most reasonable tracking box from the action space, through which the target object is tracked at each frame. Experiments in an existing tracking benchmark demonstrate the effectiveness and robustness of our proposed strategy.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"832 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125075095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Spatio-Angular Binary Descriptor For Fast Light Field Inter View Matching 一种用于快速光场视间匹配的空间-角度二元描述子
2020 IEEE International Conference on Image Processing (ICIP) Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9191118
Martin Alain, A. Smolic
{"title":"A Spatio-Angular Binary Descriptor For Fast Light Field Inter View Matching","authors":"Martin Alain, A. Smolic","doi":"10.1109/ICIP40778.2020.9191118","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9191118","url":null,"abstract":"Light fields are able to capture light rays from a scene arriving at different angles, effectively creating multiple perspective views of the same scene. Thus, one of the flagship applications of light fields is to estimate the captured scene geometry, which can notably be achieved by establishing correspondences between the perspective views, usually in the form of a disparity map. Such correspondence estimation has been a long standing research topic in computer vision, with application to stereo vision or optical flow. Research in this area has shown the importance of well designed descriptors to enable fast and accurate matching. We propose in this paper a binary descriptor exploiting the light field gradient over both the spatial and the angular dimensions in order to improve inter view matching. We demonstrate in a disparity estimation application that it can achieve comparable accuracy compared to existing descriptors while being faster to compute.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126164837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
CDVA/VCM: Language for Intelligent and Autonomous Vehicles CDVA/VCM:智能和自动驾驶汽车语言
2020 IEEE International Conference on Image Processing (ICIP) Pub Date : 2020-10-01 DOI: 10.1109/ICIP40778.2020.9190735
Baohua Sun, Hao Sha, M. Rafie, Lin Yang
{"title":"CDVA/VCM: Language for Intelligent and Autonomous Vehicles","authors":"Baohua Sun, Hao Sha, M. Rafie, Lin Yang","doi":"10.1109/ICIP40778.2020.9190735","DOIUrl":"https://doi.org/10.1109/ICIP40778.2020.9190735","url":null,"abstract":"Intelligent transportation is a complex system that involves the interaction of connected technologies, including Smart Sensors, Intelligent and Autonomous Vehicles, High Precision Maps, and 5G. The coordination of all these machines mandates a common language that serves as a protocol for intelligent machines to communicate. International standards serves as the global protocol to satisfy industry needs at the product level. MPEG-CDVA is the official ISO standard for search and retrieval applications by providing Compact Descriptors for Video Analysis (CDVA). It is robust and enables efficient implementations on embedded systems. CDVA is the first generation language for images/videos. MPEG-VCM is developing advanced features beyond CDVA to the new generation as Video Coding for Machines (VCM). With the wide availability of low-power AI chips, CDVA and VCM are ready to deploy and serve as the language for intelligent and autonomous vehicles. In this paper, we demonstrate the use of the SuperCDVA and Closed Captioning CDVA algorithms for intelligent and autonomous vehicles. Concepts are borrowed from the Super Characters algorithm in Natural Language Processing. In order for intelligent and autonomous vehicles to understand events on the road, the CDVA vectors are organized into an image to represent the story of the video.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123288000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信