2022 IEEE International Conference on Image Processing (ICIP)最新文献_第3页

Texture-Guided End-to-End Depth Map Compression 纹理引导的端到端深度图压缩

2022 IEEE International Conference on Image Processing (ICIP) Pub Date : 2022-10-16 DOI: 10.1109/ICIP46576.2022.9897569

Bo Peng, Yuying Jing, Dengchao Jin, Xiangrui Liu, Zhaoqing Pan, Jianjun Lei

引用次数: 0

Representation Learning Using Rank Loss for Robust Neurosurgical Skills Evaluation 基于秩损失的表征学习鲁棒性神经外科技能评估

2022 IEEE International Conference on Image Processing (ICIP) Pub Date : 2022-10-16 DOI: 10.1109/ICIP46576.2022.9897932

Britty Baby, Mustafa Chasmai, Tamajit Banerjee, A. Suri, Subhashis Banerjee, Chetan Arora

{"title":"Representation Learning Using Rank Loss for Robust Neurosurgical Skills Evaluation","authors":"Britty Baby, Mustafa Chasmai, Tamajit Banerjee, A. Suri, Subhashis Banerjee, Chetan Arora","doi":"10.1109/ICIP46576.2022.9897932","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897932","url":null,"abstract":"Surgical simulators provide hands-on training and learning of the necessary psychomotor skills. Automated skill evaluation of the trainee doctors based on the video of a task being performed by them is an important key step for the optimal utilization of such simulators. However, current skill evaluation techniques require accurate tracking information of the instruments which restricts their applicability to robot assisted surgeries only. In this paper, we propose a novel neural network architecture that can perform skill evaluation using video data alone (and no tracking information). Given the small dataset available for training such a system, the network trained using ℓ2 regression loss easily overfits the training data. We propose a novel rank loss to help learn robust representation, leading to 5% improvement for skill score prediction on the benchmark JIGSAWS dataset. To demonstrate the applicability of our method on non-robotic surgeries, we contribute a new neuro-endoscopic technical skills (NETS) training dataset comprising of 100 short videos of 12 subjects. Our method achieved 27% improvement over the state of the art on the NETS dataset. Project page with source code, and data is available at nets-iitd.github.io/nets-v1.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114521497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Panoptic-Deeplab-DVA: Improving Panoptic Deeplab with Dual Value Attention and Instance Boundary Aware Regression Panoptic-Deeplab- dva:基于双值注意和实例边界感知回归的Panoptic Deeplab改进

2022 IEEE International Conference on Image Processing (ICIP) Pub Date : 2022-10-16 DOI: 10.1109/ICIP46576.2022.9897430

Qingfeng Liu, Mostafa El-Khamy

引用次数: 1

GCN-Based Multi-Modal Multi-Label Attribute Classification in Anime Illustration Using Domain-Specific Semantic Features 基于gcn的多模态多标签动漫插图的领域语义特征分类

2022 IEEE International Conference on Image Processing (ICIP) Pub Date : 2022-10-16 DOI: 10.1109/ICIP46576.2022.9898071

Ziwen Lan, Keisuke Maeda, Takahiro Ogawa, M. Haseyama

{"title":"GCN-Based Multi-Modal Multi-Label Attribute Classification in Anime Illustration Using Domain-Specific Semantic Features","authors":"Ziwen Lan, Keisuke Maeda, Takahiro Ogawa, M. Haseyama","doi":"10.1109/ICIP46576.2022.9898071","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9898071","url":null,"abstract":"This paper presents a multi-modal multi-label attribute classification model in anime illustration based on Graph Convolutional Networks (GCN) using domain-specific semantic features. In animation production, since creators often intentionally highlight the subtle characteristics of the characters and objects when creating anime illustrations, we focus on the task of multi-label attribute classification. To capture the relationship between attributes, we construct a multi-modal GCN model that can adopt semantic features specific to anime illustration. To generate the domain-specific semantic features that represent the semantic contents of anime illustrations, we construct a new captioning framework for anime illustration by combining real images and their style transformation. The contributions of the proposed method are two-folds. 1) More comprehensive relationships between attributes are captured by introducing GCN with semantic features into the multi-label attribute classification task of anime illustrations. 2) More accurate image captioning of anime illustrations can be generated by a trainable model by using only real-world images. To our best knowledge, this is the first work dealing with multi-label attribute classification in anime illustration. The experimental results show the effectiveness of the proposed method by comparing it with some existing methods including the state-of-the-art methods.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129811257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Undersampled Dynamic Fourier Ptychography via Phaseless PCA 基于无相PCA的欠采样动态傅立叶平面成像

2022 IEEE International Conference on Image Processing (ICIP) Pub Date : 2022-10-16 DOI: 10.1109/ICIP46576.2022.9897747

Zhengyu Chen, Seyedehsara Nayer, Namrata Vaswani

引用次数: 0

PCA Event-Based Optical Flow: A Fast and Accurate 2D Motion Estimation 基于PCA事件的光流:快速准确的二维运动估计

2022 IEEE International Conference on Image Processing (ICIP) Pub Date : 2022-10-16 DOI: 10.1109/ICIP46576.2022.9897875

M. Khairallah, Fabien Bonardi, D. Roussel, S. Bouchafa

引用次数: 1

Efficient One-Shot Sports Field Image Registration with Arbitrary Keypoint Segmentation 基于任意关键点分割的高效单镜头运动场图像配准

2022 IEEE International Conference on Image Processing (ICIP) Pub Date : 2022-10-16 DOI: 10.1109/ICIP46576.2022.9897170

Nicolas Jacquelin, Romain Vuillemot, S. Duffner

引用次数: 3

Guided Sampling Based Feature Aggregation for Video Object Detection 基于引导采样的特征聚合视频目标检测

2022 IEEE International Conference on Image Processing (ICIP) Pub Date : 2022-10-16 DOI: 10.1109/ICIP46576.2022.9897210

Jun Liang, Haosheng Chen, Y. Yan, Yang Lu, Hanzi Wang

引用次数: 1

An Efficient End-To-End Image Compression Transformer 一个有效的端到端图像压缩转换器

2022 IEEE International Conference on Image Processing (ICIP) Pub Date : 2022-10-16 DOI: 10.1109/ICIP46576.2022.9897663

Afsana Ahsan Jeny, Masum Shah Junayed, Md Baharul Islam

引用次数: 0

A Lightweight Network with Multi-Stage Feature Fusion Module for Single-View 3d Face Reconstruction 基于多阶段特征融合模块的单视图三维人脸重建轻量级网络

2022 IEEE International Conference on Image Processing (ICIP) Pub Date : 2022-10-16 DOI: 10.1109/ICIP46576.2022.9897570

Jing Wang, Shikun Zhang, F. Song, Ge Song, Ming Yang

{"title":"A Lightweight Network with Multi-Stage Feature Fusion Module for Single-View 3d Face Reconstruction","authors":"Jing Wang, Shikun Zhang, F. Song, Ge Song, Ming Yang","doi":"10.1109/ICIP46576.2022.9897570","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897570","url":null,"abstract":"3D face reconstruction has attracted great attentions of researchers from both academic and industry for its potential application in many scenarios such as face alignment and recognition across large poses. 3D Morphable Model which reconstructs a 3D face through basis coefficients prediction, is usually adopted as the typical parametric framework for 3D face and is suitable to combine with deep learning. Existing cascade regression method predicts coefficients by multiple iterations, which is time-consuming. In this paper, we propose an efficient and end-to-end method for single-view 3D face reconstruction. We build a lightweight network based on mobile blocks with faster speed for parameter extraction and smaller model size. Especially, a multi-stage feature fusion module is designed for enhancing the end-to-end learning. To match the setting of input image size, we updated the pose label of images under various sizes in training dataset before training. Extensive experiments on challenging datasets validate the efficiency of our method for both 3D face reconstruction and face alignment.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129487293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0