International Conference on Digital Image Processing最新文献_第3页

A joint feature aggregation method for robust masked face recognition 一种联合特征聚合的鲁棒被蒙面人脸识别方法

International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2643615

Xinmeng Xu, Yuesheng Zhu, Zhiqiang Bai

{"title":"A joint feature aggregation method for robust masked face recognition","authors":"Xinmeng Xu, Yuesheng Zhu, Zhiqiang Bai","doi":"10.1117/12.2643615","DOIUrl":"https://doi.org/10.1117/12.2643615","url":null,"abstract":"Masked face recognition becomes an important issue of prevention and monitor in outbreak of COVID-19. Due to loss of facial features caused by masks, unmasked face recognition could not identify the specific person well. Current masked faces methods focus on local features from the unmasked regions or recover masked faces to fit standard face recognition models. These methods only focus on partial information of faces thus these features are not robust enough to deal with complex situations. To solve this problem, we propose a joint feature aggregation method for robust masked face recognition. Firstly, we design a multi-module feature extraction network to extract different features, including local module (LM), global module (GM), and recovery module (RM). Our method not only extracts global features from the original masked faces but also extracts local features from the unmasked area since it is a discriminative part of masked faces. Specially, we utilize a pretrained recovery model to recover masked faces and get some recovery features from the recovered faces. Finally, features from three modules are aggregated as a joint feature of masked faces. The joint feature enhances the feature representation of masked faces thus it is more discriminative and robust than that in previous methods. Experiments show that our method can achieve better performance than previous methods on LFW dataset.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125393058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Spatio-temporal dual-attention network for view-invariant human action recognition 视觉不变人类动作识别的时空双注意网络

International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2643446

Kumie Gedamu, Getinet Yilma, Maregu Assefa, Melese Ayalew

{"title":"Spatio-temporal dual-attention network for view-invariant human action recognition","authors":"Kumie Gedamu, Getinet Yilma, Maregu Assefa, Melese Ayalew","doi":"10.1117/12.2643446","DOIUrl":"https://doi.org/10.1117/12.2643446","url":null,"abstract":"Due to the action occlusion and information loss caused by the view changes, view-invariant human action recognition is challenging in plenty of real-world applications. One possible solution to this problem is minimizing representation discrepancy in different views while learning discriminative feature representation for view-invariant action recognition. To solve the problem, we propose a Spatio-temporal Dual-Attention Network (SDA-Net) for view-invariant human action recognition. The SDA-Net is composed of a spatial/temporal self-attention and spatial/temporal cross-attention modules. The spatial/temporal self-attention module captures global long-range dependencies of action features. The cross-attention module is designed to learn view-invariant co-occurrence attention maps and generates discriminative features for a semantic representation of actions in different views. We exhaustively evaluate our approach on the NTU- 60, NTU-120, and UESTC datasets with multi-type evaluations, i.e., Cross-Subject, Cross-View, Cross-Set, and Arbitrary-view. Extensive experiment results demonstrate that our approach exceeds the state-of-the-art approaches with a significant margin in view-invariant human action recognition.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"233 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121413001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Ship detection in optical remote sensing images based on saliency and rotation-invariant feature 基于显著性和旋转不变性特征的光学遥感图像船舶检测

International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644322

Donglai Wu, Bingxin Liu, Wanhan Zhang

引用次数: 0

Real-time image distortion correction based on FPGA 基于FPGA的实时图像失真校正

International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644433

R. Ou, Danni Ai, X. Hu, Zhao Zheng, Yu Qiu, Jian Yang

引用次数: 0

SlowFast with DropBlock and smooth samples loss for student action recognition 慢速与DropBlock和平滑的样本丢失学生动作识别

International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644370

Chuanming Li, Wenxing Bao, Xu Chen, Yongjun Jing, Xiudong Qu

引用次数: 0

Accurate neuroanatomy segmentation using 3D spatial and anatomical attention neural networks 利用三维空间和解剖注意神经网络进行精确的神经解剖学分割

International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644416

Hewei Cheng, Zhengyu Ren, Peiyang Li, Yin Tian, Wei Wang, Zhangyong Li, Yongjiao Fan

{"title":"Accurate neuroanatomy segmentation using 3D spatial and anatomical attention neural networks","authors":"Hewei Cheng, Zhengyu Ren, Peiyang Li, Yin Tian, Wei Wang, Zhangyong Li, Yongjiao Fan","doi":"10.1117/12.2644416","DOIUrl":"https://doi.org/10.1117/12.2644416","url":null,"abstract":"Brain structure segmentation from 3D magnetic resonance (MR) images is a prerequisite for quantifying brain morphology. Since typical 3D whole brain deep learning models demand large GPU memory, 3D image patch-based deep learning methods are favored for their GPU memory efficiency. However, existing 3D image patch-based methods are not well equipped to capture spatial and anatomical contextual information that is necessary for accurate brain structure segmentation. To overcome this limitation, we develop a spatial and anatomical context-aware network to integrate spatial and anatomical contextual information for accurate brain structure segmentation from MR images. Particularly, a spatial attention block is adopted to encode spatial context information of the 3D patches, an anatomical attention block is adopted to aggregate image information across channels of the 3D patches, and finally the spatial and anatomical attention blocks are adaptively fused by an element-wise convolution operation. Moreover, an online patch sampling strategy is utilized to train a deep neural network with all available patches of the training MR images, facilitating accurate segmentation of brain structures. Ablation and comparison results have demonstrated that our method is capable of achieving promising segmentation performance, better than state-of-the-art alternative methods by 3.30% in terms of Dice scores.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134373062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Face tampering detection based on spatiotemporal attention residual network 基于时空注意残差网络的人脸篡改检测

International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644654

Z. Cai, Weimin Wei, Fanxing Meng, Changan Liu

{"title":"Face tampering detection based on spatiotemporal attention residual network","authors":"Z. Cai, Weimin Wei, Fanxing Meng, Changan Liu","doi":"10.1117/12.2644654","DOIUrl":"https://doi.org/10.1117/12.2644654","url":null,"abstract":"Fake technology has evolved to the point where fake faces are increasingly difficult to distinguish from real ones. If the forged face videos spread wildly on social media, social unrest or personal reputation damage may lead to social unrest. A face tampering detection method (RALNet) with spatiotemporal attention residual network is designed to reduce the misuse of face data due to malicious dissemination. Firstly, we propose a process to extract video face data, which reduces the interference of irrelevant information and improves the utilization of data processing. Then, based on the characteristics of incoherence and inconsistency in spatial and temporal information of tampered videos, the spatial domain features and temporal domain features of the target face video are extracted by introducing an attention mechanism of residual network and long short-term memory network to classify the targets as true or fake. The experimental results show that the method can effectively detect whether the face data is tampered, and its detection accuracy is better than other methods. In addition, it also achieves good performance in terms of recall, precision, and F1 score.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"377 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115174057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Identifying Alzheimer’s disease from 4D fMRI using hybrid 3DCNN and GRU networks 使用混合3DCNN和GRU网络从4D fMRI识别阿尔茨海默病

International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644454

Yifan Cao, Meili Lu, Jiajun Fu, Zhaohua Guo, Zicheng Gao

引用次数: 0

Blind image quality assessment based on transformer 基于变压器的盲图像质量评价

International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2643493

Linxin Li, Chu Chen, Naixuan Zhao

引用次数: 0

Monocular inertial indoor location algorithm considering point and line features 考虑点线特征的单目惯性室内定位算法

International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644842

Ju Huo, Liang Wei, Chuwei Mao

{"title":"Monocular inertial indoor location algorithm considering point and line features","authors":"Ju Huo, Liang Wei, Chuwei Mao","doi":"10.1117/12.2644842","DOIUrl":"https://doi.org/10.1117/12.2644842","url":null,"abstract":"Compared with point features, line features in the environment have more structural information. When indoor texture is not rich, making full use of the structural information of line features can improve the robustness and accuracy of simultaneous location and mapping algorithm. In this paper, we propose an improved monocular inertial indoor location algorithm considering point and line features. Firstly, the point features and line features in the environment are extracted, matched and parameterized, and then the inertial sensor is used to estimate the initial pose, and the tightly coupled method is adopted to optimize the observation error of the point and line features and the measurement error of the inertial sensor simultaneously in the back optimization to achieve accurate estimation of the pose of unmanned aerial vehicle. Finally, loop closure detection and pose graph optimization are used to optimize the pose in real time. The test results on public datasets show that the location accuracy of the proposed method is superior to 10 cm under sufficient light and texture conditions. The angle measurement accuracy is better than 0.05 rad, and the output frequency of positioning results is 10Hz, which effectively improves the accuracy of traditional visual inertial location method and meets the requirements of real-time.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123054438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0