International Conference on Digital Image Processing最新文献

筛选
英文 中文
A joint feature aggregation method for robust masked face recognition 一种联合特征聚合的鲁棒被蒙面人脸识别方法
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2643615
Xinmeng Xu, Yuesheng Zhu, Zhiqiang Bai
{"title":"A joint feature aggregation method for robust masked face recognition","authors":"Xinmeng Xu, Yuesheng Zhu, Zhiqiang Bai","doi":"10.1117/12.2643615","DOIUrl":"https://doi.org/10.1117/12.2643615","url":null,"abstract":"Masked face recognition becomes an important issue of prevention and monitor in outbreak of COVID-19. Due to loss of facial features caused by masks, unmasked face recognition could not identify the specific person well. Current masked faces methods focus on local features from the unmasked regions or recover masked faces to fit standard face recognition models. These methods only focus on partial information of faces thus these features are not robust enough to deal with complex situations. To solve this problem, we propose a joint feature aggregation method for robust masked face recognition. Firstly, we design a multi-module feature extraction network to extract different features, including local module (LM), global module (GM), and recovery module (RM). Our method not only extracts global features from the original masked faces but also extracts local features from the unmasked area since it is a discriminative part of masked faces. Specially, we utilize a pretrained recovery model to recover masked faces and get some recovery features from the recovered faces. Finally, features from three modules are aggregated as a joint feature of masked faces. The joint feature enhances the feature representation of masked faces thus it is more discriminative and robust than that in previous methods. Experiments show that our method can achieve better performance than previous methods on LFW dataset.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125393058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatio-temporal dual-attention network for view-invariant human action recognition 视觉不变人类动作识别的时空双注意网络
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2643446
Kumie Gedamu, Getinet Yilma, Maregu Assefa, Melese Ayalew
{"title":"Spatio-temporal dual-attention network for view-invariant human action recognition","authors":"Kumie Gedamu, Getinet Yilma, Maregu Assefa, Melese Ayalew","doi":"10.1117/12.2643446","DOIUrl":"https://doi.org/10.1117/12.2643446","url":null,"abstract":"Due to the action occlusion and information loss caused by the view changes, view-invariant human action recognition is challenging in plenty of real-world applications. One possible solution to this problem is minimizing representation discrepancy in different views while learning discriminative feature representation for view-invariant action recognition. To solve the problem, we propose a Spatio-temporal Dual-Attention Network (SDA-Net) for view-invariant human action recognition. The SDA-Net is composed of a spatial/temporal self-attention and spatial/temporal cross-attention modules. The spatial/temporal self-attention module captures global long-range dependencies of action features. The cross-attention module is designed to learn view-invariant co-occurrence attention maps and generates discriminative features for a semantic representation of actions in different views. We exhaustively evaluate our approach on the NTU- 60, NTU-120, and UESTC datasets with multi-type evaluations, i.e., Cross-Subject, Cross-View, Cross-Set, and Arbitrary-view. Extensive experiment results demonstrate that our approach exceeds the state-of-the-art approaches with a significant margin in view-invariant human action recognition.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"233 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121413001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Ship detection in optical remote sensing images based on saliency and rotation-invariant feature 基于显著性和旋转不变性特征的光学遥感图像船舶检测
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644322
Donglai Wu, Bingxin Liu, Wanhan Zhang
{"title":"Ship detection in optical remote sensing images based on saliency and rotation-invariant feature","authors":"Donglai Wu, Bingxin Liu, Wanhan Zhang","doi":"10.1117/12.2644322","DOIUrl":"https://doi.org/10.1117/12.2644322","url":null,"abstract":"Ship detection is important to guarantee maritime safety at sea. In optical remote sensing images, the detection efficiency and accuracy are limited due to the complex ocean background and variant ship directions. Therefore, we propose a novel ship detection method, which consists of two main stages: candidate area location and target discrimination. In the first stage, we use the spectral residual method to detect the saliency map of the original image, get the saliency sub-map containing the ship target, and then use the threshold segmentation method to obtain the ship candidate region. In the second stage, we obtain the radial gradient histogram of the ship candidate region and transform it into a radial gradient feature, which is rotation-invariant. Afterward, radial gradient features and LBP features are fused, and SVM is used for ship detection. Data experimental results show that the method has the characteristics of low complexity and high detection accuracy.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"321 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114015131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Real-time image distortion correction based on FPGA 基于FPGA的实时图像失真校正
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644433
R. Ou, Danni Ai, X. Hu, Zhao Zheng, Yu Qiu, Jian Yang
{"title":"Real-time image distortion correction based on FPGA","authors":"R. Ou, Danni Ai, X. Hu, Zhao Zheng, Yu Qiu, Jian Yang","doi":"10.1117/12.2644433","DOIUrl":"https://doi.org/10.1117/12.2644433","url":null,"abstract":"As the primary method for real-time image processing, a field-programmable gate array (FPGA) is widely used in binocular vision systems. Distortion correction is an important component of binocular stereo vision systems. When implementing a real-time image distortion correction algorithm on FPGA, problems, such as insufficient on-chip storage space and high complexity of coordinate correction calculation methods, occur. These problems are analyzed in detail in this study. On the basis of the reverse mapping method, a distortion correction algorithm that uses a lookup table (LUT) is proposed. A compression with restoration method is established for this LUT to reduce space occupation. The corresponding cache method of LUT and the image data are designed. The algorithm is verified on our binocular stereo vision system based on Xilinx Zynq-7020. The experiments show that the proposed algorithm can achieve real-time and high precision gray image distortion correction effect and significantly reduce the consumption of on-chip resources. Enough to meet the requirements of accurate binocular stereo vision system.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127783528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SlowFast with DropBlock and smooth samples loss for student action recognition 慢速与DropBlock和平滑的样本丢失学生动作识别
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644370
Chuanming Li, Wenxing Bao, Xu Chen, Yongjun Jing, Xiudong Qu
{"title":"SlowFast with DropBlock and smooth samples loss for student action recognition","authors":"Chuanming Li, Wenxing Bao, Xu Chen, Yongjun Jing, Xiudong Qu","doi":"10.1117/12.2644370","DOIUrl":"https://doi.org/10.1117/12.2644370","url":null,"abstract":"Due to the advent of large-scale video datasets, action recognition using three-dimensional convolutions (3D CNNs) containing spatiotemporal information has become mainstream. Aiming at the problem of classroom student behavior recognition, the paper adopts the improved SlowFast network structure to deal with spatial structure and temporal events respectively. First, DropBlock (a regularization method) is added to the SlowFast network to solve the overfitting problem. Second, for the problem of Long-Tailed Distribution, the designed Smooth Sample (SS) Loss function is added to the network to smooth the number of samples. Classification experiments show that compared with similar methods, the model accuracy of our method on the Kinetics and Student Action Dataset is increased by 2.1% and 2.9%, respectively.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132665226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accurate neuroanatomy segmentation using 3D spatial and anatomical attention neural networks 利用三维空间和解剖注意神经网络进行精确的神经解剖学分割
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644416
Hewei Cheng, Zhengyu Ren, Peiyang Li, Yin Tian, Wei Wang, Zhangyong Li, Yongjiao Fan
{"title":"Accurate neuroanatomy segmentation using 3D spatial and anatomical attention neural networks","authors":"Hewei Cheng, Zhengyu Ren, Peiyang Li, Yin Tian, Wei Wang, Zhangyong Li, Yongjiao Fan","doi":"10.1117/12.2644416","DOIUrl":"https://doi.org/10.1117/12.2644416","url":null,"abstract":"Brain structure segmentation from 3D magnetic resonance (MR) images is a prerequisite for quantifying brain morphology. Since typical 3D whole brain deep learning models demand large GPU memory, 3D image patch-based deep learning methods are favored for their GPU memory efficiency. However, existing 3D image patch-based methods are not well equipped to capture spatial and anatomical contextual information that is necessary for accurate brain structure segmentation. To overcome this limitation, we develop a spatial and anatomical context-aware network to integrate spatial and anatomical contextual information for accurate brain structure segmentation from MR images. Particularly, a spatial attention block is adopted to encode spatial context information of the 3D patches, an anatomical attention block is adopted to aggregate image information across channels of the 3D patches, and finally the spatial and anatomical attention blocks are adaptively fused by an element-wise convolution operation. Moreover, an online patch sampling strategy is utilized to train a deep neural network with all available patches of the training MR images, facilitating accurate segmentation of brain structures. Ablation and comparison results have demonstrated that our method is capable of achieving promising segmentation performance, better than state-of-the-art alternative methods by 3.30% in terms of Dice scores.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134373062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Face tampering detection based on spatiotemporal attention residual network 基于时空注意残差网络的人脸篡改检测
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644654
Z. Cai, Weimin Wei, Fanxing Meng, Changan Liu
{"title":"Face tampering detection based on spatiotemporal attention residual network","authors":"Z. Cai, Weimin Wei, Fanxing Meng, Changan Liu","doi":"10.1117/12.2644654","DOIUrl":"https://doi.org/10.1117/12.2644654","url":null,"abstract":"Fake technology has evolved to the point where fake faces are increasingly difficult to distinguish from real ones. If the forged face videos spread wildly on social media, social unrest or personal reputation damage may lead to social unrest. A face tampering detection method (RALNet) with spatiotemporal attention residual network is designed to reduce the misuse of face data due to malicious dissemination. Firstly, we propose a process to extract video face data, which reduces the interference of irrelevant information and improves the utilization of data processing. Then, based on the characteristics of incoherence and inconsistency in spatial and temporal information of tampered videos, the spatial domain features and temporal domain features of the target face video are extracted by introducing an attention mechanism of residual network and long short-term memory network to classify the targets as true or fake. The experimental results show that the method can effectively detect whether the face data is tampered, and its detection accuracy is better than other methods. In addition, it also achieves good performance in terms of recall, precision, and F1 score.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"377 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115174057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying Alzheimer’s disease from 4D fMRI using hybrid 3DCNN and GRU networks 使用混合3DCNN和GRU网络从4D fMRI识别阿尔茨海默病
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644454
Yifan Cao, Meili Lu, Jiajun Fu, Zhaohua Guo, Zicheng Gao
{"title":"Identifying Alzheimer’s disease from 4D fMRI using hybrid 3DCNN and GRU networks","authors":"Yifan Cao, Meili Lu, Jiajun Fu, Zhaohua Guo, Zicheng Gao","doi":"10.1117/12.2644454","DOIUrl":"https://doi.org/10.1117/12.2644454","url":null,"abstract":"In recently years, motivated by the excellent performance in automatic feature extraction and complex patterns detecting from raw data, recently, deep learning technologies have been widely used in analyzing fMRI data for Alzheimer’s disease classification. However, most current studies did not take full advantage of the temporal and spatial features of fMRI, which may result in ignoring some important information and influencing classification performance. In this paper, we propose a novel approach based on deep learning to learn temporal and spatial features of 4D fMRI for Alzheimer’s disease classification. This model is composed of 3D Convolutional Neural Network(3DCNN) and recurrent neural network. Experimental results demonstrated that the proposed approach could discriminate Alzheimer’s patients from healthy controls with a high accuracy rate.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117122171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Blind image quality assessment based on transformer 基于变压器的盲图像质量评价
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2643493
Linxin Li, Chu Chen, Naixuan Zhao
{"title":"Blind image quality assessment based on transformer","authors":"Linxin Li, Chu Chen, Naixuan Zhao","doi":"10.1117/12.2643493","DOIUrl":"https://doi.org/10.1117/12.2643493","url":null,"abstract":"Transformer has achieved milestones in natural language processing (NLP). Due to its excellent global and remote semantic information interaction performance, it has gradually been applied in vision tasks. In this paper, we propose PTIQ, which is a pure Transformer structure for Image Quality Assessment. Specifically, we use Swin Transformer Blocks as backbone to extract image features. The extracted feature vectors after extra state embedding and position embedding are fed into the original transformer encoder. Then, the output is passed to the MLP head to predict quality score. Experimental results demonstrate that the proposed architecture achieves outstanding performance.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124943319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Monocular inertial indoor location algorithm considering point and line features 考虑点线特征的单目惯性室内定位算法
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644842
Ju Huo, Liang Wei, Chuwei Mao
{"title":"Monocular inertial indoor location algorithm considering point and line features","authors":"Ju Huo, Liang Wei, Chuwei Mao","doi":"10.1117/12.2644842","DOIUrl":"https://doi.org/10.1117/12.2644842","url":null,"abstract":"Compared with point features, line features in the environment have more structural information. When indoor texture is not rich, making full use of the structural information of line features can improve the robustness and accuracy of simultaneous location and mapping algorithm. In this paper, we propose an improved monocular inertial indoor location algorithm considering point and line features. Firstly, the point features and line features in the environment are extracted, matched and parameterized, and then the inertial sensor is used to estimate the initial pose, and the tightly coupled method is adopted to optimize the observation error of the point and line features and the measurement error of the inertial sensor simultaneously in the back optimization to achieve accurate estimation of the pose of unmanned aerial vehicle. Finally, loop closure detection and pose graph optimization are used to optimize the pose in real time. The test results on public datasets show that the location accuracy of the proposed method is superior to 10 cm under sufficient light and texture conditions. The angle measurement accuracy is better than 0.05 rad, and the output frequency of positioning results is 10Hz, which effectively improves the accuracy of traditional visual inertial location method and meets the requirements of real-time.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123054438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信