International Conference on Digital Image Processing最新文献

筛选
英文 中文
A joint feature aggregation method for robust masked face recognition 一种联合特征聚合的鲁棒被蒙面人脸识别方法
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2643615
Xinmeng Xu, Yuesheng Zhu, Zhiqiang Bai
{"title":"A joint feature aggregation method for robust masked face recognition","authors":"Xinmeng Xu, Yuesheng Zhu, Zhiqiang Bai","doi":"10.1117/12.2643615","DOIUrl":"https://doi.org/10.1117/12.2643615","url":null,"abstract":"Masked face recognition becomes an important issue of prevention and monitor in outbreak of COVID-19. Due to loss of facial features caused by masks, unmasked face recognition could not identify the specific person well. Current masked faces methods focus on local features from the unmasked regions or recover masked faces to fit standard face recognition models. These methods only focus on partial information of faces thus these features are not robust enough to deal with complex situations. To solve this problem, we propose a joint feature aggregation method for robust masked face recognition. Firstly, we design a multi-module feature extraction network to extract different features, including local module (LM), global module (GM), and recovery module (RM). Our method not only extracts global features from the original masked faces but also extracts local features from the unmasked area since it is a discriminative part of masked faces. Specially, we utilize a pretrained recovery model to recover masked faces and get some recovery features from the recovered faces. Finally, features from three modules are aggregated as a joint feature of masked faces. The joint feature enhances the feature representation of masked faces thus it is more discriminative and robust than that in previous methods. Experiments show that our method can achieve better performance than previous methods on LFW dataset.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125393058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatio-temporal dual-attention network for view-invariant human action recognition 视觉不变人类动作识别的时空双注意网络
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2643446
Kumie Gedamu, Getinet Yilma, Maregu Assefa, Melese Ayalew
{"title":"Spatio-temporal dual-attention network for view-invariant human action recognition","authors":"Kumie Gedamu, Getinet Yilma, Maregu Assefa, Melese Ayalew","doi":"10.1117/12.2643446","DOIUrl":"https://doi.org/10.1117/12.2643446","url":null,"abstract":"Due to the action occlusion and information loss caused by the view changes, view-invariant human action recognition is challenging in plenty of real-world applications. One possible solution to this problem is minimizing representation discrepancy in different views while learning discriminative feature representation for view-invariant action recognition. To solve the problem, we propose a Spatio-temporal Dual-Attention Network (SDA-Net) for view-invariant human action recognition. The SDA-Net is composed of a spatial/temporal self-attention and spatial/temporal cross-attention modules. The spatial/temporal self-attention module captures global long-range dependencies of action features. The cross-attention module is designed to learn view-invariant co-occurrence attention maps and generates discriminative features for a semantic representation of actions in different views. We exhaustively evaluate our approach on the NTU- 60, NTU-120, and UESTC datasets with multi-type evaluations, i.e., Cross-Subject, Cross-View, Cross-Set, and Arbitrary-view. Extensive experiment results demonstrate that our approach exceeds the state-of-the-art approaches with a significant margin in view-invariant human action recognition.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"233 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121413001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Ship detection in optical remote sensing images based on saliency and rotation-invariant feature 基于显著性和旋转不变性特征的光学遥感图像船舶检测
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644322
Donglai Wu, Bingxin Liu, Wanhan Zhang
{"title":"Ship detection in optical remote sensing images based on saliency and rotation-invariant feature","authors":"Donglai Wu, Bingxin Liu, Wanhan Zhang","doi":"10.1117/12.2644322","DOIUrl":"https://doi.org/10.1117/12.2644322","url":null,"abstract":"Ship detection is important to guarantee maritime safety at sea. In optical remote sensing images, the detection efficiency and accuracy are limited due to the complex ocean background and variant ship directions. Therefore, we propose a novel ship detection method, which consists of two main stages: candidate area location and target discrimination. In the first stage, we use the spectral residual method to detect the saliency map of the original image, get the saliency sub-map containing the ship target, and then use the threshold segmentation method to obtain the ship candidate region. In the second stage, we obtain the radial gradient histogram of the ship candidate region and transform it into a radial gradient feature, which is rotation-invariant. Afterward, radial gradient features and LBP features are fused, and SVM is used for ship detection. Data experimental results show that the method has the characteristics of low complexity and high detection accuracy.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"321 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114015131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Real-time image distortion correction based on FPGA 基于FPGA的实时图像失真校正
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644433
R. Ou, Danni Ai, X. Hu, Zhao Zheng, Yu Qiu, Jian Yang
{"title":"Real-time image distortion correction based on FPGA","authors":"R. Ou, Danni Ai, X. Hu, Zhao Zheng, Yu Qiu, Jian Yang","doi":"10.1117/12.2644433","DOIUrl":"https://doi.org/10.1117/12.2644433","url":null,"abstract":"As the primary method for real-time image processing, a field-programmable gate array (FPGA) is widely used in binocular vision systems. Distortion correction is an important component of binocular stereo vision systems. When implementing a real-time image distortion correction algorithm on FPGA, problems, such as insufficient on-chip storage space and high complexity of coordinate correction calculation methods, occur. These problems are analyzed in detail in this study. On the basis of the reverse mapping method, a distortion correction algorithm that uses a lookup table (LUT) is proposed. A compression with restoration method is established for this LUT to reduce space occupation. The corresponding cache method of LUT and the image data are designed. The algorithm is verified on our binocular stereo vision system based on Xilinx Zynq-7020. The experiments show that the proposed algorithm can achieve real-time and high precision gray image distortion correction effect and significantly reduce the consumption of on-chip resources. Enough to meet the requirements of accurate binocular stereo vision system.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127783528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SlowFast with DropBlock and smooth samples loss for student action recognition 慢速与DropBlock和平滑的样本丢失学生动作识别
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644370
Chuanming Li, Wenxing Bao, Xu Chen, Yongjun Jing, Xiudong Qu
{"title":"SlowFast with DropBlock and smooth samples loss for student action recognition","authors":"Chuanming Li, Wenxing Bao, Xu Chen, Yongjun Jing, Xiudong Qu","doi":"10.1117/12.2644370","DOIUrl":"https://doi.org/10.1117/12.2644370","url":null,"abstract":"Due to the advent of large-scale video datasets, action recognition using three-dimensional convolutions (3D CNNs) containing spatiotemporal information has become mainstream. Aiming at the problem of classroom student behavior recognition, the paper adopts the improved SlowFast network structure to deal with spatial structure and temporal events respectively. First, DropBlock (a regularization method) is added to the SlowFast network to solve the overfitting problem. Second, for the problem of Long-Tailed Distribution, the designed Smooth Sample (SS) Loss function is added to the network to smooth the number of samples. Classification experiments show that compared with similar methods, the model accuracy of our method on the Kinetics and Student Action Dataset is increased by 2.1% and 2.9%, respectively.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132665226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accurate neuroanatomy segmentation using 3D spatial and anatomical attention neural networks 利用三维空间和解剖注意神经网络进行精确的神经解剖学分割
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644416
Hewei Cheng, Zhengyu Ren, Peiyang Li, Yin Tian, Wei Wang, Zhangyong Li, Yongjiao Fan
{"title":"Accurate neuroanatomy segmentation using 3D spatial and anatomical attention neural networks","authors":"Hewei Cheng, Zhengyu Ren, Peiyang Li, Yin Tian, Wei Wang, Zhangyong Li, Yongjiao Fan","doi":"10.1117/12.2644416","DOIUrl":"https://doi.org/10.1117/12.2644416","url":null,"abstract":"Brain structure segmentation from 3D magnetic resonance (MR) images is a prerequisite for quantifying brain morphology. Since typical 3D whole brain deep learning models demand large GPU memory, 3D image patch-based deep learning methods are favored for their GPU memory efficiency. However, existing 3D image patch-based methods are not well equipped to capture spatial and anatomical contextual information that is necessary for accurate brain structure segmentation. To overcome this limitation, we develop a spatial and anatomical context-aware network to integrate spatial and anatomical contextual information for accurate brain structure segmentation from MR images. Particularly, a spatial attention block is adopted to encode spatial context information of the 3D patches, an anatomical attention block is adopted to aggregate image information across channels of the 3D patches, and finally the spatial and anatomical attention blocks are adaptively fused by an element-wise convolution operation. Moreover, an online patch sampling strategy is utilized to train a deep neural network with all available patches of the training MR images, facilitating accurate segmentation of brain structures. Ablation and comparison results have demonstrated that our method is capable of achieving promising segmentation performance, better than state-of-the-art alternative methods by 3.30% in terms of Dice scores.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134373062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A foreground detection based video stabilization method and its application in aerospace measurement and control 基于前景检测的视频稳像方法及其在航天测控中的应用
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644567
L. Zhang, Chen Chen, Jinqian Tao, Zhaodun Huang, Hao Ding
{"title":"A foreground detection based video stabilization method and its application in aerospace measurement and control","authors":"L. Zhang, Chen Chen, Jinqian Tao, Zhaodun Huang, Hao Ding","doi":"10.1117/12.2644567","DOIUrl":"https://doi.org/10.1117/12.2644567","url":null,"abstract":"The output video of the optical equipment in the aerospace measurement and control field is prone to the problem of image quality degradation caused by the operator’s unstable manual operation. to improve the classical motion estimation based video stabilization algorithm, a novel video stabilization method based on foreground detection is proposed in this paper. Firstly, a object detection datasets based on historical images of the launch center is collected and labeled. Secondly, inspired by transfer learning and prior knowledge of the image in launch center, a YOLO-based object detection method for rocket launching scene is designed. Then, the object detection method is introduced into the motion estimation based video stabilization pipeline in which the object detection is used for foreground detection so the tracked feature points are filtered to reduce the global motion estimation error caused by the motion of the background area. Thus, the error stabilization problem in the classic motion estimation-based video stabilization method is avoided. Experiments show that the video stabilization method proposed in this paper achieved better image stabilization effect in subject and object evaluation. This paper has certain reference significance for exploring the application of deep learning and artificial intelligence technology in the field of aerospace measurement and control field.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134645578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Haze removal using a hybrid convolutional sparse representation model 基于混合卷积稀疏表示模型的雾霾去除
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2643362
Ye Cai, Lan Luo, Hongxia Gao, Shicheng Niu, Weipeng Yang, Tian Qi, Guoheng Liang
{"title":"Haze removal using a hybrid convolutional sparse representation model","authors":"Ye Cai, Lan Luo, Hongxia Gao, Shicheng Niu, Weipeng Yang, Tian Qi, Guoheng Liang","doi":"10.1117/12.2643362","DOIUrl":"https://doi.org/10.1117/12.2643362","url":null,"abstract":"Haze removal is a challenging task in image recovery, because hazy images are always degraded by turbid media in atmosphere, showing limited visibility and low contrast. Analysis Sparse Representation (ASR) and Synthesis Sparse Representation (SSR) has been widely used to recover degraded images. But there are always unexpected noise and details loss in the recovered images, as they take relatively less account of the images’ inherent coherence between image patches. Thus, in this paper, we propose a new haze removal method based on hybrid convolutional sparse representation, with consideration of the adjacent relationship by convolution and superposition. To integrate optical model into a convolutional sparse framework, we separate transmission map by transforming it into logarithm domain. And then a structure-based constraint on transmission map is proposed to maintain piece-wise smoothness and reduce the influence brought by pseudo depth abrupt edges. Experiment results demonstrate that the proposed method can restore fine structure of hazy images and suppress boosted noise.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133529156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hyperspectral remote sensing image semantic segmentation using extended extrema morphological profiles 基于扩展极值形态轮廓的高光谱遥感图像语义分割
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2643022
Tengyu Ma, Yunfei Liu, Weijian Huang, Chun Wang, Shuangquan Ge
{"title":"Hyperspectral remote sensing image semantic segmentation using extended extrema morphological profiles","authors":"Tengyu Ma, Yunfei Liu, Weijian Huang, Chun Wang, Shuangquan Ge","doi":"10.1117/12.2643022","DOIUrl":"https://doi.org/10.1117/12.2643022","url":null,"abstract":"Hyperspectral remote sensing images have been shown to be particularly beneficial for detecting the types of materials in a scene due to their unique spectral properties. This paper proposes a novel semantic segmentation method for hyperspectral image (HSI), which is based on a new spatial-spectral filtering, called extended extrema morphological profiles (EEMPs). Firstly, principal component analysis (PCA) is used as the feature extractor to construct the feature maps by extracting the first informative feature from the hyperspectral image (HSI). Secondly, the extrema morphological profiles (EMPs) are used to extract the spatial-spectral feature from the informative feature maps to construct the EEMPs. Finally, support vector machine (SVM) is utilized to obtain accurate semantic segmentation from the EEMPs. In order to evaluate the semantic segmentation results, the proposed method is tested on a widely used hyperspectral dataset, i.e., Houston dataset, and four metrics, i.e., class accuracy (CA), overall accuracy (OA), average accuracy (AA), and Kappa coefficient, are used to quantitatively measure the segmentation accuracy. The experimental results demonstrate that EEMPs can efficiently achieve good semantic segmentation accuracy.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133170697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hyperspectral image classification based on dual-branch attention network with 3-D octave convolution 基于双分支注意网络的三维八度卷积高光谱图像分类
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644256
Ling Xu, Guo Cao, Lin Deng, Lanwei Ding, Hao Xu, Qikun Pan
{"title":"Hyperspectral image classification based on dual-branch attention network with 3-D octave convolution","authors":"Ling Xu, Guo Cao, Lin Deng, Lanwei Ding, Hao Xu, Qikun Pan","doi":"10.1117/12.2644256","DOIUrl":"https://doi.org/10.1117/12.2644256","url":null,"abstract":"Hyperspectral Image (HSI) classification aims to assign each hyperspectral pixel with an appropriate land-cover category. In recent years, deep learning (DL) has received attention from a growing number of researchers. Hyperspectral image classification methods based on DL have shown admirable performance, but there is still room for improvement in terms of exploratory capabilities in spatial and spectral dimensions. To improve classification accuracy and reduce training samples, we propose a double branch attention network (OCDAN) based on 3-D octave convolution and dense block. Especially, we first use a 3-D octave convolution model and dense block to extract spatial features and spectral features respectively. Furthermore, a spatial attention module and a spectral attention module are implemented to highlight more discriminative information. Then the extracted features are fused for classification. Compared with the state-of-the-art methods, the proposed framework can achieve superior performance on two hyperspectral datasets, especially when the training samples are signally lacking. In addition, ablation experiments are utilized to validate the role of each part of the network.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"166 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133156087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信