International Conference on Digital Image Processing最新文献

筛选
英文 中文
Multi-visual information fusion and aggregation for video action classification 面向视频动作分类的多视觉信息融合与聚合
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644312
Xuchao Gong, Zongmin Li, Xiangdong Wang
{"title":"Multi-visual information fusion and aggregation for video action classification","authors":"Xuchao Gong, Zongmin Li, Xiangdong Wang","doi":"10.1117/12.2644312","DOIUrl":"https://doi.org/10.1117/12.2644312","url":null,"abstract":"In order to fully mine the performance improvement of spatio-temporal features in video action classification, we propose a multi-visual information fusion time sequence prediction network (MI-TPN) which based on the feature aggregation model ActionVLAD. The method includes three parts: multi-visual information fusion, time sequence feature modeling and spatiotemporal feature aggregation. In the multi-visual information fusion, the RGB features and optical flow features are combined, the visual context and action description details are fully considered. In time sequence feature modeling, the temporal relationship is modeled by LSTM to obtain the importance measurement between temporal description features. Finally, in feature aggregation, time step feature and spatiotemporal center attention mechanism are used to aggregate features and projected them into a common feature space. This method obtains good results on three commonly used comparative datasets UCF101, HMDB51 and Something.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114891973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A defect detection method for plastic gears based on deep learning and machine vision 基于深度学习和机器视觉的塑料齿轮缺陷检测方法
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644273
Y. Hao, Meng Xiang, Zichao Zhu
{"title":"A defect detection method for plastic gears based on deep learning and machine vision","authors":"Y. Hao, Meng Xiang, Zichao Zhu","doi":"10.1117/12.2644273","DOIUrl":"https://doi.org/10.1117/12.2644273","url":null,"abstract":"For the detection of plastic gears, most factories still use manual method with measurement tools. Therefore, the efforts expended in their defect detection are tremendous in the production processes. This paper proposes a new method that detects defection for plastic gears during their production and recycling processes. An image dataset of different kind of plastic gears was created. Then, a defect detection DL model was proposed based on GoogLeNet; it detected whether the plastic gears have missing teeth (MT), edge fin (EF), or good quality (GQ). An independent dataset was created to test the DL model: the accuracy of this model reached 94.8%. Combined with MV and DL methods, this paper realizes the automatic detection of plastic gear defects. Based on the independent plastic gear data set, the effect of defect detection method is verified by experiments. The results have important theoretical value and practical significance for liberating manpower and promoting the automatic process of plastic gear defect detection.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125987872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reference-driven undersampled MRI reconstruction using automated stopping deep image prior 使用自动停止深度图像先验的参考驱动欠采样MRI重建
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644282
Guisong Wang, Xiaofeng Du, Yanhua Qin, Yifan He
{"title":"Reference-driven undersampled MRI reconstruction using automated stopping deep image prior","authors":"Guisong Wang, Xiaofeng Du, Yanhua Qin, Yifan He","doi":"10.1117/12.2644282","DOIUrl":"https://doi.org/10.1117/12.2644282","url":null,"abstract":"Magnetic resonance image (MRI) reconstruction from undersampled k-space data using unsupervised learning methods suffers from insufficient a priori knowledge and the lack of stopping criterion. This work introduces a high-resolution reference image to tackle these issues. Specifically, we explicitly broadcast the reference image into the proposed network, transferring the reference image structure priors to the recovered image. In addition, the reference image helps to develop a criterion to determine the best-reconstructed image, so training stops automatically once the conditions are met. Experimental results show that the proposed method can reduce artifacts without using a priori training set.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"12342 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130310753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tobacco plant disease dataset 烟草植物病害数据集
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644288
Hong Lin, Rita Tse, Su-Kit Tang, Z. Qiang, Jinliang Ou, Giovanni Pau
{"title":"Tobacco plant disease dataset","authors":"Hong Lin, Rita Tse, Su-Kit Tang, Z. Qiang, Jinliang Ou, Giovanni Pau","doi":"10.1117/12.2644288","DOIUrl":"https://doi.org/10.1117/12.2644288","url":null,"abstract":"Tobacco is a valuable plant in agricultural and commercial industry. Any disease infection to the plant may lower the harvest and interfere the operation of supply chain in the market. Image-based deep learning methods are cutting-edge technologies that can facilitate the diagnosis of diseases efficiently and effectively when large-scale dataset is available for training. However, there is not a public dataset about tobacco currently. A comprehensive dataset is appealed to take advantage of deep learning methods in tobacco cultivation urgently. In this paper, we propose to create a specific dataset for tobacco diseases, called Tobacco Plant Disease Dataset (TPDD). 2721 tobacco leaf images are taken in field. The dataset serves for two purposes: disease classification and leaf detection. For classification, we identify 12 classes and provide two types of disease annotations: 1) Whole Leaf Section; 2) Disease Fragment Section. For leaf detection, we provide two kinds of bounding box: rectangle bounding box and polygon bounding box. In addition, we conduct baseline experiments to illustrate the usefulness of TPDD: 1) using deep learning model to detect single disease and multiple diseases; 2) using YOLO-v3 and Mask-RCNN to detect leaves. We hope that the dataset could support the tobacco industry, also be a benchmark in fine-grained vision classification.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127360859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Measuring system for elongation at break of cable insulation sheath based on machine vision 基于机器视觉的电缆绝缘护套断裂伸长率测量系统
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2643116
X. Su, Gangwei Wang, Zhiqiang Zhang, Jiale Yang, Zhijia Zhang
{"title":"Measuring system for elongation at break of cable insulation sheath based on machine vision","authors":"X. Su, Gangwei Wang, Zhiqiang Zhang, Jiale Yang, Zhijia Zhang","doi":"10.1117/12.2643116","DOIUrl":"https://doi.org/10.1117/12.2643116","url":null,"abstract":"In the production of power cables, the performance test of the cable insulation sheath is an important part. Compared with traditional testing methods, machine vision has the advantages of stable operation, high precision, and high efficiency. Because of this situation, firstly, based on machine vision theory, the structure of the old-fashioned tensile machine was reconstructed, and the whole tensile test process of the cable insulation sheath test was imaged by a CMOS camera, and the color recognition algorithm, effective area segmentation algorithm, and workpiece were proposed. The fracture judgment detection algorithm and the corrosion difference algorithm are used to calculate the distance between the marked lines and then calculate the elongation at the break of the cable material. Through systematic experiments on the same batch of cable jackets, the deviation of the elongation at break measured by visual inspection is the largest, no more than 1%. The experimental results and practical applications show that the machine vision-based visual inspection system has higher accuracy, faster efficiency, and more stable and reliable operation than the traditional inspection system.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"12342 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129415939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
3D face alignment and face reconstruction based on image sequence 基于图像序列的三维人脸对齐与重构
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644477
Y. Wei, Biao Qiao, Hua-bin Wang, Mengxin Zhang, Shijun Liu, L. Tao
{"title":"3D face alignment and face reconstruction based on image sequence","authors":"Y. Wei, Biao Qiao, Hua-bin Wang, Mengxin Zhang, Shijun Liu, L. Tao","doi":"10.1117/12.2644477","DOIUrl":"https://doi.org/10.1117/12.2644477","url":null,"abstract":"Existing 3D face alignment and face reconstruction methods mainly focus on the accuracy of the model. When the existing methods are applied to dynamic videos, the stability and accuracy are significantly reduced. To overcome this problem, we propose a novel regression framework that strikes a balance between accuracy and stability. First, on the basis of lightweight backbone, encoder-decoder structure is used to jointly learn expression details and detailed 3D face from video images to recover shape details and their relationship to facial expression, and dynamic regression of a small number of 3D face parameters, effectively improve the speed and accuracy. Secondly, in order to further improve the stability of face landmarks in video, a jitter loss function of multi-frame image joint learning is proposed to strengthen the correlation between frames and face landmarks in video, and reduce the difference amplitude of face landmarks between adjacent frames to reduce the jitter of face landmarks. Experiments on several challenging datasets verify the effectiveness of our method.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127625035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic segmentation of road scene based on multi-scale feature extraction and deep supervision 基于多尺度特征提取和深度监督的道路场景语义分割
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644695
Longfei Wang, Chunman Yan
{"title":"Semantic segmentation of road scene based on multi-scale feature extraction and deep supervision","authors":"Longfei Wang, Chunman Yan","doi":"10.1117/12.2644695","DOIUrl":"https://doi.org/10.1117/12.2644695","url":null,"abstract":"Aiming at the problems of inaccurate segmentation edges, poor adaptability to multi-scale road targets, prone to false segmentation and missing segmentation when segmenting road targets with various and changeable occlusions in the traditional U-Net model, a semantic segmentation model of road scene based on multi-scale feature extraction and deep supervision module is proposed. Firstly, the dual attention module is embedded in the U-Net encoder, which can make the model have the ability to capture the context information of channel dimension and spatial dimension in the global range, and enhance the road features; Secondly, before upsampling, the feature map containing high-level semantic information is input into ASPP module to obtain road features of different scales; Finally, the deep supervision module is introduced into the upsampling part to learn the feature representation at different levels and retain more road detail features. Experiments are carried out on CamVid dataset and Cityscapes dataset. The results show that our Network can effectively segment road targets with different scales, and the segmented road contour is more complete and clear, which improves the accuracy of semantic segmentation while ensuring a certain segmentation speed.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122639747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Cluster-based point cloud attribute compression using inter prediction and graph Fourier transform 基于聚类的点云属性压缩,采用内部预测和图傅里叶变换
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644218
Jiaying Liu, Jin Wang, Longhua Sun, Jie Pei, Qing Zhu
{"title":"Cluster-based point cloud attribute compression using inter prediction and graph Fourier transform","authors":"Jiaying Liu, Jin Wang, Longhua Sun, Jie Pei, Qing Zhu","doi":"10.1117/12.2644218","DOIUrl":"https://doi.org/10.1117/12.2644218","url":null,"abstract":"With the rapid development of 3D capture technologies, point cloud has been widely used in many emerging applications such as augmented reality, autonomous driving, and 3D printing. However, point cloud, used to represent real world objects in these applications, may contain millions of points, which results in huge data volume. Therefore, efficient compression algorithms are essential for point cloud when it comes to storage and real-time transmission issues. Specially, the attribute compression of point cloud is still challenging owing to the sparsity and irregular distribution of corresponding points in 3D space. In this paper, we present a novel point cloud attribute compression scheme based on inter-prediction of blocks and graph Laplacian transforms for attributes residual. Firstly, we divide the entire point cloud into adaptive sub-clouds via K-means based on the geometry to acquire sub-clouds, which enables efficient representation with less cost. Secondly, the sub-clouds are divided into two parts, one is the attribute means of the sub clouds, another is the attribute residual by removing the means. For the attribute means, we use inter-prediction between sub-clouds to remove the attribute redundancy, and the attribute residual is encoded after graph Fourier transforming. Experimental results demonstrate that the proposed scheme is much more efficient than traditional attribute compression schemes.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115594123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Automatic heart segmentation based on convolutional networks using attention mechanism 基于注意机制的卷积神经网络自动心脏分割
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2643378
Guodong Zhang, Yu Liu, Wei Guo, Wenjun Tan, Zhaoxuan Gong, M. Farooq
{"title":"Automatic heart segmentation based on convolutional networks using attention mechanism","authors":"Guodong Zhang, Yu Liu, Wei Guo, Wenjun Tan, Zhaoxuan Gong, M. Farooq","doi":"10.1117/12.2643378","DOIUrl":"https://doi.org/10.1117/12.2643378","url":null,"abstract":"Heart segmentation is challenging due to the poor image contrast of heart in the CT images. Since manual segmentation of the heart is tedious and time-consuming, we propose an attention-based Convolution Neural Network (CNN) for heart segmentation. First, one-hot preprocessing is performed on the multi-tissue CT images. U-Net network with Attention-gate is then applied to obtain the heart region. We compared our method with several CNN methods in terms of dice coefficient. Results show that our method outperforms other methods for segmentation.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115969236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Radio frequency interference suppression based on two-dimensional frequency domain notch for P-band ultra-wideband SAR 基于二维频域陷波的p波段超宽带SAR射频干扰抑制
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2643264
Kang Liang, Hongtu Xie, Xinqiao Jiang, Xiao Hu, Kaipeng Chen, Guoqian Wang
{"title":"Radio frequency interference suppression based on two-dimensional frequency domain notch for P-band ultra-wideband SAR","authors":"Kang Liang, Hongtu Xie, Xinqiao Jiang, Xiao Hu, Kaipeng Chen, Guoqian Wang","doi":"10.1117/12.2643264","DOIUrl":"https://doi.org/10.1117/12.2643264","url":null,"abstract":"P-band ultra-wideband synthetic aperture radar (UWB SAR) not only has the characteristics of the high-resolution imaging, but also has the well capability of the foliage penetrating, which is potential of detecting and imaging the concealed target under the vegetation. However, there are a lot of the radio, television and mobile communication signals in the P-band, which are called as the radio frequency interference (RFI) signals. These RFI signals are mixed with target echo signals, which will cause the serious interference in the P-band UWB SAR imaging. The traditional notch method is easy to implement the RFI suppression, so it has been widely used. However, the traditional notch method is to notch each pulse echo individually, which has a high computational complexity. At the same time, the RFI suppression of each pulse echo separately will always lead to a large amount of the residual interference, so the traditional notch method has the poor RFI suppression effect. Based on the traditional notch method, this paper proposes an RFI suppression method based on the two-dimensional frequency domain (2DFD) notch, which can realize one-time processing of all echo pulses so that improve the efficiency of the RFI suppression. Meanwhile, because the bandwidth of the RFI signal is much smaller than that of the SAR echo signal, converting the received SAR echo signal to the 2DFD can further concentrate the energy of the RFI signals, so it has the better RFI suppression effect. The simulation results show that the proposed RFI suppression method based on the 2DFD notch can not only improve the efficiency of the RFI suppression but also have the better effect of the RFI suppression.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"617 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115827208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信