International Conference on Digital Image Processing最新文献

筛选
英文 中文
Deep learning techniques for image recognition of counterfeited luxury handbags materials 基于深度学习技术的奢侈品手袋仿冒材料图像识别
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644669
P. Apipawinwongsa, Y. Limpiyakorn
{"title":"Deep learning techniques for image recognition of counterfeited luxury handbags materials","authors":"P. Apipawinwongsa, Y. Limpiyakorn","doi":"10.1117/12.2644669","DOIUrl":"https://doi.org/10.1117/12.2644669","url":null,"abstract":"Due to the fact that counterfeit in second-handed goods terribly affects trading in markets of second-handed luxury bags, users in this research thus present studies of methods to classify genuineness of ‘Gucci GG Canvas’ with the pretrained model from Model VGG16 and with DenseNet121 to design deep Convolutional Neural Networks (CNN) model for binary classification. The CNN together with DenseNet121 model comprises accuracy at 95%, which is more than the 2 prior models, i.e., CNN from scratch and CNN together with VGG16.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"236 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134192150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stochastic recursive gradient descent optimization-based on foreground features of Fisher vector 基于Fisher向量前景特征的随机递归梯度下降优化
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644640
Mohamed Gamal M. Kamaleldin, S. Abu-Bakar, U. U. Sheikh
{"title":"Stochastic recursive gradient descent optimization-based on foreground features of Fisher vector","authors":"Mohamed Gamal M. Kamaleldin, S. Abu-Bakar, U. U. Sheikh","doi":"10.1117/12.2644640","DOIUrl":"https://doi.org/10.1117/12.2644640","url":null,"abstract":"Human action recognition has been one of the hot topics in computer vision both from the handcrafted and deep learning approaches. In the handcrafted approach, the extracted features are encoded for reducing the size of these features. Amonsgt the state-of-the-art approaches is to encode these visual features using the Gaussian mixture model. However, the size of the codebook is an issue in terms of the computation complexity, especially for large-scale data as it requires encoding using a large codebook. In this paper, we introduced the use of different optimizers to reduce the codebook size while boosting its accuracy. To illustrate the performance , first we use the improved dense trajectories (IDT) to extract the handcrafted features. This is followed with encoding the descriptor using Fisher kernel-based codebook using the Gaussian mixture model. Next, the support vector machine is used to classify the categories. We then use and compare five different Stochastic gradient descent optimization techniques to modify the number of Gaussian components. In this manner we are able to select the discriminative foreground features (as represented by the final number of Gaussian components), and omit the background features. Finally, to show the performance improvement of the proposed method, we implement this technique to two datasets UCF101 and HMDB51.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131472787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PointIt3D: a benchmark dataset and baseline for pointed object detection task PointIt3D:点目标检测任务的基准数据集和基线
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2645330
Chun-Tse Lin, Hongxin Zhang, Hao Zheng
{"title":"PointIt3D: a benchmark dataset and baseline for pointed object detection task","authors":"Chun-Tse Lin, Hongxin Zhang, Hao Zheng","doi":"10.1117/12.2645330","DOIUrl":"https://doi.org/10.1117/12.2645330","url":null,"abstract":"Pointed object detection is of great importance for human-machine interaction, but attempts to solve this task may run into the difficulties of lack of available large scale datasets since people hardly record 3D scenes with a human pointing at specific objects. In efforts to mitigate this gap, we cultivate the first benchmark dataset for this task: PointIt3D (available at https://pan.baidu.com/share/init?surl=E3u96E7dEXnrR1dDris_1w (access code: jps5)), containing 347 scans now and can be easily scaled up to facilitate future utilizations, which is automatically constructed from existing 3D scenes from ScanNet1 and 3D people models using our novel synthetic algorithm that achieves a high acceptable rate of more than 85% according to three experts’ assessments, which hopefully would pave the way for further studies. We also provide a simple yet effective baseline based on anomaly detection and majority voting pointline generation to solve this task based on our dataset, which achieves accuracy of 55.33%, leaving much room for further improvements. Code will be released at https://github.com/XHRlyb/PointIt3D.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"14 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127596998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A new zero-watermarking algorithm based on deep learning 一种新的基于深度学习的零水印算法
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2643017
Jing Liu, Qian Li, Hui Yang
{"title":"A new zero-watermarking algorithm based on deep learning","authors":"Jing Liu, Qian Li, Hui Yang","doi":"10.1117/12.2643017","DOIUrl":"https://doi.org/10.1117/12.2643017","url":null,"abstract":"A new zero-watermarking algorithm based on deep learning is proposed to improve the robustness of the zero-watermarking, in which zero-watermarking image generation and copyright verification are both completed using neural networks. First, a stylized image is generated from a host image and a logo image with a time stamp through VGG network. Then, the stylized image is encrypted by the Arnold transform and registered as a zero-watermarking image in Intellectual Property Protection (IPR). Finally, the RCNN network is designed to extract the logo image to verify the copyright of host images. The experimental results show that the security and robustness of the algorithm are better than the existing zero-watermarking algorithm.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132921018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-modal transformer for video retrieval using improved sentence embeddings 基于改进句子嵌入的多模态视频检索变压器
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2643741
Zhi Liu, Fangyuan Zhao, Mengmeng Zhang
{"title":"Multi-modal transformer for video retrieval using improved sentence embeddings","authors":"Zhi Liu, Fangyuan Zhao, Mengmeng Zhang","doi":"10.1117/12.2643741","DOIUrl":"https://doi.org/10.1117/12.2643741","url":null,"abstract":"With the explosive growth of the number of online videos, video retrieval becomes increasingly difficult. Multi-modal visual and language understanding based video-text retrieval is one of the mainstream framework to solve this problem. Among them, MMT (Multi-modal Transformer) is a novel and mainstream model. On the language side, BERT (Bidirectional Encoder Representation for Transformers) is used to encode text, where the pretrained BERT will be fine tuned during training. However, there exists a mismatch in this stage. The pre-training tasks of BERT is based on NSP (Next Sentence Prediction) and MLM(masked language model) which have weak correlation with video retrieval. For text encoder will encode text into semantic embeddings. On the visual side, Transformer is used to aggregate multimodal experts of videos. We find that the output of visual transformer is not fully utilized. In this paper, a sentence- BERT model is introduced to substitute BERT model in MMT to improve sentence embeddings efficiency. In addition, a max-pooling layer is adopted after Transformer to improve the utilization efficiency of the output of the model. Experiment results show that the proposed model outperforms MMT.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124678369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Real-time ranging of traffic signs for smart car environment perception 面向智能汽车环境感知的交通标志实时测距
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2643117
Peng Liu, Wen-hui Pei, Zhi-jia Zhang, Jin-zhi Du, Xiu-tian Wang
{"title":"Real-time ranging of traffic signs for smart car environment perception","authors":"Peng Liu, Wen-hui Pei, Zhi-jia Zhang, Jin-zhi Du, Xiu-tian Wang","doi":"10.1117/12.2643117","DOIUrl":"https://doi.org/10.1117/12.2643117","url":null,"abstract":"In order to effectively solve the problem of real-time distance measurement of traffic signs in intelligent driving environment perception, a distance measurement method based on binocular vision is proposed. In order to solve the problem of real-time distance measurement, the paper proposes to build a correction mapping table, through which the correction coordinates corresponding to any distorted coordinates can be read out. The calibrated parameters are used to calculate the correction mapping table. The coordinates of left and right traffic signs can be obtained through pyramid template matching. Then the parallax is obtained and the distance is measured. The error rate of the measurement method is less than 2.33% within 20 meters to 60 meters. The time of one-time measurement is within 20ms in embedded environment.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127030913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Infrared and visible images fusion method based on unsupervised learning 基于无监督学习的红外与可见光图像融合方法
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644227
Depeng Zhu, Weida Zhan, Yichun Jiang, Xiaoyu Xu, Renzhong Guo, Yu Chen
{"title":"Infrared and visible images fusion method based on unsupervised learning","authors":"Depeng Zhu, Weida Zhan, Yichun Jiang, Xiaoyu Xu, Renzhong Guo, Yu Chen","doi":"10.1117/12.2644227","DOIUrl":"https://doi.org/10.1117/12.2644227","url":null,"abstract":"Aiming at the problem that the current infrared and visible image fusion based on deep learning has no labels, this paper proposes an infrared and visible image fusion algorithm based on unsupervised learning. This method utilizes the characteristics of unsupervised learning, and introduces infrared image information with high gray value into the visible image to obtain the fusion image. The deep learning network proposed in this paper is composed of 6 layers of convolution blocks, and a dual attention module is also designed to make the fusion image pay more attention to the high gray value area in the infrared image. By introducing skip connections, the shallow features are fused with the deep features, so that the details of the entire fused image are richer and the appearance of halos is reduced. A large number of experimental results show that the fusion method proposed in this paper can accurately highlight the target object while maintaining the visible texture details, enhance the visual effect of the human eye, and improve the target recognition. At the same time, the quantitative experimental results show that the fusion algorithm proposed in this paper has obvious advantages in multiple indicators.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130579981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An efficient image encryption scheme based on variable row-columns scrambling and selective block diffusion 一种基于可变行列置乱和选择性块扩散的高效图像加密方案
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644699
W. Huang, Lyu Xin, C. Zhang, Xin Li, Yifei Su, Tao Zeng, Xinyu Wang, Zhennan Xu, Yucong Huang
{"title":"An efficient image encryption scheme based on variable row-columns scrambling and selective block diffusion","authors":"W. Huang, Lyu Xin, C. Zhang, Xin Li, Yifei Su, Tao Zeng, Xinyu Wang, Zhennan Xu, Yucong Huang","doi":"10.1117/12.2644699","DOIUrl":"https://doi.org/10.1117/12.2644699","url":null,"abstract":"With the continuous development of multimedia technology and shooting hardware, more and more high-quality images appear in life and production. At present, there are many image encryption methods to ensure image security, but most of them cannot meet the real-time requirements of large-size image encryption, which causes great obstacles to the application and popularization of image encryption. In this paper, an efficient image encryption method based on variable row-columns scrambling and dynamic threshold selective block diffusion is designed. The pixels are operated in batches row by column and block, and the length of chaotic sequence required is reduced without reducing the security, thus reducing the time consuming of the encryption system. In addition, the modified five-point sampling is adopted to select chaotic sequence, which improves the utilization rate of chaotic sequence and further improves the encryption efficiency. Experimental results show that the proposed method has a higher encryption efficiency than the existing methods with similar security, and has strong practicability.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132422538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An algorithm for facial mask area repair based upon deep learning 一种基于深度学习的人脸区域修复算法
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644299
Haoyu Zhang
{"title":"An algorithm for facial mask area repair based upon deep learning","authors":"Haoyu Zhang","doi":"10.1117/12.2644299","DOIUrl":"https://doi.org/10.1117/12.2644299","url":null,"abstract":"Due to the impact of Corona Virus Disease 2019 (COVID-19), facial mask has become a necessary protective measure for people going out in the last two years. One's mouth and nose are covered to suppress the spread of the virus, which brings a huge challenge for face verification. Whereas some existing image inpainting methods cannot repair the covered area well, which reduces the accuracy of face verification. In this paper, an algorithm is proposed to repair the area covered by facial mask to restore the identity information for face authentication. The proposed algorithm consists of an image inpainting network and a face verification network. Among them, in image inpainting network, to begin with, two discriminators, namely global discriminator and local discriminator. Then Resnet blocks are employed in two discriminators, which is used to retain more feature information. Experimental results show that the proposed method generates fewer artifacts and receives the higher Rank-1 accuracy than other methods in discussion.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"257 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132175531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fusion of infrared and visible sensor images based on anisotropic diffusion and fast guided filter 基于各向异性扩散和快速制导滤波的红外和可见光传感器图像融合
International Conference on Digital Image Processing Pub Date : 2022-10-12 DOI: 10.1117/12.2644537
Jingwen Nan, Zongxi Song, Hao Lei, W. Li
{"title":"Fusion of infrared and visible sensor images based on anisotropic diffusion and fast guided filter","authors":"Jingwen Nan, Zongxi Song, Hao Lei, W. Li","doi":"10.1117/12.2644537","DOIUrl":"https://doi.org/10.1117/12.2644537","url":null,"abstract":"Infrared images and visible images can obtain different image information in the same scene, especially in low-light scenes, infrared images can obtain image information that cannot be obtained by visible images. In order to obtain more useful information in the environment such as glimmer, infrared and visible images can be fused. In this paper, an image fusion method based on anisotropic diffusion and fast guided filter is proposed. Firstly, the source images are decomposed into base layers and detail layers by anisotropic dispersion. Secondly, the visible images and the infrared images are passed through the side window Gaussian filter to obtain the saliency map, and then the saliency map is passed through fast guided filter to obtain the fusion weight. Thirdly, the fused base layers and the fused detail layers are reconstructed to obtain the final fusion image. The application of the side window Gaussian filter helps to reduce the artifact information of the fused image. The results of the proposed algorithm are compared with similar algorithms. The fusion results reveal that the proposed method are outstanding in subjective evaluation and objective evaluation, and are better than other algorithms in standard deviation(STD) and entropy(EN), and other quality metrics are close to the optimal comparison algorithm.","PeriodicalId":314555,"journal":{"name":"International Conference on Digital Image Processing","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130838315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信