Eurasip Journal on Image and Video Processing最新文献

筛选
英文 中文
Fast CU size decision and intra-prediction mode decision method for H.266/VVC 针对 H.266/VVC 的快速 CU 大小决策和内部预测模式决策方法
IF 2.4 4区 计算机科学
Eurasip Journal on Image and Video Processing Pub Date : 2024-03-18 DOI: 10.1186/s13640-024-00622-7
{"title":"Fast CU size decision and intra-prediction mode decision method for H.266/VVC","authors":"","doi":"10.1186/s13640-024-00622-7","DOIUrl":"https://doi.org/10.1186/s13640-024-00622-7","url":null,"abstract":"<h3>Abstract</h3> <p>H.266/Versatile Video Coding (VVC) is the most recent video coding standard developed by the Joint Video Experts Team (JVET). The quad-tree with nested multi-type tree (QTMT) architecture that improves the compression performance of H.266/VVC is introduced. Moreover, H.266/VVC contains a greater number of intra-prediction modes than H.265/High Efficiency Video Coding (HEVC), totalling 67. However, these lead to extremely the coding computational complexity. To cope with the above issues, a fast intra-coding unit (CU) size decision method and a fast intra-prediction mode decision method are proposed in this paper. Specifically, the trained Support Vector Machine (SVM) classifier models are utilized for determining CU partition mode in a fast CU size decision scheme. Furthermore, the quantity of intra-prediction modes added to the RDO mode set decreases in a fast intra-prediction mode decision scheme based on the improved search step. Simulation results illustrate that the proposed overall algorithm can decrease 55.24% encoding runtime with negligible BDBR.</p>","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140172567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessment framework for deepfake detection in real-world situations 真实世界中深度伪造检测的评估框架
IF 2.4 4区 计算机科学
Eurasip Journal on Image and Video Processing Pub Date : 2024-02-13 DOI: 10.1186/s13640-024-00621-8
Yuhang Lu, Touradj Ebrahimi
{"title":"Assessment framework for deepfake detection in real-world situations","authors":"Yuhang Lu, Touradj Ebrahimi","doi":"10.1186/s13640-024-00621-8","DOIUrl":"https://doi.org/10.1186/s13640-024-00621-8","url":null,"abstract":"<p>Detecting digital face manipulation in images and video has attracted extensive attention due to the potential risk to public trust. To counteract the malicious usage of such techniques, deep learning-based deepfake detection methods have been employed and have exhibited remarkable performance. However, the performance of such detectors is often assessed on related benchmarks that hardly reflect real-world situations. For example, the impact of various image and video processing operations and typical workflow distortions on detection accuracy has not been systematically measured. In this paper, a more reliable assessment framework is proposed to evaluate the performance of learning-based deepfake detectors in more realistic settings. To the best of our acknowledgment, it is the first systematic assessment approach for deepfake detectors that not only reports the general performance under real-world conditions but also quantitatively measures their robustness toward different processing operations. To demonstrate the effectiveness and usage of the framework, extensive experiments and detailed analysis of four popular deepfake detection methods are further presented in this paper. In addition, a stochastic degradation-based data augmentation method driven by realistic processing operations is designed, which significantly improves the robustness of deepfake detectors.</p>","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139763809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Edge-aware nonlinear diffusion-driven regularization model for despeckling synthetic aperture radar images 用于合成孔径雷达图像去斑的边缘感知非线性扩散驱动正则化模型
IF 2.4 4区 计算机科学
Eurasip Journal on Image and Video Processing Pub Date : 2024-01-11 DOI: 10.1186/s13640-023-00617-w
Anthony Bua, Goodluck Kapyela, Libe Massawe, Baraka Maiseli
{"title":"Edge-aware nonlinear diffusion-driven regularization model for despeckling synthetic aperture radar images","authors":"Anthony Bua, Goodluck Kapyela, Libe Massawe, Baraka Maiseli","doi":"10.1186/s13640-023-00617-w","DOIUrl":"https://doi.org/10.1186/s13640-023-00617-w","url":null,"abstract":"<p>Speckle noise corrupts synthetic aperture radar (SAR) images and limits their applications in sensitive scientific and engineering fields. This challenge has attracted several scholars because of the wide demand of SAR images in forestry, oceanography, geology, glaciology, and topography. Despite some significant efforts to address the challenge, an open-ended research question remains to simultaneously suppress speckle noise and to restore semantic features in SAR images. Therefore, this work establishes a diffusion-driven nonlinear method with edge-awareness capabilities to restore corrupted SAR images while protecting critical image features, such as contours and textures. The proposed method incorporates two terms that promote effective noise removal: (1) high-order diffusion kernel; and (2) fractional regularization term that is sensitive to speckle noise. These terms have been carefully designed to ensure that the restored SAR images contain stronger edges and well-preserved textures. Empirical results show that the proposed model produces content-rich images with higher subjective and objective values. Furthermore, our model generates images with unnoticeable staircase and block artifacts, which are commonly found in the classical Perona–Malik and Total variation models.</p>","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139422462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal few-shot classification without attribute embedding 无属性嵌入的多模态少镜头分类
IF 2.4 4区 计算机科学
Eurasip Journal on Image and Video Processing Pub Date : 2024-01-10 DOI: 10.1186/s13640-024-00620-9
Jun Qing Chang, Deepu Rajan, Nicholas Vun
{"title":"Multimodal few-shot classification without attribute embedding","authors":"Jun Qing Chang, Deepu Rajan, Nicholas Vun","doi":"10.1186/s13640-024-00620-9","DOIUrl":"https://doi.org/10.1186/s13640-024-00620-9","url":null,"abstract":"<p>Multimodal few-shot learning aims to exploit complementary information inherent in multiple modalities for vision tasks in low data scenarios. Most of the current research focuses on a suitable embedding space for the various modalities. While solutions based on embedding provide state-of-the-art results, they reduce the interpretability of the model. Separate visualization approaches enable the models to become more transparent. In this paper, a multimodal few-shot learning framework that is inherently interpretable is presented. This is achieved by using the textual modality in the form of attributes without embedding them. This enables the model to directly explain which attributes caused it to classify an image into a particular class. The model consists of a variational autoencoder to learn the visual latent representation, which is combined with a semantic latent representation that is learnt from a normal autoencoder, which calculates a semantic loss between the latent representation and a binary attribute vector. A decoder reconstructs the original image from concatenated latent vectors. The proposed model outperforms other multimodal methods when all test classes are used, e.g., 50 classes in a 50-way 1-shot setting, and is comparable for lesser number of ways. Since raw text attributes are used, the datasets for evaluation are CUB, SUN and AWA2. The effectiveness of interpretability provided by the model is evaluated by analyzing how well it has learnt to identify the attributes.</p>","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139422411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Secure image transmission through LTE wireless communications systems 通过 LTE 无线通信系统安全传输图像
IF 2.4 4区 计算机科学
Eurasip Journal on Image and Video Processing Pub Date : 2024-01-10 DOI: 10.1186/s13640-024-00619-2
Farouk Abduh Kamil Al-Fahaidy, Radwan AL-Bouthigy, Mohammad Yahya H. Al-Shamri, Safwan Abdulkareem
{"title":"Secure image transmission through LTE wireless communications systems","authors":"Farouk Abduh Kamil Al-Fahaidy, Radwan AL-Bouthigy, Mohammad Yahya H. Al-Shamri, Safwan Abdulkareem","doi":"10.1186/s13640-024-00619-2","DOIUrl":"https://doi.org/10.1186/s13640-024-00619-2","url":null,"abstract":"<p>Secure transmission of images over wireless communications systems can be done using RSA, the most known and efficient cryptographic algorithm, and OFDMA, the most preferred signal processing choice in wireless communications. This paper aims to investigate the performance of OFDMA system for wireless transmission of RSA-based encrypted images. In fact, the performance of OFDMA systems; based on different signal processing techniques, such as, discrete sine transforms (DST) and discrete cosine transforms (DCT), as well as the conventional discrete Fourier transforms (DFT) are tested for wireless transmission of gray-scale images with/without RSA encryption. The progress of transmitting the image is carried by firstly, encrypting the image with RSA algorithm. Then, the encrypted image is modulated with DFT-based, DCT-based, and DST-based OFDMA systems. After that, the modulated images are transmitted over a wireless multipath fading channel. The reverse operations will be carried at the receiver, in addition to the frequency domain equalization to overcome the channel effect. Exhaustive numbers of scenarios are performed for study and investigation of the performance of the different OFDMA systems in terms of PSNR and MSE, with different subcarriers mapping and modulation techniques, is done. Results indicate that the ability of different OFDMA systems for wireless secure transmission of images. However, the DCT-OFDMA system showed superiority over the DST-OFDMA and the conventional DFT-OFDMA systems.</p>","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139422457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An optimized capsule neural networks for tomato leaf disease classification 用于番茄叶病分类的优化胶囊神经网络
IF 2.4 4区 计算机科学
Eurasip Journal on Image and Video Processing Pub Date : 2024-01-08 DOI: 10.1186/s13640-023-00618-9
Lobna M. Abouelmagd, Mahmoud Y. Shams, Hanaa Salem Marie, Aboul Ella Hassanien
{"title":"An optimized capsule neural networks for tomato leaf disease classification","authors":"Lobna M. Abouelmagd, Mahmoud Y. Shams, Hanaa Salem Marie, Aboul Ella Hassanien","doi":"10.1186/s13640-023-00618-9","DOIUrl":"https://doi.org/10.1186/s13640-023-00618-9","url":null,"abstract":"<p>Plant diseases have a significant impact on leaves, with each disease exhibiting specific spots characterized by unique colors and locations. Therefore, it is crucial to develop a method for detecting these diseases based on spot shape, color, and location within the leaves. While Convolutional Neural Networks (CNNs) have been widely used in deep learning applications, they suffer from limitations in capturing relative spatial and orientation relationships. This paper presents a computer vision methodology that utilizes an optimized capsule neural network (CapsNet) to detect and classify ten tomato leaf diseases using standard dataset images. To mitigate overfitting, data augmentation, and preprocessing techniques were employed during the training phase. CapsNet was chosen over CNNs due to its superior ability to capture spatial positioning within the image. The proposed CapsNet approach achieved an accuracy of 96.39% with minimal loss, relying on a 0.00001 Adam optimizer. By comparing the results with existing state-of-the-art approaches, the study demonstrates the effectiveness of CapsNet in accurately identifying and classifying tomato leaf diseases based on spot shape, color, and location. The findings highlight the potential of CapsNet as an alternative to CNNs for improving disease detection and classification in plant pathology research.</p>","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139396774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-layer features template update object tracking algorithm based on SiamFC++ 基于 SiamFC++ 的多层特征模板更新物体跟踪算法
IF 2.4 4区 计算机科学
Eurasip Journal on Image and Video Processing Pub Date : 2024-01-04 DOI: 10.1186/s13640-023-00616-x
Xiaofeng Lu, Xuan Wang, Zhengyang Wang, Xinhong Hei
{"title":"Multi-layer features template update object tracking algorithm based on SiamFC++","authors":"Xiaofeng Lu, Xuan Wang, Zhengyang Wang, Xinhong Hei","doi":"10.1186/s13640-023-00616-x","DOIUrl":"https://doi.org/10.1186/s13640-023-00616-x","url":null,"abstract":"<p>SiamFC++ only extracts the object feature of the first frame as a tracking template, and only uses the highest level feature maps in both the classification branch and the regression branch, so that the respective characteristics of the two branches are not fully utilized. In view of this, the present paper proposes an object tracking algorithm based on SiamFC++. The algorithm uses the multi-layer features of the Siamese network to update template. First, FPN is used to extract feature maps from different layers of Backbone for classification branch and regression branch. Second, 3D convolution is used to update the tracking template of the object tracking algorithm. Next, a template update judgment condition is proposed based on mutual information. Finally, AlexNet is used as the backbone and GOT-10K as training set. Compared with SiamFC++, our algorithm obtains improved results on OTB100, VOT2016, VOT2018 and GOT-10k data sets, and the tracking process is real time.</p>","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139092803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Subjective performance evaluation of bitrate allocation strategies for MPEG and JPEG Pleno point cloud compression. MPEG 和 JPEG Pleno 点云压缩比特率分配策略的主观性能评估。
IF 2.4 4区 计算机科学
Eurasip Journal on Image and Video Processing Pub Date : 2024-01-01 Epub Date: 2024-06-11 DOI: 10.1186/s13640-024-00629-0
Davi Lazzarotto, Michela Testolina, Touradj Ebrahimi
{"title":"Subjective performance evaluation of bitrate allocation strategies for MPEG and JPEG Pleno point cloud compression.","authors":"Davi Lazzarotto, Michela Testolina, Touradj Ebrahimi","doi":"10.1186/s13640-024-00629-0","DOIUrl":"10.1186/s13640-024-00629-0","url":null,"abstract":"<p><p>The recent rise in interest in point clouds as an imaging modality has motivated standardization groups such as JPEG and MPEG to launch activities aiming at developing compression standards for point clouds. Lossy compression usually introduces visual artifacts that negatively impact the perceived quality of media, which can only be reliably measured through subjective visual quality assessment experiments. While MPEG standards have been subjectively evaluated in previous studies on multiple occasions, no work has yet assessed the performance of the recent JPEG Pleno standard in comparison to them. In this study, a comprehensive performance evaluation of JPEG and MPEG standards for point cloud compression is conducted. The impact of different configuration parameters on the performance of the codecs is first analyzed with the help of objective quality metrics. The results from this analysis are used to define three rate allocation strategies for each codec, which are employed to compress a set of point clouds at four target rates. The set of distorted point clouds is then subjectively evaluated following two subjective quality assessment protocols. Finally, the obtained results are used to compare the performance of these compression standards and draw insights about best coding practices.</p>","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11166754/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141318743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust steganography in practical communication: a comparative study 稳健隐写术在实际通信中的应用:比较研究
4区 计算机科学
Eurasip Journal on Image and Video Processing Pub Date : 2023-10-31 DOI: 10.1186/s13640-023-00615-y
Tong Qiao, Shengwang Xu, Shuai Wang, Xiaoshuai Wu, Bo Liu, Ning Zheng, Ming Xu, Binmin Pan
{"title":"Robust steganography in practical communication: a comparative study","authors":"Tong Qiao, Shengwang Xu, Shuai Wang, Xiaoshuai Wu, Bo Liu, Ning Zheng, Ming Xu, Binmin Pan","doi":"10.1186/s13640-023-00615-y","DOIUrl":"https://doi.org/10.1186/s13640-023-00615-y","url":null,"abstract":"Abstract To realize the act of covert communication in a public channel, steganography is proposed. In the current study, modern adaptive steganography plays a dominant role due to its high undetectability. However, the effectiveness of modern adaptive steganography is challenged when being applied in practical communication, such as over social network. Several robust steganographic methods have been proposed, while the comparative study between them is still unknown. Thus, we propose a framework to generalize the current typical steganographic methods resisting against compression attack, and meanwhile empirically analyze advantages and disadvantages of them based on four baseline indicators, referring to as capacity, imperceptibility, undetectability, and robustness. More importantly, the robustness performance of the methods is compared in the real application, such as on Facebook, Twitter, and WeChat, which has not been comprehensively addressed in this community. In particular, the methods modifying sign of DCT coefficients perform more superiority on the social media application.","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135868890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-attention-based approach for deepfake face and expression swap detection and localization 基于多注意力的深度假人脸与表情交换检测与定位方法
IF 2.4 4区 计算机科学
Eurasip Journal on Image and Video Processing Pub Date : 2023-08-18 DOI: 10.1186/s13640-023-00614-z
Saima Waseem, S. Abu-Bakar, Z. Omar, Bilal Ashfaq Ahmed, Saba Baloch, Adel Hafeezallah
{"title":"Multi-attention-based approach for deepfake face and expression swap detection and localization","authors":"Saima Waseem, S. Abu-Bakar, Z. Omar, Bilal Ashfaq Ahmed, Saba Baloch, Adel Hafeezallah","doi":"10.1186/s13640-023-00614-z","DOIUrl":"https://doi.org/10.1186/s13640-023-00614-z","url":null,"abstract":"","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2023-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42272550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信