Eurasip Journal on Image and Video Processing最新文献_第3页

Fast CU size decision and intra-prediction mode decision method for H.266/VVC 针对 H.266/VVC 的快速 CU 大小决策和内部预测模式决策方法

IF 2.4 4区计算机科学

Eurasip Journal on Image and Video Processing Pub Date : 2024-03-18 DOI: 10.1186/s13640-024-00622-7

引用次数: 0

Assessment framework for deepfake detection in real-world situations 真实世界中深度伪造检测的评估框架

IF 2.4 4区计算机科学

Eurasip Journal on Image and Video Processing Pub Date : 2024-02-13 DOI: 10.1186/s13640-024-00621-8

Yuhang Lu, Touradj Ebrahimi

{"title":"Assessment framework for deepfake detection in real-world situations","authors":"Yuhang Lu, Touradj Ebrahimi","doi":"10.1186/s13640-024-00621-8","DOIUrl":"https://doi.org/10.1186/s13640-024-00621-8","url":null,"abstract":"Detecting digital face manipulation in images and video has attracted extensive attention due to the potential risk to public trust. To counteract the malicious usage of such techniques, deep learning-based deepfake detection methods have been employed and have exhibited remarkable performance. However, the performance of such detectors is often assessed on related benchmarks that hardly reflect real-world situations. For example, the impact of various image and video processing operations and typical workflow distortions on detection accuracy has not been systematically measured. In this paper, a more reliable assessment framework is proposed to evaluate the performance of learning-based deepfake detectors in more realistic settings. To the best of our acknowledgment, it is the first systematic assessment approach for deepfake detectors that not only reports the general performance under real-world conditions but also quantitatively measures their robustness toward different processing operations. To demonstrate the effectiveness and usage of the framework, extensive experiments and detailed analysis of four popular deepfake detection methods are further presented in this paper. In addition, a stochastic degradation-based data augmentation method driven by realistic processing operations is designed, which significantly improves the robustness of deepfake detectors.","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":"46 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139763809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Edge-aware nonlinear diffusion-driven regularization model for despeckling synthetic aperture radar images 用于合成孔径雷达图像去斑的边缘感知非线性扩散驱动正则化模型

IF 2.4 4区计算机科学

Eurasip Journal on Image and Video Processing Pub Date : 2024-01-11 DOI: 10.1186/s13640-023-00617-w

Anthony Bua, Goodluck Kapyela, Libe Massawe, Baraka Maiseli

{"title":"Edge-aware nonlinear diffusion-driven regularization model for despeckling synthetic aperture radar images","authors":"Anthony Bua, Goodluck Kapyela, Libe Massawe, Baraka Maiseli","doi":"10.1186/s13640-023-00617-w","DOIUrl":"https://doi.org/10.1186/s13640-023-00617-w","url":null,"abstract":"Speckle noise corrupts synthetic aperture radar (SAR) images and limits their applications in sensitive scientific and engineering fields. This challenge has attracted several scholars because of the wide demand of SAR images in forestry, oceanography, geology, glaciology, and topography. Despite some significant efforts to address the challenge, an open-ended research question remains to simultaneously suppress speckle noise and to restore semantic features in SAR images. Therefore, this work establishes a diffusion-driven nonlinear method with edge-awareness capabilities to restore corrupted SAR images while protecting critical image features, such as contours and textures. The proposed method incorporates two terms that promote effective noise removal: (1) high-order diffusion kernel; and (2) fractional regularization term that is sensitive to speckle noise. These terms have been carefully designed to ensure that the restored SAR images contain stronger edges and well-preserved textures. Empirical results show that the proposed model produces content-rich images with higher subjective and objective values. Furthermore, our model generates images with unnoticeable staircase and block artifacts, which are commonly found in the classical Perona–Malik and Total variation models.","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":"104 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139422462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multimodal few-shot classification without attribute embedding 无属性嵌入的多模态少镜头分类

IF 2.4 4区计算机科学

Eurasip Journal on Image and Video Processing Pub Date : 2024-01-10 DOI: 10.1186/s13640-024-00620-9

Jun Qing Chang, Deepu Rajan, Nicholas Vun

{"title":"Multimodal few-shot classification without attribute embedding","authors":"Jun Qing Chang, Deepu Rajan, Nicholas Vun","doi":"10.1186/s13640-024-00620-9","DOIUrl":"https://doi.org/10.1186/s13640-024-00620-9","url":null,"abstract":"Multimodal few-shot learning aims to exploit complementary information inherent in multiple modalities for vision tasks in low data scenarios. Most of the current research focuses on a suitable embedding space for the various modalities. While solutions based on embedding provide state-of-the-art results, they reduce the interpretability of the model. Separate visualization approaches enable the models to become more transparent. In this paper, a multimodal few-shot learning framework that is inherently interpretable is presented. This is achieved by using the textual modality in the form of attributes without embedding them. This enables the model to directly explain which attributes caused it to classify an image into a particular class. The model consists of a variational autoencoder to learn the visual latent representation, which is combined with a semantic latent representation that is learnt from a normal autoencoder, which calculates a semantic loss between the latent representation and a binary attribute vector. A decoder reconstructs the original image from concatenated latent vectors. The proposed model outperforms other multimodal methods when all test classes are used, e.g., 50 classes in a 50-way 1-shot setting, and is comparable for lesser number of ways. Since raw text attributes are used, the datasets for evaluation are CUB, SUN and AWA2. The effectiveness of interpretability provided by the model is evaluated by analyzing how well it has learnt to identify the attributes.","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":"4 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139422411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Secure image transmission through LTE wireless communications systems 通过 LTE 无线通信系统安全传输图像

IF 2.4 4区计算机科学

Eurasip Journal on Image and Video Processing Pub Date : 2024-01-10 DOI: 10.1186/s13640-024-00619-2

Farouk Abduh Kamil Al-Fahaidy, Radwan AL-Bouthigy, Mohammad Yahya H. Al-Shamri, Safwan Abdulkareem

{"title":"Secure image transmission through LTE wireless communications systems","authors":"Farouk Abduh Kamil Al-Fahaidy, Radwan AL-Bouthigy, Mohammad Yahya H. Al-Shamri, Safwan Abdulkareem","doi":"10.1186/s13640-024-00619-2","DOIUrl":"https://doi.org/10.1186/s13640-024-00619-2","url":null,"abstract":"Secure transmission of images over wireless communications systems can be done using RSA, the most known and efficient cryptographic algorithm, and OFDMA, the most preferred signal processing choice in wireless communications. This paper aims to investigate the performance of OFDMA system for wireless transmission of RSA-based encrypted images. In fact, the performance of OFDMA systems; based on different signal processing techniques, such as, discrete sine transforms (DST) and discrete cosine transforms (DCT), as well as the conventional discrete Fourier transforms (DFT) are tested for wireless transmission of gray-scale images with/without RSA encryption. The progress of transmitting the image is carried by firstly, encrypting the image with RSA algorithm. Then, the encrypted image is modulated with DFT-based, DCT-based, and DST-based OFDMA systems. After that, the modulated images are transmitted over a wireless multipath fading channel. The reverse operations will be carried at the receiver, in addition to the frequency domain equalization to overcome the channel effect. Exhaustive numbers of scenarios are performed for study and investigation of the performance of the different OFDMA systems in terms of PSNR and MSE, with different subcarriers mapping and modulation techniques, is done. Results indicate that the ability of different OFDMA systems for wireless secure transmission of images. However, the DCT-OFDMA system showed superiority over the DST-OFDMA and the conventional DFT-OFDMA systems.","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":"14 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139422457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An optimized capsule neural networks for tomato leaf disease classification 用于番茄叶病分类的优化胶囊神经网络

IF 2.4 4区计算机科学

Eurasip Journal on Image and Video Processing Pub Date : 2024-01-08 DOI: 10.1186/s13640-023-00618-9

Lobna M. Abouelmagd, Mahmoud Y. Shams, Hanaa Salem Marie, Aboul Ella Hassanien

{"title":"An optimized capsule neural networks for tomato leaf disease classification","authors":"Lobna M. Abouelmagd, Mahmoud Y. Shams, Hanaa Salem Marie, Aboul Ella Hassanien","doi":"10.1186/s13640-023-00618-9","DOIUrl":"https://doi.org/10.1186/s13640-023-00618-9","url":null,"abstract":"Plant diseases have a significant impact on leaves, with each disease exhibiting specific spots characterized by unique colors and locations. Therefore, it is crucial to develop a method for detecting these diseases based on spot shape, color, and location within the leaves. While Convolutional Neural Networks (CNNs) have been widely used in deep learning applications, they suffer from limitations in capturing relative spatial and orientation relationships. This paper presents a computer vision methodology that utilizes an optimized capsule neural network (CapsNet) to detect and classify ten tomato leaf diseases using standard dataset images. To mitigate overfitting, data augmentation, and preprocessing techniques were employed during the training phase. CapsNet was chosen over CNNs due to its superior ability to capture spatial positioning within the image. The proposed CapsNet approach achieved an accuracy of 96.39% with minimal loss, relying on a 0.00001 Adam optimizer. By comparing the results with existing state-of-the-art approaches, the study demonstrates the effectiveness of CapsNet in accurately identifying and classifying tomato leaf diseases based on spot shape, color, and location. The findings highlight the potential of CapsNet as an alternative to CNNs for improving disease detection and classification in plant pathology research.","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":"29 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2024-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139396774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-layer features template update object tracking algorithm based on SiamFC++ 基于 SiamFC++ 的多层特征模板更新物体跟踪算法

IF 2.4 4区计算机科学

Eurasip Journal on Image and Video Processing Pub Date : 2024-01-04 DOI: 10.1186/s13640-023-00616-x

Xiaofeng Lu, Xuan Wang, Zhengyang Wang, Xinhong Hei

引用次数: 0

Subjective performance evaluation of bitrate allocation strategies for MPEG and JPEG Pleno point cloud compression. MPEG 和 JPEG Pleno 点云压缩比特率分配策略的主观性能评估。

IF 2.4 4区计算机科学

Eurasip Journal on Image and Video Processing Pub Date : 2024-01-01 Epub Date: 2024-06-11 DOI: 10.1186/s13640-024-00629-0

Davi Lazzarotto, Michela Testolina, Touradj Ebrahimi

{"title":"Subjective performance evaluation of bitrate allocation strategies for MPEG and JPEG Pleno point cloud compression.","authors":"Davi Lazzarotto, Michela Testolina, Touradj Ebrahimi","doi":"10.1186/s13640-024-00629-0","DOIUrl":"10.1186/s13640-024-00629-0","url":null,"abstract":"The recent rise in interest in point clouds as an imaging modality has motivated standardization groups such as JPEG and MPEG to launch activities aiming at developing compression standards for point clouds. Lossy compression usually introduces visual artifacts that negatively impact the perceived quality of media, which can only be reliably measured through subjective visual quality assessment experiments. While MPEG standards have been subjectively evaluated in previous studies on multiple occasions, no work has yet assessed the performance of the recent JPEG Pleno standard in comparison to them. In this study, a comprehensive performance evaluation of JPEG and MPEG standards for point cloud compression is conducted. The impact of different configuration parameters on the performance of the codecs is first analyzed with the help of objective quality metrics. The results from this analysis are used to define three rate allocation strategies for each codec, which are employed to compress a set of point clouds at four target rates. The set of distorted point clouds is then subjectively evaluated following two subjective quality assessment protocols. Finally, the obtained results are used to compare the performance of these compression standards and draw insights about best coding practices.","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":"2024 1","pages":"14"},"PeriodicalIF":2.4,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11166754/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141318743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Learned scalable video coding for humans and machines. 为人类和机器学习可扩展的视频编码。

IF 2.4 4区计算机科学

Eurasip Journal on Image and Video Processing Pub Date : 2024-01-01 Epub Date: 2024-11-14 DOI: 10.1186/s13640-024-00657-w

Hadi Hadizadeh, Ivan V Bajić

{"title":"Learned scalable video coding for humans and machines.","authors":"Hadi Hadizadeh, Ivan V Bajić","doi":"10.1186/s13640-024-00657-w","DOIUrl":"10.1186/s13640-024-00657-w","url":null,"abstract":"Video coding has traditionally been developed to support services such as video streaming, videoconferencing, digital TV, and so on. The main intent was to enable human viewing of the encoded content. However, with the advances in deep neural networks (DNNs), encoded video is increasingly being used for automatic video analytics performed by machines. In applications such as automatic traffic monitoring, analytics such as vehicle detection, tracking and counting, would run continuously, while human viewing could be required occasionally to review potential incidents. To support such applications, a new paradigm for video coding is needed that will facilitate efficient representation and compression of video for both machine and human use in a scalable manner. In this manuscript, we introduce an end-to-end learnable video codec that supports a machine vision task in its base layer, while its enhancement layer, together with the base layer, supports input reconstruction for human viewing. The proposed system is constructed based on the concept of conditional coding to achieve better compression gains. Comprehensive experimental evaluations conducted on four standard video datasets demonstrate that our framework outperforms both state-of-the-art learned and conventional video codecs in its base layer, while maintaining comparable performance on the human vision task in its enhancement layer.","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":"2024 1","pages":"41"},"PeriodicalIF":2.4,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11564357/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142649470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust steganography in practical communication: a comparative study 稳健隐写术在实际通信中的应用:比较研究

4区计算机科学

Eurasip Journal on Image and Video Processing Pub Date : 2023-10-31 DOI: 10.1186/s13640-023-00615-y

Tong Qiao, Shengwang Xu, Shuai Wang, Xiaoshuai Wu, Bo Liu, Ning Zheng, Ming Xu, Binmin Pan

{"title":"Robust steganography in practical communication: a comparative study","authors":"Tong Qiao, Shengwang Xu, Shuai Wang, Xiaoshuai Wu, Bo Liu, Ning Zheng, Ming Xu, Binmin Pan","doi":"10.1186/s13640-023-00615-y","DOIUrl":"https://doi.org/10.1186/s13640-023-00615-y","url":null,"abstract":"Abstract To realize the act of covert communication in a public channel, steganography is proposed. In the current study, modern adaptive steganography plays a dominant role due to its high undetectability. However, the effectiveness of modern adaptive steganography is challenged when being applied in practical communication, such as over social network. Several robust steganographic methods have been proposed, while the comparative study between them is still unknown. Thus, we propose a framework to generalize the current typical steganographic methods resisting against compression attack, and meanwhile empirically analyze advantages and disadvantages of them based on four baseline indicators, referring to as capacity, imperceptibility, undetectability, and robustness. More importantly, the robustness performance of the methods is compared in the real application, such as on Facebook, Twitter, and WeChat, which has not been comprehensively addressed in this community. In particular, the methods modifying sign of DCT coefficients perform more superiority on the social media application.","PeriodicalId":49322,"journal":{"name":"Eurasip Journal on Image and Video Processing","volume":"44 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135868890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0