Multimedia Systems最新文献_第6页

Effective ensemble based intrusion detection and energy efficient load balancing using sunflower optimization in distributed wireless sensor network 在分布式无线传感器网络中使用向日葵优化技术进行基于集合的有效入侵检测和节能负载平衡

IF 3.9 3区计算机科学

Multimedia Systems Pub Date : 2024-07-29 DOI: 10.1007/s00530-024-01388-8

V. S. Prasanth, A. Mary Posonia, A. Parveen Akhther

{"title":"Effective ensemble based intrusion detection and energy efficient load balancing using sunflower optimization in distributed wireless sensor network","authors":"V. S. Prasanth, A. Mary Posonia, A. Parveen Akhther","doi":"10.1007/s00530-024-01388-8","DOIUrl":"https://doi.org/10.1007/s00530-024-01388-8","url":null,"abstract":"Wireless sensor networks (WSNs) play a very important role in providing real-time data access for big data and internet of things applications. Despite this, WSNs’ open deployment makes them highly susceptible to various malicious attacks, energy constraints, and decentralized governance. For mission-critical applications in WSNs, it is crucial to identify rogue sensor devices and remove the sensed data they contain. The resource-constrained nature of sensor devices prevents the direct application of standard cryptography and authentication techniques in WSNs. Low latency and energy-efficient methods are therefore needed. An efficient and safe routing system is created in this study. Initially the outliers are detected from deployed nodes using stacking based ensemble learning approach. Deep neural network (DNN) and long short term memory (LSTM) are two different basic classifiers and multilayer perceptron (MLP) is utilized as a Meta classifier in the ensemble method. The normal nodes are considered for further process. Then, distance, density and residual energy based cluster head selection and cluster formations are done. Sunflower optimization algorithm (SOA) is employed in this approach for routing purpose to improve energy efficiency and load balancing. Superior transmission routing can potentially obtained by taking the shortest way. This proposed method achieves 95% accuracy for the intrusion detection phase and 92% is the packet delivery ratio for energy efficient routing. Consequently, the proposed method is the most effective option for load balancing with intrusion detection.","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"8 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141866097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A survey of multimodal federated learning: background, applications, and perspectives 多模态联合学习调查：背景、应用和前景

IF 3.9 3区计算机科学

Multimedia Systems Pub Date : 2024-07-29 DOI: 10.1007/s00530-024-01422-9

Hao Pan, Xiaoli Zhao, Lipeng He, Yicong Shi, Xiaogang Lin

引用次数: 0

GAN-based image steganography by exploiting transform domain knowledge with deep networks 利用深度网络的变换域知识，实现基于 GAN 的图像隐写术

IF 3.9 3区计算机科学

Multimedia Systems Pub Date : 2024-07-29 DOI: 10.1007/s00530-024-01427-4

Xiao Li, Liquan Chen, Jianchang Lai, Zhangjie Fu, Suhui Liu

{"title":"GAN-based image steganography by exploiting transform domain knowledge with deep networks","authors":"Xiao Li, Liquan Chen, Jianchang Lai, Zhangjie Fu, Suhui Liu","doi":"10.1007/s00530-024-01427-4","DOIUrl":"https://doi.org/10.1007/s00530-024-01427-4","url":null,"abstract":"Image steganography secures the transmission of secret information by covering it under routine multimedia transmission. During image generation based on Generative Adversarial Network (GAN), the embedding and recovery of secret bits can rely entirely on deep networks, relieving many manual design efforts. However, existing GAN-based methods always design deep networks by adapting generic deep learning structures to image steganography. These structures themselves lack the feature extraction that is effective for steganography, resulting in the low imperceptibility of these methods. To address the problem, we propose GAN-based image steganography by exploiting transform domain knowledge with deep networks, called EStegTGANs. Different from existing GAN-based methods, we explicitly introduce transform domain knowledge with Discrete Wavelet Transform (DWT) and its inverse (IDWT) in deep networks, ensuring that each network performs with DWT features. Specifically, the encoder embeds secrets and generates stego images with the explicit DWT and IDWT approaches. The decoder recovers secrets and the discriminator evaluates feature distribution with the explicit DWT approach. By utilizing traditional DWT and IDWT approaches, we propose EStegTGAN-coe, which directly adopts the DWT coefficients of pixels for embedding and recovering. To create more feature redundancy for secrets, we extract DWT features from the intermediate features in deep networks for embedding and recovering. We then propose EStegTGAN-DWT with traditional DWT and IDWT approaches. To entirely rely on deep networks without traditional filters, we design the convolutional DWT and IDWT approaches that conduct the same feature transformation on features as traditional approaches. We further replace the traditional approaches in EStegTGAN-DWT with our proposed convolutional approaches. Comprehensive experimental results demonstrate that our proposals significantly improve the imperceptibility and our designed convolutional DWT and IDWT approaches are more effective in distinguishing high-frequency characteristics of images for steganography than traditional DWT and IDWT approaches.","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"46 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141866093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Coordinate-aligned multi-camera collaboration for active multi-object tracking 用于主动多目标跟踪的坐标对齐多摄像头协作

IF 3.9 3区计算机科学

Multimedia Systems Pub Date : 2024-07-29 DOI: 10.1007/s00530-024-01420-x

Zeyu Fang, Jian Zhao, Mingyu Yang, Zhenbo Lu, Wengang Zhou, Houqiang Li

{"title":"Coordinate-aligned multi-camera collaboration for active multi-object tracking","authors":"Zeyu Fang, Jian Zhao, Mingyu Yang, Zhenbo Lu, Wengang Zhou, Houqiang Li","doi":"10.1007/s00530-024-01420-x","DOIUrl":"https://doi.org/10.1007/s00530-024-01420-x","url":null,"abstract":"Active Multi-Object Tracking (AMOT) is a task where cameras are controlled by a centralized system to adjust their poses automatically and collaboratively so as to maximize the coverage of targets in their shared visual field. In AMOT, each camera only receives partial information from its observation, which may mislead cameras to take locally optimal action. Besides, the global goal, i.e., maximum coverage of objects, is hard to be directly optimized. To address the above issues, we propose a coordinate-aligned multi-camera collaboration system for AMOT. In our approach, we regard each camera as an agent and address AMOT with a multi-agent reinforcement learning solution. To represent the observation of each agent, we first identify the targets in the camera view with an image detector and then align the coordinates of the targets via inverse projection transformation. We define the reward of each agent based on both global coverage as well as four individual reward terms. The action policy of the agents is derived from a value-based Q-network. To the best of our knowledge, we are the first to study the AMOT task. To train and evaluate the efficacy of our system, we build a virtual yet credible 3D environment, named “Soccer Court”, to mimic the real-world AMOT scenario. The experimental results show that our system outperforms the baseline and existing methods in various settings, including real-world datasets.","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"7 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141865992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SAM-guided contrast based self-training for source-free cross-domain semantic segmentation 基于 SAM 引导的对比度自我训练，实现无源跨域语义分割

IF 3.9 3区计算机科学

Multimedia Systems Pub Date : 2024-07-26 DOI: 10.1007/s00530-024-01426-5

Qinghua Ren, Ke Hou, Yongzhao Zhan, Chen Wang

{"title":"SAM-guided contrast based self-training for source-free cross-domain semantic segmentation","authors":"Qinghua Ren, Ke Hou, Yongzhao Zhan, Chen Wang","doi":"10.1007/s00530-024-01426-5","DOIUrl":"https://doi.org/10.1007/s00530-024-01426-5","url":null,"abstract":"Traditional domain adaptive semantic segmentation methods typically assume access to source domain data during training, a paradigm known as source-access domain adaptation for semantic segmentation (SASS). To address data privacy concerns in real-world applications, source-free domain adaptation for semantic segmentation (SFSS) has recently been studied, eliminating the need for direct access to source data. Most SFSS methods primarily utilize pseudo-labels to regularize the model in either the label space or the feature space. Inspired by the segment anything model (SAM), we propose SAM-guided contrast based pseudo-label learning for SFSS in this work. Unlike previous methods that heavily rely on noisy pseudo-labels, we leverage the class-agnostic segmentation masks generated by SAM as prior knowledge to construct positive and negative sample pairs. This approach allows us to directly shape the feature space using contrastive learning. This design ensures the reliable construction of contrastive samples and exploits both intra-class and intra-instance diversity. Our framework is built upon a vanilla teacher–student network architecture for online pseudo-label learning. Consequently, the SFSS model can be jointly regularized in both the feature and label spaces in an end-to-end manner. Extensive experiments demonstrate that our method achieves competitive performance in two challenging SFSS tasks.","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"71 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141784698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

RA-RevGAN: region-aware reversible adversarial example generation network for privacy-preserving applications RA-RevGAN：用于隐私保护应用的区域感知可逆对抗示例生成网络

IF 3.9 3区计算机科学

Multimedia Systems Pub Date : 2024-07-26 DOI: 10.1007/s00530-024-01425-6

Jiacheng Zhao, Xiuming Zhao, Zhihua Gan, Xiuli Chai, Tianfeng Ma, Zhen Chen

{"title":"RA-RevGAN: region-aware reversible adversarial example generation network for privacy-preserving applications","authors":"Jiacheng Zhao, Xiuming Zhao, Zhihua Gan, Xiuli Chai, Tianfeng Ma, Zhen Chen","doi":"10.1007/s00530-024-01425-6","DOIUrl":"https://doi.org/10.1007/s00530-024-01425-6","url":null,"abstract":"The rise of online sharing platforms has provided people with diverse and convenient ways to share images. However, a substantial amount of sensitive user information is contained within these images, which can be easily captured by malicious neural networks. To ensure the secure utilization of authorized protected data, reversible adversarial attack techniques have emerged. Existing algorithms for generating adversarial examples do not strike a good balance between visibility and attack capability. Additionally, the network oscillations generated during the training process affect the quality of the final examples. To address these shortcomings, we propose a novel reversible adversarial network based on generative adversarial networks (RA-RevGAN). In this paper, the generator is used for noise generation to map features into perturbations of the image, while the region selection module confines these perturbations to specific areas that significantly affect classification. Furthermore, a robust attack mechanism is integrated into the discriminator to stabilize the network’s training by optimizing convergence speed and minimizing time cost. Extensive experiments have demonstrated that the proposed method ensures a high image generation rate, excellent attack capability, and superior visual quality while maintaining high classification accuracy in image restoration.","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"23 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141784523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Application of CLIP for efficient zero-shot learning 应用 CLIP 实现高效的零点学习

IF 3.9 3区计算机科学

Multimedia Systems Pub Date : 2024-07-26 DOI: 10.1007/s00530-024-01414-9

Hairui Yang, Ning Wang, Haojie Li, Lei Wang, Zhihui Wang

{"title":"Application of CLIP for efficient zero-shot learning","authors":"Hairui Yang, Ning Wang, Haojie Li, Lei Wang, Zhihui Wang","doi":"10.1007/s00530-024-01414-9","DOIUrl":"https://doi.org/10.1007/s00530-024-01414-9","url":null,"abstract":"Zero-shot learning (ZSL) addresses the challenging task of recognizing classes absent during training. Existing methodologies focus on knowledge transfer from known to unknown categories by formulating a correlation between visual and semantic spaces. However, these methods are faced with constraints related to the discrimination of visual features and the integrity of semantic representations. To alleviate these limitations, we propose a novel Collaborative learning Framework for Zero-Shot Learning (CFZSL), which integrates the CLIP architecture into a fundamental zero-shot learner. Specifically, the foundational zero-shot learning model extracts visual features through a set of CNNs and maps them to a domain-specific semantic space. Simultaneously, the CLIP image encoder extracts visual features containing universal semantics. In this way, the CFZSL framework can obtain discriminative visual features for both domain-specific and domain-agnostic semantics. Additionally, a more comprehensive semantic space is explored by combining the latent feature space learned by CLIP and the domain-specific semantic space. Notably, we just leverage the pre-trained parameters of the CLIP model, mitigating the high training cost and potential overfitting issues associated with fine-tuning. Our proposed framework, characterized by its simple structure, undergoes training exclusively via classification and triplet loss functions. Extensive experimental results, conducted on three widely recognized benchmark datasets-AwA2, CUB, and SUN, conclusively affirm the effectiveness and superiority of our proposed approach.","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"65 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141784569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CMLCNet: medical image segmentation network based on convolution capsule encoder and multi-scale local co-occurrence CMLCNet：基于卷积胶囊编码器和多尺度局部共现的医学图像分割网络

IF 3.9 3区计算机科学

Multimedia Systems Pub Date : 2024-07-26 DOI: 10.1007/s00530-024-01430-9

Chendong Qin, Yongxiong Wang, Jiapeng Zhang

{"title":"CMLCNet: medical image segmentation network based on convolution capsule encoder and multi-scale local co-occurrence","authors":"Chendong Qin, Yongxiong Wang, Jiapeng Zhang","doi":"10.1007/s00530-024-01430-9","DOIUrl":"https://doi.org/10.1007/s00530-024-01430-9","url":null,"abstract":"Medical images have low contrast and blurred boundaries between different tissues or between tissues and lesions. Because labeling medical images is laborious and requires expert knowledge, the labeled data are expensive or simply unavailable. UNet has achieved great success in the field of medical image segmentation. However, the pooling layer in downsampling tends to discard important information such as location information. It is difficult to learn global and long-range semantic interactive information well due to the locality of convolution operation. The usual solution is increasing the number of datasets or enhancing the training data though augmentation methods. However, to obtain a large number of medical datasets is tough, and the augmentation methods may increase the training burden. In this work, we propose a 2D medical image segmentation network with a convolutional capsule encoder and a multiscale local co-occurrence module. To extract more local detail and contextual information, the capsule encoder is introduced to learn the information about the target location and the relationship between the part and the whole. Multi-scale features can be fused by a new attention mechanism, which can then selectively emphasize salient features useful for a specific task by capturing global information and suppress background noise. The proposed attention mechanism is used to preserve the information that is discarded by pooling layers of the network. In addition, a multi-scale local co-occurrence algorithm is proposed, where the context and dependencies between different regions in an image can be better learned. Experimental results on the dataset of Liver, ISIC and BraTS2019 show that our network is superior to the UNet and other previous medical image segmentation networks under the same experimental conditions.","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"95 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141784670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

TrafficTrack: rethinking the motion and appearance cue for multi-vehicle tracking in traffic monitoring 交通跟踪：重新思考交通监控中多车跟踪的运动和外观线索

IF 3.9 3区计算机科学

Multimedia Systems Pub Date : 2024-07-25 DOI: 10.1007/s00530-024-01407-8

Hui Cai, Haifeng Lin, Dapeng Liu

{"title":"TrafficTrack: rethinking the motion and appearance cue for multi-vehicle tracking in traffic monitoring","authors":"Hui Cai, Haifeng Lin, Dapeng Liu","doi":"10.1007/s00530-024-01407-8","DOIUrl":"https://doi.org/10.1007/s00530-024-01407-8","url":null,"abstract":"Analyzing traffic flow based on data from traffic monitoring is an essential component of intelligent transportation systems. In most traffic scenarios, vehicles are the primary targets, so multi-object tracking of vehicles in traffic monitoring is a critical subject. In view of the current difficulties, such as complex road conditions, numerous obstructions, and similar vehicle appearances, we propose a detection-based multi-object vehicle tracking algorithm that combines motion and appearance cues. Firstly, to improve the motion prediction accuracy, we propose a Kalman filter that adaptively updates the noise according to the motion matching cost and detection confidence score, combined with exponential transformation and residuals. Then, we propose a combined distance to utilize motion and appearance cues. Finally, we present a trajectory recovery strategy to handle unmatched trajectories and detections. Experimental results on the UA-DETRAC dataset demonstrate that this method achieves excellent tracking performance for vehicle tracking tasks in traffic monitoring perspectives, meeting the practical application demands of complex traffic scenarios.","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"2 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141784671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fs-yolo: fire-smoke detection based on improved YOLOv7 Fs-yolo：基于改进型 YOLOv7 的烟火探测技术

IF 3.9 3区计算机科学

Multimedia Systems Pub Date : 2024-07-24 DOI: 10.1007/s00530-024-01359-z

Dongmei Wang, Ying Qian, Jingyi Lu, Peng Wang, Zhongrui Hu, Yongkang Chai

{"title":"Fs-yolo: fire-smoke detection based on improved YOLOv7","authors":"Dongmei Wang, Ying Qian, Jingyi Lu, Peng Wang, Zhongrui Hu, Yongkang Chai","doi":"10.1007/s00530-024-01359-z","DOIUrl":"https://doi.org/10.1007/s00530-024-01359-z","url":null,"abstract":"Fire has emerged as a major danger to the Earth’s ecological equilibrium and human well-being. Fire detection and alert systems are essential. There is a scarcity of public fire datasets with examples of fire and smoke in real-world situations. Moreover, techniques for recognizing items in fire smoke are imprecise and unreliable when it comes to identifying small objects. We developed a dual dataset to evaluate the model’s ability to handle these difficulties. Introducing FS-YOLO, a new fire detection model with improved accuracy. Training YOLOv7 may lead to overfitting because of the large number of parameters and the limited fire detection object categories. YOLOv7 struggles to recognize small dense objects during feature extraction, resulting in missed detections. The Swin Transformer module has been enhanced to decrease local feature interdependence, obtain a wider range of parameters, and handle features at several levels. The improvements strengthen the model’s robustness and the network’s ability to recognize dense tiny objects. The efficient channel attention was incorporated to reduce the occurrence of false fire detections. Localizing the region of interest and extracting meaningful information aids the model in identifying pertinent areas and minimizing false detections. The proposal also considers using fire-smoke and real-fire-smoke datasets. The latter dataset simulates real-world conditions with occlusions, lens blur, and motion blur. This dataset tests the model’s robustness and adaptability in complex situations. On both datasets, the mAP of FS-YOLO is improved by 6.4(%) and 5.4(%) compared to YOLOv7. In the robustness check experiments, the mAP of FS-YOLO is 4.1(%) and 3.1(%) higher than that of today’s SOTA models YOLOv8s, DINO.","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"180 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141784572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0