Journal of Visual Communication and Image Representation最新文献

筛选
英文 中文
Mapping-based coverless steganography via generating a face database 通过生成人脸数据库实现基于映射的无覆盖隐写
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-07-22 DOI: 10.1016/j.jvcir.2025.104536
Bo Hu , Yanhui Xiao , Qiyao Deng , Huawei Tian
{"title":"Mapping-based coverless steganography via generating a face database","authors":"Bo Hu ,&nbsp;Yanhui Xiao ,&nbsp;Qiyao Deng ,&nbsp;Huawei Tian","doi":"10.1016/j.jvcir.2025.104536","DOIUrl":"10.1016/j.jvcir.2025.104536","url":null,"abstract":"<div><div>Most existing mapping-based coverless steganography methods face two fundamental limitations: (1) how to find a sufficiently large-scale image dataset capable of mapping all possible secret segments, and (2) how to efficiently construct an index database between secret segments and their corresponding images. To overcome these challenges, this paper proposes a novel mapping method that uses a diffusion model to generate a face database for mapping secret segments. Unlike conventional methods that laboriously search for candidate images from existing datasets, our method generates different face images by controlling the initial noise of the diffusion model to map different secret segments, thereby ensuring the completeness of the mapping. Furthermore, the index database is automatically constructed during image generation, eliminating the need for time-consuming feature computation and matching. Experimental results demonstrate that the proposed method achieves high robustness.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"111 ","pages":"Article 104536"},"PeriodicalIF":2.6,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144702411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TDENet: Three-branch distillation enhancement network for foggy scene object detection TDENet:用于雾景目标检测的三分支蒸馏增强网络
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-07-21 DOI: 10.1016/j.jvcir.2025.104534
Yufei Shi , Jichang Guo , Xiaojian Wang , Yudong Wang
{"title":"TDENet: Three-branch distillation enhancement network for foggy scene object detection","authors":"Yufei Shi ,&nbsp;Jichang Guo ,&nbsp;Xiaojian Wang ,&nbsp;Yudong Wang","doi":"10.1016/j.jvcir.2025.104534","DOIUrl":"10.1016/j.jvcir.2025.104534","url":null,"abstract":"<div><div>Hazy images typically exhibit low contrast, blurred textures, and other forms of quality degradation, leading to diminished performance in object detection tasks. Previous researches, either relying on defogging pre-processing or by cascading enhancement networks, often suffer from domain shift, limiting the network’s ability to effectively utilize information from diverse domains. Inspired by domain adaptation, we propose <strong>T</strong>hree-branch <strong>D</strong>istillation <strong>E</strong>nhancement <strong>Net</strong>work (TDENet), a novel framework that synergistically expands the scope of perceptual domains through joint training strategy. TDENet comprises three branches: the first uses a vanilla detector, the second integrates a cascaded enhancement model, and the third facilitates feature distillation to guide the optimization process. Unlike traditional methods that apply loss functions directly on the enhancement model, our approach enhances feature extraction for object detection with clean images distillation. Meanwhile, joint training strategy mitigates information loss of the original domain caused by enhancement model, enabling TDENet to learn more robust representations. Experiments on Foggy Cityscapes and real-world Foggy Driving dataset demonstrate the excellent universality and generalization of our approach. The pre-trained models and results will be released.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"111 ","pages":"Article 104534"},"PeriodicalIF":2.6,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144678891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient global–local feature extraction via the Superior Efficient Transformer 基于Superior Efficient Transformer的高效全局-局部特征提取
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-07-18 DOI: 10.1016/j.jvcir.2025.104532
Wei Xu , Yi Wan , Weina Zhao
{"title":"Efficient global–local feature extraction via the Superior Efficient Transformer","authors":"Wei Xu ,&nbsp;Yi Wan ,&nbsp;Weina Zhao","doi":"10.1016/j.jvcir.2025.104532","DOIUrl":"10.1016/j.jvcir.2025.104532","url":null,"abstract":"<div><div>As the demand for efficient neural networks on mobile devices surges, we introduce the Superior Efficient Transformer (SET), a lightweight model that integrates convolutional and self-attention mechanisms to optimize spatial and channel modeling. The Hybrid Dual Attention Module (HDAM) at the heart of SET employs self-attention for local feature extraction and convolutional attention for global dependency capture, complemented by Window-Integrated Spatial Attention (WISA) and Adaptive Channel Gate (ACG). By integrating HDAM within a dual-residual Transformer encoder architecture featuring Inverted Residual Blocks (IRB) and SET blocks, we achieve improved training stability and generalization. Experimental results demonstrate that SET outperforms models like MobileVit and MobileNets in image classification, object detection, and semantic segmentation tasks on ImageNet, MS COCO, and ADE20k datasets, respectively. With minimal parameters and Floating Point Operations (FLOPs), SET achieves superior performance, highlighting its advantages in accuracy, efficiency, and adaptability across diverse vision tasks. The source code is available at <span><span>https://github.com/Xuwei86/SET/tree/main</span><svg><path></path></svg></span> to support reproducibility.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"111 ","pages":"Article 104532"},"PeriodicalIF":2.6,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144678983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FUT: Frequency-aware U-shaped transformer for image denoising FUT:用于图像去噪的频率感知u形变压器
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-07-17 DOI: 10.1016/j.jvcir.2025.104528
Yaozong Zhang , Hao Yu , Yang Yang , Jin Liu , Qian Li , Zhenghua Huang
{"title":"FUT: Frequency-aware U-shaped transformer for image denoising","authors":"Yaozong Zhang ,&nbsp;Hao Yu ,&nbsp;Yang Yang ,&nbsp;Jin Liu ,&nbsp;Qian Li ,&nbsp;Zhenghua Huang","doi":"10.1016/j.jvcir.2025.104528","DOIUrl":"10.1016/j.jvcir.2025.104528","url":null,"abstract":"<div><div>Transformer has a powerful ability in capturing global dependencies of all pixels or regions with the help of self-attention mechanism and achieves remarkable denoising performance. However, such achievements are obtained at the cost of deeply stacked networks and repeated displacements to adjust the feature maps. To address these issues, this paper develops a frequency-aware U-shaped transformer (FUT) for image denoising, which consists of the encoding and decoding procedures. In the encoding process, there are two modules including down-sampling and convolution residual blocks, where down-sampling block explores a multi-spectral attention mechanism to extract different spectral information between different channels and preserve their richness. The convolution residual block uses a spatial attention mechanism that automatically captures important regional features, remaining robust to operations such as cropping, translation, and rotation. In the decoding procedure, a dual-branch transformer is employed to reduce the depth of the network, in which a feature map pixel exchange method is reported to adjust the whole feature map without multiple displacements. Meanwhile, a dual-branch up-sampling for preserving both global and local information is presented. Ablation experiments about the selection of different modules in our FUT are conducted to validate their effectiveness and extensive experimental results in both quantitation and qualification demonstrate that our FUT method can achieve competitive denoising performance, and even outperforms the SOTA denoising methods.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"111 ","pages":"Article 104528"},"PeriodicalIF":2.6,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144662355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CMIED-Net: A disentangled network for clothing-change person re-identification with clothing mixing and identity enhancement CMIED-Net:一个以服装混搭和身份增强为特征的换装人再识别解纠缠网络
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-07-17 DOI: 10.1016/j.jvcir.2025.104527
Yingzhu Zeng, Jianxun Zhang, Jiawei Zhu, Hongji Chen
{"title":"CMIED-Net: A disentangled network for clothing-change person re-identification with clothing mixing and identity enhancement","authors":"Yingzhu Zeng,&nbsp;Jianxun Zhang,&nbsp;Jiawei Zhu,&nbsp;Hongji Chen","doi":"10.1016/j.jvcir.2025.104527","DOIUrl":"10.1016/j.jvcir.2025.104527","url":null,"abstract":"<div><div>Clothing-change person re-identification (Re-ID) is a challenging problem due to the significant increase in intra-class variation and the reduction in inter-class variation caused by changes in clothing. The latent entanglement between clothing and identity-related features complicates the extraction of reliable identity information across diverse conditions. To address this, we propose CMIED-Net, a disentangled network that combines clothing mixing and identity enhancement strategies. Specifically, we design a Multi-scale Information Fusion Decoupling (MIFD) module that integrates multimodal information and separates it into identity-related body shape features and identity-irrelevant clothing features. Furthermore, we propose a Median-Driven Channel Attention (MDCA) mechanism to enhance feature representation while suppressing fusion noise. To stabilize identity features and better handle clothing variability, we also design the Human Shape Enhancement Module (HSEM) and the Clothing Mixing Module (CMM). Extensive experiments demonstrate that CMIED-Net outperforms existing methods on several clothing-change person Re-ID benchmarks.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"111 ","pages":"Article 104527"},"PeriodicalIF":2.6,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144678892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advancements in medical image quality assessment for spinal CT and MRI images 脊柱CT和MRI图像医学图像质量评价研究进展
IF 3.1 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-07-17 DOI: 10.1016/j.jvcir.2025.104526
Ying Li , Hongyu Wang , Yu Fu , Yushuo Ling , Yukun Du , Yongming Xi , Huan Yang
{"title":"Advancements in medical image quality assessment for spinal CT and MRI images","authors":"Ying Li ,&nbsp;Hongyu Wang ,&nbsp;Yu Fu ,&nbsp;Yushuo Ling ,&nbsp;Yukun Du ,&nbsp;Yongming Xi ,&nbsp;Huan Yang","doi":"10.1016/j.jvcir.2025.104526","DOIUrl":"10.1016/j.jvcir.2025.104526","url":null,"abstract":"<div><div>Medical image quality assessment (MIQA) is crucial to ensure diagnostic accuracy and reliability, optimizing imaging parameters to reduce misdiagnosis risks, and conserving medical resources. The design of MIQA faces multiple challenges, such as the diverse imaging modalities and clinical content variations. Due to the lack of ideal reference images in medical imaging, blind image quality assessment (BIQA) methods are recommended to evaluate medical image quality. This paper provides a comprehensive review of the latest advances in BIQA and their implement on assessing spinal CT and MRI images, which belong to a broad category of medical imaging. We constructed four spinal imaging data sets (CerS-CT, CerS-MRI, LumS-MRI, and LSpine-MRI) to systematically explore the performance and adaptability of various methods on medical spinal images. Additionally, we highlight several key factors that should be considered when designing IQA methods for medical spinal images and outline potential future research directions. This review offers valuable information for researchers in the field of BIQA for medical spinal imaging, helping to guide the intelligent clinical diagnosis in real applications.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"111 ","pages":"Article 104526"},"PeriodicalIF":3.1,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144724962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring training data-free video generation from a single image via a stable diffusion model 探索通过稳定扩散模型从单个图像生成训练无数据视频
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-07-16 DOI: 10.1016/j.jvcir.2025.104504
Xianjun Han , Huayong Sheng , Can Bai
{"title":"Exploring training data-free video generation from a single image via a stable diffusion model","authors":"Xianjun Han ,&nbsp;Huayong Sheng ,&nbsp;Can Bai","doi":"10.1016/j.jvcir.2025.104504","DOIUrl":"10.1016/j.jvcir.2025.104504","url":null,"abstract":"<div><div>Video generation is typically performed by incorporating frame information into a model and combining it with optical flow or warping operations. However, this approach requires extensive training on multiple frames, making it time intensive and arduous. Generating a video with only one image as the initial frame would be a meaningful improvement. This paper presents a method of using textual information to guide video generation on the basis of a single image. Specifically, this work primarily relies on the <em>stable diffusion model</em> to invert the input image to a noise map and to generate text-to-image cross-attention maps between the input image and its corresponding text at each time step. Additionally, <em>CLIP</em> is used to calculate the residuals between the target text and the original image text, which guide video generation. Subsequently, under the constraints of cross-attention maps, we employ the <em>stable diffusion</em> UNet denoiser to obtain progressive latent codes with textual residuals and a noise map. Next, these latent codes that indicate keyframes are used to generate additional latent codes through cubic spline interpolation. Finally, all the latent codes are fed into the <em>stable diffusion</em> variable autoencoder (VAE) decoder to generate frames. We use optical flow and warp operations to trim these frames and synthesize the video to avoid blurring and artifacts. Throughout the video generation process, we refrain from conducting training and instead repeatedly sample the <em>stable diffusion</em>. The experimental results demonstrate the effectiveness of the proposed method, which can quickly generate various videos based on different text inputs.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"111 ","pages":"Article 104504"},"PeriodicalIF":2.6,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144662354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SSH-Net: A self-supervised and hybrid network for noisy image watermark removal SSH-Net:一种用于噪声图像水印去除的自监督混合网络
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-07-16 DOI: 10.1016/j.jvcir.2025.104516
Wenyang Liu, Jianjun Gao, Kim-Hui Yap
{"title":"SSH-Net: A self-supervised and hybrid network for noisy image watermark removal","authors":"Wenyang Liu,&nbsp;Jianjun Gao,&nbsp;Kim-Hui Yap","doi":"10.1016/j.jvcir.2025.104516","DOIUrl":"10.1016/j.jvcir.2025.104516","url":null,"abstract":"<div><div>Visible watermark removal is challenging due to its inherent complexities and the noise carried within images. Existing methods primarily rely on supervised learning approaches that require paired datasets of watermarked and watermark-free images, which are often impractical to obtain in real-world scenarios. To address this challenge, we propose SSH-Net, a Self-Supervised and Hybrid Network specifically designed for noisy image watermark removal. SSH-Net synthesizes reference watermark-free images using the watermark distribution in a self-supervised manner and adopts a dual-network design to address the task. The upper network, focused on the simpler task of noise removal, employs a lightweight CNN-based architecture, while the lower network, designed to handle the more complex task of simultaneously removing watermarks and noise, incorporates Transformer blocks to model long-range dependencies and capture intricate image features. To enhance the model’s effectiveness, a shared CNN-based feature encoder is introduced before dual networks to extract common features that both networks can leverage. Comprehensive experiments show that our proposed method surpasses state-of-the-art approaches in both performance and efficiency, demonstrating its effectiveness in noisy image watermark removal. Our code will be available at <span><span>https://github.com/wenyang001/SSH-Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"111 ","pages":"Article 104516"},"PeriodicalIF":2.6,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144670919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ThermalDiff: A diffusion architecture for thermal image synthesis ThermalDiff:用于热图像合成的扩散架构
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-07-12 DOI: 10.1016/j.jvcir.2025.104524
Tayeba Qazi , Brejesh Lall , Prerana Mukherjee
{"title":"ThermalDiff: A diffusion architecture for thermal image synthesis","authors":"Tayeba Qazi ,&nbsp;Brejesh Lall ,&nbsp;Prerana Mukherjee","doi":"10.1016/j.jvcir.2025.104524","DOIUrl":"10.1016/j.jvcir.2025.104524","url":null,"abstract":"<div><div>Thermal images represents the heat signature of a scene. However, thermal imaging sensors are not only expensive but also require exacting evaluation during manufacturing. Leveraging both visible and infrared (IR) images in deep learning-based computer vision tasks is a recent development. Therefore, many applications necessitate datasets containing paired visible and infrared images. In contrary to the widespread availability of large visible-spectrum image datasets, thermal image datasets are not readily available. The dearth of thermal datasets hinders the efficiency of deep learning algorithms dependent on thermal data. In this study, we introduce an algorithm based on diffusion models for synthesizing synthetic thermal images that closely resemble real thermal data. Our proposed architecture, named ThermalDiff, presents a novel solution to synthesize thermal images by estimating them from their visible image counterparts. We evaluate our architecture on three thermal benchmark datasets and compare it to the Pix2Pix, UNet-GAN, ThermalGAN, and InfraGAN architectures using multiple metrics and report state-of-the-art results across all metrics. We report up to + 14 dB and + 7 dB higher accuracy in PSNR value over the baseline for the VEDAI dataset and KAIST dataset, respectively and comparable result on FLIR dataset. Additionally, we achieve a reduction of one order of magnitude in the FID score across all datasets. Through a demonstration on a pedestrian detection task that exhibits an average mAP improvement of 8% upon augmenting the train set with synthetic thermal data, we validate the effectiveness of samples generated by ThermalDiff.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"111 ","pages":"Article 104524"},"PeriodicalIF":2.6,"publicationDate":"2025-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144604834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A multi-modal multi-scale network based on Transformer for micro-expression recognition 基于Transformer的多模态多尺度网络微表情识别
IF 2.6 4区 计算机科学
Journal of Visual Communication and Image Representation Pub Date : 2025-07-11 DOI: 10.1016/j.jvcir.2025.104537
Fengping Wang , Jie Li , Chun Qi , Lin Wang , Pan Wang
{"title":"A multi-modal multi-scale network based on Transformer for micro-expression recognition","authors":"Fengping Wang ,&nbsp;Jie Li ,&nbsp;Chun Qi ,&nbsp;Lin Wang ,&nbsp;Pan Wang","doi":"10.1016/j.jvcir.2025.104537","DOIUrl":"10.1016/j.jvcir.2025.104537","url":null,"abstract":"<div><div>Micro-expression recognition is challenging due to their transient nature, subtle intensity, and category-specific spatial variability. Existing methods often fail to capture fine-grained motion across scales and effectively integrate multi-modal information while preserving contextual relevance. To address this, we propose a Transformer-based framework that jointly models multi-scale and multi-modal representations. Specifically, dynamic images and optical flow are used to extract local motion features at multiple spatial resolutions. Patch-level features from both modalities are processed through multilayer, multi-head attention to capture intra- and inter-scale contextual dependencies. Additionally, a cross-modal contrastive learning strategy is introduced to improve feature alignment and enhance modality-level discriminability. Extensive experiments on three spontaneous micro-expression datasets—SMIC, CASMEII, and SAMM—demonstrate the effectiveness of our method, achieving accuracy rates of 0.7805, 0.8008, and 0.75, respectively. On the composite dataset, it achieves state-of-the-art UF1 and UAR scores of 0.8321 and 0.8233.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"111 ","pages":"Article 104537"},"PeriodicalIF":2.6,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144633076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信