IEEE transactions on image processing : a publication of the IEEE Signal Processing Society最新文献

筛选
英文 中文
PalmDiff: When Palmprint Generation Meets Controllable Diffusion Model 掌纹生成满足可控扩散模型的PalmDiff。
IF 13.7
Long Tang;Tingting Chai;Zheng Zhang;Miao Zhang;Xiangqian Wu
{"title":"PalmDiff: When Palmprint Generation Meets Controllable Diffusion Model","authors":"Long Tang;Tingting Chai;Zheng Zhang;Miao Zhang;Xiangqian Wu","doi":"10.1109/TIP.2025.3593974","DOIUrl":"10.1109/TIP.2025.3593974","url":null,"abstract":"Due to its distinctive texture and intricate details, palmprint has emerged as a critical modality in biometric identity recognition. The absence of large-scale public palmprint datasets has substantially impeded the advancement of palmprint research, resulting in inadequate accuracy in commercial palmprint recognition systems. However, existing generative methods exhibit insufficient generalization, as the images they generate differ in specific ways from the conditional images. This paper proposes a method for generating palmprint images using a controllable diffusion model (PalmDiff), which addresses the issue of insufficient datasets by generating palmprint data, improving the accuracy of palmprint recognition. We introduce a diffusion process that effectively tackles the problems of excessive noise and loss of texture details commonly encountered in diffusion models. A linear attention mechanism is employed to enhance the backbone’s expressive capacity and reduce the computational complexity. To this end, we proposed an ID loss function to enable the diffusion model to generate palmprint images under the same identical space consistently. PalmDiff is compared with other generation methods in terms of both image quality and the enhancement of palmprint recognition performance. Experiments show that PalmDiff performs well in image generation, with an FID score of 13.311 on MPD and 18.434 on Tongji. Besides, PalmDiff has significantly improved various backbones for palmprint recognition compared to other generation methods.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"5228-5240"},"PeriodicalIF":13.7,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144791949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FusionINV: A Diffusion-Based Approach for Multimodal Image Fusion 一种基于扩散的多模态图像融合方法
IF 13.7
Pengwei Liang;Junjun Jiang;Qing Ma;Chenyang Wang;Xianming Liu;Jiayi Ma
{"title":"FusionINV: A Diffusion-Based Approach for Multimodal Image Fusion","authors":"Pengwei Liang;Junjun Jiang;Qing Ma;Chenyang Wang;Xianming Liu;Jiayi Ma","doi":"10.1109/TIP.2025.3593775","DOIUrl":"10.1109/TIP.2025.3593775","url":null,"abstract":"Infrared images exhibit a significantly different appearance compared to visible counterparts. Existing infrared and visible image fusion (IVF) methods fuse features from both infrared and visible images, producing a new “image” appearance not inherently captured by any existing device. From an appearance perspective, infrared, visible, and fused images belong to different data domains. This difference makes it challenging to apply fused images because their domain-specific appearance may be difficult for downstream systems, e.g., pre-trained segmentation models. Therefore, accurately assessing the quality of the fused image is challenging. To address those problem, we propose a novel IVF method, FusionINV, which produces fused images with an appearance similar to visible images. FusionINV employs the pre-trained Stable Diffusion (SD) model to invert infrared images into the noise feature space. To inject visible-style appearance information into the infrared features, we leverage the inverted features from visible images to guide this inversion process. In this way, we can embed all the information of infrared and visible images in the noise feature space, and then use the prior of the pre-trained SD model to generate visually friendly images that align more closely with the RGB distribution. Specially, to generate the fused image, we design a tailored fusion rule within the denoising process that iteratively fuses visible-style infrared and visible features. In this way, the fused image falls into the visible domain and can be directly applied to existing downstream machine systems. Thanks to advancements in image inversion, FusionINV can directly produce fused images in a training-free manner. Extensive experiments demonstrate that FusionINV achieves outstanding performance in both human visual evaluation and machine perception tasks. The code is available at <uri>https://github.com/erfect2020/FusionINV</uri>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"5355-5368"},"PeriodicalIF":13.7,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144786591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UFPF: A Universal Feature Perception Framework for Microscopic Hyperspectral Images UFPF:用于显微高光谱图像的通用特征感知框架
IF 13.7
Geng Qin;Huan Liu;Wei Li;Xueyu Zhang;Yuxing Guo;Xiang-Gen Xia
{"title":"UFPF: A Universal Feature Perception Framework for Microscopic Hyperspectral Images","authors":"Geng Qin;Huan Liu;Wei Li;Xueyu Zhang;Yuxing Guo;Xiang-Gen Xia","doi":"10.1109/TIP.2025.3594151","DOIUrl":"10.1109/TIP.2025.3594151","url":null,"abstract":"In recent years, deep learning has shown immense promise in advancing medical hyperspectral imaging diagnostics at the microscopic level. Despite this progress, most existing research models remain constrained to single-task or single-scene applications, lacking robust collaborative interpretation of microscopic hyperspectral features and spatial information, thereby failing to fully explore the clinical value of hyperspectral data. In this paper, we propose a microscopic hyperspectral universal feature perception framework (UFPF), which extracts high-quality spatial-spectral features of hyperspectral data, providing a robust feature foundation for downstream tasks. Specifically, this innovative framework captures different sequential spatial nearest-neighbor relationships through a hierarchical corner-to-center mamba structure. It incorporates the concept of “progressive focus towards the center”, starting by emphasizing edge information and gradually refining attention from the edges towards the center. This approach effectively integrates richer spatial-spectral information, boosting the model’s feature extraction capability. On this basis, a dual-path spatial-spectral joint perception module is developed to achieve the complementarity of spatial and spectral information and fully explore the potential patterns in the data. In addition, a Mamba-attention Mix-alignment is designed to enhance the optimized alignment of deep semantic features. The experimental results on multiple datasets have shown that this framework significantly improves classification and segmentation performance, supporting the clinical application of medical hyperspectral data. The code is available at: <uri>https://github.com/Qugeryolo/UFPF</uri>","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"5513-5526"},"PeriodicalIF":13.7,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144786589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hard EXIF: Protecting Image Authorship Through Metadata, Hardware, and Content 硬EXIF:通过元数据、硬件和内容保护图像的作者身份
IF 13.7
Yushu Zhang;Bowen Shi;Shuren Qi;Xiangli Xiao;Ping Wang;Wenying Wen
{"title":"Hard EXIF: Protecting Image Authorship Through Metadata, Hardware, and Content","authors":"Yushu Zhang;Bowen Shi;Shuren Qi;Xiangli Xiao;Ping Wang;Wenying Wen","doi":"10.1109/TIP.2025.3593911","DOIUrl":"10.1109/TIP.2025.3593911","url":null,"abstract":"With the rapid proliferation of digital image content and advancements in image editing technologies, the protection of digital image authorship has become an increasingly important issue. Traditional methods for authorship protection include registering authorship through certification organization, utilizing image metadata such as Exchangeable Image File Format (EXIF) data, and employing watermarking techniques to prove ownership. In recent years, blockchain-based technologies have also been introduced to enhance authorship protection further. However, these approaches face challenges in balancing four key attributes: strong legal validity, high security, low cost, and high usability. Authorship registration is often cumbersome, EXIF metadata can be easily extracted and tampered with, watermarking techniques are vulnerable to various forms of attack, and blockchain technology is complex to implement and requires long-term maintenance. In response to these challenges, this paper introduces a new framework Hard EXIF, designed to balance these multiple attributes while delivering improved performance. The proposed method integrates metadata with physically unclonable functions (PUFs) for the first time, creating unique device fingerprints and embedding them into images using watermarking techniques. By leveraging the security and simplicity of hash functions and PUFs, this method enhances EXIF security while minimizing costs. Experimental results demonstrate that the Hard EXIF framework achieves an average peak signal-to-noise ratio (PSNR) of 42.89 dB, with a similarity of 99.46% between the original and watermarked images, and the extraction error rate is only 0.0017. These results show that the Hard EXIF framework balances legal validity, security, cost, and usability, promising authorship protection with great potential for wider application.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"5023-5037"},"PeriodicalIF":13.7,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144786590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Asymmetric and Discrete Self-Representation Enhancement Hashing for Cross-Domain Retrieval 跨域检索的非对称和离散自表示增强哈希
IF 13.7
Jiaxing Li;Lin Jiang;Xiaozhao Fang;Shengli Xie;Yong Xu
{"title":"Asymmetric and Discrete Self-Representation Enhancement Hashing for Cross-Domain Retrieval","authors":"Jiaxing Li;Lin Jiang;Xiaozhao Fang;Shengli Xie;Yong Xu","doi":"10.1109/TIP.2025.3594140","DOIUrl":"10.1109/TIP.2025.3594140","url":null,"abstract":"Due to the characteristics of low storage requirement and high retrieval efficiency, hashing-based retrieval has shown its great potential and has been widely applied for information retrieval. However, retrieval tasks in real-world applications are usually required to handle the data from various domains, leading to the unsatisfactory performances of existing hashing-based methods, as most of them assuming that the retrieval pool and the querying set are similar. Most of the existing works overlooked the self-representation that containing the modality-specific semantic information, in the cross-modal data. To cope with the challenges mentioned above, this paper proposes an asymmetric and discrete self-representation enhancement hashing (ADSEH) for cross-domain retrieval. Specifically, ADSEH aligns the mathematical distribution with domain adaptation for cross-domain data, by exploiting the correlation of minimizing the distribution mismatch to reduce the heterogeneous semantic gaps. Then, ADSEH learns the self-representation which is embedded into the generated hash codes, for enhancing the semantic relevance, improving the quality of hash codes, and boosting the generalization ability of ADSEH. Finally, the heterogeneous semantic gaps are further reduced by the log-likelihood similarity preserving for the cross-domain data. Experimental results demonstrate that ADSEH can outperform some SOTA baseline methods on four widely used datasets.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"5158-5171"},"PeriodicalIF":13.7,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144786636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Faces Blind Your Eyes: Unveiling the Content-Irrelevant Synthetic Artifacts for Deepfake Detection 脸蒙蔽了你的眼睛:揭示与内容无关的人工制品用于深度伪造检测。
IF 13.7
Xinghe Fu;Benzun Fu;Shen Chen;Taiping Yao;Yiting Wang;Shouhong Ding;Xiubo Liang;Xi Li
{"title":"Faces Blind Your Eyes: Unveiling the Content-Irrelevant Synthetic Artifacts for Deepfake Detection","authors":"Xinghe Fu;Benzun Fu;Shen Chen;Taiping Yao;Yiting Wang;Shouhong Ding;Xiubo Liang;Xi Li","doi":"10.1109/TIP.2025.3592576","DOIUrl":"10.1109/TIP.2025.3592576","url":null,"abstract":"Data synthesis methods have shown promising results in general deepfake detection tasks. This is attributed to the inherent blending process in deepfake creation, which leaves behind distinct synthetic artifacts. However, the existence of content-irrelevant artifacts has not been explicitly explored in the deepfake synthesis. Unveiling content-irrelevant synthetic artifacts helps uncover general deepfake features and enhances the generalization capability of detection models. To capture the content-irrelevant synthetic artifacts, we propose a learning framework incorporating a synthesis process for diverse contents and specially designed learning strategies that encourage using content-irrelevant forgery information across deepfake images. From the data perspective, we disentangle the blending operation from face data and propose a universal synthetic module that generates images from various classes with common synthetic artifacts. From the learning perspective, a domain-adaptive learning head is introduced to filter out forgery-irrelevant features and optimize the decision on deepfake face detection. To efficiently learn the content-irrelevant artifacts for detection with a large sampling space, we propose a batch-wise sample selection strategy that actively mines the hard samples based on their effect on the adaptive decision boundary. Extensive cross-dataset experiments show that our method achieves state-of-the-art performance in general deepfake detection.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"5686-5696"},"PeriodicalIF":13.7,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144777821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Simplifying Scalable Subspace Clustering and Its Multi-View Extension by Anchor-to-Sample Kernel 基于锚点-样本核的可伸缩子空间聚类简化及其多视图扩展
IF 13.7
Zhoumin Lu;Feiping Nie;Linru Ma;Rong Wang;Xuelong Li
{"title":"Simplifying Scalable Subspace Clustering and Its Multi-View Extension by Anchor-to-Sample Kernel","authors":"Zhoumin Lu;Feiping Nie;Linru Ma;Rong Wang;Xuelong Li","doi":"10.1109/TIP.2025.3593057","DOIUrl":"10.1109/TIP.2025.3593057","url":null,"abstract":"As we all known, sparse subspace learning can provide good input for spectral clustering, thereby producing high-quality cluster partitioning. However, it employs complete samples as the dictionary for representation learning, resulting in non-negligible computational costs. Therefore, replacing the complete samples with representative ones (anchors) as the dictionary has become a more popular choice, giving rise to a series of related works. Unfortunately, although these works are linear with respect to the number of samples, they are often quadratic or even cubic with respect to the number of anchors. In this paper, we derive a simpler problem to replace the original scalable subspace clustering, whose properties are utilized. This new problem is linear with respect to both the number of samples and anchors, further enhancing scalability and providing more efficient operations. Furthermore, thanks to the new problem formulation, we can adopt a separate fusion strategy for multi-view extensions. This strategy can better measure the inter-view difference and avoid alternate optimization, so as to achieve more robust and efficient multi-view clustering. Finally, comprehensive experiments demonstrate that our methods not only significantly reduce time overhead but also exhibit superior performance.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"5084-5098"},"PeriodicalIF":13.7,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144763227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Ensemble Model for Quantitative Optical Property and Chromophore Concentration Images of Biological Tissues 生物组织定量光学性质和发色团浓度图像的深系综模型
IF 13.7
Bingbao Yan;Bowen Song;Chang Ge;Xinman Yin;Wenchao Jia;Gen Mu;Yanyu Zhao
{"title":"Deep Ensemble Model for Quantitative Optical Property and Chromophore Concentration Images of Biological Tissues","authors":"Bingbao Yan;Bowen Song;Chang Ge;Xinman Yin;Wenchao Jia;Gen Mu;Yanyu Zhao","doi":"10.1109/TIP.2025.3593071","DOIUrl":"10.1109/TIP.2025.3593071","url":null,"abstract":"The ability to quantify widefield tissue optical properties (OPs, i.e., absorption and scattering) has major implications on the characterization of various physiological and disease processes. However, conventional image processing methods for tissue optical properties are either limited to qualitative analysis, or have tradeoffs in speed and accuracy. The key to quantification of optical properties is the extraction of amplitude maps from reflectance images under sinusoidal illumination of different spatial frequencies. Conventional three-phase demodulation (TPD) method has been demonstrated for the mapping of OPs, but it requires as many as 14 measurement images for accurate OP extraction, which leads to limited throughput and hinders practical translation. Although single-phase demodulation (SPD) method has been proposed to map OPs with a single measurement image, it is typically subject to image artifacts and decreased measurement accuracy. To tackle those challenges, here we develop a deep ensemble model (DEM) that can map tissue optical properties with high accuracy in a single snapshot, increasing the measurement speed by <inline-formula> <tex-math>$14times $ </tex-math></inline-formula> compared to conventional TPD method. The proposed method was validated with measurements on an array of optical phantoms, ex vivo tissues, and in vivo tissues. The errors for OP extraction were <inline-formula> <tex-math>$0.83~pm ~5.0$ </tex-math></inline-formula>% for absorption and <inline-formula> <tex-math>$0.40~pm ~1.9$ </tex-math></inline-formula>% for reduced scattering, dramatically lower than that of the state-of-the-art SPD method (<inline-formula> <tex-math>$2.5~pm ~15$ </tex-math></inline-formula>% for absorption and -<inline-formula> <tex-math>$1.2~pm ~11$ </tex-math></inline-formula>% for reduced scattering). It was further demonstrated that while trained with data from a single wavelength, the DEM can be directly applied to other wavelengths and effectively obtain optical property and chromophore concentration images of biological tissues. Together, these results highlight the potential of DEM to enable new capabilities for quantitative monitoring of tissue physiological and disease processes.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"4999-5008"},"PeriodicalIF":13.7,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11107267","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144763226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Partial Domain Adaptation via Importance Sampling-Based Shift Correction 基于重要采样偏移校正的局部域自适应
IF 13.7
Cheng-Jun Guo;Chuan-Xian Ren;You-Wei Luo;Xiao-Lin Xu;Hong Yan
{"title":"Partial Domain Adaptation via Importance Sampling-Based Shift Correction","authors":"Cheng-Jun Guo;Chuan-Xian Ren;You-Wei Luo;Xiao-Lin Xu;Hong Yan","doi":"10.1109/TIP.2025.3593115","DOIUrl":"10.1109/TIP.2025.3593115","url":null,"abstract":"Partial domain adaptation (PDA) is a challenging task in real-world machine learning scenarios. It aims to transfer knowledge from a labeled source domain to a related unlabeled target domain, where the support set of the source label distribution subsumes the target one. Previous PDA works managed to correct the label distribution shift by weighting samples in the source domain. However, the simple reweighing technique cannot explore the latent structure and sufficiently use the labeled data, and then models are prone to over-fitting on the source domain. In this work, we propose a novel importance sampling-based shift correction (IS2C) method, where new labeled data are sampled from a built sampling domain, whose label distribution is supposed to be the same as the target domain, to characterize the latent structure and enhance the generalization ability of the model. We provide theoretical guarantees for IS2C by proving that the generalization error can be sufficiently dominated by IS2C. In particular, by implementing sampling with the mixture distribution, the extent of shift between source and sampling domains can be connected to generalization error, which provides an interpretable way to build IS2C. To improve knowledge transfer, an optimal transport-based independence criterion is proposed for conditional distribution alignment, where the computation of the criterion can be adjusted to reduce the complexity from <inline-formula> <tex-math>$mathcal {O}(n^{3})$ </tex-math></inline-formula> to <inline-formula> <tex-math>$mathcal {O}(n^{2})$ </tex-math></inline-formula> in realistic PDA scenarios. Extensive experiments on PDA benchmarks validate the theoretical results and demonstrate the effectiveness of our IS2C over existing methods.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"5009-5022"},"PeriodicalIF":13.7,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144763228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Source Domain Generalization for Learned Lossless Volumetric Biomedical Image Compression 学习后无损体积生物医学图像压缩的多源域泛化
IF 13.7
Dongmei Xue;Siqi Wu;Li Li;Dong Liu;Zhu Li
{"title":"Multi-Source Domain Generalization for Learned Lossless Volumetric Biomedical Image Compression","authors":"Dongmei Xue;Siqi Wu;Li Li;Dong Liu;Zhu Li","doi":"10.1109/TIP.2025.3592549","DOIUrl":"10.1109/TIP.2025.3592549","url":null,"abstract":"Learned lossless compression methods for volumetric biomedical images have achieved significant performance improvements compared with the traditional ones. However, they often perform poorly when applied to unseen domains due to domain gap issues. To address this problem, we propose a multi-source domain generalization method to handle two main sources of domain gap issues: modality and structure differences. To address modality differences, we develop an adaptive modality transfer (AMT) module, which predicts a set of modality-specific parameters from the original image and embeds them into the bit stream. These parameters control the weights of a mixture of experts to create a dynamic convolution, which is then used for entropy coding to facilitate modality transfer. To address structure differences, we design an adaptive structure transfer (AST) module, which decomposes the high dynamic range biomedical images into least significant bits (LSB) and most significant bits (MSB) in the wavelet domain. The MSB information, which is unique to the test image, is then used to predict an additional set of dynamic convolutions to enable structure transfer. Experimental results show that our approach reduces performance degradation caused by the domain gap to within 3% across various volumetric biomedical modalities. This paves the way for the practical end-to-end biomedical image compression.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"4896-4907"},"PeriodicalIF":13.7,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144747515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信