IEEE Transactions on Image Processing最新文献

筛选
英文 中文
Pro2Diff: Proposal Propagation for Multi-Object Tracking via the Diffusion Model Pro2Diff:通过扩散模型进行多目标跟踪的提案传播
IF 10.6 1区 计算机科学
IEEE Transactions on Image Processing Pub Date : 2024-11-14 DOI: 10.1109/tip.2024.3494600
Hongmin Liu, Canbin Zhang, Bin Fan, Jinglin Xu
{"title":"Pro2Diff: Proposal Propagation for Multi-Object Tracking via the Diffusion Model","authors":"Hongmin Liu, Canbin Zhang, Bin Fan, Jinglin Xu","doi":"10.1109/tip.2024.3494600","DOIUrl":"https://doi.org/10.1109/tip.2024.3494600","url":null,"abstract":"","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"354 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142637275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced Multispectral Band-to-Band Registration using Co-occurrence Scale Space and Spatial Confined RANSAC Guided Segmented Affine Transformation 利用共现尺度空间和空间限制 RANSAC 引导的分段仿射变换增强多光谱波段到波段的配准
IF 10.6 1区 计算机科学
IEEE Transactions on Image Processing Pub Date : 2024-11-14 DOI: 10.1109/tip.2024.3494555
Indranil Misra, Mukesh Kumar Rohil, S. Manthira Moorthi, Debajyoti Dhar
{"title":"Enhanced Multispectral Band-to-Band Registration using Co-occurrence Scale Space and Spatial Confined RANSAC Guided Segmented Affine Transformation","authors":"Indranil Misra, Mukesh Kumar Rohil, S. Manthira Moorthi, Debajyoti Dhar","doi":"10.1109/tip.2024.3494555","DOIUrl":"https://doi.org/10.1109/tip.2024.3494555","url":null,"abstract":"","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"11 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142637274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nonconvex Robust High-Order Tensor Completion Using Randomized Low-Rank Approximation 使用随机低库近似的非凸稳健高阶张量补全
IF 10.6 1区 计算机科学
IEEE Transactions on Image Processing Pub Date : 2024-04-10 DOI: 10.1109/tip.2024.3385284
Wenjin Qin, Hailin Wang, Feng Zhang, Weijun Ma, Jianjun Wang, Tingwen Huang
{"title":"Nonconvex Robust High-Order Tensor Completion Using Randomized Low-Rank Approximation","authors":"Wenjin Qin, Hailin Wang, Feng Zhang, Weijun Ma, Jianjun Wang, Tingwen Huang","doi":"10.1109/tip.2024.3385284","DOIUrl":"https://doi.org/10.1109/tip.2024.3385284","url":null,"abstract":"","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"22 1","pages":""},"PeriodicalIF":10.6,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140544990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Transparent Deep Image Aesthetics Assessment with Tag-based Content Descriptors. 利用基于标签的内容描述符实现透明的深度图像美学评估
IF 10.6 1区 计算机科学
IEEE Transactions on Image Processing Pub Date : 2023-08-30 DOI: 10.1109/TIP.2023.3308852
Jingwen Hou, Weisi Lin, Yuming Fang, Haoning Wu, Chaofeng Chen, Liang Liao, Weide Liu
{"title":"Towards Transparent Deep Image Aesthetics Assessment with Tag-based Content Descriptors.","authors":"Jingwen Hou, Weisi Lin, Yuming Fang, Haoning Wu, Chaofeng Chen, Liang Liao, Weide Liu","doi":"10.1109/TIP.2023.3308852","DOIUrl":"10.1109/TIP.2023.3308852","url":null,"abstract":"<p><p>Deep learning approaches for Image Aesthetics Assessment (IAA) have shown promising results in recent years, but the internal mechanisms of these models remain unclear. Previous studies have demonstrated that image aesthetics can be predicted using semantic features, such as pre-trained object classification features. However, these semantic features are learned implicitly, and therefore, previous works have not elucidated what the semantic features are representing. In this work, we aim to create a more transparent deep learning framework for IAA by introducing explainable semantic features. To achieve this, we propose Tag-based Content Descriptors (TCDs), where each value in a TCD describes the relevance of an image to a human-readable tag that refers to a specific type of image content. This allows us to build IAA models from explicit descriptions of image contents. We first propose the explicit matching process to produce TCDs that adopt predefined tags to describe image contents. We show that a simple MLP-based IAA model with TCDs only based on predefined tags can achieve an SRCC of 0.767, which is comparable to most state-of-the-art methods. However, predefined tags may not be sufficient to describe all possible image contents that the model may encounter. Therefore, we further propose the implicit matching process to describe image contents that cannot be described by predefined tags. By integrating components obtained from the implicit matching process into TCDs, the IAA model further achieves an SRCC of 0.817, which significantly outperforms existing IAA methods. Both the explicit matching process and the implicit matching process are realized by the proposed TCD generator. To evaluate the performance of the proposed TCD generator in matching images with predefined tags, we also labeled 5101 images with photography-related tags to form a validation set. And experimental results show that the proposed TCD generator can meaningfully assign photography-related tags to images.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"PP ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2023-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10207498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Field-of-View IoU for Object Detection in 360° Images. 用于 360° 图像中物体检测的视场 IoU。
IF 10.6 1区 计算机科学
IEEE Transactions on Image Processing Pub Date : 2023-07-21 DOI: 10.1109/TIP.2023.3296013
Miao Cao, Satoshi Ikehata, Kiyoharu Aizawa
{"title":"Field-of-View IoU for Object Detection in 360° Images.","authors":"Miao Cao, Satoshi Ikehata, Kiyoharu Aizawa","doi":"10.1109/TIP.2023.3296013","DOIUrl":"10.1109/TIP.2023.3296013","url":null,"abstract":"<p><p>360° cameras have gained popularity over the last few years. In this paper, we propose two fundamental techniques-Field-of-View IoU (FoV-IoU) and 360Augmentation for object detection in 360° images. Although most object detection neural networks designed for perspective images are applicable to 360° images in equirectangular projection (ERP) format, their performance deteriorates owing to the distortion in ERP images. Our method can be readily integrated with existing perspective object detectors and significantly improves the performance. The FoV-IoU computes the intersection-over-union of two Field-of-View bounding boxes in a spherical image which could be used for training, inference, and evaluation while 360Augmentation is a data augmentation technique specific to 360° object detection task which randomly rotates a spherical image and solves the bias due to the sphere-to-plane projection. We conduct extensive experiments on the 360° indoor dataset with different types of perspective object detectors and show the consistent effectiveness of our method.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"PP ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2023-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9848778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TGFuse: An Infrared and Visible Image Fusion Approach Based on Transformer and Generative Adversarial Network. TGFuse:基于变换器和生成对抗网络的红外与可见光图像融合方法。
IF 10.6 1区 计算机科学
IEEE Transactions on Image Processing Pub Date : 2023-05-10 DOI: 10.1109/TIP.2023.3273451
Dongyu Rao, Tianyang Xu, Xiao-Jun Wu
{"title":"TGFuse: An Infrared and Visible Image Fusion Approach Based on Transformer and Generative Adversarial Network.","authors":"Dongyu Rao, Tianyang Xu, Xiao-Jun Wu","doi":"10.1109/TIP.2023.3273451","DOIUrl":"10.1109/TIP.2023.3273451","url":null,"abstract":"<p><p>The end-to-end image fusion framework has achieved promising performance, with dedicated convolutional networks aggregating the multi-modal local appearance. However, long-range dependencies are directly neglected in existing CNN fusion approaches, impeding balancing the entire image-level perception for complex scenario fusion. In this paper, therefore, we propose an infrared and visible image fusion algorithm based on the transformer module and adversarial learning. Inspired by the global interaction power, we use the transformer technique to learn the effective global fusion relations. In particular, shallow features extracted by CNN are interacted in the proposed transformer fusion module to refine the fusion relationship within the spatial scope and across channels simultaneously. Besides, adversarial learning is designed in the training process to improve the output discrimination via imposing competitive consistency from the inputs, reflecting the specific characteristics in infrared and visible images. The experimental performance demonstrates the effectiveness of the proposed modules, with superior improvement against the state-of-the-art, generalising a novel paradigm via transformer and adversarial learning in the fusion task.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"PP ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2023-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9443051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
USOD10K: A New Benchmark Dataset for Underwater Salient Object Detection. USOD10K:水下突出物体检测的新基准数据集。
IF 10.6 1区 计算机科学
IEEE Transactions on Image Processing Pub Date : 2023-04-14 DOI: 10.1109/TIP.2023.3266163
Lin Hong, Xin Wang, Gan Zhang, Ming Zhao
{"title":"USOD10K: A New Benchmark Dataset for Underwater Salient Object Detection.","authors":"Lin Hong, Xin Wang, Gan Zhang, Ming Zhao","doi":"10.1109/TIP.2023.3266163","DOIUrl":"10.1109/TIP.2023.3266163","url":null,"abstract":"<p><p>Underwater salient object detection (USOD) attracts increasing interest for its promising performance in various underwater visual tasks. However, USOD research is still in its early stages due to the lack of large-scale datasets within which salient objects are well-defined and pixel-wise annotated. To address this issue, this paper introduces a new dataset named USOD10K. It consists of 10,255 underwater images, covering 70 categories of salient objects in 12 different underwater scenes. In addition, salient object boundaries and depth maps of all images are provided in this dataset. The USOD10K is the first large-scale dataset in the USOD community, making a significant leap in diversity, complexity, and scalability. Secondly, a simple but strong baseline termed TC-USOD is designed for the USOD10K. The TC-USOD adopts a hybrid architecture based on an encoder-decoder design that leverages transformer and convolution as the basic computational building block of the encoder and decoder, respectively. Thirdly, we make a comprehensive summarization of 35 cutting-edge SOD/USOD methods and benchmark them over the existing USOD dataset and the USOD10K. The results show that our TC-USOD obtained superior performance on all datasets tested. Finally, several other use cases of the USOD10K are discussed, and future directions of USOD research are pointed out. This work will promote the development of the USOD research and facilitate further research on underwater visual tasks and visually-guided underwater robots. To pave the road in this research field, all the dataset, code, and benchmark results are publicly available: https://github.com/LinHong-HIT/USOD10K.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"PP ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2023-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9781338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DVMark: A Deep Multiscale Framework for Video Watermarking. DVMark:用于视频水印的深度多尺度框架。
IF 10.6 1区 计算机科学
IEEE Transactions on Image Processing Pub Date : 2023-03-28 DOI: 10.1109/TIP.2023.3251737
Xiyang Luo, Yinxiao Li, Huiwen Chang, Ce Liu, Peyman Milanfar, Feng Yang
{"title":"DVMark: A Deep Multiscale Framework for Video Watermarking.","authors":"Xiyang Luo, Yinxiao Li, Huiwen Chang, Ce Liu, Peyman Milanfar, Feng Yang","doi":"10.1109/TIP.2023.3251737","DOIUrl":"10.1109/TIP.2023.3251737","url":null,"abstract":"<p><p>Video watermarking embeds a message into a cover video in an imperceptible manner, which can be retrieved even if the video undergoes certain modifications or distortions. Traditional watermarking methods are often manually designed for particular types of distortions and thus cannot simultaneously handle a broad spectrum of distortions. To this end, we propose a robust deep learning-based solution for video watermarking that is end-to-end trainable. Our model consists of a novel multiscale design where the watermarks are distributed across multiple spatial-temporal scales. Extensive evaluations on a wide variety of distortions show that our method outperforms traditional video watermarking methods as well as deep image watermarking models by a large margin. We further demonstrate the practicality of our method on a realistic video-editing application.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"PP ","pages":""},"PeriodicalIF":10.6,"publicationDate":"2023-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9266354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Radiometric Compensation of Images Projected on Non-White Surfaces by Exploiting Chromatic Adaptation and Perceptual Anchoring. 利用色彩适应和感知锚定的非白色表面投影图像的辐射补偿。
IF 10.6 1区 计算机科学
IEEE Transactions on Image Processing Pub Date : 2017-01-01 Epub Date: 2016-07-18 DOI: 10.1109/TIP.2016.2592799
Tai-Hsiang Huang, Ting-Chun Wang, Homer H Chen
{"title":"Radiometric Compensation of Images Projected on Non-White Surfaces by Exploiting Chromatic Adaptation and Perceptual Anchoring.","authors":"Tai-Hsiang Huang,&nbsp;Ting-Chun Wang,&nbsp;Homer H Chen","doi":"10.1109/TIP.2016.2592799","DOIUrl":"https://doi.org/10.1109/TIP.2016.2592799","url":null,"abstract":"<p><p>Flat surfaces in our living environment to be used as replacements of a projection screen are not necessarily white. We propose a perceptual radiometric compensation method to counteract the effect of color projection surfaces on image appearance. It reduces color clipping while preserving the hue and brightness of images based on the anchoring property of human visual system. In addition, it considers the effect of chromatic adaptation on perceptual image quality and fixes the color distortion caused by non-white projection surfaces by properly shifting the color of the image pixels toward the complementary color of the projection surface. User ratings show that our method outperforms the existing methods in 974 out of 1020 subjective tests.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"26 1","pages":"147-159"},"PeriodicalIF":10.6,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TIP.2016.2592799","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34754340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Incorporating Spatial Information and Endmember Variability Into Unmixing Analyses to Improve Abundance Estimates. 将空间信息和端元变异性纳入分离分析以改进丰度估算。
IF 10.6 1区 计算机科学
IEEE Transactions on Image Processing Pub Date : 2016-12-01 Epub Date: 2016-08-18 DOI: 10.1109/TIP.2016.2601269
Tatsumi Uezato, Richard J Murphy, Arman Melkumyan, Anna Chlingaryan
{"title":"Incorporating Spatial Information and Endmember Variability Into Unmixing Analyses to Improve Abundance Estimates.","authors":"Tatsumi Uezato,&nbsp;Richard J Murphy,&nbsp;Arman Melkumyan,&nbsp;Anna Chlingaryan","doi":"10.1109/TIP.2016.2601269","DOIUrl":"https://doi.org/10.1109/TIP.2016.2601269","url":null,"abstract":"<p><p>Incorporating endmember variability and spatial information into spectral unmixing analyses is important for producing accurate abundance estimates. However, most methods do not incorporate endmember variability with spatial regularization. This paper proposes a novel 2-step unmixing approach, which incorporates endmember variability and spatial information. In step 1, a probability distribution representing abundances is estimated by spectral unmixing within a multi-task Gaussian process framework (SUGP). In step 2, spatial information is incorporated into the probability distribution derived by SUGP through an a priori distribution derived from a Markov random field (MRF). The proposed method (SUGP-MRF) is different to the existing unmixing methods because it incorporates endmember variability and spatial information at separate steps in the analysis and automatically estimates parameters controlling the balance between the data fit and spatial smoothness. The performance of SUGP-MRF is compared with the existing unmixing methods using synthetic imagery with precisely known abundances and real hyperspectral imagery of rock samples. Results show that SUGP-MRF outperforms the existing methods and improves the accuracy of abundance estimates by incorporating spatial information.</p>","PeriodicalId":13217,"journal":{"name":"IEEE Transactions on Image Processing","volume":"25 12","pages":"5563-5575"},"PeriodicalIF":10.6,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TIP.2016.2601269","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34686710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信