IEEE Transactions on Pattern Analysis and Machine Intelligence最新文献

筛选
英文 中文
A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual Learning 持续学习之外深度学习中的遗忘综合调查
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2024-11-14 DOI: 10.1109/tpami.2024.3498346
Zhenyi Wang, Enneng Yang, Li Shen, Heng Huang
{"title":"A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual Learning","authors":"Zhenyi Wang, Enneng Yang, Li Shen, Heng Huang","doi":"10.1109/tpami.2024.3498346","DOIUrl":"https://doi.org/10.1109/tpami.2024.3498346","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"3 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142637282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DiffI2I: Efficient Diffusion Model for Image-to-Image Translation DiffI2I:图像到图像转换的高效扩散模型
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2024-11-14 DOI: 10.1109/tpami.2024.3498003
Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, Radu Timotfe, Luc Van Gool
{"title":"DiffI2I: Efficient Diffusion Model for Image-to-Image Translation","authors":"Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, Radu Timotfe, Luc Van Gool","doi":"10.1109/tpami.2024.3498003","DOIUrl":"https://doi.org/10.1109/tpami.2024.3498003","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"109 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142637281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PATNAS: A Path-Based Training-Free Neural Architecture Search PATNAS:基于路径的免训练神经架构搜索
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2024-11-14 DOI: 10.1109/tpami.2024.3498035
Jiechao Yang, Yong Liu, Wei Wang, Haoran Wu, Zhiyuan Chen, Xibo Ma
{"title":"PATNAS: A Path-Based Training-Free Neural Architecture Search","authors":"Jiechao Yang, Yong Liu, Wei Wang, Haoran Wu, Zhiyuan Chen, Xibo Ma","doi":"10.1109/tpami.2024.3498035","DOIUrl":"https://doi.org/10.1109/tpami.2024.3498035","url":null,"abstract":"","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"246 1","pages":""},"PeriodicalIF":23.6,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142637284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning to Solve Hard Minimal Problems. 学会解决最简单的难题
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2023-08-23 DOI: 10.1109/TPAMI.2023.3307898
Petr Hruby, Timothy Duff, Anton Leykin, Tomas Pajdla
{"title":"Learning to Solve Hard Minimal Problems.","authors":"Petr Hruby, Timothy Duff, Anton Leykin, Tomas Pajdla","doi":"10.1109/TPAMI.2023.3307898","DOIUrl":"10.1109/TPAMI.2023.3307898","url":null,"abstract":"<p><p>We present an approach to solving hard geometric optimization problems in the RANSAC framework. The hard minimal problems arise from relaxing the original geometric optimization problem into a minimal problem with many spurious solutions. Our approach avoids computing large numbers of spurious solutions. We design a learning strategy for selecting a starting problem-solution pair that can be numerically continued to the problem and the solution of interest. We demonstrate our approach by developing a RANSAC solver for the problem of computing the relative pose of three calibrated cameras, via a minimal relaxation using four points in each view. On average, we can solve a single problem in under 70 μs. We also benchmark and study our engineering choices on the very familiar problem of computing the relative pose of two calibrated cameras, via the minimal case of five points in two views.</p>","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"PP ","pages":""},"PeriodicalIF":23.6,"publicationDate":"2023-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10055768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CoIR: Compressive Implicit Radar. CoIR:压缩隐含雷达。
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2023-08-10 DOI: 10.1109/TPAMI.2023.3301553
Sean M Farrell, Vivek Boominathan, Nathaniel Raymondi, Ashutosh Sabharwal, Ashok Veeraraghavan
{"title":"CoIR: Compressive Implicit Radar.","authors":"Sean M Farrell, Vivek Boominathan, Nathaniel Raymondi, Ashutosh Sabharwal, Ashok Veeraraghavan","doi":"10.1109/TPAMI.2023.3301553","DOIUrl":"10.1109/TPAMI.2023.3301553","url":null,"abstract":"<p><p>Using millimeter wave (mmWave) signals for imaging has an important advantage in that they can penetrate through poor environmental conditions such as fog, dust, and smoke that severely degrade optical-based imaging systems. However, mmWave radars, contrary to cameras and LiDARs, suffer from low angular resolution because of small physical apertures and conventional signal processing techniques. Sparse radar imaging, on the other hand, can increase the aperture size while minimizing the power consumption and read out bandwidth. This paper presents CoIR, an analysis by synthesis method that leverages the implicit neural network bias in convolutional decoders and compressed sensing to perform high accuracy sparse radar imaging. The proposed system is data set-agnostic and does not require any auxiliary sensors for training or testing. We introduce a sparse array design that allows for a 5.5× reduction in the number of antenna elements needed compared to conventional MIMO array designs. We demonstrate our system's improved imaging performance over standard mmWave radars and other competitive untrained methods on both simulated and experimental mmWave radar data.</p>","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"PP ","pages":""},"PeriodicalIF":23.6,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9971344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transformer-Empowered Invariant Grounding for Video Question Answering. 用于视频问题解答的变压器供电不变接地。
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2023-08-09 DOI: 10.1109/TPAMI.2023.3303451
Yicong Li, Xiang Wang, Junbin Xiao, Wei Ji, Tat-Seng Chua
{"title":"Transformer-Empowered Invariant Grounding for Video Question Answering.","authors":"Yicong Li, Xiang Wang, Junbin Xiao, Wei Ji, Tat-Seng Chua","doi":"10.1109/TPAMI.2023.3303451","DOIUrl":"10.1109/TPAMI.2023.3303451","url":null,"abstract":"<p><p>Video Question Answering (VideoQA) is the task of answering questions about a video. At its core is the understanding of the alignments between video scenes and question semantics to yield the answer. In leading VideoQA models, the typical learning objective, empirical risk minimization (ERM), tends to over-exploit the spurious correlations between question-irrelevant scenes and answers, instead of inspecting the causal effect of question-critical scenes, which undermines the prediction with unreliable reasoning. In this work, we take a causal look at VideoQA and propose a modal-agnostic learning framework, named Invariant Grounding for VideoQA (IGV), to ground the question-critical scene, whose causal relations with answers are invariant across different interventions on the complement. With IGV, leading VideoQA models are forced to shield the answering from the negative influence of spurious correlations, which significantly improves their reasoning ability. To unleash the potential of this framework, we further provide a Transformer-Empowered Invariant Grounding for VideoQA (TIGV), a substantial instantiation of IGV framework that naturally integrates the idea of invariant grounding into a transformer-style backbone. Experiments on four benchmark datasets validate our design in terms of accuracy, visual explainability, and generalization ability over the leading baselines. Our code is available at https://github.com/yl3800/TIGV.</p>","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"PP ","pages":""},"PeriodicalIF":23.6,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9968656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Count-Free Single-Photon 3D Imaging with Race Logic. 利用竞赛逻辑进行无计数单光子三维成像
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2023-08-07 DOI: 10.1109/TPAMI.2023.3302822
Atul Ingle, David Maier
{"title":"Count-Free Single-Photon 3D Imaging with Race Logic.","authors":"Atul Ingle, David Maier","doi":"10.1109/TPAMI.2023.3302822","DOIUrl":"10.1109/TPAMI.2023.3302822","url":null,"abstract":"<p><p>Single-photon cameras (SPCs) have emerged as a promising new technology for high-resolution 3D imaging. A single-photon 3D camera determines the round-trip time of a laser pulse by precisely capturing the arrival of individual photons at each camera pixel. Constructing photon-timestamp histograms is a fundamental operation for a single-photon 3D camera. However, in-pixel histogram processing is computationally expensive and requires large amount of memory per pixel. Digitizing and transferring photon timestamps to an off-sensor histogramming module is bandwidth and power hungry. Can we estimate distances without explicitly storing photon counts? Yes-here we present an online approach for distance estimation suitable for resource-constrained settings with limited bandwidth, memory and compute. The two key ingredients of our approach are (a) processing photon streams using race logic, which maintains photon data in the time-delay domain, and (b) constructing count-free equi-depth histograms as opposed to conventional equi-width histograms. Equi-depth histograms are a more succinct representation for \"peaky\" distributions, such as those obtained by an SPC pixel from a laser pulse reflected by a surface. Our approach uses a binner element that converges on the median (or, more generally, to another k-quantile) of a distribution. We cascade multiple binners to form an equi-depth histogrammer that produces multi-bin histograms. Our evaluation shows that this method can provide at least an order of magnitude reduction in bandwidth and power consumption while maintaining similar distance reconstruction accuracy as conventional histogram-based processing methods.</p>","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"PP ","pages":""},"PeriodicalIF":23.6,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9953599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Aberration-Aware Depth-From-Focus. 畸变感知离焦深度。
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2023-08-04 DOI: 10.1109/TPAMI.2023.3301931
Xinge Yang, Qiang Fu, Mohamed Elhoseiny, Wolfgang Heidrich
{"title":"Aberration-Aware Depth-From-Focus.","authors":"Xinge Yang, Qiang Fu, Mohamed Elhoseiny, Wolfgang Heidrich","doi":"10.1109/TPAMI.2023.3301931","DOIUrl":"10.1109/TPAMI.2023.3301931","url":null,"abstract":"<p><p>Computer vision methods for depth estimation usually use simple camera models with idealized optics. For modern machine learning approaches, this creates an issue when attempting to train deep networks with simulated data, especially for focus-sensitive tasks like Depth-from-Focus. In this work, we investigate the domain gap caused by off-axis aberrations that will affect the decision of the best-focused frame in a focal stack. We then explore bridging this domain gap through aberration-aware training (AAT). Our approach involves a lightweight network that models lens aberrations at different positions and focus distances, which is then integrated into the conventional network training pipeline. We evaluate the generality of network models on both synthetic and real-world data. The experimental results demonstrate that the proposed AAT scheme can improve depth estimation accuracy without fine-tuning the model for different datasets. The code will be available in github.com/vccimaging/Aberration-Aware-Depth-from-Focus.</p>","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"PP ","pages":""},"PeriodicalIF":23.6,"publicationDate":"2023-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9951494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Isolating Signals in Passive Non-Line-of-Sight Imaging using Spectral Content. 利用光谱内容隔离被动非视线成像中的信号
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2023-08-02 DOI: 10.1109/TPAMI.2023.3301336
Connor Hashemi, Rafael Avelar, James Leger
{"title":"Isolating Signals in Passive Non-Line-of-Sight Imaging using Spectral Content.","authors":"Connor Hashemi, Rafael Avelar, James Leger","doi":"10.1109/TPAMI.2023.3301336","DOIUrl":"10.1109/TPAMI.2023.3301336","url":null,"abstract":"<p><p>In real-life passive non-line-of-sight (NLOS) imaging there is an overwhelming amount of undesired scattered radiance, called clutter, that impedes reconstruction of the desired NLOS scene. This paper explores using the spectral domain of the scattered light field to separate the desired scattered radiance from the clutter. We propose two techniques: The first separates the multispectral scattered radiance into a collection of objects each with their own uniform color. The objects which correspond to clutter can then be identified and removed based on how well they can be reconstructed using NLOS imaging algorithms. This technique requires very few priors and uses off-the-shelf algorithms. For the second technique, we derive and solve a convex optimization problem assuming we know the desired signal's spectral content. This method is quicker and can be performed with fewer spectral measurements. We demonstrate both techniques using realistic scenarios. In the presence of clutter that is 50 times stronger than the desired signal, the proposed reconstruction of the NLOS scene is 23 times more accurate than typical reconstructions and 5 times more accurate than using the leading clutter rejection method.</p>","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"PP ","pages":""},"PeriodicalIF":23.6,"publicationDate":"2023-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9969727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Supervision by Denoising. 去噪监督
IF 23.6 1区 计算机科学
IEEE Transactions on Pattern Analysis and Machine Intelligence Pub Date : 2023-07-28 DOI: 10.1109/TPAMI.2023.3299789
Sean I Young, Adrian V Dalca, Enzo Ferrante, Polina Golland, Christopher A Metzler, Bruce Fischl, Juan Eugenio Iglesias
{"title":"Supervision by Denoising.","authors":"Sean I Young, Adrian V Dalca, Enzo Ferrante, Polina Golland, Christopher A Metzler, Bruce Fischl, Juan Eugenio Iglesias","doi":"10.1109/TPAMI.2023.3299789","DOIUrl":"10.1109/TPAMI.2023.3299789","url":null,"abstract":"<p><p>Learning-based image reconstruction models, such as those based on the U-Net, require a large set of labeled images if good generalization is to be guaranteed. In some imaging domains, however, labeled data with pixel- or voxel-level label accuracy are scarce due to the cost of acquiring them. This problem is exacerbated further in domains like medical imaging, where there is no single ground truth label, resulting in large amounts of repeat variability in the labels. Therefore, training reconstruction networks to generalize better by learning from both labeled and unlabeled examples (called semi-supervised learning) is problem of practical and theoretical interest. However, traditional semi-supervised learning methods for image reconstruction often necessitate handcrafting a differentiable regularizer specific to some given imaging problem, which can be extremely time-consuming. In this work, we propose \"supervision by denoising\" (SUD), a framework to supervise reconstruction models using their own denoised output as labels. SUD unifies stochastic averaging and spatial denoising techniques under a spatio-temporal denoising framework and alternates denoising and model weight update steps in an optimization framework for semi-supervision. As example applications, we apply SUD to two problems from biomedical imaging-anatomical brain reconstruction (3D) and cortical parcellation (2D)-to demonstrate a significant improvement in reconstruction over supervised-only and ensembling baselines. Our code available at https://github.com/seannz/sud.</p>","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"PP ","pages":""},"PeriodicalIF":23.6,"publicationDate":"2023-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9958188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信