IEEE transactions on pattern analysis and machine intelligence最新文献

筛选
英文 中文
HOZ++: Versatile Hierarchical Object-to-Zone Graph for Object Navigation 用于对象导航的通用分层对象到区域图
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-03-19 DOI: 10.1109/TPAMI.2025.3552987
Sixian Zhang;Xinhang Song;Xinyao Yu;Yubing Bai;Xinlong Guo;Weijie Li;Shuqiang Jiang
{"title":"HOZ++: Versatile Hierarchical Object-to-Zone Graph for Object Navigation","authors":"Sixian Zhang;Xinhang Song;Xinyao Yu;Yubing Bai;Xinlong Guo;Weijie Li;Shuqiang Jiang","doi":"10.1109/TPAMI.2025.3552987","DOIUrl":"10.1109/TPAMI.2025.3552987","url":null,"abstract":"The goal of object navigation task is to reach the expected objects using visual information in unseen environments. Previous works typically implement deep models as agents that are trained to predict actions based on visual observations. Despite extensive training, agents often fail to make wise decisions when navigating in unseen environments toward invisible targets. In contrast, humans demonstrate a remarkable talent to navigate toward targets even in unseen environments. This superior capability is attributed to the cognitive map in the hippocampus, which enables humans to recall past experiences in similar situations and anticipate future occurrences during navigation. It is also dynamically updated with new observations from unseen environments. The cognitive map equips humans with a wealth of prior knowledge, significantly enhancing their navigation capabilities. Inspired by human navigation mechanisms, we propose the Hierarchical Object-to-Zone (HOZ++) graph, which encapsulates the regularities among objects, zones, and scenes. The HOZ++ graph helps the agent to identify the current zone and the target zone, and computes an optimal path between them, then selects the next zone along the path as the guidance for the agent. Moreover, the HOZ++ graph continuously updates based on real-time observations in new environments, thereby enhancing its adaptability to new environments. Our HOZ++ graph is versatile and can be integrated into existing methods, including end-to-end RL and modular methods. Our method is evaluated across four simulators, including AI2-THOR, RoboTHOR, Gibson, and Matterport 3D. Additionally, we build a realistic environment to evaluate our method in the real world. Experimental results demonstrate the effectiveness and efficiency of our proposed method.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 7","pages":"5958-5975"},"PeriodicalIF":0.0,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143661529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spectrum-Enhanced Graph Attention Network for Garment Mesh Deformation 服装网格变形的谱增强图关注网络
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-03-19 DOI: 10.1109/TPAMI.2025.3570523
Tianxing Li;Rui Shi;Qing Zhu;Liguo Zhang;Takashi Kanai
{"title":"Spectrum-Enhanced Graph Attention Network for Garment Mesh Deformation","authors":"Tianxing Li;Rui Shi;Qing Zhu;Liguo Zhang;Takashi Kanai","doi":"10.1109/TPAMI.2025.3570523","DOIUrl":"10.1109/TPAMI.2025.3570523","url":null,"abstract":"We present a novel solution for mesh-based deformation simulation from a spectral perspective. Unlike existing approaches that demand separate training for each garment or body type and often struggle to produce rich folds and lifelike dynamics, our method achieves the quality of physics-based simulations while maintaining superior efficiency within a unified model. The key to achieve this lies in the development of a spectrum-enhanced deformation network, a result of in-depth theoretical analysis bridging neural networks and garment deformations. This enhancement compels the network to focus on learning spectral information predominantly within the frequency band associated with intricate deformations. Furthermore, building upon standard blend skinning techniques, we introduce target-aware temporal skinning weights. The weights describe how the underlying human skeleton dynamically affects the mesh vertices according to the garment and body shape, as well as the motion state. We validate our method on various garments, bodies, and motions through extensive ablation studies. Finally, we conduct comparisons to confirm its superiority in generalization, deformation quality, and performance over several state-of-the-art methods.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 8","pages":"7153-7170"},"PeriodicalIF":0.0,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144097304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CLRNetV2: A Faster and Stronger Lane Detector CLRNetV2:一种更快更强的车道检测器。
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-03-18 DOI: 10.1109/TPAMI.2025.3551935
Tu Zheng;Yifei Huang;Yang Liu;Binbin Lin;Zheng Yang;Deng Cai;Xiaofei He
{"title":"CLRNetV2: A Faster and Stronger Lane Detector","authors":"Tu Zheng;Yifei Huang;Yang Liu;Binbin Lin;Zheng Yang;Deng Cai;Xiaofei He","doi":"10.1109/TPAMI.2025.3551935","DOIUrl":"10.1109/TPAMI.2025.3551935","url":null,"abstract":"Lane is critical in the vision navigation system of intelligent vehicles. Naturally, the lane is a traffic sign with high-level semantics, whereas it owns the specific local pattern which needs detailed low-level features to localize accurately. Using different feature levels is of great importance for accurate lane detection, but it is still under-explored. On the other hand, current lane detection methods still struggle to detect complex dense lanes, such as Y-shape or fork-shape. In this work, we present Cross Layer Refinement Network aiming at fully utilizing both high-level and low-level features in lane detection. In particular, it first detects lanes with high-level semantic features and then performs refinement based on low-level features. In this way, we can exploit more contextual information to detect lanes while leveraging local-detailed features to improve localization accuracy. We present Fast-ROIGather to gather global context, which further enhances the representation of lane features. To detect dense lanes accurately, we propose Correlation Discrimination Module (CDM) to discriminate the correlation of dense lanes, enabling nearly cost-free high-quality dense lane prediction. In addition to our novel network design, we introduce LineIoU loss which regresses lanes as a whole unit to improve localization accuracy. Experiments demonstrate our approach significantly outperforms the state-of-the-art lane detection methods.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 6","pages":"4271-4284"},"PeriodicalIF":0.0,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143660091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Systematic Bias of Machine Learning Regression Models and Correction 机器学习回归模型的系统偏差及其校正。
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-03-18 DOI: 10.1109/TPAMI.2025.3552368
Hwiyoung Lee;Shuo Chen
{"title":"Systematic Bias of Machine Learning Regression Models and Correction","authors":"Hwiyoung Lee;Shuo Chen","doi":"10.1109/TPAMI.2025.3552368","DOIUrl":"10.1109/TPAMI.2025.3552368","url":null,"abstract":"Machine learning models for continuous outcomes often yield systematically biased predictions, particularly for values that largely deviate from the mean. Specifically, predictions for large-valued outcomes tend to be negatively biased (underestimating actual values), while those for small-valued outcomes are positively biased (overestimating actual values). We refer to this linear central tendency warped bias as the “systematic bias of machine learning regression”. In this paper, we first demonstrate that this systematic prediction bias persists across various machine learning regression models, and then delve into its theoretical underpinnings. To address this issue, we propose a general constrained optimization approach designed to correct this bias and develop computationally efficient implementation algorithms. Simulation results indicate that our correction method effectively eliminates the bias from the predicted outcomes. We apply the proposed approach to the prediction of brain age using neuroimaging data. In comparison to competing machine learning regression models, our method effectively addresses the longstanding issue of “systematic bias of machine learning regression” in neuroimaging-based brain age calculation, yielding unbiased predictions of brain age.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 6","pages":"4974-4983"},"PeriodicalIF":0.0,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143660149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hulk: A Universal Knowledge Translator for Human-Centric Tasks 浩克:以人为中心任务的通用知识翻译器。
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-03-18 DOI: 10.1109/TPAMI.2025.3552604
Yizhou Wang;Yixuan Wu;Weizhen He;Xun Guo;Feng Zhu;Lei Bai;Rui Zhao;Jian Wu;Tong He;Wanli Ouyang;Shixiang Tang
{"title":"Hulk: A Universal Knowledge Translator for Human-Centric Tasks","authors":"Yizhou Wang;Yixuan Wu;Weizhen He;Xun Guo;Feng Zhu;Lei Bai;Rui Zhao;Jian Wu;Tong He;Wanli Ouyang;Shixiang Tang","doi":"10.1109/TPAMI.2025.3552604","DOIUrl":"10.1109/TPAMI.2025.3552604","url":null,"abstract":"Human-centric perception tasks, e.g., pedestrian detection, skeleton-based action recognition, and pose estimation, have wide industrial applications, such as metaverse and sports analysis. There is a recent surge to develop human-centric foundation models that can benefit a broad range of human-centric perception tasks. While many human-centric foundation models have achieved success, they did not explore 3D and vision-language tasks for human-centric and required task-specific finetuning. These limitations restrict their application to more downstream tasks and situations. To tackle these problems, we present Hulk, the first multimodal human-centric generalist model, capable of addressing 2D vision, 3D vision, skeleton-based, and vision-language tasks without task-specific finetuning. The key to achieving this is condensing various task-specific heads into two general heads, one for discrete representations, e.g., languages, and the other for continuous representations, e.g., location coordinates. The outputs of two heads can be further stacked into four distinct input and output modalities. This uniform representation enables Hulk to treat diverse human-centric tasks as modality translation, integrating knowledge across a wide range of tasks. Comprehensive evaluations of Hulk on 12 benchmarks covering 8 human-centric tasks demonstrate the superiority of our proposed method, achieving state-of-the-art performance in 11 benchmarks.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 7","pages":"5672-5689"},"PeriodicalIF":0.0,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143660141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Conditional Cauchy-Schwarz Divergence With Applications to Time-Series Data and Sequential Decision Making 条件Cauchy-Schwarz散度在时间序列数据和顺序决策中的应用。
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-03-18 DOI: 10.1109/TPAMI.2025.3552434
Shujian Yu;Hongming Li;Sigurd Løkse;Robert Jenssen;José C. Príncipe
{"title":"The Conditional Cauchy-Schwarz Divergence With Applications to Time-Series Data and Sequential Decision Making","authors":"Shujian Yu;Hongming Li;Sigurd Løkse;Robert Jenssen;José C. Príncipe","doi":"10.1109/TPAMI.2025.3552434","DOIUrl":"10.1109/TPAMI.2025.3552434","url":null,"abstract":"The Cauchy-Schwarz (CS) divergence was developed by Príncipe et al. in 2000. In this paper, we extend the classic CS divergence to quantify the closeness between two conditional distributions and show that the developed conditional CS divergence can be elegantly estimated by a kernel density estimator from given samples. We illustrate the advantages (e.g., rigorous faithfulness guarantee, lower computational complexity, higher statistical power, and much more flexibility in a wide range of applications) of our conditional CS divergence over previous proposals, such as the conditional Kullback-Leibler divergence and the conditional maximum mean discrepancy. We also demonstrate the compelling performance of conditional CS divergence in two machine learning tasks related to time series data and sequential inference, namely time series clustering and uncertainty-guided exploration for sequential decision making.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 7","pages":"5901-5917"},"PeriodicalIF":0.0,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143660154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hard-Aware Instance Adaptive Self-Training for Unsupervised Cross-Domain Semantic Segmentation 无监督跨域语义分割的硬感知实例自适应训练。
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-03-18 DOI: 10.1109/TPAMI.2025.3552484
Chuang Zhu;Kebin Liu;Wenqi Tang;Ke Mei;Jiaqi Zou;Tiejun Huang
{"title":"Hard-Aware Instance Adaptive Self-Training for Unsupervised Cross-Domain Semantic Segmentation","authors":"Chuang Zhu;Kebin Liu;Wenqi Tang;Ke Mei;Jiaqi Zou;Tiejun Huang","doi":"10.1109/TPAMI.2025.3552484","DOIUrl":"10.1109/TPAMI.2025.3552484","url":null,"abstract":"The divergence between labeled training data and unlabeled testing data is a significant challenge for recent deep learning models. Unsupervised domain adaptation (UDA) attempts to solve such problem. Recent works show that self-training is a powerful approach to UDA. However, existing methods have difficulty in balancing the scalability and performance. In this paper, we propose a hard-aware instance adaptive self-training framework for UDA on the task of semantic segmentation. To effectively improve the quality and diversity of pseudo-labels, we develop a novel pseudo-label generation strategy with an instance adaptive selector. We further enrich the hard class pseudo-labels with inter-image information through a skillfully designed hard-aware pseudo-label augmentation. Besides, we propose the region-adaptive regularization to smooth the pseudo-label region and sharpen the non-pseudo-label region. For the non-pseudo-label region, consistency constraint is also constructed to introduce stronger supervision signals during model optimization. Our method is so concise and efficient that it is easy to be generalized to other UDA methods. Experiments on GTA5 <inline-formula><tex-math>$rightarrow$</tex-math></inline-formula> Cityscapes, SYNTHIA <inline-formula><tex-math>$rightarrow$</tex-math></inline-formula> Cityscapes, and Cityscapes <inline-formula><tex-math>$rightarrow$</tex-math></inline-formula> Oxford RobotCar demonstrate the superior performance of our approach compared with the state-of-the-art methods.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 7","pages":"5655-5671"},"PeriodicalIF":0.0,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143660109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diff9D: Diffusion-Based Domain-Generalized Category-Level 9-DoF Object Pose Estimation Diff9D:基于扩散的域广义类别级9-DoF目标姿态估计。
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-03-18 DOI: 10.1109/TPAMI.2025.3552132
Jian Liu;Wei Sun;Hui Yang;Pengchao Deng;Chongpei Liu;Nicu Sebe;Hossein Rahmani;Ajmal Mian
{"title":"Diff9D: Diffusion-Based Domain-Generalized Category-Level 9-DoF Object Pose Estimation","authors":"Jian Liu;Wei Sun;Hui Yang;Pengchao Deng;Chongpei Liu;Nicu Sebe;Hossein Rahmani;Ajmal Mian","doi":"10.1109/TPAMI.2025.3552132","DOIUrl":"10.1109/TPAMI.2025.3552132","url":null,"abstract":"Nine-degrees-of-freedom (9-DoF) object pose and size estimation is crucial for enabling augmented reality and robotic manipulation. Category-level methods have received extensive research attention due to their potential for generalization to intra-class unknown objects. However, these methods require manual collection and labeling of large-scale real-world training data. To address this problem, we introduce a diffusion-based paradigm for domain-generalized category-level 9-DoF object pose estimation. Our motivation is to leverage the latent generalization ability of the diffusion model to address the domain generalization challenge in object pose estimation. This entails training the model exclusively on rendered synthetic data to achieve generalization to real-world scenes. We propose an effective diffusion model to redefine 9-DoF object pose estimation from a generative perspective. Our model does not require any 3D shape priors during training or inference. By employing the Denoising Diffusion Implicit Model, we demonstrate that the reverse diffusion process can be executed in as few as 3 steps, achieving near real-time performance. Finally, we design a robotic grasping system comprising both hardware and software components. Through comprehensive experiments on two benchmark datasets and the real-world robotic system, we show that our method achieves state-of-the-art domain generalization performance.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 7","pages":"5520-5537"},"PeriodicalIF":0.0,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143660098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Human as Points: Explicit Point-Based 3D Human Reconstruction From Single-View RGB Images 人作为点:从单视图RGB图像中明确的基于点的3D人体重建。
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-03-18 DOI: 10.1109/TPAMI.2025.3552408
Yingzhi Tang;Qijian Zhang;Yebin Liu;Junhui Hou
{"title":"Human as Points: Explicit Point-Based 3D Human Reconstruction From Single-View RGB Images","authors":"Yingzhi Tang;Qijian Zhang;Yebin Liu;Junhui Hou","doi":"10.1109/TPAMI.2025.3552408","DOIUrl":"10.1109/TPAMI.2025.3552408","url":null,"abstract":"The latest trends in the research field of single-view human reconstruction are devoted to learning deep implicit functions constrained by explicit body shape priors. Despite the remarkable performance improvements compared with traditional processing pipelines, existing learning approaches still exhibit limitations in terms of <i>flexibility</i>, <i>generalizability</i>, <i>robustness</i>, and/or <i>representation capability</i>. To comprehensively address the above issues, in this paper, we investigate an explicit point-based human reconstruction framework named HaP, which utilizes point clouds as the intermediate representation of the target geometric structure. Technically, our approach features fully explicit point cloud estimation (exploiting depth and SMPL), manipulation (SMPL rectification), generation (built upon diffusion), and refinement (displacement learning and depth replacement) in the 3D geometric space, instead of an implicit learning process that can be ambiguous and less controllable. Extensive experiments demonstrate that our framework achieves quantitative performance improvements of 20<inline-formula><tex-math>$%$</tex-math></inline-formula> to 40<inline-formula><tex-math>$%$</tex-math></inline-formula> over current state-of-the-art methods, and better qualitative results. Our promising results may indicate a paradigm rollback to the <i>fully-explicit</i> and <i>geometry-centric</i> algorithm design. In addition, we newly contribute a real-scanned 3D human dataset featuring more intricate geometric details.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 7","pages":"5884-5900"},"PeriodicalIF":0.0,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143660143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Revisiting Stochastic Multi-Level Compositional Optimization 再论随机多级成分优化。
IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-03-18 DOI: 10.1109/TPAMI.2025.3552197
Wei Jiang;Sifan Yang;Yibo Wang;Tianbao Yang;Lijun Zhang
{"title":"Revisiting Stochastic Multi-Level Compositional Optimization","authors":"Wei Jiang;Sifan Yang;Yibo Wang;Tianbao Yang;Lijun Zhang","doi":"10.1109/TPAMI.2025.3552197","DOIUrl":"10.1109/TPAMI.2025.3552197","url":null,"abstract":"This paper explores stochastic multi-level compositional optimization, where the objective function is a composition of multiple smooth functions. Traditional methods for solving this problem suffer from either sub-optimal sample complexities or require huge batch sizes. To address these limitations, we introduce the Stochastic Multi-level Variance Reduction (SMVR) method. In the expectation case, our SMVR method attains the optimal sample complexity of <inline-formula><tex-math>$mathcal {O}(1/epsilon ^{3})$</tex-math></inline-formula> to find an <inline-formula><tex-math>$epsilon$</tex-math></inline-formula>-stationary point for non-convex objectives. When the function satisfies convexity or the Polyak-Łojasiewicz (PL) condition, we propose a stage-wise SMVR variant. This variant improves the sample complexity to <inline-formula><tex-math>$mathcal {O}(1/epsilon ^{2})$</tex-math></inline-formula> for convex functions and <inline-formula><tex-math>$mathcal {O}(1/(mu epsilon ))$</tex-math></inline-formula> for functions meeting the <inline-formula><tex-math>$mu$</tex-math></inline-formula>-PL condition or <inline-formula><tex-math>$mu$</tex-math></inline-formula>-strong convexity. These complexities match the lower bounds not only in terms of <inline-formula><tex-math>$epsilon$</tex-math></inline-formula> but also in terms of <inline-formula><tex-math>$mu$</tex-math></inline-formula> (for PL or strongly convex functions), without relying on large batch sizes in each iteration. Furthermore, in the finite-sum case, we develop the SMVR-FS algorithm, which can achieve a complexity of <inline-formula><tex-math>$mathcal {O}(sqrt{n}/epsilon ^{2})$</tex-math></inline-formula> for non-convex objectives, <inline-formula><tex-math>$mathcal {O}(sqrt{n}/epsilon log (1/epsilon ))$</tex-math></inline-formula> for convex functions and <inline-formula><tex-math>$mathcal {O}(sqrt{n}/mu log (1/epsilon ))$</tex-math></inline-formula> for objectives satisfying the <inline-formula><tex-math>$mu$</tex-math></inline-formula>-PL condition, where <inline-formula><tex-math>$n$</tex-math></inline-formula> denotes the number of functions in each level. To make use of adaptive learning rates, we propose the Adaptive SMVR method, which maintains the same complexities while demonstrating faster convergence in practice.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 7","pages":"5613-5624"},"PeriodicalIF":0.0,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143660145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信