Neurocomputing最新文献

筛选
英文 中文
ExpTamed: An exponential tamed optimizer based on Langevin SDEs ExpTamed:一个基于Langevin SDEs的指数驯服优化器
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2025-07-18 DOI: 10.1016/j.neucom.2025.130949
Utku Erdoğan , Şahin Işık , Yıldıray Anagün , Gabriel Lord
{"title":"ExpTamed: An exponential tamed optimizer based on Langevin SDEs","authors":"Utku Erdoğan ,&nbsp;Şahin Işık ,&nbsp;Yıldıray Anagün ,&nbsp;Gabriel Lord","doi":"10.1016/j.neucom.2025.130949","DOIUrl":"10.1016/j.neucom.2025.130949","url":null,"abstract":"<div><div>This study presents a new method to improve optimization by regularizing the gradients in deep learning methods based on a novel taming strategy to regulate the growth of numerical solutions for stochastic differential equations. The method, ExpTamed, enhances stability and reduces the mean-square error across a short time horizon in comparison to existing techniques. The practical effectiveness of ExpTamed is rigorously evaluated on CIFAR-10, Tiny-ImageNet, and Caltech256 across diverse architectures. In direct comparisons with prominent optimizers like Adam, ExpTamed demonstrates significant performance gains. Specifically, it achieved increases in best top-1 test accuracy ranging from 0.86 to 2.76 percentage points on CIFAR-10, and up to 4.46 percentage points on Tiny-ImageNet (without learning rate schedule). On Caltech256, ExpTamed also yielded superior accuracy, precision, and Kappa metrics. These results clearly quantify ExpTamed’s capability to deliver enhanced performance in practical deep learning applications.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 130949"},"PeriodicalIF":5.5,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144678902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MKE-PLLM: A benchmark for multilingual knowledge editing on pretrained large language model MKE-PLLM:基于预训练大语言模型的多语言知识编辑基准
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2025-07-18 DOI: 10.1016/j.neucom.2025.130979
Ran Song , Shengxiang Gao , Xiaofei Gao , Cunli Mao , Zhengtao Yu
{"title":"MKE-PLLM: A benchmark for multilingual knowledge editing on pretrained large language model","authors":"Ran Song ,&nbsp;Shengxiang Gao ,&nbsp;Xiaofei Gao ,&nbsp;Cunli Mao ,&nbsp;Zhengtao Yu","doi":"10.1016/j.neucom.2025.130979","DOIUrl":"10.1016/j.neucom.2025.130979","url":null,"abstract":"<div><div>Multilingual large language models have demonstrated remarkable performance across various downstream tasks but are still plagued by factuality errors. Knowledge editing aims to correct these errors by modifying the internal knowledge of pre-trained models. However, current knowledge editing methods primarily focus on monolingual settings, neglecting the complexities and interdependencies within multilingual scenarios. Furthermore, benchmarks specifically designed for multilingual knowledge editing are relatively scarce. Addressing this gap, this paper constructs a novel multilingual knowledge editing benchmark. This benchmark comprehensively evaluates methods for mLLMs based on accuracy, reliability, generalization, and consistency. To ensure the robustness and usability of the benchmark, we conducted detailed analysis and validation. Concurrently, we propose a baseline method that adapts existing monolingual knowledge editing techniques to the multilingual environment. Extensive experimental results demonstrate the effectiveness of our constructed benchmark in evaluating multilingual knowledge editing.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 130979"},"PeriodicalIF":5.5,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144686703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint super-resolution and inverse tone-mapping: A feature decomposition aggregation network and a new benchmark 联合超分辨和逆色调映射:一种特征分解聚合网络和新的基准
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2025-07-18 DOI: 10.1016/j.neucom.2025.131050
Gang Xu , Ao Shen , Yuchen Yang , Xiantong Zhen , Wei Chen , Jun Xu
{"title":"Joint super-resolution and inverse tone-mapping: A feature decomposition aggregation network and a new benchmark","authors":"Gang Xu ,&nbsp;Ao Shen ,&nbsp;Yuchen Yang ,&nbsp;Xiantong Zhen ,&nbsp;Wei Chen ,&nbsp;Jun Xu","doi":"10.1016/j.neucom.2025.131050","DOIUrl":"10.1016/j.neucom.2025.131050","url":null,"abstract":"<div><div>Joint Super-Resolution and Inverse Tone-Mapping (joint SR-ITM) aims to increase the resolution and dynamic range of low-resolution and standard dynamic range images. Recent networks mainly resort to image decomposition techniques with complex multi-branch architectures. However, the fixed decomposition techniques would largely restrict their power on versatile images. To exploit the potential power of decomposition mechanism, in this paper, we generalize it from the image domain to the broader feature domain. To this end, we propose a lightweight Feature Decomposition Aggregation Network (FDAN). In particular, we design a Feature Decomposition Block (FDB) to achieve learnable separation of detail and base feature maps, and develop a Hierarchical Feature Decomposition Group by cascading FDBs for powerful multi-level feature decomposition. Moreover, for better evaluation, we collect a large-scale dataset for joint SR-ITM, <em>i.e.</em>, SRITM-4K, which provides versatile scenarios for robust model training and evaluation. Experimental results on two benchmark datasets demonstrate that our FDAN is efficient and outperforms state-of-the-art methods on joint SR-ITM. The code of our FDAN and the SRITM-4K dataset are available at <span><span>https://github.com/CS-GangXu/FDAN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 131050"},"PeriodicalIF":5.5,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144686693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Emotion agent: Unsupervised deep reinforcement learning with distribution-prototype reward for continuous emotional EEG analysis 情绪主体:基于分布原型奖励的无监督深度强化学习用于连续情绪脑电图分析
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2025-07-17 DOI: 10.1016/j.neucom.2025.130951
Zhihao Zhou , Li Zhang , Qile Liu , Gan Huang , Zhuliang Yu , Zhen Liang
{"title":"Emotion agent: Unsupervised deep reinforcement learning with distribution-prototype reward for continuous emotional EEG analysis","authors":"Zhihao Zhou ,&nbsp;Li Zhang ,&nbsp;Qile Liu ,&nbsp;Gan Huang ,&nbsp;Zhuliang Yu ,&nbsp;Zhen Liang","doi":"10.1016/j.neucom.2025.130951","DOIUrl":"10.1016/j.neucom.2025.130951","url":null,"abstract":"<div><div>Continuous Electroencephalography (EEG) signals are widely employed in affective brain-computer interface (aBCI) applications. However, only a subset of the continuously acquired EEG data is truly relevant to emotional processing, while the remainder is often noisy or unrelated. Manual annotation of these key emotional segments is impractical due to their dynamic and individualized nature. To address this challenge, we propose a novel unsupervised deep reinforcement learning framework, termed <em>Emotion Agent</em>, which automatically identifies and extracts the most informative emotional segments from continuous EEG signals. Emotion Agent initially utilizes a heuristic algorithm to perform a global search and generate prototype representations of the EEG signals. These prototypes guide the exploration of the signal space and highlight regions of interest. Furthermore, we design a distribution-prototype-based reward function that evaluates the interaction between samples and prototypes to ensure that the selected segments are both representative and relevant to the underlying emotional states. Finally, the framework is trained using Proximal Policy Optimization (PPO) to achieve stable and efficient convergence. Experimental results on three widely used datasets (covering both discrete and dimensional emotion recognition) show an average improvement of 13.46 % when using the proposed Emotion Agent, demonstrating its significant enhancement of accuracy and robustness in downstream aBCI tasks.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"652 ","pages":"Article 130951"},"PeriodicalIF":5.5,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144704174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic T-distributed stochastic neighbor graph convolutional networks for multi-modal contrastive fusion 多模态对比融合的动态t分布随机邻居图卷积网络
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2025-07-17 DOI: 10.1016/j.neucom.2025.130950
Bo Xu , Guoxu Li , Jie Wang , Zheng Wang , Jianfu Cao , Rong Wang , Feiping Nie
{"title":"Dynamic T-distributed stochastic neighbor graph convolutional networks for multi-modal contrastive fusion","authors":"Bo Xu ,&nbsp;Guoxu Li ,&nbsp;Jie Wang ,&nbsp;Zheng Wang ,&nbsp;Jianfu Cao ,&nbsp;Rong Wang ,&nbsp;Feiping Nie","doi":"10.1016/j.neucom.2025.130950","DOIUrl":"10.1016/j.neucom.2025.130950","url":null,"abstract":"<div><div>As the continuous advancement of data acquisition technologies progresses, multi-modal data have emerged as a prominent focus in various domains. This paper aims to tackle critical challenges in the multi-modal fusion process, specifically in representation learning, modal consistency invariance learning, and model diversity complementarity learning, by employing graph convolutional networks and contrastive learning methods. Current GCN-based methods generally depend on predefined graphs for representation learning, limiting their capacity to capture local and global information effectively. Furthermore, some current models do not adequately compare the representations of consistency and diversity across different modalities during the fusion procedure. To address the identified challenges, we propose a novel T-distributed Stochastic Neighbor Contrastive Graph Convolutional Network (TSNGCN). It consists of the adaptive static graph learning module, the multi-modal representation learning module, and the multi-modal contrastive fusion module. The adaptive static graph learning module constructs graphs without relying on any predefined distance metrics, which creates a pairwise graph adaptively to preserve the local structure of general data. Moreover, a loss function based on T-distributed stochastic neighbor embedding is designed to learn the transformation between the embeddings and the original data, thus facilitating the exploration of more discriminative information within the learned subspace. In addition, the proposed multi-modal contrastive fusion module effectively maximizes the similarity of the same samples across different modalities while ensuring the distinction of dissimilar samples, thereby enhancing the model’s consistency objective. Extensive experiments conducted on several multi-modal benchmark datasets demonstrate the superiority and effectiveness of TSNGCN compared to existing methods.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"652 ","pages":"Article 130950"},"PeriodicalIF":5.5,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144704571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CTMEG: A continuous-time medical event generation model for clinical prediction of long-term disease progression CTMEG:用于临床预测长期疾病进展的连续时间医疗事件生成模型
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2025-07-17 DOI: 10.1016/j.neucom.2025.130999
Mengxuan Sun , Xuebing Yang , Jiayi Geng , Jinghao Niu , Chutong Wang , Chang Cui , Xiuyuan Chen , Wen Tang , Wensheng Zhang
{"title":"CTMEG: A continuous-time medical event generation model for clinical prediction of long-term disease progression","authors":"Mengxuan Sun ,&nbsp;Xuebing Yang ,&nbsp;Jiayi Geng ,&nbsp;Jinghao Niu ,&nbsp;Chutong Wang ,&nbsp;Chang Cui ,&nbsp;Xiuyuan Chen ,&nbsp;Wen Tang ,&nbsp;Wensheng Zhang","doi":"10.1016/j.neucom.2025.130999","DOIUrl":"10.1016/j.neucom.2025.130999","url":null,"abstract":"<div><div>Long-term health monitoring indicates patient’s disease progression which is critical in improving the quality of patient life and physician’s decision-making. Predictive models based on Electronic Health Records (EHRs) can offer substantial clinical support by alerting subsequent disease-associated adverse events. Effective disease progression modeling involves two subtasks: 1) estimation of disease-associated event occurrence times 2) classification of occurred event types Recent time-aware disease predictive models, mainly based on recurrent neural networks or attention networks, specialize in future disease type prediction by accounting for the temporal irregularities in EHRs. This paper focuses on multi-step continuous-time disease prediction, which is more challenging as predictive models can easily fall into task conflicts between subtasks. We propose a multi-task disentangled Continuous-Time Medical Event Generation (CTMEG) model to simultaneously tackle the two subtasks. Unlike conventional continuous-time models, CTMEG encodes multi-view historical medical events and then simultaneously predicts multi-step disease types and occurrence times. First, a discrete Conditional Intensity Function (CIF) is designed to better estimate the disease occurrence time with limited available data. Second, to reduce task conflicts, a gated network is proposed to disentangle the rough patient representation into task-specific representations. Finally, we utilize a tailored CIF attention module to reduce error accumulation during the prediction process. Extensive experiments on the eICU and BFH databases demonstrate that the proposed CTMEG outperforms twelve competing models in long-term disease progression prediction. Our codes are available on github.<span><span><sup>2</sup></span></span></div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 130999"},"PeriodicalIF":5.5,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144703808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DCB-VIM: An ensemble learning based filter method for feature selection with imbalanced class distribution DCB-VIM:一种基于集成学习的类分布不平衡特征选择滤波方法
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2025-07-17 DOI: 10.1016/j.neucom.2025.130848
Nayiri Galestian Pour , Soudabeh Shemehsavar
{"title":"DCB-VIM: An ensemble learning based filter method for feature selection with imbalanced class distribution","authors":"Nayiri Galestian Pour ,&nbsp;Soudabeh Shemehsavar","doi":"10.1016/j.neucom.2025.130848","DOIUrl":"10.1016/j.neucom.2025.130848","url":null,"abstract":"<div><div>Feature selection aims to improve predictive performance and interpretability in the analysis of datasets with high dimensional feature spaces. Imbalanced class distribution can make the process of feature selection more severe. Robust methodologies are essential for dealing with this case. Therefore, we present a filter method based on ensemble learning, in which each classifier is built on randomly selected subspaces of features. Variable importance measure is computed based on a class-wise procedure within each classifier, and a feature weighting procedure is subsequently applied. The performance of classifiers is considered in the combination phase of the ensemble learning. Different choices of hyperparameters consisting of the subspace size and the number of classification trees are investigated through simulation studies for determining their effects on the predictive performance. The efficiency of the proposed method is evaluated with respect to predictive performance by different selection strategies based on real data analysis in the presence of class imbalance.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 130848"},"PeriodicalIF":5.5,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144686704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SCR: A completion-then-reasoning framework for multi-hop question answering over incomplete knowledge graph 基于不完全知识图的多跳问答的补全-推理框架
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2025-07-16 DOI: 10.1016/j.neucom.2025.131027
Ridong Han , Jia Liu , Haijia Bi , Tao Peng , Lu Liu
{"title":"SCR: A completion-then-reasoning framework for multi-hop question answering over incomplete knowledge graph","authors":"Ridong Han ,&nbsp;Jia Liu ,&nbsp;Haijia Bi ,&nbsp;Tao Peng ,&nbsp;Lu Liu","doi":"10.1016/j.neucom.2025.131027","DOIUrl":"10.1016/j.neucom.2025.131027","url":null,"abstract":"<div><div>Reinforcement learning has become the widely adopted technique for multi-hop knowledge graph question answering task thanks to its excellent interpretability in reasoning process. However, it is severely affected by the incompleteness of knowledge graphs and the sparse rewards caused by weak supervision. In this paper, we propose a completion-then-reasoning framework, called SCR, to address these two issues. For the incompleteness of knowledge graphs, we first extract a subgraph from the given knowledge graph for a given question, and use the knowledge graph embedding model to predict and complete missing triples, followed by reinforcement learning for answer reasoning on the completed subgraph. To alleviate the sparse rewards in reinforcement learning, we introduce a semantic reward based on the semantic similarity between original question and full relational path, enabling the model to receive partial rewards for partially correct paths instead of a zero reward. Detailed experiments on PQ, PQL, MetaQA, and WebQSP datasets demonstrate that our SCR model effectively improves the performance of multi-hop knowledge graph question answering task. Particularly, under sparse KG setting, SCR model outperforms baselines by a large margin, highlighting the effectiveness of completion-then-reasoning framework in mitigating the incompleteness of knowledge graphs.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 131027"},"PeriodicalIF":5.5,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144686694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MarIns3D: An open-vocabulary 3D instance segmentation model with mask refinement MarIns3D:一个开放词汇的3D实例分割模型
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2025-07-16 DOI: 10.1016/j.neucom.2025.131018
Haiyang Li, Jinhe Su, Dong Zhou, Mengyun Cao
{"title":"MarIns3D: An open-vocabulary 3D instance segmentation model with mask refinement","authors":"Haiyang Li,&nbsp;Jinhe Su,&nbsp;Dong Zhou,&nbsp;Mengyun Cao","doi":"10.1016/j.neucom.2025.131018","DOIUrl":"10.1016/j.neucom.2025.131018","url":null,"abstract":"<div><div>Open-vocabulary 3D instance segmentation has gained significant attention due to its potential role in scene perception. Existing methods typically involve two stages: generating class-agnostic 3D instance masks using segmentation models, followed by semantic classification of these masks. However, poor classification performance often stems from low-quality masks in the first stage. This paper proposes two key components to optimize the mask generation process: a dynamic offset module and a projection consistency loss. By dynamically adjusting sampling point positions, query points can capture key scene features to generate high-quality masks. Then the projection consistency loss compares these 3D instance masks with ground truth in 2D projections to refine them, improving segmentation accuracy. Experimental results on the ScanNetV2 validation set show that MarIns3D outperforms SOLE on zero-shot segmentation, with a 1.8 % and 1.7 % improvement in AP25 and AP50, respectively, and also demonstrates enhanced open-set segmentation capabilities. These results confirm our model’s superior mask quality and segmentation performance. Furthermore, ablation studies verify that the synergy between the dynamic offset module and the projection consistency loss is crucial for these enhancements.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 131018"},"PeriodicalIF":5.5,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144672040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
M3DP: Optimizing 2D vision tasks with minimal 3D object information M3DP:用最少的3D对象信息优化2D视觉任务
IF 6.5 2区 计算机科学
Neurocomputing Pub Date : 2025-07-16 DOI: 10.1016/j.neucom.2025.130905
Ziming Wang , Yanjing Li , Linlin Yang , Xinkai Liang , Xianbin Cao , Qi Wang , Baochang Zhang
{"title":"M3DP: Optimizing 2D vision tasks with minimal 3D object information","authors":"Ziming Wang ,&nbsp;Yanjing Li ,&nbsp;Linlin Yang ,&nbsp;Xinkai Liang ,&nbsp;Xianbin Cao ,&nbsp;Qi Wang ,&nbsp;Baochang Zhang","doi":"10.1016/j.neucom.2025.130905","DOIUrl":"10.1016/j.neucom.2025.130905","url":null,"abstract":"<div><div>The availability of 3D data (such as LiDAR, Radar, and other point-cloud data) enriches the exploration of leveraging 3D priors for specific 2D downstream tasks. However, existing methods require aligned 3D data annotation for 2D data, which can be time-consuming and labor-intensive. To reduce this reliance, we propose Minimal 3D object information Prior (M3DP) for 2D vision feature learning using unaligned 3D data as prior. Specifically, M3DP only requires 3D objects and their corresponding classification labels within the same dataset to learn 3D priors, which greatly saves time and labor costs. Moreover, we introduce multiview rotation augmentation (MRA) and two alignments—K-Best-of-N alignment and instance alignment—to encourage 2D representation learning from 3D representations in the unified 2D–3D space. This approach helps the model learn geometric properties. This strategy fully leverages the multi-view geometric information of 3D objects, enabling precise localization and matching in 2D images. Through instance alignment, our method also efficiently facilitates information transfer across different categories of 3D objects. Our method effectively enhances the performance of 2D tasks by learning the geometric properties of 3D objects. Extensive experiments on three datasets demonstrate the superiority of our approach over prior state-of-the-art methods.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"652 ","pages":"Article 130905"},"PeriodicalIF":6.5,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144723934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信