NeurocomputingPub Date : 2025-03-26DOI: 10.1016/j.neucom.2025.130059
Minghao Mo , Weihai Lu , Qixiao Xie , Zikai Xiao , Xiang Lv , Hong Yang , Yanchun Zhang
{"title":"One multimodal plugin enhancing all: CLIP-based pre-training framework enhancing multimodal item representations in recommendation systems","authors":"Minghao Mo , Weihai Lu , Qixiao Xie , Zikai Xiao , Xiang Lv , Hong Yang , Yanchun Zhang","doi":"10.1016/j.neucom.2025.130059","DOIUrl":"10.1016/j.neucom.2025.130059","url":null,"abstract":"<div><div>With advances in multimodal pre-training, more efforts focus on integrating it into recommendation models. Current methods mainly focus on utilizing multimodal pre-training models to obtain multimodal representations of items and designing specific model architectures for downstream tasks. However, these methods often neglect the suitability of multimodal representations for recommendation systems since the pre-training is not conducted on recommendation datasets, making the directly obtained representations potentially suboptimal due to semantic biases from domain discrepancy and noise interference. Furthermore, collaborative information, a key element in recommendation systems, significantly impacts the effectiveness of recommendation models, but existing advanced multimodal pre-training models (e.g., CLIP) are unable to capture the collaborative information of items. To bridge the gap between multimodal pre-training models and recommendation systems, we propose a novel multimodal pre-training framework <strong>C</strong>LIP-based <strong>P</strong>re-training <strong>M</strong>ulti<strong>M</strong>odal (CPMM) item representations model for recommendation. First, the representations of images, text, and IDs are mapped to a new low-dimensional contrastive representation space for alignment and semantic enhancement, ensuring the consistency and robustness of the multimodal contrastive representation (MCR). A contrastive learning approach is designed to regulate the inter-modal distances, mitigating the impact of noise on recommendation performance. Finally, modeling of the first-order similarities of the items is conducted, thereby integrating the collaborative information of the items into the multimodal contrastive representations. Extensive experiments on Amazon benchmark datasets (Beauty, Toys, Tools) validate CPMM’s effectiveness across three core recommendation tasks: sequential recommendation, collaborative filtering, and click-through rate prediction.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"637 ","pages":"Article 130059"},"PeriodicalIF":5.5,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143739558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-03-25DOI: 10.1016/j.neucom.2025.129903
Ronghan Li , Dongdong Li , Haowen Yang , Xiaoxi Liu , Haoxiang Jin , RongCheng Pu , Qiguang Miao
{"title":"RECoT: Relation-enhanced Chains-of-Thoughts for knowledge-intensive multi-hop questions answering","authors":"Ronghan Li , Dongdong Li , Haowen Yang , Xiaoxi Liu , Haoxiang Jin , RongCheng Pu , Qiguang Miao","doi":"10.1016/j.neucom.2025.129903","DOIUrl":"10.1016/j.neucom.2025.129903","url":null,"abstract":"<div><div>Open Domain question answering is designed to enable a computer to understand and answer any question on a wide range of topics. The prevalent retrieval-reading paradigm helps large language models (LLMs) when retrieving relevant text from external knowledge sources using questions, however the multi-hop question answering approach based on Chains-of-Thoughts (CoT) may perform poorly when it comes to complex questions. This is because there can be errors in generating sentences at each hop, and these errors accumulate, leading to significant deviations in the final result. In order to solve the above problems, this paper first extracted the relational triples of complex problems. Next, triples are used to select the most representative sentence at each step during CoT generation as the query for the next-hop retrieval.</div><div>The RECoT with GPT-3 results in significant improvements with F1 score up 5.1 points in downstream QA on 2WikiMultihopQA datasets and up 2.9 points on HotpotQA datasets. In addition, improvements in results can be obtained even with smaller models such as Flan-T5-large without additional training. In conclusion, RECoT reduced model hallucination and accelerated more accurate CoT reasoning to guide retrieval to get improved results. Code is publicly available at <span><span>https://github.com/XD-BDIV-NLP/RECoT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"637 ","pages":"Article 129903"},"PeriodicalIF":5.5,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143739556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-03-25DOI: 10.1016/j.neucom.2025.130068
Xiyao Liu , Qingyu Dang , Huiyi Wang , Xiaoheng Deng , Xunli Fan , Cundian Yang , Zhihong Chen , Hui Fang
{"title":"An adversarial contrastive learning based cross-modality zero-watermarking scheme for DIBR 3D video copyright protection","authors":"Xiyao Liu , Qingyu Dang , Huiyi Wang , Xiaoheng Deng , Xunli Fan , Cundian Yang , Zhihong Chen , Hui Fang","doi":"10.1016/j.neucom.2025.130068","DOIUrl":"10.1016/j.neucom.2025.130068","url":null,"abstract":"<div><div>Copyright protection of depth image-based rendering (DIBR) videos has raised significant concerns due to their increasing popularity. Zero-watermarking, emerging as a powerful tool to protect the copyright of DIBR 3D videos, mainly relies on traditional feature extraction methods, thus necessitating improvements in robustness against complex geometric attacks and its ability to strike a balance between robustness and distinguishability. This paper presents a novel zero-watermarking scheme based on cross-modality feature fusion within a contrastive learning framework. Our approach integrates complementary information from 2D frames and depth maps using a cross-modality attention feature fusion mechanism to obtain discriminative features. Moreover, our features achieve a better trade-off between robustness and distinguishability by leveraging a designed contrastive learning strategy with an adversarial distortion simulator. Experimental results demonstrate our remarkable performance by reducing the false negative rates to around 0.2% when the false positive rate is equal to 0.5%, which is superior to the state-of-the-art zero-watermarking methods.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"637 ","pages":"Article 130068"},"PeriodicalIF":5.5,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143715510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-03-25DOI: 10.1016/j.neucom.2025.130070
Junlei Zhang , Hongliang He , Lizhi Ma , Nirui Song , Shuyuan He , Shuai Zhang , Huachuan Qiu , Zhanchao Zhou , Anqi Li , Yong Dai , Renjun Xu , Zhenzhong Lan
{"title":"ConceptPsy: A comprehensive benchmark suite for hierarchical psychological concept understanding in LLMs","authors":"Junlei Zhang , Hongliang He , Lizhi Ma , Nirui Song , Shuyuan He , Shuai Zhang , Huachuan Qiu , Zhanchao Zhou , Anqi Li , Yong Dai , Renjun Xu , Zhenzhong Lan","doi":"10.1016/j.neucom.2025.130070","DOIUrl":"10.1016/j.neucom.2025.130070","url":null,"abstract":"<div><div>The safe and efficient deployment of Large Language Models (LLMs) in psychology requires rigorous evaluation frameworks, which current benchmarks fail to provide. Current Massive Multitask Language Understanding (MMLU) benchmarks suffer from two critical limitations: (1) coarse-grained evaluation, lacking systematic assessments of conceptual understanding, and (2) low concept coverage rate, covering only a small fraction of essential psychological concepts. For example, in the widely-used MMLU benchmarks, CMMLU covers only 59% of the necessary chapter-level concepts in psychology, while the C-EVAL benchmark lacks a specific psychology category. Both fall short in their coverage of psychological concepts. We address these limitations through ConceptPsy, introducing three technical innovations: (1) fine-grained labeling for conceptual hierarchy evaluation, rather than only subject-level average scores, (2) a comprehensive benchmark spanning 12 core psychological subjects and 1,383 manually collected concepts, and (3) a novel concept coverage metric that identifies evaluation gaps. The evaluation results indicate that although some models achieve high average scores, they may underperform on specific concepts. For instance, <span>Qwen2.5-72B-Instruct</span> achieved an average performance of 88%, yet scored only around 70% on certain individual concepts. This fine-grained evaluation offers valuable insights for targeted refinement. We publish our data at: <span><span>https://huggingface.co/datasets/ConceptPsy/ConceptPsy</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"637 ","pages":"Article 130070"},"PeriodicalIF":5.5,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143734529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-03-25DOI: 10.1016/j.neucom.2025.130079
Jinshan Bian , Hongbing Xia , Jun Yi , Chaoxu Mu , Chenyi Si
{"title":"Finite-time neuro-optimal control for constrained nonlinear systems through reinforcement learning","authors":"Jinshan Bian , Hongbing Xia , Jun Yi , Chaoxu Mu , Chenyi Si","doi":"10.1016/j.neucom.2025.130079","DOIUrl":"10.1016/j.neucom.2025.130079","url":null,"abstract":"<div><div>This paper investigates the problem of reinforcement learning-based finite-time neuro-optimal control for constrained nonlinear systems with dynamic asymmetric constraints and input saturation. To begin with, based on the definition of finite time, an optimal backstepping technique is used to build the Sub-Actor-Critic framework, under which the tracking error and system variables are stays within the bounds imposed by the asymmetric time-varying constraint. Then, identifier neural networks are utilized to tackle the asymmetric saturation constraints and estimate unknown nonlinearities. Furthermore, the policy iteration algorithm is introduced to update each subsystem’s control input and value function, simultaneously solving the Hamilton–Jacobi–Bellman equation. In addition, all signals in the closed-loop system are guaranteed to be bounded within finite time via Lyapunov stability analysis. Finally, simulation results validate the proposed method’s effectiveness.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"637 ","pages":"Article 130079"},"PeriodicalIF":5.5,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143739551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-03-25DOI: 10.1016/j.neucom.2025.130038
Jose L. Gómez , Manuel Silva , Antonio Seoane , Agnés Borràs , Mario Noriega , German Ros , Jose A. Iglesias-Guitian , Antonio M. López
{"title":"All for one, and one for all: UrbanSyn Dataset, the third Musketeer of synthetic driving scenes","authors":"Jose L. Gómez , Manuel Silva , Antonio Seoane , Agnés Borràs , Mario Noriega , German Ros , Jose A. Iglesias-Guitian , Antonio M. López","doi":"10.1016/j.neucom.2025.130038","DOIUrl":"10.1016/j.neucom.2025.130038","url":null,"abstract":"<div><div>We introduce UrbanSyn, a photorealistic dataset acquired through semi-procedurally generated synthetic urban driving scenarios. Developed using high-quality geometry and materials, UrbanSyn provides pixel-level ground truth, including depth, semantic segmentation, and instance segmentation with object bounding boxes and occlusion degree. It complements GTAV and Synscapes datasets to form what we coin as the ’Three Musketeers’. We demonstrate the value of the Three Musketeers in unsupervised domain adaptation for image semantic segmentation. Results on real-world datasets, Cityscapes, Mapillary Vistas, and BDD100K, establish new benchmarks, largely attributed to UrbanSyn. We make UrbanSyn openly and freely accessible.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"637 ","pages":"Article 130038"},"PeriodicalIF":5.5,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143746308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-03-24DOI: 10.1016/j.neucom.2025.130063
Jiangnan Tang , Huanhuan Gu , Darko B. Vuković , Guandong Xu , Youquan Wang , Haicheng Tao , Jie Cao
{"title":"Fraud detection in multi-relation graph: Contrastive Learning on Feature and Structural Levels","authors":"Jiangnan Tang , Huanhuan Gu , Darko B. Vuković , Guandong Xu , Youquan Wang , Haicheng Tao , Jie Cao","doi":"10.1016/j.neucom.2025.130063","DOIUrl":"10.1016/j.neucom.2025.130063","url":null,"abstract":"<div><div>Fraud detection has emerged as a significant area of study, primarily due to its considerable impact on real-world applications. Despite the effectiveness of existing methods for fraud detection, they have not adequately addressed two key challenges: fraudulent camouflage and class imbalance. To tackle these challenges, we propose a novel model called Contrastive Learning on Feature and Structural Levels in Graph Neural Networks (CLFS-GNN) to effectively tackle these challenges. Our model incorporates an innovative neighbor nodes selection module that considers both feature and structural similarity between central nodes and their neighbor nodes, effectively reducing interference from fraudulent nodes by selecting highly similar neighbor nodes. Additionally, it employs an intra- and inter-graph message aggregation module with attention mechanisms to enhance the value of aggregated neighbor node information, thereby improving fraud detection performance. Furthermore, the algorithm incorporates contrastive learning to pull similar nodes closer and push dissimilar nodes further apart, mitigating class imbalance effects and achieving superior performance. Extensive experimental results show that this model outperforms the state-of-the-art GNN-based fraud detection on the Yelp and Amazon benchmark datasets.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"637 ","pages":"Article 130063"},"PeriodicalIF":5.5,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143725091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-03-24DOI: 10.1016/j.neucom.2025.130037
Michele Nazareth da Costa , Romis Attux , Andrzej Cichocki , João M.T. Romano
{"title":"Tensor-Train networks for learning predictive modeling of multidimensional data","authors":"Michele Nazareth da Costa , Romis Attux , Andrzej Cichocki , João M.T. Romano","doi":"10.1016/j.neucom.2025.130037","DOIUrl":"10.1016/j.neucom.2025.130037","url":null,"abstract":"<div><div>In this work, we firstly apply Tensor-Train (TT) networks to construct a compact representation of the classical Multilayer Perceptron, representing a reduction of up to 95% of the coefficients. A comparative analysis between tensor model and standard multilayer neural networks is also carried out in the context of prediction of the Mackey–Glass noisy chaotic time series and NASDAQ index. We show that the weights of a multidimensional regression model can be learned by means of TT network and the optimization of TT weights is more robust to the impact of coefficient initialization and hyper-parameter setting. Furthermore, an efficient algorithm based on alternating least squares has been proposed for approximating the weights in TT-format with a reduction of computational calculus, providing a much faster convergence than the well-known adaptive learning-method algorithms, widely applied for optimizing neural networks.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"637 ","pages":"Article 130037"},"PeriodicalIF":5.5,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143746309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-03-24DOI: 10.1016/j.neucom.2025.130031
Wing W.Y. Ng , Xuyu Liu , Xing Tian , Ting Wang , Jianjun Zhang , C.L. Philip Chen
{"title":"Broad hashing for image retrieval","authors":"Wing W.Y. Ng , Xuyu Liu , Xing Tian , Ting Wang , Jianjun Zhang , C.L. Philip Chen","doi":"10.1016/j.neucom.2025.130031","DOIUrl":"10.1016/j.neucom.2025.130031","url":null,"abstract":"<div><div>With the popularity of deep learning, deep hashing methods have become mainstream of hashing methods which adopt deep networks to learn better feature representation of images and simultaneously generate compact binary hash codes. Deep hashing methods have a bottleneck in training efficiency due to the complex structure of deep networks. In this work, we propose a broad hashing (BH) method with high retrieval performance and very short learning time. In BH, uncorrelated and balanced binary codes are assigned to each category through a Hadamard matrix. Then, a broad hashing network is constructed to learn hash functions which maps images to binary hash codes with high efficiency. Our method yields higher retrieval precision while its training time is 200 to 700 times faster than that of deep hashing methods. At the same time, more compact hash codes are obtained compared with conventional supervised learning methods. In addition, three incremental algorithms for BH are developed for dynamic environments, which enable the hash network to be remodeled without retraining. Experiments on three benchmark datasets validate the effectiveness and efficiency of BH.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"636 ","pages":"Article 130031"},"PeriodicalIF":5.5,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143706026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-03-24DOI: 10.1016/j.neucom.2025.130040
Xiongtao Zhang , Lijie Pan , Qing Shen , Zhenfang Liu , Jungang Lou , Yunliang Jiang
{"title":"Trend-aware spatio-temporal fusion graph convolutional network with self-attention for traffic prediction","authors":"Xiongtao Zhang , Lijie Pan , Qing Shen , Zhenfang Liu , Jungang Lou , Yunliang Jiang","doi":"10.1016/j.neucom.2025.130040","DOIUrl":"10.1016/j.neucom.2025.130040","url":null,"abstract":"<div><div>Current traffic prediction methods often extract insufficient road network information, have difficulty in mining long-term temporal dependencies, and cause model performance decline due to uneven data distribution. To address the issues above, we propose a novel Spatio-Temporal Fusion Graph Convolutional Network with Trend-Aware(STFTGCN) for traffic prediction. It consists of Spatio Temporal Embedding, Spatio-Temporal Synchronous Graph Convolution, Temporal-Attention Module, and Trend-Aware Forecasting. By constructing and thus aggregating Spatial Distance Graph, Road Connection Graph, Geographic Correlation Graph, and the proposed Lagged Correlation Graph, the hidden information in the road network is fully extracted. Then, Multi-layer Spatio-Temporal Synchronous Graph Convolution captures local spatio-temporal correlations, while a sandwich structure combined Temporal Self-Attention and Temporal Trend-Aware Multi-Head Self-Attention effectively extracts long-term dependencies and responds to local traffic fluctuations. The Trend-Aware transformations method overcome uneven data distribution, improving node relationship matching and capturing dynamic traffic changes. Experiments results on real-world datasets (PEMS03, PEMS04, PEMS07, PEMS08, PEMS-BAY and METR-LA) demonstrate that the proposed STFTGCN outperforms baseline models, validating its effectiveness and practicality.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"637 ","pages":"Article 130040"},"PeriodicalIF":5.5,"publicationDate":"2025-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143715506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}