NeurocomputingPub Date : 2025-03-18DOI: 10.1016/j.neucom.2025.130003
Ziqing Huang , Hao Liu , Zhihao Jia , Shuo Zhang , Yonghua Zhang , Shiguang Liu
{"title":"Texture dominated no-reference quality assessment for high resolution image by multi-scale mechanism","authors":"Ziqing Huang , Hao Liu , Zhihao Jia , Shuo Zhang , Yonghua Zhang , Shiguang Liu","doi":"10.1016/j.neucom.2025.130003","DOIUrl":"10.1016/j.neucom.2025.130003","url":null,"abstract":"<div><div>With the rapid development of new media formats, various high-definition display devices are ubiquitous, and high-resolution (HR) images are essential for high-quality visual experiences. Quality assessment of HR images has become an urgent challenge. However, conventional image quality assessment (IQA) methods with good performance are designed for low-resolution (LR) images, which lacks the perceptual characteristics of HR images, resulting in difficult to achieve satisfactory subjective consistency. Moreover, huge computational costs would have to be consumed when applying those deep neural networks in LR-IQA directly to HR images. Inspired by the fact that regions with rich textures are more sensitive to distortion than others, texture dominated no-reference image quality assessment for HR images are proposed in this paper. Specifically, a dual branch network based on multi-scale technology was designed to extract texture and semantic features separately, and cross scale and dual dimensional attention were introduced to ensure the dominance of texture features. Then, multi-layer perception network is used to map the extracted quality perception feature vectors to the predicted quality score. Worthy of note is that local entropy has been calculated and representative blocks are cropped as inputs to the feature extraction network, greatly reducing computational complexity. Overall, the texture dominated high-resolution IQA network (TD-HRNet) proposed utilizes a reference free method, while could perform excellently on HR datasets of different sizes, image types, and distortion types, accurately predicting the quality of different types of HR images.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"636 ","pages":"Article 130003"},"PeriodicalIF":5.5,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143706021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-03-18DOI: 10.1016/j.neucom.2025.129925
Enhao Chen , Hao Wang , Zhanglei Shi , Wei Zhang
{"title":"Channel pruning for convolutional neural networks using l0-norm constraints","authors":"Enhao Chen , Hao Wang , Zhanglei Shi , Wei Zhang","doi":"10.1016/j.neucom.2025.129925","DOIUrl":"10.1016/j.neucom.2025.129925","url":null,"abstract":"<div><div>Channel pruning can effectively reduce the size and inference time of Convolutional Neural Networks (CNNs). However, existing channel pruning methods still face several issues, including high computational costs, extensive manual intervention, difficulty in hyperparameter tuning, and challenges in directly controlling the sparsity. To address these issues, this paper proposes two channel pruning methods based on <span><math><msub><mrow><mi>l</mi></mrow><mrow><mn>0</mn></mrow></msub></math></span>-norm sparse optimization: the <span><math><msub><mrow><mi>l</mi></mrow><mrow><mn>0</mn></mrow></msub></math></span>-norm Pruner and the Automated <span><math><msub><mrow><mi>l</mi></mrow><mrow><mn>0</mn></mrow></msub></math></span>-norm Pruner. The <span><math><msub><mrow><mi>l</mi></mrow><mrow><mn>0</mn></mrow></msub></math></span>-norm Pruner formulates the channel pruning problem as a sparse optimization problem involving the <span><math><msub><mrow><mi>l</mi></mrow><mrow><mn>0</mn></mrow></msub></math></span>-norm and achieves a fast solution through a series of approximations and transformations. Inspired by this solution process, we devise the Zero-Norm (ZN) module, which can autonomously select output channels for each layer based on a predefined global pruning ratio. This approach incurs low computational cost and allows for precise control over the overall pruning ratio. Furthermore, to further enhance the performance of the pruned model, we have developed the Automated <span><math><msub><mrow><mi>l</mi></mrow><mrow><mn>0</mn></mrow></msub></math></span>-norm Pruner. This method utilizes a Bee Colony Optimization algorithm to adjust the pruning ratio, mitigating the negative impact of manually preset pruning ratios on model performance. Our experiments demonstrate that the proposed pruning methods outperform several state-of-the-art techniques. The source code for our proposed methods is available at: <span><span>https://github.com/TCCofWANG/l0_prune</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"636 ","pages":"Article 129925"},"PeriodicalIF":5.5,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143687916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-03-18DOI: 10.1016/j.neucom.2025.129891
Jingting Li , Su-Jing Wang , Yong Wang , Haoliang Zhou , Xiaolan Fu
{"title":"Parallel Spatiotemporal Network to recognize micro-expression","authors":"Jingting Li , Su-Jing Wang , Yong Wang , Haoliang Zhou , Xiaolan Fu","doi":"10.1016/j.neucom.2025.129891","DOIUrl":"10.1016/j.neucom.2025.129891","url":null,"abstract":"<div><div>Micro-expressions are fleeting spontaneous facial expressions that commonly occur in high-stakes scenarios and reflect humans’ mental states. Thus, it is one of the crucial clues for lie detection. Furthermore, due to the brief duration of micro-expression, temporal information is important for micro-expression recognition. The paper proposes a Parallel Spatiotemporal Network (PSN) to recognize micro-expression. The proposed PSN includes a spatial sub-network and a temporal sub-network. The spatial sub-network is a shallow network with subtle motion information as the input. And the temporal sub-network is a network with a novel temporal feature extraction unit that extracts sparse temporal features of micro-expressions. Finally, we propose an element-wise addition with 1 × 1 convolutional kernel fusion model to fuse the spatial and temporal features. The proposed PSN gets better measurement metrics (such as recognition rate, F1 score, true positive rate, and true negative rate) than the other state-of-the-art methods on the consisted databases consisting of CASME, CASME II, CAS(ME)<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>, and SAMM.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"636 ","pages":"Article 129891"},"PeriodicalIF":5.5,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143687915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-03-18DOI: 10.1016/j.neucom.2025.129897
Yanzhe Wang, Yizhen Wang, Baoqun Yin
{"title":"DCFT: Dependency-aware continual learning fine-tuning for sparse LLMs","authors":"Yanzhe Wang, Yizhen Wang, Baoqun Yin","doi":"10.1016/j.neucom.2025.129897","DOIUrl":"10.1016/j.neucom.2025.129897","url":null,"abstract":"<div><div>As the size of Large Language Models (LLMs) increasing, they exhibit enhanced capabilities in general intelligence but also present greater challenges in deployment. Consequently, compressing LLMs has become critically important. Among the various compression techniques, post-training pruning is highly favored by researchers due to its efficiency. However, this one-shot pruning approach often results in a significant deterioration of model performance. To mitigate this issue, we introduce Dependency-aware Continual learning Fine-Tuning (DCFT) for sparse LLMs. This method facilitates fine-tuning across sequential tasks without compromising the model’s sparsity. Initially, we revisit the inference process in LLMs from a novel perspective, treating two matrices that previously required independent optimization as a unified entity. This strategy involves introduces merely 0.011‰ additional parameters to achieve efficient fine-tuning. Furthermore, we re-evaluate the parameter fine-tuning process through the lens of matrix space mapping. By constraining the similarity of the mapping matrices, our approach enables the model to retain its performance on prior tasks while learning new ones. We tested our method on models from the LLaMA-V1/V2 families, with parameters ranging from 7B to 70B, and under various sparsity ratios and patterns (unstructured and N:M sparsity). The results consistently demonstrate outstanding performance.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"636 ","pages":"Article 129897"},"PeriodicalIF":5.5,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143706027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-03-18DOI: 10.1016/j.neucom.2025.130018
Yang Yu , Jiahao Wang , Weide Liu , Ivan Ho Mien , Pavitra Krishnaswamy , Xulei Yang , Jun Cheng
{"title":"Multimodal multitask similarity learning for vision language model on radiological images and reports","authors":"Yang Yu , Jiahao Wang , Weide Liu , Ivan Ho Mien , Pavitra Krishnaswamy , Xulei Yang , Jun Cheng","doi":"10.1016/j.neucom.2025.130018","DOIUrl":"10.1016/j.neucom.2025.130018","url":null,"abstract":"<div><div>In recent years, large-scale Vision-Language Models (VLM) have shown promise in learning general representations for various medical image analysis tasks. However, current medical VLM methods typically employ contrastive learning approaches that have limited ability to capture nuanced yet crucial medical knowledge, particularly within similar medical images, and do not explicitly consider the uneven and complementary semantic information contained in different modalities. To address these challenges, we propose a novel Multimodal Multitask Similarity Learning (M2SL) method that learns joint representations of image–text pairs and captures the relational similarity between different modalities via a coupling network. Our method also notably leverages the rich information in the text inputs to construct a knowledge-driven semantic similarity matrix as the supervision signal. We conduct extensive experiments for cross-modal retrieval and zero-shot classification tasks on radiological images and reports and demonstrate substantial performance gains over existing methods. Our method also accommodates low-resource settings with limited training data availability and has significant implications for enhancing VLM development.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"636 ","pages":"Article 130018"},"PeriodicalIF":5.5,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143687111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-03-18DOI: 10.1016/j.neucom.2025.129977
Ruyue Yang , Ding Wang , Menghua Li , Chengyu Cui , Junfei Qiao
{"title":"Enhancing offline reinforcement learning for wastewater treatment via transition filter and prioritized approximation loss","authors":"Ruyue Yang , Ding Wang , Menghua Li , Chengyu Cui , Junfei Qiao","doi":"10.1016/j.neucom.2025.129977","DOIUrl":"10.1016/j.neucom.2025.129977","url":null,"abstract":"<div><div>Wastewater treatment plays a crucial role in urban society, requiring efficient control strategies to optimize its performance. In this paper, we propose an enhanced offline reinforcement learning (RL) approach for wastewater treatment. Our algorithm improves the learning process. It uses a transition filter to sort out low-performance transitions and employs prioritized approximation loss to achieve prioritized experience replay with uniformly sampled loss. Additionally, the variational autoencoder is introduced to address the problem of distribution shift in offline RL. The proposed approach is evaluated on a nonlinear system and wastewater treatment simulation platform, demonstrating its effectiveness in achieving optimal control. The contributions of this paper include the development of an improved offline RL algorithm for wastewater treatment and the integration of transition filtering and prioritized approximation loss. Evaluation results demonstrate that the proposed algorithm achieves lower tracking error and cost.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"636 ","pages":"Article 129977"},"PeriodicalIF":5.5,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143687113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-03-18DOI: 10.1016/j.neucom.2025.129964
Yang Li, Guoxiang Tong
{"title":"Multi-level feature splicing 3D network based on multi-task joint learning for video anomaly detection","authors":"Yang Li, Guoxiang Tong","doi":"10.1016/j.neucom.2025.129964","DOIUrl":"10.1016/j.neucom.2025.129964","url":null,"abstract":"<div><div>In video anomaly detection research, deep learning is dedicated to identifying anomalous events accurately and efficiently. However, due to the scarcity and diversity of anomaly samples, previous methods have not adequately taken into account important information about location and timing. In addition, the overpowered generalization ability of the models leads to the fact that anomalies can also be well reconstructed or predicted. To address the above challenges, we propose a 3D network based on multi-level feature splicing with joint multi-task learning. The network is improved by the autoencoder (AE) as a backbone network. Firstly, we design a normal sample training task and a Gaussian noise task from a spatial perspective to enhance the reconstruction of positive samples. The frame-skipping task and the inverse sequence task of the video are designed from the temporal perspective to suppress the reconstruction ability of negative samples. Secondly, we use multi-level feature splicing in the encoding and decoding process to equip the network with the ability to explore sufficient information from the full scale. At the same time, we use an attention gating module to filter redundant features. The results show that our network is competitive with state-of-the-art methods. In terms of AUC, UCSD Ped2 achieves 99.3%, CUHK Avenue achieves 88.4%, and ShanghaiTech Campus achieves 74.2%.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"636 ","pages":"Article 129964"},"PeriodicalIF":5.5,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143686896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-03-17DOI: 10.1016/j.neucom.2025.130013
Zihao Lu, Junli Wang, Changjun Jiang
{"title":"Domain-wise knowledge decoupling for personalized federated learning via Radon transform","authors":"Zihao Lu, Junli Wang, Changjun Jiang","doi":"10.1016/j.neucom.2025.130013","DOIUrl":"10.1016/j.neucom.2025.130013","url":null,"abstract":"<div><div>Personalized federated learning (pFL) customizes local models to address heterogeneous data across clients. One prominent research direction in pFL is model decoupling, where the knowledge of a global model is selectively utilized to assist local model personalization. Prior studies primarily use decoupled global-model parameters to convey this selected knowledge. However, due to the task-related knowledge-mixing nature of deep learning models, using these parameters may introduce irrelevant knowledge to specific clients, impeding personalization. To address this, we propose a domain-wise knowledge decoupling approach (pFedDKD), which decouples global-model knowledge into diverse projection segments in the representation space, meeting the specific needs of clients on heterogeneous local domains. A Radon transform-based method is provided to facilitate this decoupling, enabling clients to extract relevant knowledge segments for personalization. Besides, we provide a distillation-based back-projection learning method to fuse local-model knowledge into the global model, ensuring the updated global-model knowledge remains decouplable by projection. A theoretical analysis confirms that our approach improves generalization. Extensive experiments on four datasets demonstrate that pFedDKD consistently outperforms eleven state-of-the-art baselines, achieving an average improvement of 1.21% in test accuracy over the best-performing baseline.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"635 ","pages":"Article 130013"},"PeriodicalIF":5.5,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143644076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-03-17DOI: 10.1016/j.neucom.2025.129979
Yuze Yang, Jiahang Liu, Yangyu Fu, Yue Ni, Yan Xu
{"title":"Shading- and geometry-aware lighting calibration network for uncalibrated photometric stereo","authors":"Yuze Yang, Jiahang Liu, Yangyu Fu, Yue Ni, Yan Xu","doi":"10.1016/j.neucom.2025.129979","DOIUrl":"10.1016/j.neucom.2025.129979","url":null,"abstract":"<div><div>Three-dimensional measurement provides essential geometric information for fault diagnosis and product optimization in intelligent manufacturing applications. Photometric stereo is a non-destructive 3D measurement technique that estimates the surface normals of objects using shading cues from images under different lighting conditions. However, the generalized bas-relief (GBR) ambiguity caused by unknown or varying lighting will significantly decrease measurement accuracy. To address this issue, we propose a shading- and geometry-aware lighting calibration network (SGLC-Net) to mitigate the inherent ambiguity and enhance surface normal estimation in uncalibrated photometric stereo by generating accurate lighting information. The proposed method iteratively optimizes lighting direction and intensity by leveraging self-generated shading and normal prior features. To further improve the accuracy of the lighting estimation, we introduce collocated light into SGLC-Net to implicitly extract shading features of images to generate accurate rough lighting. Accurate rough lighting can generate accurate shading and normal prior features, which can be used to optimize rough lighting to generate fine lighting. Experimental results indicate that the proposed method significantly outperforms most uncalibrated photometric stereo methods in lighting estimation on multiple real-world datasets. Furthermore, our method can seamlessly integrate with most uncalibrated photometric stereo methods to effectively enhance the accuracy of the surface normal estimation under unknown illumination.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"636 ","pages":"Article 129979"},"PeriodicalIF":5.5,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143687913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-03-17DOI: 10.1016/j.neucom.2025.130025
Zhenjiang Du , Yan Zhang , Zhitao Liu , Guan Wang , Zeyu Ma , Ning Xie , Yang Yang
{"title":"SGCDiff: Sketch-Guided Cross-modal Diffusion Model for 3D shape completion","authors":"Zhenjiang Du , Yan Zhang , Zhitao Liu , Guan Wang , Zeyu Ma , Ning Xie , Yang Yang","doi":"10.1016/j.neucom.2025.130025","DOIUrl":"10.1016/j.neucom.2025.130025","url":null,"abstract":"<div><div>Shape completion aims to generate complete shapes based on partial observations. Most recent methods utilize existing information on 3D shapes for shape completion tasks, such as inputting a partial 3D shape into an encoder–decoder structure to obtain a complete 3D shape. Despite the recent rapid evolution of neural networks greatly improving the completion performance of 3D shapes, they usually generate deterministic results. However, the completed shape is inherently diverse, leading to the concept of multimodal shape completion, in which a single partial shape can correspond to multiple plausible complete shapes. Existing multimodal shape completion methods are typically unpredictable, which results in the generated complete shapes exhibiting randomness. To address the challenge of achieving a guided generation process for multimodal shape completion, we propose a novel sketch-based diffusion model. Our key designs encompass the following. We propose a novel diffusion-based framework that employs sketches as guidance to generate complete 3D shapes. Within the framework, we introduce a dual cross-modal attention module that ensures the generated results retain sufficient geometric detail. Experimental results indicate that our approach not only facilitates multimodal shape completion based on sketches but also achieves competitive performance in deterministic shape completion.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"636 ","pages":"Article 130025"},"PeriodicalIF":5.5,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143706082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}