NeurocomputingPub Date : 2025-07-15DOI: 10.1016/j.neucom.2025.131009
Hwichang Jeong , Insung Kong , Yongdai Kim
{"title":"Learning deep generative models based on binomial log-likelihood","authors":"Hwichang Jeong , Insung Kong , Yongdai Kim","doi":"10.1016/j.neucom.2025.131009","DOIUrl":"10.1016/j.neucom.2025.131009","url":null,"abstract":"<div><div>Likelihood-based learning algorithms for deep generative models mostly use the Gaussian log-likelihood. One notable exception is the binomial log-likelihood used in the Wasserstein autoencoder; however, it is not commonly used in practice because it does not generalize well. In this paper, we reconsider the binomial log-likelihood for learning deep generative models and study its theoretical properties. We propose two modifications to the original binomial log-likelihood and derive the convergence rates of the corresponding maximum likelihood estimators. These theoretical results explain why the original binomial log-likelihood performs poorly. In addition, motivated by the modified binomial log-likelihood, we propose a parametric heterogeneous Gaussian log-likelihood, which is novel in learning deep generative models. By analyzing various benchmark image datasets, we show that the proposed parametric heterogeneous Gaussian log-likelihood outperforms the standard homogeneous Gaussian log-likelihood. Additionally, we provide several pieces of evidence to explain why the proposed heterogeneous Gaussian log-likelihood works better than others.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 131009"},"PeriodicalIF":5.5,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144686695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-07-14DOI: 10.1016/j.neucom.2025.130996
Bo Liu , Boxu Zhou , Yanshan Xiao , Zhitong Wang , Baoqing Li , Shengxin He , Chenlong Ye , Fan Cao
{"title":"SMT-DL: A semi-supervised multi-task learning framework based on dictionary learning for robust feature sharing","authors":"Bo Liu , Boxu Zhou , Yanshan Xiao , Zhitong Wang , Baoqing Li , Shengxin He , Chenlong Ye , Fan Cao","doi":"10.1016/j.neucom.2025.130996","DOIUrl":"10.1016/j.neucom.2025.130996","url":null,"abstract":"<div><div>Multi-task learning (MTL) leverages shared representations across related tasks, facilitating the utilization of latent data information and enhancing classification performance. In complex learning scenarios, two major challenges frequently arise: limited labeled data and inefficient cross-task knowledge transfer. To address these issues, we propose a semi-supervised multi-task learning (SMT-DL) method based on dictionary learning. Specifically, we establish a dual dictionary architecture by innovatively combining dictionary learning with a multi-task coordination mechanism: (1) a semi-supervised synthetic dictionary that jointly reconstructs labeled and unlabeled data to capture potential cross-task features, and (2) an analytical dictionary that aligns sparse representations with discriminative decision boundaries. The framework incorporates three technical innovations: a block sparse regularization scheme that enforces feature sharing across tasks, a dual-space reconstruction mechanism that separates task-specific and shared representations, and a cross-task support vector synchronization strategy. In addition, we rigorously demonstrate the convergence of the proposed optimization algorithm. Extensive experimental results validate that the proposed SMT-DL approach outperforms existing methods in terms of robustness and classification performance.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 130996"},"PeriodicalIF":5.5,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144657271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A review of object tracking based on deep learning","authors":"Guochen Zhao , Fanyong Meng , Chengzhuan Yang , Hui Wei , Dawei Zhang , Zhonglong Zheng","doi":"10.1016/j.neucom.2025.130988","DOIUrl":"10.1016/j.neucom.2025.130988","url":null,"abstract":"<div><div>The rapid advancement of deep learning has led to a surge in the development of object-tracking algorithms. Given the diverse objectives, backbone networks, and application methodologies, this study aims to integrate the prevalent tracking approaches comprehensively. We propose a systematic classification scheme based on application scenarios and primary methods, accompanied by a thorough analysis and concise summaries of each category. This approach provides a broader coverage of tracking techniques, facilitating a quicker understanding of the domain for novice researchers. In addition, we present standardized evaluation metrics and widely used datasets, including cross-method performance comparisons of selected algorithms on identical benchmarks to enhance the reader’s contextual understanding. Finally, we offer a critical assessment of current limitations, practical recommendations, and forward-looking perspectives to guide future research directions.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 130988"},"PeriodicalIF":5.5,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144678901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel pain sentiment detection system utilizing a PainCapsule model and textual facial patterns","authors":"Anay Ghosh , Saiyed Umer , Bibhas Chandra Dhara , Deepak Kumar Jain , Ranjeet Kumar Rout , Amir Hussain","doi":"10.1016/j.neucom.2025.130907","DOIUrl":"10.1016/j.neucom.2025.130907","url":null,"abstract":"<div><div>Patient sentiment analysis establishes an intricate relationship between pain management and sentiment analysis in delivering high-quality medical care. This work presents an efficient pain sentiment recognition system within a smart healthcare framework designed to assess patients’ pain levels by analyzing their facial expressions. The proposed system is implemented in four distinct phases. First, facial regions are detected using efficient face-detection techniques. In the second phase, the extracted facial regions undergo feature computation using advancements in deep learning techniques, including end-to-end and pre-trained convolutional neural networks (<span><math><mrow><mi>C</mi><mi>N</mi><mi>N</mi></mrow></math></span>) to capture complex and discriminative facial features associated with pain emotions. In the third phase, a novel <span><math><mrow><mi>P</mi><mi>a</mi><mi>i</mi><mi>n</mi><mi>C</mi><mi>a</mi><mi>p</mi><mi>s</mi><mi>u</mi><mi>l</mi><mi>e</mi></mrow></math></span> model is introduced, which evaluates pain intensity by analyzing both macro- and microfacial expressions. This phase also employs attention networks, feature tuning, and transfer learning techniques to optimize the system’s performance. Finally, in the fourth phase, score fusion techniques are applied to the deep pain recognition models to enhance accuracy and robustness further. The system’s effectiveness is rigorously evaluated using two benchmark video datasets: the BioVid Heat Pain Dataset and the Multimodal Intensity Pain (MIntPAIN) database. Extensive experiments and comparative analysis with existing state-of-the-art methods reveal that the proposed system achieves an F1-score of 65.51% for BioVid and 58.31% for MIntPAIN datasets, outperforming other pain recognition systems, demonstrating its potential to advance pain sentiment recognition within smart healthcare frameworks.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"652 ","pages":"Article 130907"},"PeriodicalIF":5.5,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144704792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-07-14DOI: 10.1016/j.neucom.2025.130987
Dongnian Jiang , Zhaiwen Wang , Huichao Cao , Dezhi Xu
{"title":"Imbalanced open set domain generalization network for sensor fault diagnosis","authors":"Dongnian Jiang , Zhaiwen Wang , Huichao Cao , Dezhi Xu","doi":"10.1016/j.neucom.2025.130987","DOIUrl":"10.1016/j.neucom.2025.130987","url":null,"abstract":"<div><div>In recent years, the technique of applying domain generalization methods to solve cross-domain fault diagnosis problems has received widespread attention in the industrial community, among which, the open-set domain generalization fault diagnosis method effectively copes with the occurrence of unknown fault states in the target domain. However, issues such as data imbalance where fault data are scarce and normal data are abundant during the long-term operation of industrial sensors, and boundary shifts caused by unknown faults occurring in the target domain, make it difficult for the existing open-set domain generalization techniques to achieve accurate decision-making on sample types. This paper therefore introduces the HSL-ARAN generalization network, which can be generalized to carry out unknown fault diagnosis under imbalanced data conditions. First, a hierarchical style learning network is designed to encourage the generation of samples with relatively rich feature information, to address the issue of class imbalance in the source domain. Then, adversarial training with uncertainty weighting is used to extract reliable domain-invariant representations, and the inter-class relationships are leveraged to determine appropriate class boundaries and rejection thresholds. Finally, a new local clustering method is employed to further enhance the reliability of the class boundaries, which enables the identification of new fault modes. The algorithm is tested on sensor data for a nickel flash furnace system, and the effectiveness and superiority of the HSL-ARAN diagnosis method are verified.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 130987"},"PeriodicalIF":5.5,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144657061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-07-14DOI: 10.1016/j.neucom.2025.130985
Guangrui Guo , Jinyong Cheng
{"title":"Un-CNL: An uncertainty-based continual noisy learning framework","authors":"Guangrui Guo , Jinyong Cheng","doi":"10.1016/j.neucom.2025.130985","DOIUrl":"10.1016/j.neucom.2025.130985","url":null,"abstract":"<div><div>The goal of continual learning is to maintain model performance while adapting to new tasks and evolving data environments. This helps address catastrophic forgetting, a common issue in deep learning. However, challenges like human annotation errors and label biases introduce noisy labels into datasets, further intensifying catastrophic forgetting in neural networks. In response to these challenges, the concept of continual noisy learning (CNL) has emerged. While existing methods often rely on sample selection and replay strategies, they tend to focus solely on sample confidence, neglecting representativeness. To improve the reliability and representativeness of replayed samples, we propose a novel method called Un-CNL. This approach uses uncertainty purification techniques based on perturbed samples to separate data streams and select reliable samples for replay. Additionally, we apply CutMix data augmentation to enhance the representativeness of these samples. Subsequently, semi-supervised learning is employed for fine-tuning, combined with contrastive learning to handle the classification challenges posed by noisy data streams. We validated the effectiveness of Un-CNL through experiments on CIFAR-10 and CIFAR-100 datasets, demonstrating its superior performance compared to existing methods.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 130985"},"PeriodicalIF":5.5,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144657293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-07-14DOI: 10.1016/j.neucom.2025.131001
Haixin Wang , Jian Yang , Jinjia Zhou
{"title":"Harmony score-guided inpainting: Iterative refinement for seamless image inpainting","authors":"Haixin Wang , Jian Yang , Jinjia Zhou","doi":"10.1016/j.neucom.2025.131001","DOIUrl":"10.1016/j.neucom.2025.131001","url":null,"abstract":"<div><div>Inpainting techniques often demand extensive model fine-tuning or the concatenation of latent vectors, which can be time-intensive and prone to overfitting. Such methods frequently lead to inconsistencies between the inpainted regions and the surrounding background, occasionally producing partially satisfactory results where some areas appear natural while others remain unrealistic. To address these limitations, we demonstrate that existing inpainting methods can sufficiently handle certain scenarios but may struggle with specific problematic patches. We propose an iterative enhancement approach guided by an Inpainting Harmony Score, which evaluates the coherence of the inpainted image. Our method selectively enhances only the poorly reconstructed patches, preserving their masks for subsequent inpainting iterations. The process is repeated, followed by a final blending step to ensure seamless integration between the inpainted region and the background. This approach improves the overall quality and consistency of inpainting results while minimizing the risks of overfitting and inefficiency.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 131001"},"PeriodicalIF":5.5,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144657291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-07-14DOI: 10.1016/j.neucom.2025.130997
Muming Zhao , Guang Li , Piotr Koniusz , Chongyang Zhang , Yongshun Gong
{"title":"Beyond decoders: Learning prompt-aware features for few-shot object counting","authors":"Muming Zhao , Guang Li , Piotr Koniusz , Chongyang Zhang , Yongshun Gong","doi":"10.1016/j.neucom.2025.130997","DOIUrl":"10.1016/j.neucom.2025.130997","url":null,"abstract":"<div><div>Few-shot object counting involves estimating the quantity of objects from an arbitrary category in an image, given a few exemplars as visual prompts. This is typically achieved by matching image and exemplar features to establish a class-agnostic similarity map, which is used to regress a density map for the target class. Prevailing approaches primarily focus on improving the matching phase, designing various intricate decoders to perform sophisticated feature correlation. However, these methods still face challenges when initial features lack discriminative power. In this work, we shift our focus from decoder design to learning discriminative prompt-aware image features, enabling more effective similarity matching and density estimation. Specifically, we first establish a straightforward baseline that leverages a transformer-based backbone to enable direct interactions between images and exemplars. To ensure effective feature learning given limited exemplars, we further introduce a class-relevant prompts guided prediction module, which enhances the backbone’s ability to learn discriminative features by incorporating class-relevant visual cues and auxiliary training objectives. This module is designed to be auxiliary and can be discarded at inference, ensuring no additional computational overhead. Extensive experiments on FSC147 and CARPK demonstrate the effectiveness of our method, highlighting the efficacy of learning prompt-aware feature representations for few-shot counting.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 130997"},"PeriodicalIF":5.5,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144662122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-07-14DOI: 10.1016/j.neucom.2025.130994
Chenkun Ge , Xiaojun Yu , Hao Zheng , Zeming Fan , Umair Muhammad , Jinna Chen , Perry Ping Shum
{"title":"ESC-DRKD: Enhanced skip connection-based direct reverse knowledge distillation for medical image anomaly detection","authors":"Chenkun Ge , Xiaojun Yu , Hao Zheng , Zeming Fan , Umair Muhammad , Jinna Chen , Perry Ping Shum","doi":"10.1016/j.neucom.2025.130994","DOIUrl":"10.1016/j.neucom.2025.130994","url":null,"abstract":"<div><div>Image anomaly detection has emerged as a prevalent research area in medical diagnosis, focusing on using models trained exclusively on normal samples to identify and locate anomalous images during testing. While many image anomaly detection (AD) models exhibit remarkable performance on industrial datasets, they frequently struggle to effectively detect anomalies in medical datasets characterized by complex distributions. This is because most methods are prone to overgeneralization, resulting in ineffective restoration of normal image features. To tackle this challenge, we introduce a novel approach called Enhanced Skip Connection-Based Direct Reverse Knowledge Distillation (ESC-DRKD), specifically designed to facilitate anomaly detection and localization in medical images. ESC-DRKD consists of pre-trained teacher encoders, trainable projection layers, and student decoders. Firstly, pseudo abnormal images are generated using the CutPaste method. By extracting multi-scale features from both normal and pseudo abnormal images through the pretrained teacher encoders, projection layers are employed to project the features of pseudo abnormal images onto important information of normal features. These projected features are then added to the outputs of corresponding student decoders at each level through skip connections to enhance the restoration of normal features. Furthermore, the output of the final layer of the teacher encoders serves as the input to the student decoders, thereby preventing the loss of normal information. Extensive experiments on various public medical datasets are tested the effectiveness of our proposed method. The results demonstrate that ESC-DRKD outperforms the state-of-the-art AD models on the five medical datasets, achieving an average improvement of over 3.0 % AUROC for anomaly detection. Code is available at <span><span>https://github.com/GE-123-cpu/ESC_DRKD</span><svg><path></path></svg></span></div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 130994"},"PeriodicalIF":5.5,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144657297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-07-14DOI: 10.1016/j.neucom.2025.130883
Jiali You , Haoran Li , Jiawen Deng , Wei Li , Yuanyuan He , Fuji Ren
{"title":"Hierarchical Reasoning Enhanced Few-Shot Multimodal Sentiment Analysis","authors":"Jiali You , Haoran Li , Jiawen Deng , Wei Li , Yuanyuan He , Fuji Ren","doi":"10.1016/j.neucom.2025.130883","DOIUrl":"10.1016/j.neucom.2025.130883","url":null,"abstract":"<div><div>Few-shot Multimodal Sentiment Analysis (FMSA) aims to predict sentiment with minimal labeled data by integrating multiple modalities, such as text and images. While recent FMSA methods have focused on transforming non-linguistic information (e.g., images) into text and leveraging language models to convert them into few-shot filling tasks, they still struggle to capture the latent sentiment information in image–text pairs. These limitations hinder their effectiveness, particularly in real-world applications where labeled data is scarce. To address these limitations, we propose a novel approach, Hierarchical Reasoning Enhanced Few-shot Multimodal Sentiment Analysis (HRE-FMSA), which consists of three main components: the Hierarchical Reasoning Framework (HRF), the Hierarchical Reasoning Representation Fusion Network (H2RF-Net), and label prediction. Concretely, the HRF module excavates latent sentiment information from image–text pairs at three levels: topic/aspect, opinion, and sentiment. Then, H2RF-Net integrates latent sentiment information with the original image–text pairs to generate a prompt, which is fed into a pre-trained Language Model to obtain the final sentiment type. In the experiment, we conducted comprehensive evaluations on three sentence-level datasets and two aspect-level datasets, demonstrating the effectiveness and applicability of HRE-FMSA.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"651 ","pages":"Article 130883"},"PeriodicalIF":5.5,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144665754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}