Zhangmin Huang, Pengcheng Wang, Shaojie Tang, Bo Lyu, Lingfang Zeng
{"title":"Decoupling Neural Networks to Leverage Uniform Representation and Balance Personalization and Collaboration in Federated Learning","authors":"Zhangmin Huang, Pengcheng Wang, Shaojie Tang, Bo Lyu, Lingfang Zeng","doi":"10.1109/tnnls.2025.3586600","DOIUrl":"https://doi.org/10.1109/tnnls.2025.3586600","url":null,"abstract":"","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"5 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144736825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Online Adaptable Offline RL With Guidance Model.","authors":"Xun Wang,Jingmian Wang,Zhuzhong Qian,Bolei Zhang","doi":"10.1109/tnnls.2025.3589418","DOIUrl":"https://doi.org/10.1109/tnnls.2025.3589418","url":null,"abstract":"Reinforcement learning (RL) has emerged as a promising approach across various applications, yet its reliance on repeated trial-and-error learning to develop effective policies from scratch poses significant challenges for deployment in scenarios where interaction is costly or constrained. In this work, we investigate the offline-to-online RL paradigm, wherein policies are initially pretrained using offline historical datasets and subsequently fine-tuned with a limited amount of online interaction. Previous research has suggested that efficient offline pretraining is crucial for achieving optimal final performance. However, it is challenging to incorporate appropriate conservatism to prevent the overestimation of out-of-distribution (OOD) data while maintaining adaptability for online fine-tuning. To address these issues, we propose an effective offline RL algorithm that integrates a guidance model to introduce suitable conservatism and ensure seamless adaptability to online fine-tuning. Our rigorous theoretical analysis and extensive experimental evaluations demonstrate better performance of our novel algorithm, underscoring the critical role played by the guidance model in enhancing its efficacy.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"34 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144701044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"VB-Adapter: Variational Bayesian Adapter for Cross-Domain Speech Representation Learning.","authors":"Jing Zhao,Qimin Huang,Shanhu Wang,Shiliang Sun","doi":"10.1109/tnnls.2025.3589086","DOIUrl":"https://doi.org/10.1109/tnnls.2025.3589086","url":null,"abstract":"To leverage the abundant speech data available for pretraining, current models excel in generalization across diverse tasks. Nevertheless, real-world challenges emerge when addressing unfamiliar speech scenarios far from the pretrained speech, owing to the domain shift between pretraining (source) and fine-tuning (target) data. To overcome this barrier, we propose a variational Bayesian adapter (VB-Adapter) for cross-domain speech representation learning during fine-tuning. First, we establish a latent variable model to construct a desired posterior distribution after incorporating domain-specific knowledge to bridge the gap between the source and target domains. Then, an adaptive objective is presented to maximize the mutual information of the latent variables with and without domain-specific knowledge to facilitate model adaptation. Finally, we introduce contrastive learning on samples to optimize the lower bound of the above adaptive objective. Our experiments apply the VB-Adapter on transformers for dysarthric speech recognition (DSR) and the integration of Whisper-encoder and Llama for Mandarin speech recognition (MSR). The results reveal the effectiveness of VB-Adapter in modeling the uncertainties arising from domain shift and enhancing the robustness of speech representations.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"12 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144700997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HCG: Streaming DCNN Accelerator With a Hybrid Computational Granularity Scheme on FPGA.","authors":"Wenjin Huang,Conghui Luo,Baoze Zhao,Han Jiao,Yihua Huang","doi":"10.1109/tnnls.2025.3587694","DOIUrl":"https://doi.org/10.1109/tnnls.2025.3587694","url":null,"abstract":"With the growth of field-programmable gate array (FPGA) hardware resources, streaming DCNN accelerators leverage interconvolutional-layer parallelism to enhance throughput. In existing streaming accelerators, convolution nodes typically adopt layer- or column-based tiling methods, where the tiled input feature map (Ifmap) encompasses all input channels. This approach facilitates the comprehensive calculation of the output feature map (Ofmap) and maximizes interlayer parallelism. The computational granularity, defined in this study as the calculated rows or columns of Ofmap based on each tiled Ifmap data, significantly influences on-chip Ifmap storage and off-chip weight bandwidth (BW). The uniform application of computational granularity across all nodes inevitably impacts the memory-BW tradeoff. This article introduces a novel streaming accelerator with a hybrid computational granularity (HCG) scheme. Each node employs an independently optimized computational granularity, enabling a more flexible memory-BW tradeoff and more effective utilization of FPGA resources. However, this hybrid scheme can introduce pipeline bubbles and increase system pipeline complexity and control logic. To address these challenges, this article theoretically analyzes the impact of computational granularity on individual computing nodes and the overall system, aiming to establish a seamless system pipeline without pipeline bubbles and simplify system design. Furthermore, the article develops a hardware overhead model and employs a heuristic algorithm to optimize computational granularity for each computing node, achieving optimal memory-BW tradeoff and higher throughput. Finally, the effectiveness of the proposed design and optimization methodology is validated through the implementation of a 3-TOPS ResNet-18 accelerator on the Alveo U250 development board under BW constraints of 25, 20, and 15 GB/s. Additionally, accelerators for 4-TOPS VGG-16, 4-TOPS ResNet-34, 5-TOPS ResNet-50, 3-TOPS MobileNetV1, 4-TOPS ConvNeXt-T, and 4-TOPS ResNeXt-50 are implemented, surpassing the performance of most existing works.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"65 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144700991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing Graph Reconstruction: Uniting Dual-Level Graph Structure With Graph Reinforcement Learning.","authors":"Dazi Li,Yanyang Bao,Xin Xu","doi":"10.1109/tnnls.2025.3585906","DOIUrl":"https://doi.org/10.1109/tnnls.2025.3585906","url":null,"abstract":"A combinatorial optimization problem is typically regarded as a 1-D sorting problem in most existing research. The representation ignores some information about the problem because of dimension compression. When applying reinforcement learning (RL) to this problem, convolutional neural networks (CNNs) used in conventional RL cannot directly extract the connection information between two elements in the feature matrix. A typical class of combinatorial optimization problems, the job shop scheduling problem (JSSP), is used in this article as an example. Considering the limitations in previous research, this article reexamines the task from the perspective of graph reconstruction and proposes a graph RL (GRL) method that combines a double deep Q-network (DDQN) and graph attention network (GAT) to achieve breakthroughs beyond the constraints of CNN performance. Moreover, a dual-level graph representation structure is constructed to comprehensively learn the features of scheduling information and overcome the difficulty of learning dynamic graphs. Experiments show that the quality of the obtained solution and generalization performance are both improved compared with models based on original deep RL (DRL) algorithms.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"34 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144700992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Game-Theoretic Constrained Policy Optimization for Safe Reinforcement Learning.","authors":"Changxin Zhang,Xinglong Zhang,Yixing Lan,Hao Gao,Xin Xu","doi":"10.1109/tnnls.2025.3586603","DOIUrl":"https://doi.org/10.1109/tnnls.2025.3586603","url":null,"abstract":"Safe reinforcement learning (RL) aims to optimize the task performance with safety guarantees. One common modeling scheme to study safe RL problems is the constrained Markov decision process (CMDP). However, current safe RL methods within the CMDP framework face challenges in tradeoffs among various objectives and gradient conflicts of policy updating. To cope with these challenges, this article presents a novel safe RL approach called game-theoretic constrained policy optimization (GCPO). The proposed approach formulates the CMDP problem as a general-sum Markov game with multiple players, where a task player seeks to maximize the reward objective, while constraint players aim to minimize constraint objectives until they are fulfilled. By doing so, GCPO adopts the learning mode with multiple subpolicies, each aligned with a distinct objective, that collectively constitute the overall behavior of the agent. The learning convergence of the GCPO can be ensured with the contraction mapping to the Nash equilibrium. Furthermore, a novel dominant timescale update rule is presented for multiplayer policy learning to guarantee constraint satisfaction. The learning convergence and constraint satisfaction of GCPO are theoretically analyzed. Consequently, GCPO eliminates the necessity of tuning tradeoff parameters and mitigates gradient conflicts during multiobjective policy updating. Experimental results show that GCPO outperforms state-of-the-art safe RL algorithms in a quadrotor trajectory tracking task and various high-dimensional robot locomotion benchmarks. Moreover, GCPO exhibits robustness to diverse scales of task rewards and constraint costs without the need for intricate tradeoffs.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"14 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144701344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OS-RRG: Observation State-Aware Radiology Report Generation With Balanced Diagnosis and Attention Intervention.","authors":"Honglong Yang,Hui Tang,Shanshan Song,Xiaomeng Li","doi":"10.1109/tnnls.2025.3589103","DOIUrl":"https://doi.org/10.1109/tnnls.2025.3589103","url":null,"abstract":"Radiology report generation (RRG) aims to automatically generate detailed textual descriptions and diagnoses for clinical radiography, alleviating radiologists' workloads, aiding inexperienced radiologists, and minimizing errors. RRG is challenging due to the need to generate coherent and clinically accurate multisentence reports that describe various medical conditions. Although previous diagnosis-guided methods achieve impressive diagnostic accuracy by explicitly converting the identified observation states (OSs) (e.g., positive, negative, and uncertain) to descriptions, these methods still struggle in accurate observation-state identification and establishing precise state-to-description alignment. These challenges largely stem from the two aspects of imbalance (interclass and intraclass) inherent in observation states. In this article, we introduce a novel framework, observation state-aware radiology report generator (OS-RRG), designed to improve both the identification of states and their alignment with clinical descriptions. Our approach includes a state-aware balancing diagnosis (SBD) module to address both interclass and intraclass imbalances, an issue that previous methods have overlooked, resulting in suboptimal identification performance. In addition, we propose a novel technique called state-guided attention intervention (SAI), which dynamically adjusts focus on critical diagnostic features through a targeted filtering and enhancement mechanism. Furthermore, we propose a task-specific learning paradigm that decouples the identification and alignment processes into independent pathways, significantly enhancing the overall performance. Experiments on the MIMIC-CXR and IU-Xray benchmarks demonstrate the superior diagnostic accuracy of our method, which outperforms existing state-of-the-art techniques. The code will be made publicly available at https://github.com/xmed-lab/OS_RRG.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"25 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144701345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}