Xiaokang Liu , Yudong Yang , Guorong Xu , Xiaoxia Du , Rongfeng Su , Nan Yan , Lan Wang
{"title":"KGMV-net: Knowledge-guided multi-view network for audio-visual dysarthria severity assessment","authors":"Xiaokang Liu , Yudong Yang , Guorong Xu , Xiaoxia Du , Rongfeng Su , Nan Yan , Lan Wang","doi":"10.1016/j.knosys.2025.114609","DOIUrl":"10.1016/j.knosys.2025.114609","url":null,"abstract":"<div><div>The automatic assessment of dysarthria faces major challenges owing to the variability in neurological damage and clinical manifestations among patients. To address these issues, we propose KGMV-Net, a Knowledge-Guided Multi-View Network for automated audio-visual dysarthria assessment. KGMV-Net integrates pathological domain knowledge with multi-view deep representations across acoustic and visual modalities. Specifically, we introduce a feature fusion framework that extracts and combines dysarthria-relevant cues from different spatiotemporal perspectives. In the visual modality, the Knowledge Guided Appearance-Motion (KGAM) module incorporates clinical priors to partition speech into four functional phases, enabling the modeling of articulatory dynamics through Persistence of Appearance (PA) flow and identification of structural asymmetries in critical facial regions. In the audio modality, the Inter-Layer Multi-Scale Fusion (IMSF) module is built on a ResNet backbone and enhanced with Attentional Feature Fusion (AFF) to capture speech characteristics across multiple temporal and spectral scales. Cross-modal coherence is reinforced through the adaptive layer normalization (AdaLN)-based Cross Fusion (ALCF) module, which employs a dual-branch cross-attention mechanism with AdaLN to project heterogeneous features into a unified semantic space. Evaluations on the MSDM dataset show that KGMV-Net achieves state-of-the-art accuracy in predicting dysarthria severity, significantly surpassing existing benchmarks. These findings support KGMV-Net as a reliable, interpretable, and scalable framework for the objective clinical assessment of dysarthria.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114609"},"PeriodicalIF":7.6,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jianxin Li , Liang Qu , Taotao Cai , Zhixue Zhao , Nur Al Hasan Haldar , Aneesh Krishna , Xiangjie Kong , Flavio Romero Macau , Tanmoy Chakraborty , Aniket Deroy , Binshan Lin , Karen Blackmore , Nasimul Noman , Jingxian Cheng , Ningning Cui , Jianliang Xu
{"title":"AI-generated content in cross-domain applications: Research trends, challenges and propositions","authors":"Jianxin Li , Liang Qu , Taotao Cai , Zhixue Zhao , Nur Al Hasan Haldar , Aneesh Krishna , Xiangjie Kong , Flavio Romero Macau , Tanmoy Chakraborty , Aniket Deroy , Binshan Lin , Karen Blackmore , Nasimul Noman , Jingxian Cheng , Ningning Cui , Jianliang Xu","doi":"10.1016/j.knosys.2025.114634","DOIUrl":"10.1016/j.knosys.2025.114634","url":null,"abstract":"<div><div>Artificial Intelligence Generated Content (AIGC) has rapidly emerged with the capability to generate different forms of content, including text, images, videos, and other modalities, which can achieve a quality similar to content created by humans. As a result, AIGC is now widely applied across various domains such as digital marketing, education, and public health, and has shown promising results by enhancing content creation efficiency and improving information delivery. However, there are few studies that explore the latest progress and emerging challenges of AIGC across different domains. To bridge this gap, this paper brings together 16 scholars from multiple disciplines to provide a cross-domain perspective on the trends and challenges of AIGC. Specifically, the contributions of this paper are threefold: (1) It first provides a broader overview of AIGC, spanning the training techniques of Generative AI, detection methods, and both the spread and use of AI-generated content across digital platforms. (2) It then introduces the societal impacts of AIGC across diverse domains, along with a review of existing methods employed in these contexts. (3) Finally, it discusses the key technical challenges and presents research propositions to guide future work. Through these contributions, this vision paper seeks to offer readers a cross-domain perspective on AIGC, providing insights into its current research trends, ongoing challenges, and future directions.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114634"},"PeriodicalIF":7.6,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A dynamically interactable framework with dual-channel security: GAN-based speech steganography for concealed dialogues","authors":"Xiaoyi Ge , Xiongwei Zhang , Yihao Li , Meng Sun","doi":"10.1016/j.knosys.2025.114618","DOIUrl":"10.1016/j.knosys.2025.114618","url":null,"abstract":"<div><div>Steganography is a technique that conceals secret messages in carriers to conceal communication behaviors. This paper builds a novel framework of spoken dialogue steganography to enhance the dynamic interaction process by hiding the communicator’s dialogues in other dialogues. In this framework, the dialogues of A and B will hide the dialogues of C and D. The cover speech of A and B is resistant to both main-channel and side-channel steganalysis. Furthermore, a speech steganography method using a GAN-based vocoder is proposed to be adaptable to different types of covers, which is called <em>DialogStego</em>. The method embeds the Mel spectrogram of the secret speech in the cover Mel spectrogram and generates high-quality stego speech using the vocoder. A specially designed decoder is utilized to extract the Mel spectrogram of the secret speech and reconstruct the secret speech with a pre-trained vocoder. Experimental results on typical vocoders HiFi-GAN and iSTFTNet show that generated stego speech and extracted secret speech achieve superior performance on quality and intelligibility. Moreover, security on side-channel steganalysis about content logic correctness and speaker consistency of the cover speech has been achieved. Security on main-channel steganalysis, which fails to distinguish stego speech from cover speech, has also been manifested for the proposed method. Additionally, the proposed method has been proven to hold a larger embedding capacity and faster efficiency than recently proposed baseline methods.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114618"},"PeriodicalIF":7.6,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145363171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiaqi Shi , Xiaoling Huang , Xianjie Guo , Kui Yu , Chengxiang Hu , Peng Zhou
{"title":"Federated causal structure learning with missing data","authors":"Jiaqi Shi , Xiaoling Huang , Xianjie Guo , Kui Yu , Chengxiang Hu , Peng Zhou","doi":"10.1016/j.knosys.2025.114601","DOIUrl":"10.1016/j.knosys.2025.114601","url":null,"abstract":"<div><div>Federated causal structure learning (CSL) is an emerging research direction that aims to discover causal relationships from decentralized data across multiple clients, while preserving data privacy. Existing federated CSL algorithms primarily focus on complete datasets and often overlook data-quality issues, such as missing data, which are common in real-world scenarios. Moreover, client diversity can destabilize federated CSL, and this challenge is further worsened by missing data. To address these issues, we propose FedImpCSL, a novel federated CSL method, for effectively handling missing data. Our approach consists of two key components: (1) a local-to-global missing data imputation strategy that reconstructs imputed and accurate datasets from missing samples, and (2) a dynamic client weighting and weighted aggregation strategy to address inter-client differences, enhancing the CSL accuracy without utilizing each client’s original data. We demonstrate the effectiveness of FedImpCSL through comprehensive experiments on various types of datasets, showing its superior performance over existing federated CSL methods in handling missing data scenarios.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114601"},"PeriodicalIF":7.6,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"scDD: scRNA-seq dataset distillation in latent codes with single-step conditional diffusion generator","authors":"Zhen Yu , Jianan Han , Yang Liu , Qingchao Chen","doi":"10.1016/j.knosys.2025.114610","DOIUrl":"10.1016/j.knosys.2025.114610","url":null,"abstract":"<div><div>The single-cell RNA sequencing (scRNA-seq) technology has profiled hundreds of millions of human cells across <em>organs, diseases, developmental stages and perturbations</em>. However, the original scRNA-seq datasets are redundant and have an ever-increasing data scale, which pose significant challenges for cross-platform data sharing and scalable foundation model construction. To address this, we propose novel dataset distillation technology in scRNA-seq analysis tasks to distill/condense the original scRNA-seq dataset into a <em>synthetic, smaller and discriminative</em> dataset. Unfortunately, the synthetic datasets distilled by existing dataset distillation methods have inferior cross-architecture generalization and inter-class discriminability. In light of this, (1) We propose scDD, a scRNA-seq dataset distillation framework in latent codes, which distills the original dataset information into a compact latent space, and generates a synthetic dataset with cross-architecture generalization by avoiding direct disruption to gene expression values. Then, (2) We propose a single-step conditional diffusion generator named SCDG within the scDD framework, through high-fidelity generation and category-condition guidance of the generator, SCDG ensures that the generated synthetic dataset retains scRNA-seq data characteristics and inter-class discriminability. Finally, we propose a comprehensive and robust benchmark to evaluate the performance of scRNA-seq dataset distillation in different data analysis tasks. It is validated that our proposed method can achieve <span><math><mrow><mn>7.61</mn><mspace></mspace><mo>%</mo></mrow></math></span> absolute and <span><math><mrow><mn>15.70</mn><mspace></mspace><mo>%</mo></mrow></math></span> relative improvement over previous state-of-the-art methods on average across task. In particular, our method also achieves an average <span><math><mrow><mn>26.51</mn><mspace></mspace><mo>%</mo></mrow></math></span> absolute improvement in cross-architecture generalization.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114610"},"PeriodicalIF":7.6,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lin Yang , Junhua Gu , Qiqi Liu , Zhigang Zhao , Yunhe Wang , Yaochu Jin
{"title":"Gradient-based federated Bayesian optimization","authors":"Lin Yang , Junhua Gu , Qiqi Liu , Zhigang Zhao , Yunhe Wang , Yaochu Jin","doi":"10.1016/j.knosys.2025.114588","DOIUrl":"10.1016/j.knosys.2025.114588","url":null,"abstract":"<div><div>Bayesian optimization (BO) has evolved from traditional single-agent optimization to multi-agent collaborative optimization, known as federated BO, aiming to solve global optimization tasks such as federated hyperparameter tuning. Existing research on federated BO shares weight vectors sampled from Gaussian processes, approximated using random Fourier features, with a server for information aggregation. This line of approach helps protect the privacy of agents but may limit the performance of the algorithm. Unlike existing federated BO approaches, we propose to cluster each agent according to its characteristics, and transmit the gradients of acquisition functions between the server and agents for information aggregation. This allows for a more accurate representation of the overall landscape of the global acquisition function without explicitly constructing it. Moreover, we design a two-stage mechanism to infill the next query input based on the aggregated gradients. Specifically, multiple promising solutions are first suggested based on the aggregated gradients. Then, each agent further selects the one with the best local acquisition function value as the newly infilled solution for real function evaluation. The resulting gradient-based federated BO, termed FGBO, has demonstrated to be very competitive in tackling a set of benchmark functions and real-world problems in a privacy-preserving way.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114588"},"PeriodicalIF":7.6,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145269406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhaoshan Liu , Chau Hung Lee , Qiujie Lv , Nicole Kessa Wee , Lei Shen
{"title":"UNGT: Ultrasound nasogastric tube dataset for medical image analysis","authors":"Zhaoshan Liu , Chau Hung Lee , Qiujie Lv , Nicole Kessa Wee , Lei Shen","doi":"10.1016/j.knosys.2025.114615","DOIUrl":"10.1016/j.knosys.2025.114615","url":null,"abstract":"<div><div>We develop a novel ultrasound nasogastric tube (UNGT) dataset to address the lack of public nasogastric tube datasets. The UNGT dataset includes 493 images gathered from 110 patients with an average image resolution of approximately 879 <span><math><mo>×</mo></math></span> 583. Four structures, encompassing the liver, stomach, tube, and pancreas, are precisely annotated. Besides, we propose a semi-supervised adaptive-weighting aggregation medical segmenter to address data limitation and imbalance concurrently. The introduced adaptive weighting approach tackles the severe unbalanced challenge by regulating the loss across varying categories as training proceeds. The presented multiscale attention aggregation block bolsters the feature representation by integrating local and global contextual information. With these, the proposed AAMS can emphasize sparse or small structures and feature enhanced representation ability. We perform extensive segmentation experiments on our UNGT dataset, and the results show that AAMS outperforms existing state-of-the-art approaches to varying extents. In addition, we conduct comprehensive classification experiments across varying state-of-the-art methods and compare their performance. The dataset and code are available at <span><span>https://github.com/NUS-Tim/UNGT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114615"},"PeriodicalIF":7.6,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145325943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mingqin Zhang , Hongjie Wu , Zhengqing Zang , Jian Wang , Chaoqun Niu , Yuan Li , Jiancheng Lv
{"title":"ICCR-Diff: Identity-preserving and controllable craniofacial reconstruction with diffusion models","authors":"Mingqin Zhang , Hongjie Wu , Zhengqing Zang , Jian Wang , Chaoqun Niu , Yuan Li , Jiancheng Lv","doi":"10.1016/j.knosys.2025.114554","DOIUrl":"10.1016/j.knosys.2025.114554","url":null,"abstract":"<div><div>Craniofacial reconstruction predicts craniofacial features from skull morphology to reconstruct the craniofacial surface. Although previous methods have achieved promising performance, they face three critical limitations: insufficient image quality, poor identity preservation, and difficulties with conditional control. To overcome these challenges, we propose a novel diffusion-based craniofacial reconstruction method that preserves identity across domain transfer. Our approach incorporates multiple modules to separately manage multimodal information including skull data, landmarks, texture features, and biometric information, yielding high-fidelity results under various constraints. Furthermore, by enabling flexible modification of biometric information through standardized text prompts, our method achieves fine-grained control while maintaining individual identity characteristics. Extensive experimental results demonstrate that our method outperforms existing approaches in image quality and identity retrieval, showcasing exceptional robustness, strong identity preservation, and enhanced editability. Our code is available at: <span><span>https://github.com/mqzhang2024/ICCR</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114554"},"PeriodicalIF":7.6,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145269359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qi Liu , Pengbin Chen , Ke Lin , Kaidong Zhao , Jinliang Ding , Yanjie Li
{"title":"Sample-efficient backtrack temporal difference deep reinforcement learning","authors":"Qi Liu , Pengbin Chen , Ke Lin , Kaidong Zhao , Jinliang Ding , Yanjie Li","doi":"10.1016/j.knosys.2025.114613","DOIUrl":"10.1016/j.knosys.2025.114613","url":null,"abstract":"<div><div>Deep reinforcement learning algorithms often require large amounts of training data, particularly in robotic control tasks. To address this limitation, we propose a sample-efficient backtrack temporal difference learning method that enhances target state-action (<span><math><mi>Q</mi></math></span>) value estimation. The proposed method dynamically prioritizes transitions based on their proximity to terminal states using backtrack sampling weights. This prioritization mechanism yields more accurate target <span><math><mi>Q</mi></math></span>-values, thereby improving the overall <span><math><mi>Q</mi></math></span>-value estimation precision. Furthermore, our analysis uncovers a novel link between curriculum learning and Bellman equation optimization. The proposed method is versatile, applicable to both discrete and continuous action spaces, and readily integrable with off-policy actor-critic algorithms. Extensive experiments show that the proposed method considerably reduces <span><math><mi>Q</mi></math></span>-value approximation errors and outperforms baselines across diverse benchmarks, achieving a 28 % performance improvement in four discrete action-space tasks and a 78 % gain in four continuous control tasks.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114613"},"PeriodicalIF":7.6,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145269407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guanjun Li , Haoyu Gui , Jianguang Lu , Xianghong Tang , Xiaoyu Gao
{"title":"Multi-scale feature fusion network with temporal dynamic graphs for small-sample FW-UAV fault diagnosis","authors":"Guanjun Li , Haoyu Gui , Jianguang Lu , Xianghong Tang , Xiaoyu Gao","doi":"10.1016/j.knosys.2025.114605","DOIUrl":"10.1016/j.knosys.2025.114605","url":null,"abstract":"<div><div>With the extensive application of fixed-wing unmanned aerial vehicles, accurate fault diagnosis becomes crucial for flight safety and system reliability. Traditional fault diagnosis methods often require large datasets that are difficult to obtain in practice. To address this, we start with the spatio-temporal correlation characteristics and multi-dimensional heterogeneity of UAV flight data, and propose a multi-scale feature fusion with temporal dynamic graph network (MFFTD) that enables efficient fault diagnosis using limited UAV flight data. In the spatial dimension, a multi-scale residual convolutional design captures feature representations at various levels. Furthermore, the global temporal dynamic graph models the topological dependencies between the feature representations. In addition, we introduce long short-term memory networks to capture long-term dependencies in the temporal dimension. For cross-domain joint learning in both the temporal and spatial dimensions, we propose a multi-head feature fusion module based on mutual information to address the issue of heterogeneity imbalance between feature representations. Experiments on four public datasets demonstrated that MFFTD improves the detection accuracy by eight percentage points compared to the latest models under the 90 small-sample settings of multi-class tasks and significantly enhances the generalization capability, offering superior decision support for UAV fault diagnosis. The code and data will be available at <span><span>https://github.com/17992/MFFTD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114605"},"PeriodicalIF":7.6,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145324970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}