{"title":"Tensor multi-subspace learning for robust tensor-based multi-view clustering","authors":"Bing Cai , Gui-Fu Lu , Guangyan Ji , Yangfan Du","doi":"10.1016/j.knosys.2025.113476","DOIUrl":"10.1016/j.knosys.2025.113476","url":null,"abstract":"<div><div>Tensor-based multi-view clustering (TMVC) has garnered considerable attention for its efficacy in managing data that originate from multiple perspectives. However, the presence of noise in empirical datasets often undermines the reliability and robustness of the affinity matrices generated through these methods. To address this challenge, we introduce an innovative approach termed tensor multi-subspace learning (TMSL). Our methodology commences with the employment of a typical TMVC method to produce self-representation matrices for each view. Nevertheless, the affinity matrix derived from these self-representation matrices frequently falls short of the desired levels of dependability and robustness. To uncover the intrinsic architecture of the data within the tensor subspace, we harness the concept of tensor low-rank representation. This enables us to extract a higher-dimensional representation of multi-view data, thereby yielding a multi-subspace representation tensor that is both reliable and robust. These two stages are then seamlessly integrated into a unified framework and are resolved by employing the augmented Lagrangian algorithm. Notably, the TMSL method also serves as an effective post-processing strategy capable of being applied to various TMVC methods to augment their performance. Empirical evidence has established that TMSL outperforms other contemporary methods, and the post-processing strategy has proven to be an effective unified approach that can be extended to other TMVC methods.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"318 ","pages":"Article 113476"},"PeriodicalIF":7.2,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143835321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yongtang Bao , Yuzhen Wang , Yutong Qi , Qing Yang , Ruijun Liu , Liping Feng
{"title":"Emotion-Assisted multi-modal Personality Recognition using adversarial Contrastive learning","authors":"Yongtang Bao , Yuzhen Wang , Yutong Qi , Qing Yang , Ruijun Liu , Liping Feng","doi":"10.1016/j.knosys.2025.113504","DOIUrl":"10.1016/j.knosys.2025.113504","url":null,"abstract":"<div><div>Multi-modal personality recognition integrates text, audio, and video information to accurately identify personality traits, offering significant value in fields like human–computer interaction. However, existing methods face feature extraction, noise removal, and modal alignment challenges. These issues impact recognition accuracy and model robustness. To address these issues, we propose an <strong>E</strong>motion-<strong>A</strong>ssisted multi-modal <strong>P</strong>ersonality <strong>R</strong>ecognition using adversarial <strong>C</strong>ontrastive learning (EAPRC). EAPRC leverages text, audio, and image data, incorporating emotional information to enhance recognition accuracy and robustness through adversarial training. The model reduces inter-modal noise using adversarial sample generation and employs joint class propagation contrastive learning to extract discriminative feature representations. For emotion-based assistance, EAPRC uses emotion feature-guided fusion and emotion score decision fusion to exploit the correlation between emotions and personality traits fully. It further improves the accuracy and stability of multi-modal personality recognition. Experimental results on the ChaLearn First Impressions and ELEA datasets demonstrate that EAPRC performs effectively, validating its capability in multi-modal personality recognition tasks.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"317 ","pages":"Article 113504"},"PeriodicalIF":7.2,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143824364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peicheng Shi , Xinlong Dong , Runshuai Ge , Zhiqiang Liu , Aixi Yang
{"title":"Dp-M3D: Monocular 3D object detection algorithm with depth perception capability","authors":"Peicheng Shi , Xinlong Dong , Runshuai Ge , Zhiqiang Liu , Aixi Yang","doi":"10.1016/j.knosys.2025.113539","DOIUrl":"10.1016/j.knosys.2025.113539","url":null,"abstract":"<div><div>Considering the limitations of monocular 3D object detection in depth information and perception ability, we introduce a novel monocular 3D object detection algorithm, Dp-M3D, equipped with depth perception capabilities. To effectively model long-range feature dependencies during the fusion of depth maps and image features, we introduce a Transformer Feature Fusion Encoder (TFFEn). TFFEn integrates depth and image features, enabling more comprehensive long-range feature modeling. This enhances depth perception, ultimately improving the accuracy of 3D object detection. To enhance the detection ability of truncated objects at the edges of an image, we propose a Feature Enhancement method based on Deformable Convolution (FEDC). FEDC leverages depth confidence guidance to determine the deformation offset of the 3D bounding box, aligning features more effectively and improving depth perception. Furthermore, to address the issue of anchor box ranking, where candidate boxes with accurate depth predictions but low classification confidence are suppressed, we propose a Depth-perception Non-Maximum Suppression (Dp-NMS) algorithm. Dp-NMS refines the selection process by incorporating the product of classification confidence and depth confidence, ensuring that candidate boxes are ranked effectively and the most suitable detection box is retained. Experimental results on the challenging KITTI 3D object detection dataset demonstrate that the proposed method achieves <span><math><mrow><mi>A</mi><msub><mi>P</mi><mrow><mn>3</mn><mi>D</mi></mrow></msub></mrow></math></span> scores of 23.41 %, 13.65 %, and 12.91 % in the easy, moderate, and hard categories, respectively. Our approach outperforms state-of-the-art monocular 3D object detection algorithms based on image and image-depth map fusion, with particularly significant improvements in detecting edge-truncated objects.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"318 ","pages":"Article 113539"},"PeriodicalIF":7.2,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143835320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deepnet-based surgical tools detection in laparoscopic videos","authors":"Praveen SR Konduri , G Siva Nageswara Rao","doi":"10.1016/j.knosys.2025.113517","DOIUrl":"10.1016/j.knosys.2025.113517","url":null,"abstract":"<div><div>Recently, deep learning has revolutionized significant advances in image classification, especially in Medical image (MI) processing. Surgical Data Science (SDS) has been developed as a scientific research field that aims to improve the health status of patients. Laparoscopic videos possess a highly significant information source that is integrally present in minimally invasive surgeries. Recognizing surgical tools based on the videos has promoted greater interest because of their significance. In most existing research, single-tool detection is carried out, but multiple-tool recognition is not concentrated well. However, multiple-tool recognition poses numerous challenges, including diverse lighting conditions, the appearance of multiple instruments in different representations, tissue blood, etc. Also, the detection speed of learning methodology is very low because of inherent complexities and improper handling of huge amounts of data. The proposed research introduces a novel DeepNet-Tool for automatic multi-tool classification in laparoscopy videos to address these existing challenges. This paper focuses on solving the spatial-temporal issues in detecting Surgical Tools (STs). The proposed model is implemented in Python, and the overall accuracy is 97.36 % with the Cholec80 dataset, 98.67 % with the EndoVis dataset, 99.73 % on EndoVis and 98.67 % on LapGyn4, respectively. Experimental outcomes of the proposed DeepNet-Tool showed higher effectiveness compared with other deep learning methods on the ST classification task. Thus, the proposed model has revealed the potential for clinical use in accurate ST classification.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"318 ","pages":"Article 113517"},"PeriodicalIF":7.2,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143874099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qingyang Shen , Xiaozhi Zhang , Haomin Ren , Quan Guo , Zhang Yi
{"title":"Knowledge-embedded large language models for emergency triage","authors":"Qingyang Shen , Xiaozhi Zhang , Haomin Ren , Quan Guo , Zhang Yi","doi":"10.1016/j.knosys.2025.113431","DOIUrl":"10.1016/j.knosys.2025.113431","url":null,"abstract":"<div><div>Emergency departments (EDs) are crucial to healthcare but face persistent overcrowding. The Emergency Severity Index (ESI) triage system is vital for prioritizing patients based on acuity and resource needs but relies heavily on the subjective judgment of medical staff, leading to inconsistencies. This study developed a Sequential Domain and Task Adaptation (SDTA) framework for enhancing ED triage accuracy and consistency using large language models (LLMs). By training LLMs on clinical data and ESI-specific tasks, we significantly improved their performance compared to traditional prompt-engineered models, achieving accuracy levels comparable to or exceeding those of experienced emergency physicians. Notably, the fine-tuned models achieved high accuracy and perfect recall for high-risk cases. These findings highlight the potential of adapted LLMs to standardize triage decisions and reduce variability, thus offering a solution to alleviate overcrowding and enhance patient care outcomes.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"318 ","pages":"Article 113431"},"PeriodicalIF":7.2,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143850650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuelin Yang , Chunjie Yang , Xiongzhuo Zhu , Hanwen Zhang , Haifeng Zhang , Zhiqi Su , Siwei Lou
{"title":"Semi-supervised high-uncertainty deep canonical variate analysis for fault diagnosis in blast furnace ironmaking","authors":"Yuelin Yang , Chunjie Yang , Xiongzhuo Zhu , Hanwen Zhang , Haifeng Zhang , Zhiqi Su , Siwei Lou","doi":"10.1016/j.knosys.2025.113454","DOIUrl":"10.1016/j.knosys.2025.113454","url":null,"abstract":"<div><div>Blast furnace ironmaking process (BFIP) is of paramount importance in the steel industry. Reliable fault diagnosis for BFIP is crucial to ensure production safety, improve efficiency and quality, reduce costs, and maximize resource utilization. However, establishing effective fault diagnosis models is hindered by challenges including non-linearity, dynamics, widespread noise, and the scarcity of labeled data alongside an abundance of unlabeled data. To address these issues, this paper proposes a new fault diagnosis method for BFIP called semi-supervised high-uncertainty deep canonical variate analysis (SHDCVA). The proposed algorithm consists of three main parts, including (1) high-uncertainty nonlinear dynamic feature capture, (2) robust semi-supervised framework construction, and (3) model solving and parameter optimization. Firstly, a high-uncertainty deep canonical variate representation method is proposed from a probabilistic perspective, which can capture high-uncertainty nonlinear dynamic characteristics. The high-uncertainty property can effectively deal with data noise and enhance the reliability of downstream fault diagnosis model. Moreover, this paper proposes a robust semi-supervised classification framework that can efficiently utilize limited labeled samples and a large amount of unlabeled samples. The supervised part controls the release of labeled samples by training signal annealing method (TSA) to prevent overfitting, while the unsupervised part enforces model smoothing by applying adversarial perturbations to enhance robustness. Subsequently, an efficient computational method is devised to generate adversarial perturbations and the overall objective is constructed. Finally, the effectiveness of SHDCVA is confirmed through a practical case study utilizing genuine BFIP data.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"317 ","pages":"Article 113454"},"PeriodicalIF":7.2,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143830130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MESN: A multimodal knowledge graph embedding framework with expert fusion and relational attention","authors":"Ban Tran , Thanh Le","doi":"10.1016/j.knosys.2025.113541","DOIUrl":"10.1016/j.knosys.2025.113541","url":null,"abstract":"<div><div>Knowledge graph embedding is essential for knowledge graph completion and downstream applications. However, in multimodal knowledge graphs, this task is particularly challenging due to incomplete and noisy multimodal data, which often fails to capture semantic relationships between entities. While existing methods attempt to integrate multimodal features, they frequently overlook relational semantics and cross-modal dependencies, leading to suboptimal entity representations. To address these limitations, we propose MESN, a novel multimodal embedding framework that integrates relational and multimodal signals through semantic aggregation and neighbor-aware attention mechanisms. MESN selectively extracts informative multimodal features via adaptive attention and expert-driven learning, ensuring more expressive entity embeddings. Additionally, we introduce an enhanced ComplEx-based scoring function, which effectively combines structured graph interactions with multimodal information, capturing both relational and feature diversity. Extensive experiments on standard multimodal datasets confirm that MESN significantly outperforms baselines across multiple evaluation metrics. Our findings highlight the importance of relational guidance in multimodal embedding tasks, paving the way for more robust and semantically-aware knowledge representations.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"318 ","pages":"Article 113541"},"PeriodicalIF":7.2,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143839245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cheng Guo , Qianqian He , Xinyu Tang , Yining Liu , Yingmo Jie
{"title":"Parameterized data-free knowledge distillation for heterogeneous federated learning","authors":"Cheng Guo , Qianqian He , Xinyu Tang , Yining Liu , Yingmo Jie","doi":"10.1016/j.knosys.2025.113502","DOIUrl":"10.1016/j.knosys.2025.113502","url":null,"abstract":"<div><div>Knowledge distillation has emerged as a widely adopted and effective method for addressing two challenges of heterogeneous federated learning: Data heterogeneity causes client drift, which makes model convergence slow and model accuracy decrease, and personalized requirements of heterogeneous clients are ignored, which cannot be satisfied by a single global model. However, most existing knowledge distillation-based federated learning schemes are constrained by two fundamental limitations: They rely on public datasets for knowledge distillation, forming an impractical assumption for real-world scenarios, and the model personalization process employs a unified redundant teacher model, which conflicts with the diverse data distribution characteristics among heterogeneous clients. Therefore, we propose a parameterized data-free knowledge distillation, addressing the impractical dependency on public datasets and the static single knowledge transfer bottleneck through global view knowledge extraction without public datasets and an adaptive personalized teacher model. Specifically, the server learns a conditional distribution to extract knowledge about the global view of ground-truth data distributions and then uses the acquired knowledge as an inductive bias to enhance the generalization performance of local models. Additionally, the server calculates the knowledge contribution of each local model based on the similarity of the average data representation and aggregates a personalized teacher model that contains more positive transfer knowledge for each client. Experimental validation shows that the proposed scheme improves local test accuracy by up to 69.55%, 47.56%, and 18.76% on the Mnist, EMnist, and CelebA datasets, respectively, while reducing communication rounds across varying degrees of data heterogeneity compared to state-of-the-art schemes.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"317 ","pages":"Article 113502"},"PeriodicalIF":7.2,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143834177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cross-domain UAV pose estimation: A novel attempt in UAV visual localization","authors":"Wenhao Lin , Tao Liu , Kan Ren, Qian Chen","doi":"10.1016/j.knosys.2025.113449","DOIUrl":"10.1016/j.knosys.2025.113449","url":null,"abstract":"<div><div>With the rapid advancement of depth estimation algorithms and continuous improvements in devices such as LiDAR and depth cameras, the acquisition of high-quality 3D models has become increasingly accessible. This progress opens up new opportunities for leveraging cross-domain matching between images and point clouds to estimate the pose of Unmanned Aerial Vehicles (UAVs) for visual localization. In this context, we propose a novel cross-domain descriptor that facilitates the fusion and matching of features across modalities. Building upon this approach, we designed a dual-branch UAV localization pipeline that incorporates an object detection strategy to extract more reliable feature points from the scene. Additionally, we constructed two new datasets specifically tailored for UAV-based aerial applications. The first dataset is manually annotated and focuses on training and evaluating object detection models from an aerial perspective, while the second dataset contains approximately 1.7 million 2D-3D correspondences from diverse scenarios, offering a rich collection of training and evaluation samples. Extensive experiments on public UAV datasets demonstrate that, compared to existing descriptors, our method not only achieves superior pose estimation accuracy through a coarse-to-fine image matching strategy but also enables robust pose estimation by directly matching images and point clouds to obtain accurate 2D-3D correspondences. Moreover, the incorporation of object detection strategies significantly enhances pose estimation accuracy and demonstrates increased resilience to interference in complex environments. Our datasets and code will be publicly available at <span><span>https://github.com/lwhhhh13/Cross-Domain-UAV-Pose-Estimation</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"317 ","pages":"Article 113449"},"PeriodicalIF":7.2,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143830199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Federated feature reconstruction with collaborative star networks","authors":"Yihong Zhang , Yuan Gao , Maoguo Gong, Hao Li, Yuanqiao Zhang, Sijia Zhang","doi":"10.1016/j.knosys.2025.113463","DOIUrl":"10.1016/j.knosys.2025.113463","url":null,"abstract":"<div><div>Federal learning provides a secure platform for sharing sensitive data, yet imposes stringent requirements on the data. Non-IID data often cannot fully enjoy the convenience it offers. When clients possess divergent feature sets, retaining only the common features is a prevalent yet suboptimal practice. This paper proposes a novel omnidirectional federated learning framework that employs a Star collaboration network designed to leverage independent information from client nodes for feature reconstruction of other clients. It establishes an approximate distribution network, reinforcing feature correlations while overcoming data isolation seen in traditional federal learning. Additionally, homomorphic encryption is utilized to ensure data security throughout the transmission process. Experimental evaluations on structured datasets demonstrate that the reconstructed prediction results closely approximate those under the condition of complete data, confirming the effectiveness of the Star network in data completion and multi-party prediction scenarios.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"318 ","pages":"Article 113463"},"PeriodicalIF":7.2,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143843917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}