{"title":"Cross-attention fusion and edge-guided fully supervised contrastive learning network for rail surface defect detection","authors":"Jinxin Yang, Wujie Zhou","doi":"10.1007/s10489-025-06314-7","DOIUrl":"10.1007/s10489-025-06314-7","url":null,"abstract":"<div><p>In recent years, there has been significant research focus on efficiently and accurately detecting defects on rail surfaces using computer vision. Utilizing depth information from the rail surface has emerged as an effective approach for detecting visually insignificant types of defects that are unique in nature. However, previous methods have typically overlooked the long-distance dependency between the two modalities when fusing them using conventional convolutional network methods. Additionally, these methods have often relied on traditional cross-entropy loss for edge supervision without considering the intra and inter-pixel relationships associated with edge features. To address these limitations, we propose a novel approach called CECLNet (cross-attention fusion and edge-guided fully supervised contrastive learning network) for rail surface defect detection (RSDD). The proposed CECLNet incorporates a module for inter-modal cross-attention fusion, which effectively explores the complementary information by considering the long-range relationship. Furthermore, we introduce a progressive aggregation-based multiscale feature interactions decoder to promote sufficient information interaction between multiscale features, thus facilitating the generation of final predictions. Finally, we propose a pixel-level fully supervised contrastive learning approach to enhance the efficiency of utilizing edge-assisted information. Extensive experiments conducted on the industrial NEU RGB-D RSDDS-AUG dataset demonstrate the superiority of our proposed CECLNet over 17 state-of-the-art methods.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143361781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Francisco de Arriba-Pérez, Silvia García-Méndez, Javier Otero-Mosquera, Francisco J. González-Castaño
{"title":"Correction to: Explainable cognitive decline detection in free dialogues with a Machine Learning approach based on pre-trained Large Language Models","authors":"Francisco de Arriba-Pérez, Silvia García-Méndez, Javier Otero-Mosquera, Francisco J. González-Castaño","doi":"10.1007/s10489-024-06169-4","DOIUrl":"10.1007/s10489-024-06169-4","url":null,"abstract":"","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-024-06169-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143361782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bin Yang, Tinghuai Ma, Huan Rong, Xuejian Huang, Yubo Wang, Bowen Zhao, Chaoming Wang
{"title":"TADST: reconstruction with spatio-temporal feature fusion for deviation-based time series anomaly detection","authors":"Bin Yang, Tinghuai Ma, Huan Rong, Xuejian Huang, Yubo Wang, Bowen Zhao, Chaoming Wang","doi":"10.1007/s10489-025-06310-x","DOIUrl":"10.1007/s10489-025-06310-x","url":null,"abstract":"<div><p>Anomaly detection is crucial in time series analysis for identifying abnormal events. To address the limitations of traditional methods in integrating spatiotemporal correlations and modeling normal patterns, we propose a Time Series Anomaly Detection Model Based on Spatio-Temporal Feature Fusion (TADST). First, the Spatio-Temporal Feature Fusion Network (STF) combines temporal convolutional networks and graph attention influence networks to capture temporal dynamic dependencies and attribute correlations respectively, facilitating joint spatiotemporal feature modeling. Then, the Time Series Reconstruction Network (TSR) employs a multi-layer encoder-decoder architecture to learn the normal sample distribution and amplify discrepancies between reconstructed and anomalous data. Finally, the Anomaly Detection Mechanism (ADM) identifies anomalies by fitting the tail distribution of reconstruction deviations. When the anomaly score exceeds a predefined threshold, the mechanism updates the parameters of the Generalized Pareto Distribution, keeping the detection criteria adaptive. Experiments demonstrate that the proposed TADST achieves state-of-the-art results on five publicly available datasets.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143361778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nikolay Babakov, Adarsa Sivaprasad, Ehud Reiter, Alberto Bugarín-Diz
{"title":"Reusability of Bayesian Networks case studies: a survey","authors":"Nikolay Babakov, Adarsa Sivaprasad, Ehud Reiter, Alberto Bugarín-Diz","doi":"10.1007/s10489-025-06289-5","DOIUrl":"10.1007/s10489-025-06289-5","url":null,"abstract":"<div><p>Bayesian Networks (BNs) are probabilistic graphical models used to represent variables and their conditional dependencies, making them highly valuable in a wide range of fields, such as radiology, agriculture, neuroscience, construction management, medicine, and engineering systems, among many others. Despite their widespread application, the reusability of BNs presented in papers that describe their application to real-world tasks has not been thoroughly examined. In this paper, we perform a structured survey on the reusability of BNs using the PRISMA methodology, analyzing 147 papers from various domains. Our results indicate that only 18% of the papers provide sufficient information to enable the reusability of the described BNs. This creates significant challenges for other researchers attempting to reuse these models, especially since many BNs are developed using expert knowledge elicitation. Additionally, direct requests to authors for reusable BNs yielded positive results in only 12% of cases. These findings underscore the importance of improving reusability and reproducibility practices within the BN research community, a need that is equally relevant across the broader field of Artificial Intelligence.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-025-06289-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143361831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samson Mihirette, Enrique A. De la Cal, Qing Tan, Javier Sedano
{"title":"Cross-contextual stress prediction: Simple methodology for comparing features and sample domain adaptation techniques in vital sign analysis","authors":"Samson Mihirette, Enrique A. De la Cal, Qing Tan, Javier Sedano","doi":"10.1007/s10489-025-06277-9","DOIUrl":"10.1007/s10489-025-06277-9","url":null,"abstract":"<div><p>Stress significantly impacts individuals, particularly in professions like nursing and driving, leading to severe health risks and accidents. Accurate stress measurement is critical for effective interventions, yet research is hindered by incomplete datasets and inconsistent methodologies, slowing the development of reliable predictive models. This paper introduces a framework for cross-contextual stress prediction, enabling the generation of general stress prediction models adaptable to specific domain challenges. The methodology leverages two general daily life datasets and three domain-specific datasets, employing steps such as dataset selection, feature extraction, significant feature identification, feature preprocessing, fine-tuning, domain adaptation, and application to specific contexts. Through this framework, key vital signs were identified as significant predictors of stress, including electrocardiography (ECG), heart rate (HR), heart rate variability (HRV) - low frequency (LF), electrodermal activity (EDA), body temperature (TEMP), and skin conductance response (SCR). The experiments conducted include: 1) Utilizing HR and HRV-LF through domain adaptation from general to automobile driving datasets; 2) Applying EDA, HR, and TEMP from general to specific nurse activity datasets; and 3) Adapting ECG, HR, and TEMP from general to automobile driving datasets. Results demonstrate the potential of the proposed framework for cross-contextual stress prediction, with HR and HRV-LF identified as pivotal features. When applied to target datasets specific to stress scenarios, the model achieved a 62% F1 score, demonstrating the effectiveness of the feature-based Correlation Alignment (CORAL) technique combined with Random Forest models in transferring learned knowledge across domains. These findings highlight the robustness of the approach in adapting general stress prediction models to specific contexts, paving the way for real-world applications such as stress monitoring in driving and nursing during high-stress periods like COVID-19.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-025-06277-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143362123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuan Wang, Zhong Ji, Xiyao Liu, Yanwei Pang, Xuelong Li
{"title":"Concept agent network for zero-base generalized few-shot learning","authors":"Xuan Wang, Zhong Ji, Xiyao Liu, Yanwei Pang, Xuelong Li","doi":"10.1007/s10489-025-06331-6","DOIUrl":"10.1007/s10489-025-06331-6","url":null,"abstract":"<div><p>Generalized Few-Shot Learning (GFSL) aims to recognize novel classes with limited training samples without forgetting knowledge of auxiliary data (base classes). Most current approaches re-engage the base classes after initial training to balance the predictive bias between the base and novel classes. However, re-using the auxiliary data might not always be possible due to privacy or ethical constraints. Consequently, the <i>zero-base</i> GFSL paradigm emerges, where models trained on the base classes are directly fine-tuned on the novel classes without revisiting the auxiliary data, avoiding the re-balancing of prediction biases. We believe that solving this paradigm relies on a critical yet often overlooked issue: feature overlap between the base and novel classes in the embedding space. To tackle this issue, we propose the Concept Agent Network, a novel framework that interprets visual features as affinity features, thereby effectively diminishing feature overlap by aggregating feature embeddings of the novel classes according to their similarity with the base classes. Additionally, we present the Concept Catena Generator, which creates multiple concepts per base class, improving understanding of the feature distribution of the base classes and clarifying the relationships between the base and novel concepts. To prevent the catastrophic forgetting of the base classes when adapting to the novel ones, we propose an Active Training Regularization strategy, promoting the preservation of base class knowledge. Extensive experimental results on two benchmarks, <i>mini</i>-ImageNet and <i>tiered</i>-ImageNet, have demonstrated the effectiveness of our framework. The potential utility of our framework spans several real-world applications, including autonomous driving, medical image analysis, and real-time surveillance, where the ability to rapidly learn from a few examples without forgetting previously acquired knowledge is critical.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143361834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TITD: enhancing optimized temporal position encoding with time intervals and temporal decay in irregular time series forecasting","authors":"Jinquan Ji, Yu Cao, Yukun Ma, Jianzhuo Yan","doi":"10.1007/s10489-025-06293-9","DOIUrl":"10.1007/s10489-025-06293-9","url":null,"abstract":"<div><p>Multivariate Time Series (MTS) acquisition processes often exhibit irregularities, making accurate MTS forecasting challenging. Previous researches focused on interpolation approaches to address data completeness in irregular MTS, but these approaches may introduce noise, thereby altering the feature distributions of irregular MTS. Recent researches trend advocate embedding the missing temporal information through position encoding for forecasting irregular MTS. However, these position encodings were typically designed for text sequences and assumed fixed time intervals, which lead to the loss or distortion of temporal information when applied to irregular MTS. Moreover, they struggled to capture the temporal dynamic information in irregular MTS. To address these challenges, we propose a novel approach called TITD (Time Interval and Temporal Decay), which utilizes time interval and temporal decay information to enhance irregular MTS forecasting. TITD optimizes position encoding to effectively capture both local time interval features and long-term temporal decay patterns, breaking the limitations of static and fixed interval position encoding on time dynamic representation. Simultaneously, TITD integrates multi-view input information from irregular MTS to enhance the representation learning of the relationships across different views, thereby achieving superior forecasting performance without interpolation. Extensive experiments on three real-world time series datasets have demonstrated that TITD provides significant improvements over state-of-the-art methods in irregular MTS forecasting.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143361832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 3D-CNN and multi-loss video prediction architecture","authors":"Ziru Qin, Qun Dai","doi":"10.1007/s10489-025-06328-1","DOIUrl":"10.1007/s10489-025-06328-1","url":null,"abstract":"<div><p>The achievements of deep learning in the sphere of computer vision have elevated video prediction to a prominent research focus. The prevailing trend in current deep learning endeavors is to pursue advanced optimization of model architectures and enhancement of their performance metrics. The task of video prediction is inherently complex, and most of the algorithm models proposed in the past are also. In this paper, we propose a novel simple video prediction network structure based on three-Dimensional Convolutional Neural Network (3D-CNN) and multi-loss, abbreviated as ML3DVP. Our network model is completely based on 3D-CNN. Compared with Convolutional Long Short-Term Memory (ConvLSTM), Recurrent Neural Network (RNN), Generative Adversarial Network (GAN) and its variants, we start from the most basic network structure to reduce complexity, thereby improving the speed of model prediction. In addition, most models today will encounter quality problems such as insufficient clarity. To solve this problem, we introduced multiple losses for back propagation. Using multiple quality evaluation indicators, Structural Similarity (SSIM) and Peak Signal-to-Noise Ratio (PSNR), as optimization objectives, continuously improves the prediction quality during the training process. The evaluation of model complexity, parameter count, and predictive outcomes across four datasets substantiates that our proposed model has successfully attained the objectives of structural refinement and enhanced performance.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143361833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}