Alireza Koochali, Ensiye Tahaei, Andreas Dengel, Sheraz Ahmed
{"title":"VAEneu: a new avenue for VAE application on probabilistic forecasting","authors":"Alireza Koochali, Ensiye Tahaei, Andreas Dengel, Sheraz Ahmed","doi":"10.1007/s10489-024-06203-5","DOIUrl":"10.1007/s10489-024-06203-5","url":null,"abstract":"<div><p>This paper introduces VAEneu, a novel autoregressive method for multistep ahead univariate probabilistic time series forecasting, designed to address the challenges of generating sharp and well-calibrated probabilistic forecasts without assuming a specific parametric form for the predictive distribution. VAEneu leverages the Conditional VAE framework and optimizes the likelihood of the predictive distribution using the Continuous Ranked Probability Score (CRPS), a strictly proper scoring rule, as the loss function. This approach enables the model to learn flexible, sharp, and well-calibrated predictive distributions without the need for a tractable likelihood function. In a comprehensive empirical study, VAEneu is rigorously benchmarked against 12 baseline models across 12 datasets, demonstrating superior performance in both forecasting accuracy and uncertainty quantification. VAEneu provides a valuable tool for quantifying future uncertainties, and our extensive empirical study lays the foundation for future comparative studies for univariate multistep ahead probabilistic forecasting.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-024-06203-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143496975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"STGFP: information enhanced spatio-temporal graph neural network for traffic flow prediction","authors":"Qi Li, Fan Wang, Chen Wang","doi":"10.1007/s10489-025-06377-6","DOIUrl":"10.1007/s10489-025-06377-6","url":null,"abstract":"<div><p>Accurate traffic flow prediction is crucial for the development of intelligent transportation systems aimed at preventing and mitigating traffic issues. We present an information-enhanced spatio-temporal graph neural network model to predict traffic flow, addressing the inefficient utilization of non-Euclidean structured traffic data. Firstly, we employ a multivariate temporal attention mechanism to capture dynamic temporal correlations across different time intervals, while a second-order graph attention network identifies spatial correlations within the network. Secondly, we construct two types of traffic topology graphs that comprehensively describe traffic flow features by integrating non-Euclidean traffic flow data, regional traffic status information, and node features. Finally, a multi-graph convolution neural network is designed to extract long-range spatial features from these traffic topology graphs. The spatio-temporal feature extraction module then combines these long-range spatial features with spatio-temporal features to fuse multiple features and improve prediction accuracy. Experimental results demonstrate that the proposed approach outperforms state-of-the-art baseline methods in predicting traffic flow performance.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143496979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mixmamba-fewshot: mamba and attention mixer-based method with few-shot learning for bearing fault diagnosis","authors":"Nhu-Linh Than, Van Quang Nguyen, Gia-Bao Truong, Van-Truong Pham, Thi-Thao Tran","doi":"10.1007/s10489-025-06361-0","DOIUrl":"10.1007/s10489-025-06361-0","url":null,"abstract":"<div><p>In recent years, artificial intelligence, particularly machine learning and deep learning has ushered in a new era of technological advancements leading to significant progress across various domains. In the field of computer vision, deep learning has made substantial contributions, impacting everything from daily life to production and industry. When machines, rotating devices, and engines operate, bearing failures are inevitable. Our task is to accurately detect or diagnose these failures. However, a key challenge lies in the lack of sufficient data on bearing faults to train a model capable of delivering highly accurate diagnostic results. To address this issue, in this paper, we propose a new approach named MixMamba-Fewshot, leveraging few-shot learning and using a feature extraction module that integrates an attention mechanism called the Priority Attention Mixer and Mamba - a novel theory that has recently gained considerable attention within the research community. Using Mamba for vision-based feature extraction in classification tasks, particularly in few-shot learning is an innovative approach, and it has shown promising results in improving the accuracy of bearing fault diagnosis. When we tested our model on the datasets provided by Case Western Reserve University (CWRU) and the Paderborn University (PU) Bearing Dataset, we compared it with previously published models. Our proposed approach demonstrated a significant improvement in diagnostic accuracy and clearly outperformed existing approaches. Our code will be available at: https://github.com/linhthan216/MixMamba-Fewshot.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143496977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Non-local modeling of enhancer-promoter interactions, a correspondence on “LOCO-EPI: Leave-one-chromosome-out (LOCO) as a benchmarking paradigm for deep learning based prediction of enhancer-promoter interactions”","authors":"Michael A. Beer","doi":"10.1007/s10489-025-06378-5","DOIUrl":"10.1007/s10489-025-06378-5","url":null,"abstract":"<div><p>A recent paper by Tahir et al. (Appl Intell 55:71, 2024) in Applied Intelligence reported a computational model of enhancer promoter interactions without realizing that many of their conclusions were previously published in 2018. In addition to correcting this record, the authors appear to be unaware of an additional body of previous work on enhancer-promoter interactions, which can explain why their computational model performs poorly. We describe how the weak predictive power of their model is consistent with new insights gained from substantial recent progress in the area of detecting and modeling enhancer promoter interactions constrained by DNA looping, extrusion by cohesin, and CTCF.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143496978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Composed image retrieval: a survey on recent research and development","authors":"Yongquan Wan, Guobing Zou, Bofeng Zhang","doi":"10.1007/s10489-025-06372-x","DOIUrl":"10.1007/s10489-025-06372-x","url":null,"abstract":"<div><p>In recent years, composed image retrieval (CIR) has gained significant attention within the research community due to its excellent research value and extensive real-world applications. CIR allows modifying query images based on user-provided text descriptions, producing search results that better match users’ intent. This paper conducts a comprehensive and up-to-date survey of CIR research and its applications. We summarise recent advancements in CIR methodologies from these perspectives by breaking down a CIR system into four key processes-feature extraction, feature alignment, feature fusion, and image retrieval. We examine feature extraction, emphasizing deep learning techniques for images and text. As deep learning evolves, feature alignment increasingly integrates with other processes, encouraging us to categorize related methods into explicit and implicit approaches. From the perspective of feature fusion, we investigate advancements in image-text feature fusion techniques, categorizing them into 6 broad categories and 17 subcategories. We also summarize different architecture types and training loss functions for image retrieval. Additionally, we review standard benchmark datasets and evaluation metrics in CIR, presenting a comparative analysis of the accuracy of crucial CIR approaches. Finally, we put forward several critical yet underexplored issues in the field.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143496976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yusen Zhang, Min Li, Yao Gou, Xianjie Zhang, Yujie He
{"title":"A local generation-mix cascade network for image translation with limited data","authors":"Yusen Zhang, Min Li, Yao Gou, Xianjie Zhang, Yujie He","doi":"10.1007/s10489-025-06379-4","DOIUrl":"10.1007/s10489-025-06379-4","url":null,"abstract":"<p>Image translation based on deep generative models often overfits with limited data. Current methods overcome this problem through mix-based data augmentation. However, if latent features are mixed without considering semantic correspondences, augmented samples may exhibit visible artifacts and mislead model training. In this paper, we propose a <b>Lo</b>cal <b>G</b>eneration-<b>Mix</b> Cascade Network (LogMix), a data augmentation strategy for image translation tasks with limited data. Through cascading a local feature generation module and mixing module, LogMix enables the generation of a reference feature bank, which is mixed with the most similar local representation to form a new intermediate sample. Furthermore, we design a semantic relationship loss based on the mixed distance of latent features ensures consistency in the distribution of features between the generated and source domains. LogMix effectively mitigates the overfitting problem by learning to translate intermediate samples instead of memorizing the training data Experimental results across various tasks demonstrate that, even with limited data, LogMix data augmentation reduces image ambiguity and offers significant advantages in establishing realistic cross-domain mappings.</p>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiangdong Huang, Junxia Huang, Noor Farizah Ibrahim
{"title":"NVS-Former: A more efficient medical image segmentation model","authors":"Xiangdong Huang, Junxia Huang, Noor Farizah Ibrahim","doi":"10.1007/s10489-025-06387-4","DOIUrl":"10.1007/s10489-025-06387-4","url":null,"abstract":"<div><p>In the current field of medical image segmentation research, numerous Transformer-based segmentation models have emerged. However, these models often suffer from limitations in multi-scale feature extraction and struggle to capture local detail features and contextual information, thereby constraining their segmentation performance. This paper introduces a novel model for medical image segmentation, called NVS-Former, which comprises both an encoder and a decoder. The key innovation of NVS-Former lies in its redesigned core module during the encoding phase, which not only enhances feature extraction capabilities but also improves the capture of local detail information. Additionally, the decoder structure has been reengineered to further optimize the model’s class prediction abilities. NVS-Former has demonstrated superior performance in tasks involving multi-organ, pulmonary detail, and cell segmentation. In various comparative experiments, it consistently outperformed state-of-the-art methods, highlighting its efficiency and stability in medical image segmentation.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ke Liao, Armagan Elibol, Ziyan Gao, Lingzhong Meng, Nak Young Chong
{"title":"Predicting hemodynamic parameters based on arterial blood pressure waveform using self-supervised learning and fine-tuning","authors":"Ke Liao, Armagan Elibol, Ziyan Gao, Lingzhong Meng, Nak Young Chong","doi":"10.1007/s10489-025-06391-8","DOIUrl":"10.1007/s10489-025-06391-8","url":null,"abstract":"<div><p>The arterial blood pressure waveform (ABPW) serves as a less invasive technique for evaluating hemodynamic parameters, offering a lower risk compared to the more invasive pulmonary artery catheter (PAC) thermodilution method. Various studies suggest that deep learning models can potentially predict the hemodynamic parameters of ABPW. However, the scarcity of ground truth data restricts the accuracy of these models, preventing them from gaining clinical acceptance. To mitigate this data and domain challenge, this work proposed a self-supervised generative learning model for hemodynamic parameter prediction, called SSHemo (Self-Supervised Hemodynamic model). Specifically, SSHemo suggests first to leverage large amounts of unlabeled ABPW data to learn the representative embedding and then to fine-tune for the downstream task with a small amount of hemodynamic parameters’ ground truth. To verify the effectiveness of SSHemo, we utilize the public available VitalDB data set to train the model, and evaluation was conducted on two public datasets: VitalDB and MIMIC. The experimental results reveal that SSHemo’s regression mean absolute error (MAE) improved significantly from 1.63 L/min to 1.25 L/min when predicting cardiac output (CO). The trending tracking ability for CO changes meets clinical acceptance (radial limit of agreement (LOA) is <span>(pm 25.56)</span>°, less than <span>(pm 30)</span>°). In addition, SSHemo demonstrates robust stability in various conditions and cohorts, as evidenced by subgroup analysis, varying systemic vascular resistance (SVR) range analysis, and rapid CO analysis, compared to the most widely used commercial devices, the EV1000. Computational analysis further underscores the value and potential of practical application of the model in various settings.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predicting gas flow rates of wellhead chokes based on a cascade forwards neural network with a historically limited penetrable visibility graph","authors":"Youshi Jiang, Jingkai Hu, Xiyu Chen, Weiren Mo","doi":"10.1007/s10489-025-06365-w","DOIUrl":"10.1007/s10489-025-06365-w","url":null,"abstract":"<div><p>This study presents a novel hybrid model that combines the cascade forward neural network (CFNN) with a historical limited penetrable visibility graph (HLPVG) for accurate prediction of gas flow rates through wellhead chokes in shale gas production. The model addresses the challenges of complex, nonlinear relationships between multiple variables affecting gas flow, including liquid–gas ratio (LGR), upstream pressure, temperature, and choke bean size. Using 11,572 field production samples from shale gas fields in the southern Sichuan Basin, the CFNN-HLPVG model demonstrates superior predictive performance compared to the conventional methods. The HLPVG algorithm transforms time series data into a graph structure, enabling the extraction of rich temporal and topological features, whereas the CFNN captures the complex interactions between variables. The model achieves a mean absolute relative error (MARE) of 0.014, significantly outperforming traditional approaches, including the Gilbert-type correlation, support vector machine, and other neural network architectures. Sobol sensitivity analysis revealed that choke bean size has the greatest impact on gas flow prediction (37.7% first-order sensitivity), followed by upstream pressure (19.3%) and temperature (11.6%), whereas LGR has a minimal influence (0.6%). The model performs particularly well under normal operating conditions but shows decreased accuracy in extreme environments with high temperature and pressure. This research provides a novel approach to gas flow prediction in wellhead chokes, offering valuable insights for optimizing shale gas production operations while highlighting areas for future improvement in handling extreme conditions and multisource data integration.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143489632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yan Zhang, Hongping Yan, Kun Ding, Tingting Cai, Yueyue Zhou
{"title":"Instructed fine-tuning based on semantic consistency constraint for deep multi-view stereo","authors":"Yan Zhang, Hongping Yan, Kun Ding, Tingting Cai, Yueyue Zhou","doi":"10.1007/s10489-025-06382-9","DOIUrl":"10.1007/s10489-025-06382-9","url":null,"abstract":"<div><p>Existing depth map-based multi-view stereo (MVS) methods typically assume that texture features remain consistent across different viewpoints. However, factors such as lighting changes, occlusions, and weakly textured regions can lead to inconsistent texture features, posing challenges for feature extraction. As a result, relying solely on texture consistency does not always yield high-quality reconstruction results in certain scenarios. In contrast, high-level semantic concepts corresponding to the same objects remain consistent across different viewpoints, which we define as semantic consistency. Since designing and training new MVS networks from scratch is both costly and labor-intensive, we propose fine-tuning existing depth map-based MVS networks during testing phase by incorporating semantic consistency constraints to improve the reconstruction quality in regions with poor results. Considering the robust open-set detection and zero-shot segmentation capabilities of Grounded-SAM, we first use Grounded-SAM to generate semantic segmentation masks for arbitrary objects in multi-view images based on text instructions. These masks are then used to fine-tune pre-trained MVS networks via aligning them from different viewpoints to the reference viewpoint and optimizing the depth maps based on the proposed semantic consistency loss function. Our method is designed as a test-time approach that is adaptable to a wide range of depth map-based MVS networks, requiring only adjustments to a small number of depth-related parameters. Comprehensive experimental evaluation across different MVS networks and large-scale scenarios demonstrates that our method effectively enhances reconstruction quality at a lower computational cost.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 6","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143481136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}