Jin Fan , Zhanwen Liu , Yong Fang , Zeyu Huang , Yang Liu , Shan Lin
{"title":"Multi-class Agent Trajectory Prediction with Selective State Spaces for autonomous driving","authors":"Jin Fan , Zhanwen Liu , Yong Fang , Zeyu Huang , Yang Liu , Shan Lin","doi":"10.1016/j.engappai.2025.111027","DOIUrl":"10.1016/j.engappai.2025.111027","url":null,"abstract":"<div><div>Understanding and predicting multi-class agents’ movement has become more critical and challenging in diverse applications such as autonomous driving and urban intelligent monitoring. The current research mainly focuses on the motion trajectory of single-class agents. However, due to real traffic scenarios’ complexity and interactive behaviors’ variability, the motion patterns displayed by various classes of agents show inherent randomness. In this paper, inspired by the linear-time sequence model Mamba, we propose a Multi-class Agent Trajectory Prediction with Selective State Spaces (MTPSS) to model the interaction between different agents and better predict the trajectory of an individual. Specifically, MTPSS includes modeling relationships in both temporal and spatial dimensions. When encoding the spatial correlation within the trajectory graph, we construct a category-based sorting approach, which puts large-size category nodes behind to enhance contextual access. Then the sorted nodes are bi-directionally scanned through Mamba blocks, which makes the model more robust to permutations. In terms of temporal, considering the highly dynamic nature of rapidly moving agents, we utilize Mamba’s remarkable performance on sequential data to effectively conduct temporal scans to capture long-range temporal dependencies. Finally, to compute physically feasible trajectories, MTPSS employs the Neural Ordinary Differential Equation to smooth the predicted trajectory of the agent. We conducted extensive experiments on two publicly available traffic datasets and compared our method with state-of-the-art methods. Quantitative experiments show that our performance metrics are superior to state-of-the-art methods, and qualitative experiments demonstrate that the predicted trajectories have good diversity, which shows its potential in real-world traffic scenarios.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"156 ","pages":"Article 111027"},"PeriodicalIF":7.5,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144084069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fan Zhang , Min Wang , Lin Li , Yepeng Liu , Hua Wang
{"title":"Probabilistic intervals prediction based on adaptive regression with attention residual connections and covariance constraints","authors":"Fan Zhang , Min Wang , Lin Li , Yepeng Liu , Hua Wang","doi":"10.1016/j.engappai.2025.111013","DOIUrl":"10.1016/j.engappai.2025.111013","url":null,"abstract":"<div><div>This paper introduces a novel prediction interval method called Adaptive Regression with Attention Residual Connection and Covariance Constraint (AR-ARCC). By integrating Monte Carlo and Bayesian methods, we leverage the strengths of both to achieve a more flexible and accurate method for generating prediction intervals. Additionally, through the optimization of the loss function, introduction of penalty terms, and improvement of mean squared error calculations, the model’s performance in interval prediction tasks is enhanced. Finally, the integration of an interactive channel heterogeneous self-attention module, combined with residual blocks, enhances the modeling capability of the neural network. The comprehensive application of these methods results in superior performance of the model in handling uncertainty and local variations.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"156 ","pages":"Article 111013"},"PeriodicalIF":7.5,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144084070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Jayasakthi Velmurugan , H. Faheem Nikhat , K. Suresh , S. Hemavathi , V. Kavitha
{"title":"Applying convolutional attention mechanisms and Human Memory Search for effective English-Urdu translation","authors":"K. Jayasakthi Velmurugan , H. Faheem Nikhat , K. Suresh , S. Hemavathi , V. Kavitha","doi":"10.1016/j.engappai.2025.111043","DOIUrl":"10.1016/j.engappai.2025.111043","url":null,"abstract":"<div><div>In recent decades, machine translation has become a prominent area of research, with the primary objective of overcoming language barriers. Early approaches primarily centered on word-for-word translation between source and target languages. However, there has been a shift towards data-driven models, such as neural machine translation and statistical methods, driven by advancements in Artificial Intelligence (AI), communication and computing technologies. This paper introduces a novel Convolutional Attention Mechanism-based Human Memory Search (CAM-HMS) algorithm for translating English into Urdu, to achieve high-quality and effective machine translation. The proposed model consists of several key phases, including pre-processing, sentence padding, word embedding, encoding, decoding, and target text generation. A new spider web-based search strategy is also incorporated to enhance the translation process in neural machine translation. The performance is evaluated using a UMC005 English-Urdu dataset, Parallel corpus for English & Urdu language, and the Roman Urdu to English Translation Dataset. Various automatic evaluation metrics such as the Bilingual Evaluation Understudy (BLEU), National Institute of Standards and Technology (NIST), Word Error Rate (WER), Accuracy, Precision, Recall, and F-Measure are used to assess the model's efficiency, and its output is compared to that of Google Translate. The proposed model achieves an average BLEU score of 82.14 %, NIST of 79.51 %, WER of 2.77 %, Accuracy of 98.99 %, Precision of 98.95 %, Recall of 98.90 %, and F-Measure of 98.92 %, demonstrating its effectiveness in producing high-quality machine translation results.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"155 ","pages":"Article 111043"},"PeriodicalIF":7.5,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144083758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huangyuan Wu , Bin Li , Lianfang Tian , Chao Dong , Wenzhi Liao
{"title":"A unified region and concept-level explainable artificial intelligence method for explainability and active learning of defect segmentation model","authors":"Huangyuan Wu , Bin Li , Lianfang Tian , Chao Dong , Wenzhi Liao","doi":"10.1016/j.engappai.2025.111009","DOIUrl":"10.1016/j.engappai.2025.111009","url":null,"abstract":"<div><h3>Objective:</h3><div>Despite the Artificial Intelligence (AI) method having achieved great progress in defect segmentation tasks, the explainability of AI method remains a challenge since its black-box property. To guarantee its prediction result can be understood and trusted by users, recent works attempted to explain the model’s decision process through Explainable Artificial Intelligence (XAI) methods.</div></div><div><h3>Challenges:</h3><div>However, the existing XAI methods still have some limitations: (1) these XAI methods only focus on explaining model decisions from a single perspective, which usually introduces biased explanations. (2) few works consider how to leverage the explanation mechanism of XAI methods to guide the active learning process of model, which limits the application of XAI methods. Methods: To address these issues, a unified region-level and concept-level explainable AI (RC-XAI) framework is proposed for the explainability and active learning of the defect segmentation model.</div></div><div><h3>Novelty:</h3><div>Firstly, RC-XAI incorporates region-level and concept-level explanators in a collaborative manner to provide comprehensive explanations for the model decision. It enhances the reliability and robustness of explanations. Secondly, RC-XAI proposes an explainability-driven representative sample selection (ED-RSS) module to guide the model’s active learning process for improving its final performance.</div></div><div><h3>Findings:</h3><div>Experimental results on three challenging datasets demonstrate the effectiveness and generalization of the proposed RC-XAI method. Our method provides better and more comprehensive explainability compared with other XAI methods. Additionally, experiments demonstrate the potential of applying the explanation mechanism of the RC-XAI method to the active learning process of defect segmentation models.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"156 ","pages":"Article 111009"},"PeriodicalIF":7.5,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144084001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yizhen Zheng , Yuefeng Li , Xudong Pan , Fanwei Meng , Changyu Chen
{"title":"Planning scheme of artificial assembly posture and arm movement path in narrow space","authors":"Yizhen Zheng , Yuefeng Li , Xudong Pan , Fanwei Meng , Changyu Chen","doi":"10.1016/j.engappai.2025.111152","DOIUrl":"10.1016/j.engappai.2025.111152","url":null,"abstract":"<div><div>Manual assembly in a narrow space involves problems of low efficiency and difficult assembly. In view of the lack of assembly process planning and assisted manual assembly in this kind of scenario, a hybrid modeling simulation method of human posture was proposed. This method combined the characteristics of manual assembly in narrow space. The assembly planning process was divided into two parts: trunk and lower limb posture planning and human arm movement planning, to reduce the complexity of planning and the difficulty of manual assembly. In the posture planning part, this study solved for the human trunk and lower limbs by establishing a multi-objective optimization model and achieved automatic screening of assembly posture according to the weight of each target element. Arm movement planning involved a neural network of assembly spaces to guide the sampling process of the path planner combined with the inverse solution of arm kinematics for environmental collision detection to quickly obtain a feasible collision-free arm movement path from the initial position to the assembly target. Finally, the feasibility of the method in a narrow space was verified by building a scene and carrying out the corresponding manual assembly operation experiments.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"156 ","pages":"Article 111152"},"PeriodicalIF":7.5,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144084066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"BIGC-Net: A Body Inter-intra-parts Graph Convolutional Network for repetitive action counting","authors":"Jun Li , Jinying Wu , Qiming Li , Bangshu Xiong","doi":"10.1016/j.engappai.2025.110996","DOIUrl":"10.1016/j.engappai.2025.110996","url":null,"abstract":"<div><div>With the continuous development of human pose estimation techniques, researchers have gradually applied them to the field of repetitive action counting, resulting in pose-level methods. However, the current researches on the pose-level are still limited. Therefore, this paper proposes a simple but efficient Body Inter-intra-parts Graph Convolutional Network (BIGC-Net). Specifically, two core modules are developed in BIGC-Net: the Global Inter-Part Feature Learning Module (GIFL-Module) and the Salient Intra-Part Feature Learning Module (SIFL-Module). Unlike previous pose-level methods, which only model human joints globally and ignore local details. Instead, we innovatively introduce the concept of body parts with Graph Convolutional Networks (GCN) to the repetitive action counting task. Based on the natural topology of the human body, we divide the joints into multiple inter-intra-parts, each of which is regarded as a subgraph to form the overall graph structure. The complete action is then achieved by the collaborative operation between different subgraphs, thus modelling the action execution process more accurately. Therefore, the GIFL-Module is designed to capture the global collaborative relationships between the subgraphs. However, since the body joints are segmented into multiple parts, this segmentation may ignore the variation of local detail information within the subgraphs. To address this issue, the SIFL-Module aims to capture the local interdependencies between joints within the subgraphs, and the ability to focus on the most salient features of the subgraphs as it moves. The collaboration of these two modules further enhances the feature representation capability. Finally, extensive experimental results on the challenging benchmark datasets (RepCount-pose, UCFRep-pose, and Countix-Fitness-pose) show that the proposed BIGC-Net achieves excellent performance.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"156 ","pages":"Article 110996"},"PeriodicalIF":7.5,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144084068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jongkeun Lee , Young Su Lee , Joo-Ae Kim , Seulki Jeong
{"title":"Machine learning approaches to predict oxidative potential of fine particulate matter based on chemical constituents","authors":"Jongkeun Lee , Young Su Lee , Joo-Ae Kim , Seulki Jeong","doi":"10.1016/j.engappai.2025.111170","DOIUrl":"10.1016/j.engappai.2025.111170","url":null,"abstract":"<div><div>Exposure to fine particulate matter (PM<sub>2.5</sub>) poses significant health risks, primarily due to its oxidative potential (OP), which induces oxidative stress and related diseases. This study aimed to predict the OP of PM<sub>2.5</sub> based on its chemical constituents using machine learning (ML) models. We collected 119 p.m.<sub>2.5</sub> samples from Seoul, Korea, between 2019 and 2021, and analyzed their chemical composition and OP using the dithiothreitol (DTT) assay. Three ML models—k-Nearest Neighbors (kNN), Random Forest (RF), and Fully Connected Deep Neural Network (FCDNN)—were developed to predict OP. Among them, the RF model demonstrated the highest prediction accuracy, with coefficient of determination (R<sup>2</sup>) values ranging from 0.88 to 0.89 for training data and 0.36 to 0.62 for test data, followed by Extreme Gradient Boosting (XGBoost) and FCDNN with test R<sup>2</sup> values up to 0.53 and 0.39, respectively. Explainable Artificial Intelligence (AI) techniques, specifically feature importance and SHapley Additive exPlanations (SHAP), were employed to enhance the interpretability of the model, revealing the significant contributions of various chemical constituents. The study underscores the mixed effects of multiple factors on OP and highlights the potential of AI in providing robust predictive tools for environmental health. As OP measurement automation progresses, the availability of large datasets will further improve the accuracy and applicability of AI models, facilitating better health risk assessments and policy-making.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"156 ","pages":"Article 111170"},"PeriodicalIF":7.5,"publicationDate":"2025-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144072352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Heng-Yu Lin , Yung-Chun Chang , Pei-Ying Yang , Ting-Yun Huang
{"title":"Enhancing the effectiveness of emergency department computed tomography scans using pre-trained language models","authors":"Heng-Yu Lin , Yung-Chun Chang , Pei-Ying Yang , Ting-Yun Huang","doi":"10.1016/j.engappai.2025.111094","DOIUrl":"10.1016/j.engappai.2025.111094","url":null,"abstract":"<div><h3>Objective</h3><div>This study aims to develop a predictive model using a medical clinical assistive large language model to determine the necessity of computed tomography (CT) scans in emergency department settings based solely on data collected at triage. The model seeks to improve patient flow and more efficiently allocate limited medical resources while reducing unnecessary radiation exposure.</div></div><div><h3>Methods</h3><div>The model uses data collected from emergency department triage and includes patient symptoms, chief complaints, vital signs and medical history, without the need for physiological test data.</div></div><div><h3>Results</h3><div>This study analyzed 165,391 emergency department records from Shuang Ho Hospital of Taipei Medical University and used a large language model to develop a model for predicting whether a patient should undergo a CT scan. While initial results indicate that detailed symptom descriptions and severity of pain assessments can enhance prediction accuracy, our approach centers on data preprocessing, the integration of unstructured data, and external features. In our final performance comparison, the model developed using a large language model exhibited the best performance, achieving an area under the receiver operating characteristic curve (AUROC) of 0.88 and an area under the precision-recall curve (AUPRC) of 0.5414. This represents a 0.6 % improvement over existing language models and a 4.8 % improvement over traditional machine learning approaches. It is notable that the model achieved a high negative predictive value of 0.9261, indicating strong reliability in identifying patients who don't require CT scans. This model will allow physicians to better understand the overall health status of patients and provide earlier diagnostic and treatment recommendations based on comprehensive model information, which will ultimately lead to better patient care.</div></div><div><h3>Conclusion</h3><div>This research establishes foundational work for future studies that aim at optimizing emergency diagnostic processes and enhancing patient care through improved medical predictions. However, expanding the dataset’s diversity and pursuing external validations are essential to improve the predictive accuracy and applicability of the model in a variety of emergency department settings.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"156 ","pages":"Article 111094"},"PeriodicalIF":7.5,"publicationDate":"2025-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144084000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yunxia Lou , Lei Su , Jiefei Gu , Xinwei Zhao , Ke Li , Michael Pecht
{"title":"Semi-supervised dual-constraint centroid contrastive prototypical network for flip chip defect detection under limited labeled data","authors":"Yunxia Lou , Lei Su , Jiefei Gu , Xinwei Zhao , Ke Li , Michael Pecht","doi":"10.1016/j.engappai.2025.111163","DOIUrl":"10.1016/j.engappai.2025.111163","url":null,"abstract":"<div><div>Flip chips are widely used in electronic systems for defense, aerospace, and other applications where packaging reliability is critical. However, flip chip defect samples present a variety of defect types and few samples with labels in actual industrial applications. The paucity of labeled defect samples indicates that the existing data volume cannot be matched with deep learning detection models. Therefore, flip chip intelligent defect detection faces the problems of poor model adaptability and weak generalization performance. As a solution to these problems, a semi-supervised dual-constraint centroid contrastive prototypical network (SSDCPN) for flip chip defect detection under limited labeled data is proposed in this paper. First, a prototype-based supervised contrastive learning strategy is developed to construct the contrastive prototypical network, which increases the inter-class sparsity and intra-class compactness of features to acquire more discriminative features. Then, to address the susceptibility of the support set prototypes to outliers, dual constraints are imposed on the support set prototypes to calibrate and refine the prototypes. Finally, a pseudo-labeled sample selection mechanism based on epistemic uncertainty and entropy is proposed to obtain rich semi-supervised information to guide the model training. The mechanism can select high-confidence pseudo-labeled samples that can complement the training samples to further strengthen the generalization performance of the model. Defect detection experiments on flip chip vibration signals indicate that the present method is superior to other methods in the case of limited labeled samples.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"156 ","pages":"Article 111163"},"PeriodicalIF":7.5,"publicationDate":"2025-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144084067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Localizing state space for visual reinforcement learning in noisy environments","authors":"Jing Cheng , Jingchen Li , Haobin Shi , Tao Zhang","doi":"10.1016/j.engappai.2025.110998","DOIUrl":"10.1016/j.engappai.2025.110998","url":null,"abstract":"<div><div>Gaining robust policies is what the visual reinforcement learning community desires. In practical application, the noises in an environment lead to a larger variance in the perception of a reinforcement learning agent. This work introduces a non-differential module into deep reinforcement learning to localize the state space for agents, by which the impact of noises can be greatly reduced, and the learned policy can be explained implicitly. The proposed model leverages a hard attention module for localization, while an additional reinforcement learning process is built to update the localization module. We analyze the relationship between the non-differential module and agent, regarding the whole training as a hierarchical multi-agent reinforcement learning model, ensuring the convergence of policies by centralized evaluation. Moreover, to couple the localization policy and behavior policy, we modify the evaluation processes, gaining more direct coordination for them. The proposed method enables the agent to localize its observation or state in an explainable way, learning more advanced and robust policies by ignoring irrelevant data or changes in noisy environments. That is, it enhances reinforcement learning’s ability to disturbance rejection. Several experiments on simulation environments and Robot Arm suggest our localization module can be embedded into existing reinforcement learning models to enhance them in many respects.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"156 ","pages":"Article 110998"},"PeriodicalIF":7.5,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144072291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}