{"title":"Multimodal data fusion-based intelligent fault diagnosis for ship rotating machinery: Status quo and perspectives","authors":"Yaqiong Lv , Jian Hao , Min Tang , Jun Wu","doi":"10.1016/j.engappai.2025.111767","DOIUrl":"10.1016/j.engappai.2025.111767","url":null,"abstract":"<div><div>Ships are indispensable to global transportation and play a critical role in fostering economic growth and cultural exchange. Central to ship operations is the rotating machinery, which, due to the complex and demanding conditions at sea, is susceptible to faults posing a serious threat to the safety of ship navigation. Therefore, timely and effective fault detection and diagnosis are essential for minimizing operational interruptions and maintaining the safety of ship navigation. Fault diagnosis methods based on multimodal data fusion (MDF) strategies can fully utilize the monitoring capabilities of different modal signals for different types of faults, significantly improving diagnostic accuracy, and have attracted widespread attention from researchers in recent years, yielding promising results. Nevertheless, there is currently a notable lack of a comprehensive review of multimodal data fusion approaches. This paper addresses this gap by conducting an exhaustive review of existing literature on multimodal data fusion technology, focusing on fusion methods. It discusses their practical applications in diagnosing faults in ship rotating machinery, analyzes current challenges, and outlines future research directions. This comprehensive synthesis aims to serve as a key reference for researchers in the field, guiding future developments in fault diagnosis technologies.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"160 ","pages":"Article 111767"},"PeriodicalIF":8.0,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144723303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring You Only Look Once v8 and v9 for efficient airplane detection in very high resolution remote sensing imagery","authors":"Doğu İlmak , Tolga Bakirman , Elif Sertel","doi":"10.1016/j.engappai.2025.111854","DOIUrl":"10.1016/j.engappai.2025.111854","url":null,"abstract":"<div><div>Automatic airplane detection from satellite images using deep learning methods produces valuable geospatial information for a wide range of applications, including aviation safety, defence, airport and disaster management. You Only Look Once (YOLO) models have been widely used for various geospatial tasks; however, their application to airplane detection in very high-resolution (VHR) remote sensing imagery, particularly YOLOv8 and YOLOv9, remains underexplored. This study aims to assess the performance of YOLOv8 and YOLOv9 architectures in the context of airplane detection using High Resolution Planes (HRPlanes) dataset. First, we examine the impact of various hyperparameters on the performance of YOLOv8 models to propose optimal hyperparameter and model variant combinations. Second, we compare the best-performing YOLOv8 configurations with their YOLOv9 counterparts to evaluate potential improvements. Third, we assess the generalizability and transferability of the top-performing models by testing them across independent airplane detection datasets. Lastly, we perform an operational assessment of inference performance by analyzing trade-offs between network size, input image resolution and processing time. The optimal performance was achieved with the YOLOv8x model using 960x960 network size and data augmentation, resulting in 98.99 % F1-Score, 99.12 % Precision, 98.86 % Recall, 99.35 % Mean Average Precision (mAP)50, and 89.82 % mAP50-95. YOLOv9e achieved comparable performance with fewer parameters (57.3 vs. 68.2 million) and lower computational cost (189.0 vs. 257.8 giga floating point operations per second (GFLOPS)), offering up to a 27 % reduction in computational cost. These findings highlight the practical potential of both YOLOv8 and YOLOv9 for high-precision airplane detection in VHR remote sensing imagery. The HRPlanes dataset and model weights are publicly available at: <span><span>https://github.com/RSandAI/Efficient-YOLO-RS-Airplane-Detection</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"160 ","pages":"Article 111854"},"PeriodicalIF":8.0,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144723102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhiwu Shang, Tianchu Pang, Cailu Pan, Leyi Yao, Zifei Wang
{"title":"Meta-adversarial transfer learning based on a dual-channel transformer network for aircraft engine remaining useful life prediction","authors":"Zhiwu Shang, Tianchu Pang, Cailu Pan, Leyi Yao, Zifei Wang","doi":"10.1016/j.engappai.2025.111852","DOIUrl":"10.1016/j.engappai.2025.111852","url":null,"abstract":"<div><div>Meta-transfer learning (MTL) is increasingly used for remaining useful life (RUL) of aircraft engines across operating conditions because it enables efficient knowledge transfer under limited data. However, existing MTL-based RUL models still suffer from domain shift because they transfer irrelevant features from pre-training tasks and overlook the joint influence of temporal patterns and multi sensor interactions on RUL. To address these challenges, we propose a meta-adversarial transfer learning method based on a dual-channel Transformer network (MAT-DCT). Specifically, during meta-training, we insert a gradient reversal layer into the inner loop so the extractor and regressor co-learn while purging pre-training noise; during meta-adaptation, we add a second adversarial mechanism in which the extractor contests a domain discriminator, enabling target feature alignment under few-shot conditions. Furthermore, we construct a dual-channel network; One channel emphasizes inter-sensor interactions while the other captures temporal evolution. In each channel, gated convolutions distill subtle degradation cues and feed them into the Transformer, which weaves them into a coherent global context. Experiments on the aircraft turbofan engine dataset across twelve cross domain tasks show that MAT-DCT reduces root mean square error by about 24 % and Score function by 49 % compared to baseline models, confirming its superiority for few-shot RUL prediction under diverse operating conditions.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"160 ","pages":"Article 111852"},"PeriodicalIF":8.0,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144724872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jie Liu , Baoji Zhang , Lifen Hu , Junying Bi , Zheng Tian , Yingkai Dong
{"title":"Research on hull form optimization at multiple speeds based on machine learning and ship model experiments","authors":"Jie Liu , Baoji Zhang , Lifen Hu , Junying Bi , Zheng Tian , Yingkai Dong","doi":"10.1016/j.engappai.2025.111882","DOIUrl":"10.1016/j.engappai.2025.111882","url":null,"abstract":"<div><div>In order to improve the scientificity, efficiency and systematicness of ship form optimization, the multi-objective optimization research on the David Taylor Model Basin (DTMB) 5512 ship is carried out. First, the ship model experiment quantified the still water resistance of DTMB 5512 at six speeds at Froude number (Fr) as 0.25–0.40, demonstrating an almost linear resistance velocity relationship. Meanwhile, the DTMB 5512 ship is subjected to numerical simulations using the Computational Fluid Dynamics (CFD) method and the calculated results are compared with the experimental results. Then, Random Forest (RF)-based approximate models were developed for multi-speed resistance prediction, and verified its feasibility using Maximum Absolute Error (MAE). Finally, the parametric modeling method, the CFD method, and the optimization algorithm are integrated to construct a multi-objective optimization design system for ship forms. The resistance performance of the DTMB 5512 ship is optimized using the Multi-Objective Particle Swarm Optimization (MOPSO) algorithm. The results show that under the constructed hull form optimization framework, the optimized hull forms that meet the constraint conditions can be obtained. The total resistance of the obtained optimized ship at six speeds is reduced by 2.95 %, 4.44 %, 3.71 %, 5.22 %, 5.51 % and 4.83 % respectively. The research results indicate that the optimized hull forms with improved resistance performance can be obtained through the proposed methods, significantly enhancing the optimization efficiency. It also verifies the effectiveness of the random forest method in addressing the challenges of actual engineering optimization.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"160 ","pages":"Article 111882"},"PeriodicalIF":7.5,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144714320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aman Gautam, Maddassar Jalal, Surinder Singh Khurana, Parvinder Singh
{"title":"Leveraging quick response codes and vision transformer for distributed denial of service attack detection","authors":"Aman Gautam, Maddassar Jalal, Surinder Singh Khurana, Parvinder Singh","doi":"10.1016/j.engappai.2025.111849","DOIUrl":"10.1016/j.engappai.2025.111849","url":null,"abstract":"<div><div>The exponential growth of internet usage has created a massive volume of data and interconnected devices, establishing an environment for a variety of cyber-attacks, most notably, Distributed Denial of Service (DDoS) attacks. Persistent and increasingly advanced attacks make network security difficult to maintain. In response to this shifting cyber threat landscape, researchers have increasingly turned to machine learning and deep learning techniques to develop more robust and adaptive security mechanisms. This paper introduces an innovative model for detecting DDoS attacks that utilizes the latest deep-learning approaches to tackle these challenges. The research is divided into two main phases: the generation of Quick Response (QR) codes from.csv data and the evaluation of the model's effectiveness. The initial phase involves creating QR codes, which are then used in the training phase to assess the model's performance. The model is trained on image data using several advanced architectures, including the Convolutional Neural Network (CNN), Residual Network (ResNet101v2), Vision Transformer (ViT), Neural Architecture Search Network (NASNetMobile), Inception Residual Network (InceptionResNetV2), Convolutional Neural Network Next Base (ConvNeXtBase), and Efficient Network Large (EfficientNetV2L). The outcomes are promising, with the ViT model demonstrating an accuracy rate of 99.58 % significantly outperforming the other models. These results highlight the potential of the proposed deep learning-based approach in efficiently detecting DDoS attacks.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"160 ","pages":""},"PeriodicalIF":7.5,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144713145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Empirical method for developing a hyper-personalization artifact","authors":"Umapathy Sivan G. Murugasu, Anusuyah Subbarao","doi":"10.1016/j.engappai.2025.111875","DOIUrl":"10.1016/j.engappai.2025.111875","url":null,"abstract":"<div><div>This research synthesized an artifact that applied Artificial Intelligence to enable telecommunication businesses to offer hyper-personalized products and services. The study was based on a database of the customers’ digital demography, and a range of telecommunication products and services the customer used. Relevant attributes were included in a Google Form to collect customer usage data. The data collected was screened for bad data and normality checked by conducting multivariable Mahalanobis outlier detection and normality tests. Outlier data was removed and multivariable normality of the data was ensured. Using the preprocessed database, several procedures were conducted to determine the best artificial intelligence algorithm for subsequent analysis, namely, the Logistic Model Tree algorithm. Using this algorithm and the customer digital demography dataset, the telecommunication business offerings for the customers were predicted with a 97.6 % accuracy. The proof of concept was developed using the Waikato Environment for Knowledge Analysis software. The artifact created was named Hypersona. Implemented within telecommunication systems, the model can be integrated into customer relationship management platforms allowing real-time adaptation to user needs. The methodology ensures feasibility by leveraging existing data infrastructures, while scalability is achieved through automated learning mechanisms that adapt to changing user environments. The research contributions lie in the applicability of real-time identification of changing personalized products and services. This research highlights the potential of artificial intelligence driven hyper-personalization in the telecommunications sector. Further research can be extended to other contemporary artificial intelligence methods and exploring the scaling of the artifact across diverse businesses.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"160 ","pages":"Article 111875"},"PeriodicalIF":8.0,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144723105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing fundus image diabetic retinopathy classification through modified conformer with sparse attention","authors":"Jian Lian , Jiafu Ji , Yawen Niu , Wanzhen Jiao","doi":"10.1016/j.engappai.2025.111795","DOIUrl":"10.1016/j.engappai.2025.111795","url":null,"abstract":"<div><div>As a prevalent ocular disorder, diabetic retinopathy poses significant challenges in accurate and timely diagnosis through fundus image classification. Current classification approaches frequently encounter obstacles in attaining high accuracy and efficiency. Therefore, the primary purpose of this study is to address these limitations and enhance the classification performance. The methodology encompasses an elaborate design of a modified Conformer architecture. Specifically, within the modified Conformer, meticulous adjustments are made to incorporate the sparse attention layer. This layer is engineered to selectively concentrate on pertinent features within the fundus images, thereby enhancing the model’s discriminatory power. For the proposed model, the training process employs advanced optimization algorithms and judiciously selected hyperparameters. The model is evaluated using a set of well-established public datasets of fundus images, with an extensive array of performance metrics such as accuracy, recall, precision, and F1-score. The experimental outcomes demonstrate that the proposed model surpasses conventional classification methods in terms of various evaluation metrics. Accordingly, this research offers a novel and efficacious approach for diabetic retinopathy fundus image classification.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"160 ","pages":"Article 111795"},"PeriodicalIF":8.0,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144723302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fangfang Ye , Congcong Liu , Jinming Wang , Qingrong Sun , Somia Asklany
{"title":"Kidney stone and tumor segmentation by analyzing medical images using deep learning technique","authors":"Fangfang Ye , Congcong Liu , Jinming Wang , Qingrong Sun , Somia Asklany","doi":"10.1016/j.engappai.2025.111878","DOIUrl":"10.1016/j.engappai.2025.111878","url":null,"abstract":"<div><div>The segmentation of kidney tumors is a critical activity in medical imaging since it aids in effective diagnosis, treatment, and follow-ups of renal disorders. However, the segmentation process faces challenges related to variability in the tumor's size, shape, and intensity, as well as the noise and artifacts in medical images. This study aims to address the challenge of designing an effective and automated Deep Neural Model (DNM) analysis for Computed Tomography (CT) images of kidney stones and tumor segmentation, which is more accurate, faster, and more efficient than current state-of-the-art models. The DNM utilizes the U-Net structure to extract cross-scale features from the CT images. The extracted features are further explored with the aid of a transformer model, which identifies and extracts local and global context features to enhance mask segmentation efficiency. The obtained results reveal a considerable enhancement in segmentation results, achieving an 8 % increase in the Dice similarity coefficient (DSC) compared to standard techniques. This approach primarily focuses on segmenting renal cell carcinoma, a pathology commonly associated with kidney tumors, and demonstrates strong potential to assist clinical diagnosis, surgical planning, and treatment monitoring in nephrology, contributing to improved assessment and management of chronic kidney diseases (CKD). The proposed DNM model increases the precision ratio by 98.89 %, the recall ratio by 97.12 %, the accuracy ratio by 98.43 %, the F1-score ratio by 98.5 %, and the IoU by 99.18 % compared to existing models.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"160 ","pages":"Article 111878"},"PeriodicalIF":7.5,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144714318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thangaraja Jeyaseelan , Abhijeet Gopalakrishnan , Sundararajan Rajkumar , Santosh Narayan V , Min Son , Lars Zigan
{"title":"Application of artificial neural networks to model macroscopic spray characteristics of alcohol fuels","authors":"Thangaraja Jeyaseelan , Abhijeet Gopalakrishnan , Sundararajan Rajkumar , Santosh Narayan V , Min Son , Lars Zigan","doi":"10.1016/j.engappai.2025.111841","DOIUrl":"10.1016/j.engappai.2025.111841","url":null,"abstract":"<div><div>In pursuit of global efforts to mitigate climate change and achieve net-zero carbon emissions by 2050, alcohol-based fuels are gaining attention as low-carbon alternatives for combustion engines. This study presents a novel application of Artificial Intelligence (AI) to predict the spray characteristics, specifically spray cone angle and penetration length, for a wide range of alcohol fuels and operating conditions. A robust Artificial Neural Network (ANN) model was developed using the Keras Application Programming Interface (API) on the TensorFlow platform, trained on a comprehensive dataset combining in-house experimental data for ethanol, octanol, and published data for methanol and butanol fuels. Spray behaviour was captured using shadowgraph technique under varied injection pressures (20–239 bar) and chamber pressures (9–21 bar). The ANN model demonstrated high predictive accuracy, with mean square error and coefficient of determination (R<sup>2</sup>) of 1.7189 and 0.9841 for spray penetration, and 6.3734 and 0.9163 for spray angles, respectively. The unified modeling framework effectively captures the complex interactions of different alcohol fuels and operating conditions, enabling advanced, accurate simulations of spray behaviour. This AI-driven approach could be a valuable tool for predicting the spray behavior of alcohol fuels, facilitating advanced modeling of low-carbon fuel combustion processes.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"160 ","pages":"Article 111841"},"PeriodicalIF":7.5,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144714417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hai-Ying Zheng , Bing-Zhou Chen , Yang Xiang , Ke-Lei Xia , Yu-Bin Lin , Jin-Hang Liu , Jia-Jia Zhang , Liu-Deng Zhang , Zhong-Yi Wang , Lan Huang
{"title":"Reconstructing three-dimensional conductivity distribution of in-situ maize ears using frequency-enhanced residual encoder-decoder network","authors":"Hai-Ying Zheng , Bing-Zhou Chen , Yang Xiang , Ke-Lei Xia , Yu-Bin Lin , Jin-Hang Liu , Jia-Jia Zhang , Liu-Deng Zhang , Zhong-Yi Wang , Lan Huang","doi":"10.1016/j.engappai.2025.111858","DOIUrl":"10.1016/j.engappai.2025.111858","url":null,"abstract":"<div><div>Monitoring water content distribution in in-situ maize ears is crucial for agricultural cultivation and crop research, with impedance dynamics correlated to water content changes. However, the combined effects of complex maize ear structure, unstable contact impedance, and diverse environmental noise significantly reduce the quality of measurement signals, posing major challenges for reconstructing three-dimensional (3D) conductivity distribution. Therefore, we propose a 3D conductivity absolute reconstruction model based on a spectrum-enhanced residual encoder-decoder (FRED-Net), enabling stable monitoring of impedance changes in in-situ maize ears in a greenhouse. FRED-Net combines residual blocks and encoder-decoder architectures, utilizing skip connections to preserve low-level features, effectively addressing gradient vanishing and information loss while maintaining low computational complexity. To cope with greenhouse environmental interference, FRED-Net incorporates an information fusion method based on spectral characteristics and volatility features, combined with a noiseless dataset training mode to enrich model input variability. Results from standard water tank simulation experiments indicate that FRED-Net exhibits ideal reconstruction accuracy and noise robustness compared to existing 3D absolute reconstruction algorithms, with a structural similarity index of 0.9970, root mean square error of 0.0019, relative error of 0.0199, and coefficient of determination of 0.9610. Imaging experiments on in-situ maize ears in the greenhouse successfully captured the dynamic characteristics of conductivity changes with water content during growth and the significant differences between tissue structures, providing an effective absolute imaging method for non-invasive visualization and measurement of the physiological state of maize ears during the growing period.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"160 ","pages":"Article 111858"},"PeriodicalIF":8.0,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144723104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}