Sanaz Norouzi Larki, Mohammad Mosleh, Mohammad Kheyrandish
{"title":"Towards quantum audio steganalysis using synergy of quantum fourier transform and quantum neural network","authors":"Sanaz Norouzi Larki, Mohammad Mosleh, Mohammad Kheyrandish","doi":"10.1016/j.engappai.2025.111595","DOIUrl":"10.1016/j.engappai.2025.111595","url":null,"abstract":"<div><div>The proposed approach in this study introduces a comprehensive audio steganalysis scheme that integrates quantum signal processing with machine learning techniques. This method employs the quantum Fourier transform on the Quantum Representation of Digital Signals (QRDS) to extract statistical features from the second-order derivatives of the audio spectrum. These features are derived by analyzing the rate of change in the gradient of the quantum spectrum, providing valuable insights for identifying steganographic content, concealed within the audio data. The statistical analysis of these features includes the quantum spectral center (QSC), quantum spectral bandwidth (QSB), quantum spectral flatness measurement (QSFM), and quantum spectral crest factor (QSFC). The extracted features are then input into a multilayer quantum neural network that utilizes simple quantum gates, thereby reducing the algorithm's complexity and the time required for training and testing. The classification algorithm, applied by this neural network, can distinguish between clean and stego audio datasets, with an accuracy exceeding 96 %. It outperforms existing methods in both efficiency and accuracy.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111595"},"PeriodicalIF":7.5,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144522980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yudi Ruan , Hao Ma , Di Ma , Weikai Li , Xiao Wang
{"title":"Low-light image enhancement using dual cross attention","authors":"Yudi Ruan , Hao Ma , Di Ma , Weikai Li , Xiao Wang","doi":"10.1016/j.engappai.2025.111501","DOIUrl":"10.1016/j.engappai.2025.111501","url":null,"abstract":"<div><div>Low-light image enhancement (LLIE) aims to improve the perceptibility and interpretability of images captured in poorly illuminated environments. Existing LLIE methods often fail to capture the local self-similarity and long-range dependencies at the same time, causing the loss of complementary information between multiple modules or network layers, ultimately resulting in the loss of image details. To alleviate this issue, we design a hierarchical mutual Enhancement via a dual cross-attention transformer (ECAFormer), which introduces an architecture that enables concurrent propagation and interaction of multiple disentangling features. To capture the local self-similarity, we design a Dual Multi-head Self-Attention (DMSA), which leverages the disentangled visual and semantic features across different scales, allowing them to guide and complement each other. Further, a cross-scale DMSA block is incorporated to capture residual connections, thereby integrating cross-layer information and capturing the long-range dependencies. Experimental results show that the ECAFormer reaches competitive performance across multiple benchmarks, yielding nearly a 3.7% improvement in Peak Signal-to-Noise Ratio (PSNR) over the suboptimal method, demonstrating the effectiveness of information interaction in LLIE. For facilitating the efforts to replicate our results, our implementation is available on GitHub<span><span><sup>1</sup></span></span></div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111501"},"PeriodicalIF":7.5,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144523028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chao Zheng, Guoxing Bai, Yu Meng, Lu Wang, Xianyao Jiang, Li Liu
{"title":"Open-pit mine occlusion object detection for unmanned transport vehicles","authors":"Chao Zheng, Guoxing Bai, Yu Meng, Lu Wang, Xianyao Jiang, Li Liu","doi":"10.1016/j.engappai.2025.111436","DOIUrl":"10.1016/j.engappai.2025.111436","url":null,"abstract":"<div><div>Accurate object recognition in open-pit mine environments is crucial for the safety of autonomous transport vehicles. Existing autonomous driving perception mostly focuses on urban structured road traffic, and it is hard to adapt to the challenging open-pit mine environment. Lacking of datasets further limits the development of the specific work. In this paper, we propose an object detection dataset for open-pit mine autonomous driving applications. This dataset encompasses data from several mines and includes different periods such as day, dusk, and night. It provides detailed annotations for diverse objects in the open-pit mines and incorporates additional attributes for evaluating occlusion detection. In addition, to address the multi-scale changes of objects in open-pit mines and the occlusion problems caused by dust, we propose a novel occlusion mine object general distribution detection method, utilizing soft labels and vehicle attribute location to reduce the positioning ambiguity in difficult backgrounds and achieve specific object detection in harsh open-pit mine environments. Our work explores the benchmark for open-pit mine object recognition involving occlusion. Comparison with mainstream techniques on the benchmark demonstrates that our approach outperforms existing state-of-the-art methods and can achieve 82.2%, 81.7%, and 76.7% average precision in easy, moderate, and hard modes, respectively.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111436"},"PeriodicalIF":7.5,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144522981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OralTransNet: A novel hybrid model integrating transformer attention and CNN features for accurate diagnosis of mouth and oral diseases","authors":"Sohaib Asif , Vicky Yang Wang , Dong Xu","doi":"10.1016/j.engappai.2025.111609","DOIUrl":"10.1016/j.engappai.2025.111609","url":null,"abstract":"<div><div>The rising prevalence of mouth and oral diseases (MOD), including gum disease and oral cancer, presents a significant global health challenge. Early detection is crucial for effective intervention. However, existing models often rely on complex preprocessing, computationally expensive operations, and specialized resources, leading to inefficiency and limited practicality. This paper presents a novel lightweight hybrid model that combines the local feature extraction strengths of CNNs with the global contextual power of Transformer attention mechanisms, contributing to the advancement of artificial intelligence (AI) in medical image analysis. The proposed architecture integrates the local feature extraction efficiency of convolutional neural networks (CNNs) with the global context modeling strength of Transformers. This combination enables the model to effectively capture both fine-grained details and broader spatial patterns, while maintaining low computational complexity. By leveraging CNNs' weight-sharing properties for efficient feature extraction and Transformers' ability to model global patterns, the proposed model performs well across datasets of varying sizes and complexities. Its lightweight design emphasizes efficiency, with fewer parameters, reduced floating-point operations (FLOPs), and shorter inference times, making it ideal for real-time AI applications, particularly in resource-constrained environments. The proposed model is also well-suited for deployment on mobile devices and in regions with limited medical infrastructure, providing a scalable solution for early diagnosis in diverse healthcare settings. In the context of medical engineering, the proposed model is applied to the automated detection of mouth and oral diseases (MOD) using both clinical and histopathological images. This approach aims to enhance diagnostic capabilities in resource-constrained clinical environments. The model is rigorously evaluated on three datasets: the MOD dataset (5143 images, 7 classes), the Oral Cancer dataset (241 images, 2 classes), and the Histopathological Oral Cancer dataset (5192 images, 2 classes). The proposed model achieves accuracies of 99.03 %, 97.83 %, and 94.23 %, respectively—surpassing several state-of-the-art (SOTA) models. Its strong performance, lightweight design, and enhanced interpretability position it as a practical and scalable solution for early and reliable oral disease detection in diverse clinical settings.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111609"},"PeriodicalIF":7.5,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144522959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Quan Chen , Jiqing Chen , Zhiwu Jiang , Lixiang Huang , Jingyao Gai , Peilin Li , Mingchang Zhang
{"title":"Research on efficient and fast extraction of vineyard navigation path based on key point detection","authors":"Quan Chen , Jiqing Chen , Zhiwu Jiang , Lixiang Huang , Jingyao Gai , Peilin Li , Mingchang Zhang","doi":"10.1016/j.engappai.2025.111549","DOIUrl":"10.1016/j.engappai.2025.111549","url":null,"abstract":"<div><div>This study proposes an improved model based on key point detection, the YOLOv8-KN (You Only Look Once version 8-Keypoint Detection Navigation) model, which is used for autonomous navigation path extraction of agricultural robots in vineyards. The model comprehensively optimizes the original network structure by introducing FasterNet Block, Efficient Multi-Scale Attention (EMA), Universal Inverted Bottleneck (UIB) module, and Upsampling by Dynamic (DySample) dynamic upsampler. The improved network can accurately locate the key points of grapevine rhizomes directly from the image, and use the least squares method to achieve fast straight-line fitting of the key points on the left and right sides, thereby generating a high-precision navigation path. In the experiment, the model achieved an average precision of 87.1% and a key point detection precision of 91.2% on the grapevine rhizome detection task. At the same time, the model parameters are reduced by 25.8% compared with the original structure, and the computational complexity is controlled within 6.7 Giga Floating-point Operations Per Second (GFLOPs). For navigation path extraction, the evaluation results show that the average yaw angle of the method is only 0.75°, the maximum yaw angle is 1.47°, the average pixel offset error is 7.94 pixels, the maximum offset error is 14.04 pixels, and the average path fitting time is only 1.66 milliseconds (ms). The experimental results thoroughly verify the efficiency and precision of the proposed model in vineyards and provide a lightweight and high-performance solution for the autonomous navigation of agricultural robots in unstructured environments such as vineyards.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111549"},"PeriodicalIF":7.5,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144523129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Licai Cao , Tianxiao Zhang , Jin Cui , Anastasios P. Vassilopoulos
{"title":"Transfer learning-based approach for evaluating residual stiffness in carbon fiber reinforced composites and adhesives","authors":"Licai Cao , Tianxiao Zhang , Jin Cui , Anastasios P. Vassilopoulos","doi":"10.1016/j.engappai.2025.111624","DOIUrl":"10.1016/j.engappai.2025.111624","url":null,"abstract":"<div><div>This work proposes a transfer learning-based encoder-decoder framework to predict the relationship between loading conditions and residual stiffness in carbon fiber reinforced composites and adhesives. The encoder, built from a Convolutional Neural Network (CNN) and Bidirectional Long Short-term Memory (Bi-LSTM), extracts time-series loading signals into latent variables and captures their dependencies. The decoder employs a Multilayer Perceptron (MLP) to map these latent features to residual stiffness. Transfer learning strategy is used to account for individual variability and further improve accuracy. The model's effectiveness and robustness are validated through random and constant loading fatigue experiments from two different material systems. Under random fatigue data, the model demonstrates strong learning capabilities. Under random fatigue data, the model demonstrates strong learning capabilities. Compared to classical models like Support Vector Machine (SVM) and Random Forest, or simpler deep learning architectures like individual CNN and Bi-LSTM networks, the proposed architecture shows enhanced prediction accuracy and regression results, achieving a Root Mean Square Error (RMSE) of 0.154 and a Coefficient of Determination (R<sup>2</sup>) of 0.931. In constant amplitude fatigue datasets, the model accurately identifies different materials and exhibits satisfactory robustness when reasonable training dataset size is used.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111624"},"PeriodicalIF":7.5,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144523127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gang Liu , Xuming Li , Jin Di , Rui Sun , Fengjiang Qin , Yi Su , Dewei Liu
{"title":"A novel framework for crack segmentation using image augmentation and a CannyNet","authors":"Gang Liu , Xuming Li , Jin Di , Rui Sun , Fengjiang Qin , Yi Su , Dewei Liu","doi":"10.1016/j.engappai.2025.111644","DOIUrl":"10.1016/j.engappai.2025.111644","url":null,"abstract":"<div><div>Computer-vision based crack detection is highly dependent on the quality of the segmentation process and remains a challenging task due to its complexity. In this paper, a framework for image segmentation that incorporates image augmentation method and a CannyNet is proposed to improve segmentation results. The style transfer is employed for image augmentation. A novel deep neural network for crack image segmentation, named CannyNet, is proposed to enhance the recognition capability for tiny cracks. Moreover, to improve the precision of CannyNet predictions, Bayesian optimization approach is employed to optimize network hyperparameters. The proposed framework for crack segmentation was verified using four open-source dataset and a new constructed dataset by conducting experimental test. A comparison of segmentation models indicates that style transfer method enhances the model's generalization, and the CannyNet demonstrates superior performance. The Bayesian optimization strategy is capable of optimizing the architecture of the CannyNet, thereby improving crack segmentation results.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111644"},"PeriodicalIF":7.5,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144523130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Utilizing neural networks to illustrate the dynamics of viscous fluid flow over curved surface with homogeneous and heterogeneous reactions","authors":"Abhishek Sharma , Ram Prakash Sharma","doi":"10.1016/j.engappai.2025.111629","DOIUrl":"10.1016/j.engappai.2025.111629","url":null,"abstract":"<div><div>This study examines the influence of homogeneous-heterogeneous reactions and viscous dissipation on the magnetohydrodynamic (MHD) boundary layer flow over a curved stretching sheet, incorporating the effects of partial slip and a non-uniform heat source. Understanding these interactions is crucial for optimizing heat and mass transfer in industrial applications where precise thermal and solutal control are required. The governing partial differential equations are transformed into a system of coupled ordinary differential equations using similarity transformations and solved numerically via the Runge-Kutta method with a shooting technique. A comparative analysis with existing studies further validates the accuracy of the present findings, providing strength into flow control mechanisms and heat transfer enhancement strategies relevant to industrial thermal systems. Moreover, results indicate that increasing the magnetic field parameter increases the shear rate by 65.62 %, whereas thermal dissipation reduces the heat transfer rate by 13.63 %. Additionally, an Artificial Neural Network (ANN) model is employed to predict drag force, heat transfer, and mass transfer rates, achieving a validation accuracy exceeding 99 % with a mean squared error (MSE) of approximately 10<sup>−11</sup> and a regression coefficient (<em>R</em>) close to 1 for each case. Moreover, the inputs for predicting drag force are provided to the ANN by varying the values of curvature parameter <span><math><mrow><mo>(</mo><mrow><mi>K</mi><mrow><mo>(</mo><mrow><mn>1</mn><mo>−</mo><mn>5</mn></mrow><mo>)</mo></mrow></mrow><mo>)</mo></mrow></math></span>, magnetic parameter <span><math><mrow><mo>(</mo><mrow><mi>M</mi><mrow><mo>(</mo><mrow><mn>1</mn><mo>−</mo><mn>3</mn></mrow><mo>)</mo></mrow></mrow><mo>)</mo></mrow></math></span> and slip parameter <span><math><mrow><mo>(</mo><mrow><msub><mi>λ</mi><mn>1</mn></msub><mrow><mo>(</mo><mrow><mn>0.1</mn><mo>−</mo><mn>0.3</mn></mrow><mo>)</mo></mrow></mrow><mo>)</mo></mrow></math></span>, the heat transfer rate controlled by appropriately adjusting the parameters <span><math><mrow><mi>K</mi><mrow><mo>(</mo><mrow><mn>1</mn><mo>−</mo><mn>5</mn></mrow><mo>)</mo></mrow><mo>,</mo><mi>M</mi><mrow><mo>(</mo><mrow><mn>1</mn><mo>−</mo><mn>3</mn></mrow><mo>)</mo></mrow><mtext>,</mtext></mrow></math></span> heat source parameters <span><math><mrow><mo>(</mo><mrow><msup><mi>A</mi><mo>∗</mo></msup><mrow><mo>(</mo><mrow><mn>0.5</mn><mo>−</mo><mn>1.5</mn></mrow><mo>)</mo></mrow><mo>,</mo><msup><mi>B</mi><mo>∗</mo></msup><mrow><mo>(</mo><mrow><mn>0.1</mn><mo>−</mo><mn>0.4</mn></mrow><mo>)</mo></mrow></mrow><mo>)</mo></mrow></math></span> and Eckert number <span><math><mrow><mo>(</mo><mrow><mi>E</mi><mi>c</mi><mrow><mo>(</mo><mrow><mn>0.1</mn><mo>−</mo><mn>0.3</mn></mrow><mo>)</mo></mrow></mrow><mo>)</mo></mrow></math></span> and the solutal rate is determined by adjusting the parameters <span><math><mrow><mi>K</mi><mrow><mo>(</mo><mrow><mn>1</mn><","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111629"},"PeriodicalIF":7.5,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144523163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A decomposed fuzzy based fusion of decision-making and metaheuristic algorithm to select best unmanned aerial vehicle in agriculture 4.0 era","authors":"Rishabh Rishabh, Kedar Nath Das","doi":"10.1016/j.engappai.2025.111491","DOIUrl":"10.1016/j.engappai.2025.111491","url":null,"abstract":"<div><div>As the world embraces sustainable and smart solutions, agriculture is evolving through rapid technological advancements. Unmanned Aerial Vehicles (UAVs) are transforming smart farming, particularly for smallholder farmers, by reducing costs, saving time, and improving efficiency of agricultural tasks. This study aims to introduce a comprehensive group decision-making framework for selecting the most suitable UAV for agricultural purposes. Traditional Multi-Criteria Decision-Making (MCDM) methods face challenges with intricacies, non-linearity, limited exploration of solution space and weight distortion during defuzzification. To address these issues, this study introduces a novel Decomposed Fuzzy-based Non-Linear (DFNL) optimization model within Analytical Hierarchy Process (AHP), which directly extracts subjective crisp weights from DF-decisions. A hybrid metaheuristic algorithm is then proposed to solve this model efficiently. Additionally, objective weights are calculated using the CRiteria Importance Through Inter-criteria Correlation (CRITIC) method and qualitative data, enhancing the accuracy of the decision-making process. For ranking the UAV alternatives, the full Multiplicative form of the Multi-Objective Optimization by Ratio Analysis (MULTIMOORA) method is applied. The effectiveness of the proposed methodology is demonstrated through two extensive examples and validated via a case study focusing on the Indian subcontinent. Sensitivity analysis confirms its robustness and stability. The findings and novelties are supported by comparing with other extant models. This fusion of group decision-making methods and metaheuristic algorithms improves weight accuracy, reduces manual complexity, and adapts to uncertainty, offering policymakers actionable insights and a tailored approach for UAV selection.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111491"},"PeriodicalIF":7.5,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144523165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A lightweight algorithm for wildlife detection in outdoor environments based on you only look once version 8 network","authors":"Pengtao Jia, Yu Zhang","doi":"10.1016/j.engappai.2025.111544","DOIUrl":"10.1016/j.engappai.2025.111544","url":null,"abstract":"<div><div>To address the challenges posed by outdoor environments, such as obstructions, background noise, and the limited computational resources of detection devices that hinder accurate wildlife detection, we propose a lightweight wildlife detection algorithm named Wildlife-You Only Look Once (WL-YOLO). First, the High-level Screening-feature Pyramid Networks (HS-FPN) is introduced to enhance the processing of multi-scale wildlife features. This architecture captures comprehensive feature information while significantly reducing computational complexity. Next, the Depthwise Adaptive Spatial Attention (DASA) module is designed to improve the adaptability of the model to various environmental backgrounds, effectively addressing challenges such as object occlusion and complex backgrounds in wildlife images. Additionally, the Scylla Intersection over Union (SIoU) loss is utilized to optimize detection accuracy. Finally, a pruning method based on layer-adaptive magnitude-based pruning (LAMP) is applied to the model to trim redundant parameters. Experimental results indicate that WL-YOLO achieves comparable detection accuracy to the original YOLO model (version 8) while reducing parameters by 66.7% and computational load by 57.5%. It processes images at a speed of 344 frames per second. Furthermore, detection results indicate that WL-YOLO performs better in handling occlusions and complex environments. Compared to mainstream object detection algorithms, WL-YOLO achieves a favorable balance among detection accuracy, inference speed, and model complexity. The research suggests that WL-YOLO is better applicable in complex outdoor environments, offering a new pathway for research on wildlife diversity. The code will be available on GitHub (<span><span>https://github.com/xust-9527/WL-YOLO</span><svg><path></path></svg></span>).</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111544"},"PeriodicalIF":7.5,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144523029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}