Engineering Applications of Artificial Intelligence最新文献

筛选
英文 中文
An enhanced generative adversarial network for longer vibration time data generation under variable operating conditions for imbalanced bearing fault diagnosis
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2025-04-09 DOI: 10.1016/j.engappai.2025.110760
Teng Wang , Zhi Chao Ong , Shin Yee Khoo , Pei Yi Siow , Jinlai Zhang , Tao Wang
{"title":"An enhanced generative adversarial network for longer vibration time data generation under variable operating conditions for imbalanced bearing fault diagnosis","authors":"Teng Wang ,&nbsp;Zhi Chao Ong ,&nbsp;Shin Yee Khoo ,&nbsp;Pei Yi Siow ,&nbsp;Jinlai Zhang ,&nbsp;Tao Wang","doi":"10.1016/j.engappai.2025.110760","DOIUrl":"10.1016/j.engappai.2025.110760","url":null,"abstract":"<div><div>As a typical data augmentation method, a generative adversarial network is widely applied to solve data scarcity problems for imbalanced bearing fault diagnosis. However, these methods still face challenges in generating longer data due to the risk of mode collapse and instability during training. To address this issue, an enhanced generative adversarial network is proposed for generating longer vibration time data to improve imbalanced bearing fault diagnosis under variable operating conditions. Firstly, a dual cross-frequency attention block is integrated into the discriminator to adaptively extract intra-component and inter-component features across low and high frequency components decomposed using Wavelet, thereby facilitating generator to generate longer synthetic time data with higher frequency resolution. Furthermore, the sequence information block is introduced to generate synthetic time data under variable operating conditions by incorporating specific operating condition information into the generator. To expedite the synthetic data generation process under variable operating conditions, healthy data corresponding to these conditions are used as input for the generator, replacing random noise. Finally, the superiority of the proposed method is validated through experiments on two bearing datasets for imbalanced bearing fault diagnosis. Experimental results on these two datasets demonstrate that the proposed method with 2048-length synthetic data achieves the highest accuracy of 98.75 % and 96.88 %, respectively, outperforming state-of-the-art methods. Therefore, the proposed method can effectively address the challenge of generating longer vibration time data, improving diagnostic accuracy in imbalanced bearing fault diagnosis under variable operating conditions.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"151 ","pages":"Article 110760"},"PeriodicalIF":7.5,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143800062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A goal-conditioned offline reinforcement learning algorithm and its application to quad-rotors
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2025-04-08 DOI: 10.1016/j.engappai.2025.110678
Haojun Zhong , Zhenlei Wang , Yuzhe Hao
{"title":"A goal-conditioned offline reinforcement learning algorithm and its application to quad-rotors","authors":"Haojun Zhong ,&nbsp;Zhenlei Wang ,&nbsp;Yuzhe Hao","doi":"10.1016/j.engappai.2025.110678","DOIUrl":"10.1016/j.engappai.2025.110678","url":null,"abstract":"<div><div>Goal-conditioned reinforcement learning has garnered significant academic interest due to its ability to accomplish reinforcement learning tasks with different goals. Nevertheless, the requirement for extensive online interaction with the environment makes it exceedingly dangerous for practical implementations. This paper proposes a goal-conditional offline reinforcement learning algorithm called Goal-Conditional Twin Delayed Deep Deterministic with V-based Behavioral Cloning(GC-TD3VBC) and applies it to the navigation task of a quad-rotor. In this study, offline reinforcement learning algorithm Twin Delayed Deep Deterministic Behavioral Cloning(TD3BC) is extended to deal with goal-conditioned tasks. Specifically, goal information is explicitly incorporated into TD3BC’s value function network and policy network. Furthermore, to address the training inefficiency caused by sparse rewards with goal condition, this study integrates TD3BC with Hindsight Experience Replay(HER). Additionally, it devises composite advantage action weights to guide the agent in selecting advantageous actions based on the advantage function during the policy network update. The proposed approach significantly improves the overall performance of the algorithm. Finally, the algorithm’s practicality and performance were evaluated using an offline reinforcement learning benchmark dataset. Moreover, the algorithm was implemented on the target navigation task of a quad-rotor. The experimental results demonstrate that GC-TD3VBC is capable of efficiently addressing target navigation tasks of the quad-rotor under goal conditions.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"152 ","pages":"Article 110678"},"PeriodicalIF":7.5,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143791945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semi-supervised deep learning framework based on modified pyramid scene parsing network for multi-label fine-grained classification and diagnosis of apple leaf diseases
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2025-04-08 DOI: 10.1016/j.engappai.2025.110743
Ke-Jun Fan, Bo-Yuan Liu, Wen-Hao Su, Yankun Peng
{"title":"Semi-supervised deep learning framework based on modified pyramid scene parsing network for multi-label fine-grained classification and diagnosis of apple leaf diseases","authors":"Ke-Jun Fan,&nbsp;Bo-Yuan Liu,&nbsp;Wen-Hao Su,&nbsp;Yankun Peng","doi":"10.1016/j.engappai.2025.110743","DOIUrl":"10.1016/j.engappai.2025.110743","url":null,"abstract":"<div><div>Apple leaf diseases have a significant impact on both yield and quality of apples, traditionally requiring specialists for detection. However, labor shortages in large orchards can reduce the detection efficiency. Current diagnosis models using semantic segmentation technology require a large amount of annotated data and encounter difficulties in extracting detailed features. This study introduces a semi-supervised method using a modified Pyramid Scene Parsing Network (PSPNet) for segmenting apple leaves. When only 1/2, 1/4, and 1/8 of the annotated data were used, the Mean Intersection over Union (MioU) indicators reached 0.975, 0.974, and 0.965, respectively. This enhanced method outperformed the original model in edge segmentation, showing semi-supervised model performance comparable to fully supervised models. Additionally, a transformer-based fine-grained multi-label framework was developed for classifying apple leaf diseases. The F1-scores (the harmonic mean of Precision and Recall) for Alternaria blotch, brown spot, grey spot, mosaic, and rust were 0.855, 0.903, 0.919, 0.921, and 0.895, respectively. This framework reduces computational complexity compared with conventional detection techniques. The combination of semi-supervised segmentation and multi-label classification provides support for the development of phenotyping platforms for precise pesticide application and disease-resistant variety selection.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"151 ","pages":"Article 110743"},"PeriodicalIF":7.5,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143792480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RCSD-UAV: An object detection dataset for unmanned aerial vehicles in realistic complex scenarios
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2025-04-08 DOI: 10.1016/j.engappai.2025.110748
Wanxuan Geng , Junfan Yi , Ning Li , Chen Ji , Yu Cong , Liang Cheng
{"title":"RCSD-UAV: An object detection dataset for unmanned aerial vehicles in realistic complex scenarios","authors":"Wanxuan Geng ,&nbsp;Junfan Yi ,&nbsp;Ning Li ,&nbsp;Chen Ji ,&nbsp;Yu Cong ,&nbsp;Liang Cheng","doi":"10.1016/j.engappai.2025.110748","DOIUrl":"10.1016/j.engappai.2025.110748","url":null,"abstract":"<div><div>Unmanned Aerial Vehicle (UAV) detection based on visible light plays an important role in urban low-altitude defense, public safety and other fields. However, the current dataset is limited by single scene, large object and other factors deviated from the actual application scene, making it difficult to meet the needs of sample-driven deep learning optical image UAV detection. Therefore, this paper proposed a novel realistic complex scenarios UAV object dataset (RCSD-UAV) to provide training data for UAV detection models based on artificial intelligence technology. All data were obtained from ordinary cameras or mobile phones in the real world, covering various commonly used UAV types and natural scenes. The dataset is classified according to the scene and the object size, and we evaluated several models and gave benchmarks. From the experimental results, it can be concluded that the detection of UAVs is challenging due to small size and complex background. The two-stage model has good detection effect but poor real-time performance. The one-stage model can better balance the detection effect and real-time performance.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"151 ","pages":"Article 110748"},"PeriodicalIF":7.5,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143792412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A unified solution for replacing position embedding in Vision Transformer for object detection
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2025-04-08 DOI: 10.1016/j.engappai.2025.110679
XueZhuan Zhao , JiaWei Wang , LingLing Li , XiaoYan Shao , KeXin Zhang
{"title":"A unified solution for replacing position embedding in Vision Transformer for object detection","authors":"XueZhuan Zhao ,&nbsp;JiaWei Wang ,&nbsp;LingLing Li ,&nbsp;XiaoYan Shao ,&nbsp;KeXin Zhang","doi":"10.1016/j.engappai.2025.110679","DOIUrl":"10.1016/j.engappai.2025.110679","url":null,"abstract":"<div><div>The traditional Vision Transformer (ViT) demonstrating outstanding performance in various computer vision tasks. However, this high performance relies on substantial training on large datasets, for position embedding which is the key part of ViT, requires extensive data to maximize their effectiveness. What is more, in diverse vision task scenarios, position embedding may not perform well and even yield redundant information, which limits the overall model’s flexibility and robustness. To tackle this problem, we propose a novel position-free embedding model <strong><em>HV-SwinViT</em></strong>, which embed both Horizontal and Vertical features by using the self-attention mechanism with the position-free functions, replacing the positional embedding. To achieve this, firstly, we propose a module Horizontal and Vertical orientation Swin Transformer block (<strong><em>HVblock</em></strong>), which generate feature maps containing both horizontal and vertical information by adopting fully connected layers, then we design two hybrid sub-network:HVblock in Backbone (<strong><em>HV-Swin-B</em></strong>) and Channel Fusion 2 with HVblock (<strong><em>Cf2-HV</em></strong>) which leverage convolution layer and <strong><em>HVblock</em></strong> to tackle positional relationship information. Also we defined a learnable nonlinear activation function to increase the sensitivity to nonlinear position features. Experimental results demonstrate that the proposed <strong><em>HV-SwinViT</em></strong> model achieves an improvement of over 0.1% in Average Precision (AP) compared to state-of-the-art methods on the <em><strong>MS-COCO2017</strong></em> dataset. Additionally, our model outperforms several network architectures designed specifically for aerial photography of targets, attaining an <span><math><mrow><mi>A</mi><mi>P</mi></mrow></math></span> score of 35.6% and 32.1% on the <em><strong>VisDrone2019</strong></em> and <strong><em>AI-TOD</em></strong>, respectively. These results confirm that the <strong><em>HV-SwinViT</em></strong> model can be a unified solution and highlight the stability and robustness of our approach across diverse scenarios.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"152 ","pages":"Article 110679"},"PeriodicalIF":7.5,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143792021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Content and Contrastive Perception learning for automatic fetal nuchal translucency image quality assessment
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2025-04-08 DOI: 10.1016/j.engappai.2025.110687
Lili Zhao , Yuanyuan Xu , Jian Xu , Weiping Ding , Jinzhao Yang , Huiyu Zhou , Yiming Du , Bin Hu , Lichi Zhang , Qian Wang
{"title":"Deep Content and Contrastive Perception learning for automatic fetal nuchal translucency image quality assessment","authors":"Lili Zhao ,&nbsp;Yuanyuan Xu ,&nbsp;Jian Xu ,&nbsp;Weiping Ding ,&nbsp;Jinzhao Yang ,&nbsp;Huiyu Zhou ,&nbsp;Yiming Du ,&nbsp;Bin Hu ,&nbsp;Lichi Zhang ,&nbsp;Qian Wang","doi":"10.1016/j.engappai.2025.110687","DOIUrl":"10.1016/j.engappai.2025.110687","url":null,"abstract":"<div><div>Automatic quality assessment of fetal nuchal translucency ultrasound images can assist physicians in obtaining standard planes and improve the reproducibility of nuchal translucency screening. At present, there are no special studies and methods for the quality assessment of fetal nuchal translucency ultrasound images. For this task, main challenges are low image quality, content identification of structural integrity and relative position relationship, time consumption for data collection and fine-grained annotation. To address these challenges, we propose a framework based on DenseNet model, which includes preprocessing module, content perception module, attention learning module and contrastive regularization module. Experiments show that the modules are effective for improving the quality assessment framework performance. And this framework is better than the other fourteen deep learning models. This framework can provide the sonographer with a model interpretable reference map. Bland–Altman experimental analysis also verifies the consistency between the results obtained by the automatic quality assessment framework and the manually annotated clinical dataset. Therefore, the proposed quality assessment framework for fetal nuchal translucency ultrasound images has the prospect and value of clinical application.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"152 ","pages":"Article 110687"},"PeriodicalIF":7.5,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143791946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Orthogonal Diversity Nonnegative Matrix Factorization for multi-view clustering
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2025-04-08 DOI: 10.1016/j.engappai.2025.110715
Xinling Zhang , Chengcai Leng , Jinye Peng , Irene Cheng , Anup Basu
{"title":"Orthogonal Diversity Nonnegative Matrix Factorization for multi-view clustering","authors":"Xinling Zhang ,&nbsp;Chengcai Leng ,&nbsp;Jinye Peng ,&nbsp;Irene Cheng ,&nbsp;Anup Basu","doi":"10.1016/j.engappai.2025.110715","DOIUrl":"10.1016/j.engappai.2025.110715","url":null,"abstract":"<div><div>In the context of rapid development of artificial intelligence, how to extract valuable information from complex multidimensional data has become a core research problem. Multi-view clustering methods based on non-negative matrix factorization (NMF) are widely used in multi-view data analysis, but still face many challenges in practical applications. Current multi-view clustering methods usually solve the problem of diversity among viewpoints by orthogonalization of view representations. However, they fail to fully utilize the rich features of each viewpoint because data from different viewpoints may be interrelated. In addition, existing methods fail to fully consider the orthogonality between base matrices while emphasizing the diversity of view representations. For this reason, this paper proposes a new orthogonal diversity non-negative matrix factorization method (ODNMF). First, ODNMF explores the orthogonality of the representations of sample pairs between different viewpoints. This approach preserves the characteristics of each perspective and enhances the diversity of data representations. Second, ODNMF orthogonalizes the basis matrix of each viewpoint to reduce redundant features and enhance data interpretability and representation. Finally, ODNMF introduces graph regularization for each view to reveal the intrinsic geometric and structural information of features. Experimental results show that ODNMF significantly outperforms existing state-of-the-art algorithms on seven datasets.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"152 ","pages":"Article 110715"},"PeriodicalIF":7.5,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143792020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiple aerial/ground vehicles coordinated spraying using reinforcement learning
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2025-04-08 DOI: 10.1016/j.engappai.2025.110686
Ali Moltajaei Farid , Jafar Roshanian , Malek Mouhoub
{"title":"Multiple aerial/ground vehicles coordinated spraying using reinforcement learning","authors":"Ali Moltajaei Farid ,&nbsp;Jafar Roshanian ,&nbsp;Malek Mouhoub","doi":"10.1016/j.engappai.2025.110686","DOIUrl":"10.1016/j.engappai.2025.110686","url":null,"abstract":"<div><div>Investments in unmanned aerial vehicles (UAVs) have recently surged in precision agriculture. However, multi-UAV missions can face limitations due to weather conditions, highlighting the need for effective spray coverage. A novel system tailored for spraying in windy conditions to tackle this challenge is proposed. Instead of directly controlling sprayed drops, the location of spraying UAVs based on real-time wind data is adjusted. Our proposed methodology consists of three stages: Firstly, on-policy reinforcement learning (RL) with Proximal Policy Optimization (PPO) is utilized to optimize path planning. In the second stage, another PPO iteration to correct wind drift is employed, leveraging the latest wind data to enhance spray mission efficiency. Lastly, a novel algorithm is introduced to improve efficiency in narrow areas by substituting unmanned aerial vehicles with unmanned ground vehicles. To evaluate the efficiency of the proposed aerial spraying system, we conducted a simulation and reported the corresponding results.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"151 ","pages":"Article 110686"},"PeriodicalIF":7.5,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143792476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advanced computational modeling of Darcy-Forchheimer effects and nanoparticle-enhanced blood flow in stenosed arteries
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2025-04-08 DOI: 10.1016/j.engappai.2025.110737
Fatima Shafiq Hira , Qammar Rubbab , Irshad Ahmad , Afraz Hussain Majeed
{"title":"Advanced computational modeling of Darcy-Forchheimer effects and nanoparticle-enhanced blood flow in stenosed arteries","authors":"Fatima Shafiq Hira ,&nbsp;Qammar Rubbab ,&nbsp;Irshad Ahmad ,&nbsp;Afraz Hussain Majeed","doi":"10.1016/j.engappai.2025.110737","DOIUrl":"10.1016/j.engappai.2025.110737","url":null,"abstract":"<div><div>Enhancing blood flow in stenosed arteries through the optimization of nanoparticle-based solutions is one potential application of this research to improve medication delivery systems and therapeutic therapies for cardiovascular disorders. This research employed artificial neural networks (ANNs) to analyze the Darcy-Forchheimer flow of hybrid nanoparticles embedded in magnetized blood flow across the stenosed arteries. The ANNs were specifically used to consider the influence of heat generation and activation energy on this complex flow scenario. Heat transfer analysis accounts for multiple factors, such as thermal radiation, viscous dissipation, heat sources, and Joule heating. These elements collectively exert a substantial influence on the overall heat transfer process. Similarity transforms are applied to convert the original Partial Differential Equation (PDE) into a more manageable Ordinary Differential Equation (ODE). This ODE is solved numerically with MATLAB's built-in bvp4c scheme. The Levenberg–Marquardt Algorithm (LMA) is considered in multi-layer perceptron models with 10 neurons in the hidden layers. The ANN model is structured with a configuration that includes 9 input layers for data entry, 3 output layers for results, and 10 hidden layers that process information between the input and output stages. Increased porous parameter values indicate thermal energy is held longer within the flow. Increasing the heat source and thermal radiation leads to a rise in the Nusselt number while increasing the flow parameter causes it to drop.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"152 ","pages":"Article 110737"},"PeriodicalIF":7.5,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143792019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A lightweight neural network search algorithm based on in-place distillation and performance prediction for hardware-aware optimization
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2025-04-07 DOI: 10.1016/j.engappai.2025.110775
Siyuan Kang , Yinghao Sun , Shuguang Li , Yaozong Xu , Yuke Li , Guangjie Chen , Fei Xue
{"title":"A lightweight neural network search algorithm based on in-place distillation and performance prediction for hardware-aware optimization","authors":"Siyuan Kang ,&nbsp;Yinghao Sun ,&nbsp;Shuguang Li ,&nbsp;Yaozong Xu ,&nbsp;Yuke Li ,&nbsp;Guangjie Chen ,&nbsp;Fei Xue","doi":"10.1016/j.engappai.2025.110775","DOIUrl":"10.1016/j.engappai.2025.110775","url":null,"abstract":"<div><div>Due to the limited computing resources of edge devices, traditional object detection algorithms struggle to meet the efficiency and accuracy requirements of autonomous driving. Consequently, designing a neural network model that balances hardware resource requirements, operating speed, and accuracy is crucial. To address this, by integrating algorithm with hardware characteristics, we propose a lightweight neural network architecture search algorithm based on in-place distillation and performance predictor (LNIP). Initially, we focus on optimizing the operators of the you only look once version 8 nano (YOLOv8n) and dynamically adjust its network structure. Then, we trained a super-network using a progressive shrinking strategy, the sandwich rule, and in-place distillation. Subsequently, we employed a Gaussian process to model the relationship between network architecture and accuracy, utilizing encoding methods and custom kernel function to develop high-performance predictor. Finally, during the search process, we introduce a reward function based on Pareto optimality to balance the performance of the model with hardware constraints. Building upon this foundation, we design an efficient search algorithm based on the performance predictor to progressively explore the optimal network structure tailored to hardware characteristics. We compared our lightweight network with state-of-the-art methods on the BDD100K, COCO, and PASCAL VOC datasets and deployed it on the Black Sesame A1000 and NVIDIA Xavier for comprehensive evaluation. On the NVIDIA Xavier, the lightweight network achieves a latency of 11.81 ms and an edge precision of 46.1 %. These experimental results demonstrate that our method outperforms existing methods in balancing hardware constraints and model performance.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"151 ","pages":"Article 110775"},"PeriodicalIF":7.5,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143792475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信