Engineering Applications of Artificial Intelligence最新文献

筛选
英文 中文
Compact convolutional transformers- generative adversarial network for compound fault diagnosis of industrial robot 用于工业机器人复合故障诊断的紧凑型卷积变压器-生成式对抗网络
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2024-09-14 DOI: 10.1016/j.engappai.2024.109315
{"title":"Compact convolutional transformers- generative adversarial network for compound fault diagnosis of industrial robot","authors":"","doi":"10.1016/j.engappai.2024.109315","DOIUrl":"10.1016/j.engappai.2024.109315","url":null,"abstract":"<div><p>The safe operation of Industrial robots is a major concern in intelligent manufacturing. Accurate compound fault diagnosis is essential to the safe operation of industrial robots, while it is challenging to achieve since the compound fault samples are hard to be collected. Generative adversarial network (GAN) is a useful tool for addressing the data imbalance issue. However, the computation efficiency of GAN in addressing the data imbalance issue has not been investigated. Hence, this study proposes a lightweight GAN named compact convolutional Transformers-GAN (CCT-GAN) to alleviate the data imbalance issue in compound fault diagnosis modelling. Firstly, the feedback current signals collected from the industrial robot are transformed into time-frequency images via continuous wavelet transformation (CWT). Secondly, CCT-GAN is designed to achieve high-quality fake data generation and compound fault diagnosis modelling without large computational costs. Thirdly, the relation between a single fault and the compound fault is considered in the compound fault diagnosis modelling via multi-hot representation to alleviate the data imbalance issue. An experimental study based on the real-world compound fault dataset of industrial robots reveals that the proposed CCT-GAN shows merits in compound fault diagnosis modelling in comparison with the prevailing algorithms. The results indicate that CCT-GAN can performance of compound fault diagnosis when only 100 data samples from each compound fault category are available.</p></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142230740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Image captioning by diffusion models: A survey 通过扩散模型为图像添加字幕:一项调查
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2024-09-14 DOI: 10.1016/j.engappai.2024.109288
{"title":"Image captioning by diffusion models: A survey","authors":"","doi":"10.1016/j.engappai.2024.109288","DOIUrl":"10.1016/j.engappai.2024.109288","url":null,"abstract":"<div><p>Diffusion models are increasingly favored over traditional approaches like generative adversarial networks (GANs) and auto-regressive transformers due to their remarkable generative capabilities. They demonstrate outstanding performance not solely limited to image generation and manipulation but also in text-related tasks. Despite this, existing surveys tend to concentrate on the utilization of diffusion models solely for image generation, ignoring their potential in image captioning. To address this oversight, our paper provides an exhaustive examination of image-to-text diffusion models within the landscape of artificial intelligence (AI) and generative computing, filling a critical void in the literature. Starting with an overview of basic diffusion model principles, we explore into the enhancements brought by conditioning or guidance and the implemented AI. We then present a taxonomy and review of cutting-edge methods in diffusion-based image captioning. Additionally, we explore applications beyond image-to-text generation, such as image-guided creative generation, text editing, and the application of AI. We also cover existing evaluation metrics, software and libraries, as well as challenges and future directions in the field.</p></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142232166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An efficient identity-preserving and fast-converging hybrid generative adversarial network inversion framework 高效的身份保护和快速收敛混合生成式对抗网络反演框架
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2024-09-13 DOI: 10.1016/j.engappai.2024.109287
{"title":"An efficient identity-preserving and fast-converging hybrid generative adversarial network inversion framework","authors":"","doi":"10.1016/j.engappai.2024.109287","DOIUrl":"10.1016/j.engappai.2024.109287","url":null,"abstract":"<div><p>In this paper, we present a novel Hybrid Generative Adversarial Network (HGAN) inversion framework that enables facial images to be rapidly inverted while preserving identity and personality characteristics. Accurate inversion of facial images requires high precision in computer vision and is critical to the success of future facial manipulations (age progression, regression, accessory, and hair stylization). However, existing methods often fail to preserve the personality characteristics of the real image, negatively affecting the accuracy of manipulations. In this context, our key contribution lies in using a transformer-based strategy to initiate the generator, which effectively models spatial relationships for detailed image processing. This approach is innovative because it leverages transformer structures to enhance image inversion tasks. Additionally, we introduce a novel loss function to enhance convergence speed and reliability, ensuring high accuracy in identity and personality trait preservation. Experimental results show that our method achieves a reconstruction accuracy of 93% and improves inversion time by 86%. This advancement could significantly impact facial manipulation technologies, laying the foundation for a technological breakthrough with potential applications in secure digital authentication systems and personal data protection. Our method may have a significant impact on privacy and security in future studies, contributing to the development of secure digital authentication systems and enhancing the protection of personal data. Therefore, our work is crucial for advancing the field of facial image manipulation and ensuring the privacy and security of personal data.</p></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142173085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring explainable ensemble machine learning methods for long-term performance prediction of industrial gas turbines: A comparative analysis 探索用于工业燃气轮机长期性能预测的可解释集合机器学习方法:比较分析
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2024-09-13 DOI: 10.1016/j.engappai.2024.109318
{"title":"Exploring explainable ensemble machine learning methods for long-term performance prediction of industrial gas turbines: A comparative analysis","authors":"","doi":"10.1016/j.engappai.2024.109318","DOIUrl":"10.1016/j.engappai.2024.109318","url":null,"abstract":"<div><p>In today's modern life, where electricity demand is one of the fundamental necessities, gas turbines play a pivotal role in meeting this demand. As such, it is imperative to address the challenges faced in the field. Current models often rely on simplifying assumptions, neglecting the intricate relationships between variables. This limitation leads to reduced accuracy and reliability, ultimately affecting the overall efficiency of gas turbine systems. Furthermore, the complexity of gas turbine behavior, coupled with the scarcity of comprehensive datasets, exacerbates the problem.</p><p>To address these challenges, this research aimed to develop an advanced model capable of accurately forecasting real gas turbine behavior. The proposed approach leveraged ensemble decision trees, robust preprocessing techniques, and rigorous evaluation using an extensive dataset spanning from 2011 to 2015. The training and validation phases were conducted on data from 2011 to 2014, with the 2015 dataset reserved for evaluation.</p><p>The results demonstrated that the bagging structure outperformed the boosted structure, exhibiting lower complexity and higher reliability. Remarkably, the bagging approach with only 30 estimators achieved a superior root mean square error of 1.4176, outperforming the boosted trees with 200 learners. The model effectively captured the overall gas turbine performance, though it encountered limitations in certain specific operating ranges.</p><p>To further investigate the model's behavior, an evaluation was conducted to assess the effects of the input variables on the output power. While the interpretability of the results posed some challenges, the overall findings were deemed acceptable and provide valuable insights for optimizing gas turbine performance. The significance of this research lies in its potential to inform decision-making and enhance the efficiency of gas turbine systems, ultimately contributing to the reliable and sustainable supply of electricity.</p></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142230738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unmasking colorectal cancer: A high-performance semantic network for polyp and surgical instrument segmentation 揭开大肠癌的面纱:用于息肉和手术器械分割的高性能语义网络
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2024-09-13 DOI: 10.1016/j.engappai.2024.109292
{"title":"Unmasking colorectal cancer: A high-performance semantic network for polyp and surgical instrument segmentation","authors":"","doi":"10.1016/j.engappai.2024.109292","DOIUrl":"10.1016/j.engappai.2024.109292","url":null,"abstract":"<div><p>Colorectal cancer (CRC) remains a significant health concern, with colonoscopy serving as the gold standard for diagnosis. Accurately segmenting polyps from colonoscopy images is crucial for detecting polyps and preventing CRC. However, challenges such as varying polyp sizes, blurred edges, and uneven brightness hinder segmentation accuracy. Leveraging artificial intelligence (AI) and robot-assisted surgery mechanisms can aid surgeons and physicians in detecting and treating polyps. To address these challenges, we propose a Colorectal Network (CR-Net), an AI-based encoder-decoder network for precise polyp and surgical instrument segmentation. CR-Net incorporates a pre-trained Visual Geometry Group model with 16 convolution layers (VGG16), attention mechanisms, redesigned skip connections, and horizontal dense connections within a U-Net architecture. The VGG16 encoder captures robust visual features, while redesigned skip connections accommodate complex data dimensions, leading to enhanced segmentation outcomes. Horizontal dense connections transfer overlooked features from the encoder to subsequent layers, further improving segmentation accuracy. Additionally, a spatial attention block enhances spatial features and ensures compatibility during upsampling. Evaluation of datasets including the Kvasir segmentation (Kvasir-SEG) dataset, Computer Vision Center Clinic Database (CVC-ClinicDB), Kvasir-Instrument dataset, and University of Washington Sinus Surgery Live (UW-Sinus-Surgery-Live) dataset demonstrates CR-Net's superior performance, achieving Dice Similarity Coefficients of 96.21%, 96.54%, 96.32%, and 92.84%, respectively, surpassing previous methods. These results highlight CR-Net's potential in empowering healthcare professionals through advanced AI-driven engineering applications. By bridging AI techniques with engineering innovations, CR-Net represents a significant advancement in CRC diagnosis and treatment.</p></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142230646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sustainable management of polyethylene terephthalate waste flow using a fuzzy frank weighted assessment model 利用模糊法兰克加权评估模型对聚对苯二甲酸乙二醇酯废物流进行可持续管理
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2024-09-13 DOI: 10.1016/j.engappai.2024.109254
{"title":"Sustainable management of polyethylene terephthalate waste flow using a fuzzy frank weighted assessment model","authors":"","doi":"10.1016/j.engappai.2024.109254","DOIUrl":"10.1016/j.engappai.2024.109254","url":null,"abstract":"<div><p>The consequences for the ecosystem of polyethylene terephthalate (PET) waste are becoming increasingly significant and widespread. Companies managing PET waste strive to enhance sustainability in all areas. The development of systematic decision-making approaches and frameworks for PET waste management is strongly needed. This research aims to present a new methodological framework for the categorization of the most efficient PET waste management solutions. The introduced fuzzy Frank weighted sum product assessment (FWESPA) model enables rational and flexible reasoning by nonlinearly processing uncertain information. A nonlinear aggregation function is proposed for the fusion of fuzzy strategic options. It is advantageous in simulating the impact of strategic options on a final decision. An integral part of the introduced fuzzy FWESPA model is a reverse sorting algorithm. This innovative algorithm can improve the performance of traditional normalization techniques. Also, an improved fuzzy Frank ordinal priority approach linear model is formulated to define the significance of evaluation criteria. The comprehensive real-life study demonstrates the proposed decision-analytics-based approach. The results showed the following rankings of considered alternatives: “recycling” (ℤ<sub><em>A</em>2</sub> = 0.8565) &gt; “energy recovery” (ℤ<sub><em>A</em>1</sub> = 0.7364) &gt; “remanufacturing” (ℤ<sub><em>A</em>4</sub> = 0.690) &gt; “incineration” (ℤ<sub><em>A</em>3</sub> = 0.6592). Based on the results presented, alternatives “recycling” (<em>A</em><sub>2</sub>) and “energy recovery” (<em>A</em><sub>1</sub>) represent dominant alternatives with a slight advantage of recycling. Research findings can be used when deciding the appropriate way to enhance PET waste handling. The findings also describe the benefits and limitations of each treatment option for PET waste, as well as highlight the crucial challenges.</p></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142230737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing prognostics for sparse labeled data using advanced contrastive self-supervised learning with downstream integration 利用先进的对比自监督学习与下游集成,增强稀疏标记数据的预报能力
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2024-09-13 DOI: 10.1016/j.engappai.2024.109268
{"title":"Enhancing prognostics for sparse labeled data using advanced contrastive self-supervised learning with downstream integration","authors":"","doi":"10.1016/j.engappai.2024.109268","DOIUrl":"10.1016/j.engappai.2024.109268","url":null,"abstract":"<div><p>Data-driven Prognostics and Health Management (PHM) requires extensive and well-annotated datasets for developing algorithms that can estimate and predict the health state of systems. However, acquiring run-to-failure data is costly, time-consuming, and often lacks comprehensive sampling of failure states, limiting the effectiveness of PHM models. This paper explores the use of Self-Supervised Learning (SSL) in PHM, addressing key limitations and proposing a novel contrastive SSL approach using a nested siamese network structure to enhance degradation feature representation. The model’s performance with sparse data improves by integrating downstream task information, particularly Remaining Useful Life (RUL) prediction, into the siamese structure during SSL pre-training. This approach enforces a consistency condition that failure times for two samples from the same monitoring sequence be identical. The proposed method demonstrates superior performance on the PRONOSTIA bearing dataset, outperforming state-of-the-art methods even with sparse labeling. Furthermore, the study delves into the impact of the upstream–downstream relationship in learning processes, asserting that fine-tuning significantly enhances RUL prediction by leveraging the foundational behaviors established during pre-training. Fine-tuning refines the model’s ability to capture subtle degradation patterns by building on the initial feature representations learned in pre-training, thereby improving accuracy and robustness in RUL predictions. The generalizability of the proposed strategy is confirmed through an end-to-end tool wear prediction in a real industrial environment, illustrating the applicability of the proposed method across various datasets and models, and providing effective solutions for sparse data scenarios in prognostics.</p></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142230739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dictionary domain adaptation transformer for cross-machine fault diagnosis of rolling bearings 用于滚动轴承跨机器故障诊断的字典域适应变换器
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2024-09-13 DOI: 10.1016/j.engappai.2024.109261
{"title":"Dictionary domain adaptation transformer for cross-machine fault diagnosis of rolling bearings","authors":"","doi":"10.1016/j.engappai.2024.109261","DOIUrl":"10.1016/j.engappai.2024.109261","url":null,"abstract":"<div><p>Domain adaptation (DA) techniques have significantly promoted the fault diagnosis of rolling bearings by leveraging diagnostic knowledge from a labeled source domain to recognize faults in an unlabeled target domain. However, dominant DA models often suffer from inaccurate estimation of distribution discrepancies. This stems from the fact that they perform domain alignment on a batch-by-batch basis, where the distribution discrepancies are evaluated solely using mini-batch data. In this paper, a novel dictionary domain adaptation transformer (DDAT) is proposed to boost cross-machine fault diagnosis of rolling bearings. First, a feature dictionary is constructed to represent domain attributes using multi-batch data, enabling more accurate estimation of the domain gap compared to existing batch-based methods. Second, a novel dictionary adaptation framework is designed to direct the model focus on inter-domain discrepancy instead of intra-domain variations caused by random sampling in data batches. Third, a domain-shared transformer feature extractor is developed to learn domain-invariant representations by leveraging the inherent advantages of multi-head attention in capturing long-range dependencies. The proposed DDAT method conducts domain adaptation at the dictionary level, benefiting from a more accurate estimation of distribution discrepancies by leveraging the abundant and diverse data in the dictionary. Experiments confirm that the proposed DDAT method outperforms the popular deep domain adaptation models in various cross-machine diagnosis tasks of rolling bearings.</p></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142173084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BFFN: A novel balanced feature fusion network for fair facial expression recognition BFFN:用于公平面部表情识别的新型平衡特征融合网络
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2024-09-13 DOI: 10.1016/j.engappai.2024.109277
{"title":"BFFN: A novel balanced feature fusion network for fair facial expression recognition","authors":"","doi":"10.1016/j.engappai.2024.109277","DOIUrl":"10.1016/j.engappai.2024.109277","url":null,"abstract":"<div><p>Facial expression recognition (FER) technology has become increasingly mature and applicable in recent years. However, it still suffers from the bias of expression class, which can lead to unfair decisions for certain expression classes in applications. This study aims to mitigate expression class bias through both pre-processing and in-processing approaches. First, we analyze the output of existing models and demonstrate the existence of obvious class bias, particularly for underrepresented expressions. Second, we develop a class-balanced dataset constructed through data generation, mitigating unfairness at the data level. Then, we propose the Balanced Feature Fusion Network (BFFN), a class fairness-enhancing network. The BFFN mitigates the class bias by adding facial action units (AU) to enrich expression-related features and allocating weights in the AU feature fusion process to improve the extraction ability of underrepresented expression features. Finally, extensive experiments on datasets (RAF-DB and AffectNet) provide evidence that our BFFN outperforms existing FER models, improving the fairness by at least 16%.</p></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142230735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An efficient detector for detecting surface defects on cold-rolled steel strips 用于检测冷轧带钢表面缺陷的高效检测器
IF 7.5 2区 计算机科学
Engineering Applications of Artificial Intelligence Pub Date : 2024-09-13 DOI: 10.1016/j.engappai.2024.109325
{"title":"An efficient detector for detecting surface defects on cold-rolled steel strips","authors":"","doi":"10.1016/j.engappai.2024.109325","DOIUrl":"10.1016/j.engappai.2024.109325","url":null,"abstract":"<div><p>Surface-defect inspection is vital in cold-rolled steel-strip manufacturing, given the complexities of production environments and the high speeds involved. Further, the defects on cold-rolled steel strips are often characterized by their small size, diversity of types, and similarities among different types, posing significant challenges in balancing detection accuracy and efficiency. To address the challenges, we designed a detector based on You Only Look Once version 5 (YOLOv5) to achieve precise detection of surface defects on cold-rolled steel strips. First, a dataset containing seven types of defects was curated, named the Cold-Rolled Steel Defect Dataset (CR7-DET). Next, a feature-extraction network based on residual-like connections within a single residual block (Res2net) was developed to enhance the model’s feature-extraction capability, alongside introducing a multi-head attention module to focus on key information features. To reduce the information loss during feature fusion, we established an adaptive feature-fusion Path Aggregation Network (aff-PAN), which was optimized by designing a lightweight adaptive down-sampling module (LAD) to increase the sensory-field implementation of feature fusion. The ghost convolution effectively reduced the number of parameters and increased the speed without affecting the model’s performance. Finally, experiments were conducted on our CR7-DET and a public dataset (GC10-DET). With a reduced parameter count of 6.85 million, our model achieved a mean average precision(mAP) of 87.6% on CR7-DET and 79.7% on GC10-DET. The experimental results demonstrated that our model achieved a balance between detection accuracy and inference efficiency. The model has the potential to reduce scrap rates caused by defects and improve the overall surface quality of cold-rolled steel strips.</p></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142230673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信