A novel industrial thermoelectric cooler component defect vision transformer detector based on local and global features fusion

IF 3.3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Pattern Recognition Letters Pub Date : 2025-06-27 DOI:10.1016/j.patrec.2025.06.022

Jie Tu , Mengjie Tang , Yong Han , Daren Wei , Kelvin K.L. Wong

{"title":"A novel industrial thermoelectric cooler component defect vision transformer detector based on local and global features fusion","authors":"Jie Tu , Mengjie Tang , Yong Han , Daren Wei , Kelvin K.L. Wong","doi":"10.1016/j.patrec.2025.06.022","DOIUrl":null,"url":null,"abstract":"<div><div>Thermoelectric coolers (TECs) are crucial in industries requiring precise temperature control, such as electronics, telecommunications, aerospace, and semiconductor manufacturing. During the manufacturing process of TEC components, defects including cracks, pits, and contamination frequently occur, compromising performance and service life. Traditional manual inspection methods are inefficient and error-prone, motivating the need for an automated and accurate defect detection approach. To address these challenges posed by the subtle, diverse, and randomly distributed defects on TEC components, we propose the Local Feature Enhance and Feature Fusion Network (LFEFFN), a hybrid model integrating convolutional neural networks (CNNs) and Transformer architectures to simultaneously capture local details and global contextual information. Specifically, the model enhances the traditional patch embedding module using affine transformations and overlapping convolutional layers, incorporates a Local Feature Extraction Module (LFEM) based on depthwise separable convolutions, and employs a Global-to-Local Feature Fusion Module (GLFM) to effectively merge features. Extensive experiments were conducted on a custom TEC dataset of 4800 images representing seven defect states, employing stratified sampling for training, validation, and testing. Cross-domain validation was also performed using the publicly available DAGM 2007 dataset. The LFEFFN achieved a Top-1 accuracy of 94.73 % and a macro-average F1 score of 0.934, outperforming state-of-the-art CNN-based and Transformer-based models. Robustness evaluations under varied lighting (±50 %), rotation (±30°), and resolution changes (50 % and 150 %) demonstrated minimal performance degradation, confirming the model's resilience in complex industrial environments. Cross-domain testing on the DAGM 2007 dataset yielded a Top-1 accuracy of 85.62 %, highlighting the model's strong generalization ability. Ablation studies further validated the contributions of each module and parameter configuration, and deployment analysis showed an average inference time of 0.05 s per image, satisfying real-time industrial application requirements.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"196 ","pages":"Pages 257-266"},"PeriodicalIF":3.3000,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167865525002508","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Thermoelectric coolers (TECs) are crucial in industries requiring precise temperature control, such as electronics, telecommunications, aerospace, and semiconductor manufacturing. During the manufacturing process of TEC components, defects including cracks, pits, and contamination frequently occur, compromising performance and service life. Traditional manual inspection methods are inefficient and error-prone, motivating the need for an automated and accurate defect detection approach. To address these challenges posed by the subtle, diverse, and randomly distributed defects on TEC components, we propose the Local Feature Enhance and Feature Fusion Network (LFEFFN), a hybrid model integrating convolutional neural networks (CNNs) and Transformer architectures to simultaneously capture local details and global contextual information. Specifically, the model enhances the traditional patch embedding module using affine transformations and overlapping convolutional layers, incorporates a Local Feature Extraction Module (LFEM) based on depthwise separable convolutions, and employs a Global-to-Local Feature Fusion Module (GLFM) to effectively merge features. Extensive experiments were conducted on a custom TEC dataset of 4800 images representing seven defect states, employing stratified sampling for training, validation, and testing. Cross-domain validation was also performed using the publicly available DAGM 2007 dataset. The LFEFFN achieved a Top-1 accuracy of 94.73 % and a macro-average F1 score of 0.934, outperforming state-of-the-art CNN-based and Transformer-based models. Robustness evaluations under varied lighting (±50 %), rotation (±30°), and resolution changes (50 % and 150 %) demonstrated minimal performance degradation, confirming the model's resilience in complex industrial environments. Cross-domain testing on the DAGM 2007 dataset yielded a Top-1 accuracy of 85.62 %, highlighting the model's strong generalization ability. Ablation studies further validated the contributions of each module and parameter configuration, and deployment analysis showed an average inference time of 0.05 s per image, satisfying real-time industrial application requirements.

查看原文本刊更多论文

基于局部和全局特征融合的新型工业热电冷却器部件缺陷视觉变压器检测方法

热电冷却器（tec）在需要精确温度控制的行业中至关重要，例如电子，电信，航空航天和半导体制造。在TEC部件的制造过程中，经常会出现裂纹、凹坑和污染等缺陷，影响其性能和使用寿命。传统的人工检测方法效率低下且容易出错，这促使人们需要一种自动化的、准确的缺陷检测方法。为了解决TEC组件上细微、多样和随机分布的缺陷所带来的挑战，我们提出了局部特征增强和特征融合网络（LFEFFN），这是一种集成卷积神经网络（cnn）和Transformer架构的混合模型，可以同时捕获局部细节和全局上下文信息。具体而言，该模型利用仿射变换和重叠卷积层对传统的补丁嵌入模块进行了改进，引入了基于深度可分卷积的局部特征提取模块（LFEM），并采用全局到局部特征融合模块（GLFM）进行有效的特征融合。广泛的实验是在一个定制的TEC数据集上进行的，该数据集有4800张图像，代表7种缺陷状态，采用分层抽样进行训练、验证和测试。还使用公开可用的DAGM 2007数据集进行了跨域验证。LFEFFN的Top-1准确率为94.73%，宏观平均F1得分为0.934，优于最先进的基于cnn和transformer的模型。在不同光照（±50%）、旋转（±30°）和分辨率变化（50%和150%）下的鲁棒性评估表明，性能下降最小，证实了模型在复杂工业环境中的弹性。在DAGM 2007数据集上的跨域测试获得了85.62%的Top-1准确率，表明该模型具有较强的泛化能力。消融研究进一步验证了每个模块和参数配置的贡献，部署分析表明，每张图像的平均推理时间为0.05 s，满足实时工业应用需求。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Pattern Recognition Letters 工程技术-计算机：人工智能

CiteScore

12.40

自引率

5.90%

发文量

287

审稿时长

9.1 months

期刊介绍： Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition. Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.