{"title":"CPDD: A Cross-Scenario Photovoltaic Defect Detector Based on Fine-Grained Feature Autoencoding and Pseudo-Box Contrastive Learning","authors":"Zhaoyang Wang;Haiyong Chen;Zhen Cao","doi":"10.1109/TSM.2025.3570323","DOIUrl":null,"url":null,"abstract":"The vision foundation model, relying on large-scale pre-training, has advanced image comprehension capabilities and excels in general scenarios. However, its performance remains suboptimal in specialized tasks, such as photovoltaic (PV) cells defect detection. This limitation stems from the models’ lack of domain-specific prior knowledge. To address this, we propose a two-stage pre-training framework comprising fine-grained feature autoencoding (FFA) and pseudo-box contrastive learning (PCL), which leverages extensive unlabeled raw images to inject domain expertise into the model. First, we investigate the fine-grained feature autoencoder, which trains a detail-sensitive vision transformer (ViT) backbone by reconstructing the histogram of oriented gradients (HOG) of masked images. Second, we pre-train the detection head through contrastive learning. Using selective search (SS) to generate pseudo-boxes, we treat paired boxes from two augmented views of an image as positive samples. The abundant unsupervised pseudo-boxes optimize the detection head’s local representation and localization capabilities. Finally, we fully fine-tune the model with labeled images. Based on this methodology, we build the cross-scenario photovoltaic defect detector (CPDD). The experimental results demonstrate that CPDD achieves state-of-the-art (SOTA) mAP50 scores on three benchmarks, outperforming detectors pre-trained on the COCO dataset as well as those specifically designed for PV defect detection.","PeriodicalId":451,"journal":{"name":"IEEE Transactions on Semiconductor Manufacturing","volume":"38 3","pages":"612-623"},"PeriodicalIF":2.3000,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Semiconductor Manufacturing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11005435/","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
The vision foundation model, relying on large-scale pre-training, has advanced image comprehension capabilities and excels in general scenarios. However, its performance remains suboptimal in specialized tasks, such as photovoltaic (PV) cells defect detection. This limitation stems from the models’ lack of domain-specific prior knowledge. To address this, we propose a two-stage pre-training framework comprising fine-grained feature autoencoding (FFA) and pseudo-box contrastive learning (PCL), which leverages extensive unlabeled raw images to inject domain expertise into the model. First, we investigate the fine-grained feature autoencoder, which trains a detail-sensitive vision transformer (ViT) backbone by reconstructing the histogram of oriented gradients (HOG) of masked images. Second, we pre-train the detection head through contrastive learning. Using selective search (SS) to generate pseudo-boxes, we treat paired boxes from two augmented views of an image as positive samples. The abundant unsupervised pseudo-boxes optimize the detection head’s local representation and localization capabilities. Finally, we fully fine-tune the model with labeled images. Based on this methodology, we build the cross-scenario photovoltaic defect detector (CPDD). The experimental results demonstrate that CPDD achieves state-of-the-art (SOTA) mAP50 scores on three benchmarks, outperforming detectors pre-trained on the COCO dataset as well as those specifically designed for PV defect detection.
期刊介绍:
The IEEE Transactions on Semiconductor Manufacturing addresses the challenging problems of manufacturing complex microelectronic components, especially very large scale integrated circuits (VLSI). Manufacturing these products requires precision micropatterning, precise control of materials properties, ultraclean work environments, and complex interactions of chemical, physical, electrical and mechanical processes.