在统计异构数据上进行晶圆缺陷分类的聚类联合学习

IF 5.6 2区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Instrumentation and Measurement Pub Date : 2024-09-05 DOI:10.1109/TIM.2024.3415785

Guang Yang;Zhijia Yang;Shuping Cui;Chunhe Song;Jizhou Wang;Haodong Wei

{"title":"在统计异构数据上进行晶圆缺陷分类的聚类联合学习","authors":"Guang Yang;Zhijia Yang;Shuping Cui;Chunhe Song;Jizhou Wang;Haodong Wei","doi":"10.1109/TIM.2024.3415785","DOIUrl":null,"url":null,"abstract":"Data-driven deep learning techniques for wafer defect image classification provide wafer manufacturers with a tool to rapidly identify surface defects. However, the defect data and computational capabilities of a single wafer manufacturer are often insufficient to support the training of deep learning models. In response, we introduce federated learning (FL), a paradigm that leverages the data and computational capabilities of various wafer manufacturers, all while ensuring that the original data from different manufacturers remain unexposed to each other. Due to variations in manufacturing processes and image acquisition equipment, identical wafer defects can exhibit different features in different manufacturing settings, leading to statistically heterogeneous datasets. This heterogeneity can reduce model convergence speed and accuracy. To counteract this issue, we propose a personalized FL approach with clustering. In the personalization phase, we train distinct network layers for each client’s local model, capitalizing on the feature extraction capability of the global model’s shallow network, while also achieving commendable performance on each client’s unique dataset. During the clustering phase, we provide a theoretical analysis, demonstrating that the divergence of weights between two models is bounded above, laying a theoretical foundation for the clustering operation. We then enhance a density-based clustering method, enabling the clustering of clients with similar data features without the need to specify the number of cluster centers, thus mitigating the problem of global model oscillation. We have conducted experiments under various data heterogeneity scenarios. The experiments show that our method can achieve a 2.8% accuracy improvement average versus the compared state-of-the-art federated methods with a faster convergence rate.","PeriodicalId":13341,"journal":{"name":"IEEE Transactions on Instrumentation and Measurement","volume":null,"pages":null},"PeriodicalIF":5.6000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Clustering Federated Learning for Wafer Defects Classification on Statistical Heterogeneous Data\",\"authors\":\"Guang Yang;Zhijia Yang;Shuping Cui;Chunhe Song;Jizhou Wang;Haodong Wei\",\"doi\":\"10.1109/TIM.2024.3415785\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data-driven deep learning techniques for wafer defect image classification provide wafer manufacturers with a tool to rapidly identify surface defects. However, the defect data and computational capabilities of a single wafer manufacturer are often insufficient to support the training of deep learning models. In response, we introduce federated learning (FL), a paradigm that leverages the data and computational capabilities of various wafer manufacturers, all while ensuring that the original data from different manufacturers remain unexposed to each other. Due to variations in manufacturing processes and image acquisition equipment, identical wafer defects can exhibit different features in different manufacturing settings, leading to statistically heterogeneous datasets. This heterogeneity can reduce model convergence speed and accuracy. To counteract this issue, we propose a personalized FL approach with clustering. In the personalization phase, we train distinct network layers for each client’s local model, capitalizing on the feature extraction capability of the global model’s shallow network, while also achieving commendable performance on each client’s unique dataset. During the clustering phase, we provide a theoretical analysis, demonstrating that the divergence of weights between two models is bounded above, laying a theoretical foundation for the clustering operation. We then enhance a density-based clustering method, enabling the clustering of clients with similar data features without the need to specify the number of cluster centers, thus mitigating the problem of global model oscillation. We have conducted experiments under various data heterogeneity scenarios. The experiments show that our method can achieve a 2.8% accuracy improvement average versus the compared state-of-the-art federated methods with a faster convergence rate.\",\"PeriodicalId\":13341,\"journal\":{\"name\":\"IEEE Transactions on Instrumentation and Measurement\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2024-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Instrumentation and Measurement\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10666741/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Instrumentation and Measurement","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10666741/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

用于晶片缺陷图像分类的数据驱动深度学习技术为晶片制造商提供了快速识别表面缺陷的工具。然而，单个晶片制造商的缺陷数据和计算能力往往不足以支持深度学习模型的训练。为此，我们引入了联合学习 (FL)，这是一种利用不同晶片制造商的数据和计算能力的范例，同时确保不同制造商的原始数据互不影响。由于制造工艺和图像采集设备的不同，相同的晶片缺陷在不同的制造环境中会表现出不同的特征，从而导致统计上的数据集异质性。这种异质性会降低模型的收敛速度和准确性。为了解决这个问题，我们提出了一种具有聚类功能的个性化 FL 方法。在个性化阶段，我们为每个客户的本地模型训练不同的网络层，利用全局模型浅层网络的特征提取能力，同时在每个客户的独特数据集上实现值得称道的性能。在聚类阶段，我们进行了理论分析，证明了两个模型之间的权重发散是有边界的，为聚类操作奠定了理论基础。然后，我们改进了一种基于密度的聚类方法，无需指定聚类中心的数量，即可对具有相似数据特征的客户端进行聚类，从而缓解了全局模型振荡的问题。我们在各种数据异构场景下进行了实验。实验结果表明，我们的方法与同类最先进的联合方法相比，平均准确率提高了 2.8%，而且收敛速度更快。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Clustering Federated Learning for Wafer Defects Classification on Statistical Heterogeneous Data

Data-driven deep learning techniques for wafer defect image classification provide wafer manufacturers with a tool to rapidly identify surface defects. However, the defect data and computational capabilities of a single wafer manufacturer are often insufficient to support the training of deep learning models. In response, we introduce federated learning (FL), a paradigm that leverages the data and computational capabilities of various wafer manufacturers, all while ensuring that the original data from different manufacturers remain unexposed to each other. Due to variations in manufacturing processes and image acquisition equipment, identical wafer defects can exhibit different features in different manufacturing settings, leading to statistically heterogeneous datasets. This heterogeneity can reduce model convergence speed and accuracy. To counteract this issue, we propose a personalized FL approach with clustering. In the personalization phase, we train distinct network layers for each client’s local model, capitalizing on the feature extraction capability of the global model’s shallow network, while also achieving commendable performance on each client’s unique dataset. During the clustering phase, we provide a theoretical analysis, demonstrating that the divergence of weights between two models is bounded above, laying a theoretical foundation for the clustering operation. We then enhance a density-based clustering method, enabling the clustering of clients with similar data features without the need to specify the number of cluster centers, thus mitigating the problem of global model oscillation. We have conducted experiments under various data heterogeneity scenarios. The experiments show that our method can achieve a 2.8% accuracy improvement average versus the compared state-of-the-art federated methods with a faster convergence rate.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Instrumentation and Measurement 工程技术-工程：电子与电气

CiteScore

9.00

自引率

23.20%

发文量

1294

审稿时长

3.9 months

期刊介绍： Papers are sought that address innovative solutions to the development and use of electrical and electronic instruments and equipment to measure, monitor and/or record physical phenomena for the purpose of advancing measurement science, methods, functionality and applications. The scope of these papers may encompass: (1) theory, methodology, and practice of measurement; (2) design, development and evaluation of instrumentation and measurement systems and components used in generating, acquiring, conditioning and processing signals; (3) analysis, representation, display, and preservation of the information obtained from a set of measurements; and (4) scientific and technical support to establishment and maintenance of technical standards in the field of Instrumentation and Measurement.