Forecasting the yield of wafer by using improved genetic algorithm, high dimensional alternating feature selection and SVM with uneven distribution and high-dimensional data

自主智能系统(英文) Pub Date : 2022-09-26 DOI:10.1007/s43684-022-00041-3

Qiuhao Xu, Chuqiao Xu, Junliang Wang

{"title":"Forecasting the yield of wafer by using improved genetic algorithm, high dimensional alternating feature selection and SVM with uneven distribution and high-dimensional data","authors":"Qiuhao Xu, Chuqiao Xu, Junliang Wang","doi":"10.1007/s43684-022-00041-3","DOIUrl":null,"url":null,"abstract":"<div><p>Wafer yield prediction, as the basis of quality control, is dedicated to predicting quality indices of the wafer manufacturing process. In recent years, data-driven machine learning methods have received a lot of attention due to their accuracy, robustness, and convenience for the prediction of quality indices. However, the existing studies mainly focus on the model level to improve the accuracy of yield prediction does not consider the impact of data characteristics on yield prediction. To tackle the above issues, a novel wafer yield prediction method is proposed, in which the improved genetic algorithm (IGA) is an under-sampling method, which is used to solve the problem of data overlap between finished products and defective products caused by the similarity of manufacturing processes between finished products and defective products in the wafer manufacturing process, and the problem of data imbalance caused by too few defective samples, that is, the problem of uneven distribution of data. In addition, the high-dimensional alternating feature selection method (HAFS) is used to select key influencing processes, that is, key parameters to avoid overfitting in the prediction model caused by many input parameters. Finally, SVM is used to predict the yield. Furthermore, experiments are conducted on a public wafer yield prediction dataset collected from an actual wafer manufacturing system. IGA-HAFS-SVM achieves state-of-art results on this dataset, which confirms the effectiveness of IGA-HAFS-SVM. Additionally, on this dataset, the proposed method improves the AUC score, G-Mean and F1-score by 21.6%, 34.6% and 0.6% respectively compared with the conventional method. Moreover, the experimental results prove the influence of data characteristics on wafer yield prediction.</p></div>","PeriodicalId":71187,"journal":{"name":"自主智能系统(英文)","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s43684-022-00041-3.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"自主智能系统(英文)","FirstCategoryId":"1093","ListUrlMain":"https://link.springer.com/article/10.1007/s43684-022-00041-3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Wafer yield prediction, as the basis of quality control, is dedicated to predicting quality indices of the wafer manufacturing process. In recent years, data-driven machine learning methods have received a lot of attention due to their accuracy, robustness, and convenience for the prediction of quality indices. However, the existing studies mainly focus on the model level to improve the accuracy of yield prediction does not consider the impact of data characteristics on yield prediction. To tackle the above issues, a novel wafer yield prediction method is proposed, in which the improved genetic algorithm (IGA) is an under-sampling method, which is used to solve the problem of data overlap between finished products and defective products caused by the similarity of manufacturing processes between finished products and defective products in the wafer manufacturing process, and the problem of data imbalance caused by too few defective samples, that is, the problem of uneven distribution of data. In addition, the high-dimensional alternating feature selection method (HAFS) is used to select key influencing processes, that is, key parameters to avoid overfitting in the prediction model caused by many input parameters. Finally, SVM is used to predict the yield. Furthermore, experiments are conducted on a public wafer yield prediction dataset collected from an actual wafer manufacturing system. IGA-HAFS-SVM achieves state-of-art results on this dataset, which confirms the effectiveness of IGA-HAFS-SVM. Additionally, on this dataset, the proposed method improves the AUC score, G-Mean and F1-score by 21.6%, 34.6% and 0.6% respectively compared with the conventional method. Moreover, the experimental results prove the influence of data characteristics on wafer yield prediction.

查看原文本刊更多论文

采用改进的遗传算法、高维交替特征选择和支持向量机对分布不均匀的高维数据进行晶圆产量预测

晶圆良品率预测作为质量控制的基础，致力于预测晶圆制造过程的质量指标。近年来，数据驱动的机器学习方法因其预测质量指标的准确性、鲁棒性和便捷性而受到广泛关注。然而，现有研究主要集中在模型层面来提高良率预测的准确性，并未考虑数据特征对良率预测的影响。针对上述问题，本文提出了一种新型的晶圆良品率预测方法，其中改进遗传算法（IGA）是一种欠采样方法，用于解决晶圆制造过程中成品与次品制造工艺相似而导致的成品与次品数据重叠问题，以及次品样本过少导致的数据不平衡问题，即数据分布不均匀问题。此外，采用高维交替特征选择法（HAFS）选择关键影响过程，即关键参数，以避免输入参数过多导致预测模型过拟合。最后，使用 SVM 预测产量。此外，还在从实际晶圆制造系统中收集的公共晶圆产量预测数据集上进行了实验。IGA-HAFS-SVM 在该数据集上取得了最先进的结果，这证实了 IGA-HAFS-SVM 的有效性。此外，在该数据集上，与传统方法相比，所提出的方法在 AUC 分数、G-Mean 和 F1 分数上分别提高了 21.6%、34.6% 和 0.6%。此外，实验结果证明了数据特征对晶圆产量预测的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

自主智能系统(英文)

CiteScore

3.90

自引率

0.00%

发文量