基于模型的OPC强化学习自适应PID控制

IF 2.3 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Taeyoung Kim;Shilong Zhang;Youngsoo Shin
{"title":"基于模型的OPC强化学习自适应PID控制","authors":"Taeyoung Kim;Shilong Zhang;Youngsoo Shin","doi":"10.1109/TSM.2025.3528735","DOIUrl":null,"url":null,"abstract":"Model-based optical proximity correction (MB- OPC) relies on a feedback loop, in which correction result, measured as edge placement error (EPE), is used for decision of next correction. A proportional-integral-derivative (PID) control is a popular mechanism employed for such feedback loop, but current MB-OPC usually relies only on P control. This is because there is no systematic way to customize P, I, and D coefficients for different layouts in different OPC iterations.We apply reinforcement learning (RL) to construct the trained actor that adaptively yields PID coefficients within the correction loop. The RL model consists of an actor and a critic. We perform supervised pre-training to quickly set the initial weights of RL model, with the actor mimicking standard MB-OPC. Subsequently, the critic is trained to predict accurate Q-value, the cumulative reward from OPC correction. The actor is then trained to maximize this Q-value. Experiments are performed with aggressive target maximum EPE values. The proposed OPC for test layouts requires 5 to 7 iterations, while standard MB-OPC (with constant coefficient-based control) completes in 20 to 28 iterations. This reduces OPC runtime to about 1/2.7 on average. In addition, maximum EPE is also reduced by about 24%.","PeriodicalId":451,"journal":{"name":"IEEE Transactions on Semiconductor Manufacturing","volume":"38 1","pages":"48-56"},"PeriodicalIF":2.3000,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Model-Based OPC With Adaptive PID Control Through Reinforcement Learning\",\"authors\":\"Taeyoung Kim;Shilong Zhang;Youngsoo Shin\",\"doi\":\"10.1109/TSM.2025.3528735\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Model-based optical proximity correction (MB- OPC) relies on a feedback loop, in which correction result, measured as edge placement error (EPE), is used for decision of next correction. A proportional-integral-derivative (PID) control is a popular mechanism employed for such feedback loop, but current MB-OPC usually relies only on P control. This is because there is no systematic way to customize P, I, and D coefficients for different layouts in different OPC iterations.We apply reinforcement learning (RL) to construct the trained actor that adaptively yields PID coefficients within the correction loop. The RL model consists of an actor and a critic. We perform supervised pre-training to quickly set the initial weights of RL model, with the actor mimicking standard MB-OPC. Subsequently, the critic is trained to predict accurate Q-value, the cumulative reward from OPC correction. The actor is then trained to maximize this Q-value. Experiments are performed with aggressive target maximum EPE values. The proposed OPC for test layouts requires 5 to 7 iterations, while standard MB-OPC (with constant coefficient-based control) completes in 20 to 28 iterations. This reduces OPC runtime to about 1/2.7 on average. In addition, maximum EPE is also reduced by about 24%.\",\"PeriodicalId\":451,\"journal\":{\"name\":\"IEEE Transactions on Semiconductor Manufacturing\",\"volume\":\"38 1\",\"pages\":\"48-56\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-01-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Semiconductor Manufacturing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10847731/\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Semiconductor Manufacturing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10847731/","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

基于模型的光学邻近校正(MB- OPC)依赖于一个反馈回路,在这个反馈回路中,校正结果被测量为边缘放置误差(EPE),用于决定下一次校正。比例-积分-导数(PID)控制是这种反馈回路的常用机制,但目前的MB-OPC通常只依赖于P控制。这是因为在不同的OPC迭代中,没有系统的方法来定制不同布局的P、I和D系数。我们应用强化学习(RL)来构造训练好的actor,该actor在校正回路中自适应产生PID系数。RL模型由一个演员和一个评论家组成。我们执行监督预训练来快速设置RL模型的初始权值,参与者模仿标准MB-OPC。随后,训练评论家预测准确的q值,即OPC纠正的累积奖励。然后训练参与者最大化这个q值。实验采用侵略性目标最大EPE值进行。测试布局的建议OPC需要5到7次迭代,而标准MB-OPC(具有基于恒定系数的控制)在20到28次迭代中完成。这将OPC运行时间平均减少到1/2.7左右。此外,最大EPE也降低了约24%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Model-Based OPC With Adaptive PID Control Through Reinforcement Learning
Model-based optical proximity correction (MB- OPC) relies on a feedback loop, in which correction result, measured as edge placement error (EPE), is used for decision of next correction. A proportional-integral-derivative (PID) control is a popular mechanism employed for such feedback loop, but current MB-OPC usually relies only on P control. This is because there is no systematic way to customize P, I, and D coefficients for different layouts in different OPC iterations.We apply reinforcement learning (RL) to construct the trained actor that adaptively yields PID coefficients within the correction loop. The RL model consists of an actor and a critic. We perform supervised pre-training to quickly set the initial weights of RL model, with the actor mimicking standard MB-OPC. Subsequently, the critic is trained to predict accurate Q-value, the cumulative reward from OPC correction. The actor is then trained to maximize this Q-value. Experiments are performed with aggressive target maximum EPE values. The proposed OPC for test layouts requires 5 to 7 iterations, while standard MB-OPC (with constant coefficient-based control) completes in 20 to 28 iterations. This reduces OPC runtime to about 1/2.7 on average. In addition, maximum EPE is also reduced by about 24%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Semiconductor Manufacturing
IEEE Transactions on Semiconductor Manufacturing 工程技术-工程:电子与电气
CiteScore
5.20
自引率
11.10%
发文量
101
审稿时长
3.3 months
期刊介绍: The IEEE Transactions on Semiconductor Manufacturing addresses the challenging problems of manufacturing complex microelectronic components, especially very large scale integrated circuits (VLSI). Manufacturing these products requires precision micropatterning, precise control of materials properties, ultraclean work environments, and complex interactions of chemical, physical, electrical and mechanical processes.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信