Generative flow induced neural architecture search: Towards discovering optimal architecture in wavelet neural operator

IF 7.2 2区 物理与天体物理 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Hartej Soin , Tapas Tripura , Souvik Chakraborty
{"title":"Generative flow induced neural architecture search: Towards discovering optimal architecture in wavelet neural operator","authors":"Hartej Soin ,&nbsp;Tapas Tripura ,&nbsp;Souvik Chakraborty","doi":"10.1016/j.cpc.2025.109755","DOIUrl":null,"url":null,"abstract":"<div><div>We propose a generative flow-induced neural architecture search algorithm. The proposed approach devises simple feed-forward neural networks to learn stochastic policies to generate sequences of architecture hyperparameters such that the generated states are in proportion to the reward from the terminal state. We demonstrate the efficacy of the proposed search algorithm on the wavelet neural operator (WNO), where we learn a policy to generate a sequence of hyperparameters like wavelet basis and activation operators for wavelet integral blocks. While the trajectory of the generated wavelet basis and activation sequence is cast as flow, the policy is learned by minimizing the flow violation between each state in the trajectory and maximizing the reward from the terminal state. In the terminal state, we train WNO simultaneously to guide the search. We propose using the negative exponent of the WNO loss on the validation dataset as the reward function. While the grid search-based neural architecture generation algorithms foresee every combination, the proposed framework generates the most probable sequence based on the positive reward from the terminal state, thereby reducing exploration time. Compared to reinforcement learning schemes, where complete episodic training is required to get the reward, the proposed algorithm generates the hyperparameter trajectory sequentially. Through four fluid mechanics-oriented problems, we illustrate that the learned policies can sample the best-performing architecture of the neural operator, thereby improving the performance of the vanilla wavelet neural operator. We compare the performance of the proposed flow-based search strategy with that of a Monte Carlo Tree Search (MCTS) -based algorithm and observe an improvement of ≥23% in the resulting optimal architecture.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"316 ","pages":"Article 109755"},"PeriodicalIF":7.2000,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Physics Communications","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010465525002577","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

We propose a generative flow-induced neural architecture search algorithm. The proposed approach devises simple feed-forward neural networks to learn stochastic policies to generate sequences of architecture hyperparameters such that the generated states are in proportion to the reward from the terminal state. We demonstrate the efficacy of the proposed search algorithm on the wavelet neural operator (WNO), where we learn a policy to generate a sequence of hyperparameters like wavelet basis and activation operators for wavelet integral blocks. While the trajectory of the generated wavelet basis and activation sequence is cast as flow, the policy is learned by minimizing the flow violation between each state in the trajectory and maximizing the reward from the terminal state. In the terminal state, we train WNO simultaneously to guide the search. We propose using the negative exponent of the WNO loss on the validation dataset as the reward function. While the grid search-based neural architecture generation algorithms foresee every combination, the proposed framework generates the most probable sequence based on the positive reward from the terminal state, thereby reducing exploration time. Compared to reinforcement learning schemes, where complete episodic training is required to get the reward, the proposed algorithm generates the hyperparameter trajectory sequentially. Through four fluid mechanics-oriented problems, we illustrate that the learned policies can sample the best-performing architecture of the neural operator, thereby improving the performance of the vanilla wavelet neural operator. We compare the performance of the proposed flow-based search strategy with that of a Monte Carlo Tree Search (MCTS) -based algorithm and observe an improvement of ≥23% in the resulting optimal architecture.
生成流诱导神经结构搜索:探索小波神经算子的最优结构
我们提出了一种生成流诱导的神经结构搜索算法。该方法设计了简单的前馈神经网络来学习随机策略,以生成结构超参数序列,使生成的状态与终端状态的奖励成比例。我们证明了所提出的搜索算法在小波神经算子(WNO)上的有效性,其中我们学习了一种策略来生成一系列超参数,如小波基和小波积分块的激活算子。当生成的小波基和激活序列的轨迹被转换为流时,策略是通过最小化轨迹中每个状态之间的流冲突和最大化终端状态的奖励来学习的。在终端状态下,我们同时训练WNO来引导搜索。我们建议使用验证数据集上WNO损失的负指数作为奖励函数。基于网格搜索的神经结构生成算法可以预见每一种组合,而该框架根据终端状态的正奖励生成最可能的序列,从而减少了探索时间。与需要完整的情景训练才能获得奖励的强化学习方案相比,本文提出的算法依次生成超参数轨迹。通过四个面向流体力学的问题,我们说明了学习到的策略可以对神经算子的最佳结构进行采样,从而提高了普通小波神经算子的性能。我们将所提出的基于流的搜索策略与基于蒙特卡罗树搜索(MCTS)的算法的性能进行了比较,发现所得到的最优架构改进了≥23%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computer Physics Communications
Computer Physics Communications 物理-计算机:跨学科应用
CiteScore
12.10
自引率
3.20%
发文量
287
审稿时长
5.3 months
期刊介绍: The focus of CPC is on contemporary computational methods and techniques and their implementation, the effectiveness of which will normally be evidenced by the author(s) within the context of a substantive problem in physics. Within this setting CPC publishes two types of paper. Computer Programs in Physics (CPiP) These papers describe significant computer programs to be archived in the CPC Program Library which is held in the Mendeley Data repository. The submitted software must be covered by an approved open source licence. Papers and associated computer programs that address a problem of contemporary interest in physics that cannot be solved by current software are particularly encouraged. Computational Physics Papers (CP) These are research papers in, but are not limited to, the following themes across computational physics and related disciplines. mathematical and numerical methods and algorithms; computational models including those associated with the design, control and analysis of experiments; and algebraic computation. Each will normally include software implementation and performance details. The software implementation should, ideally, be available via GitHub, Zenodo or an institutional repository.In addition, research papers on the impact of advanced computer architecture and special purpose computers on computing in the physical sciences and software topics related to, and of importance in, the physical sciences may be considered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信