Neural architecture search: two constant shared weights initialisations

IF 13.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence Review Pub Date : 2025-07-05 DOI:10.1007/s10462-025-11238-2

Ekaterina Gracheva

{"title":"Neural architecture search: two constant shared weights initialisations","authors":"Ekaterina Gracheva","doi":"10.1007/s10462-025-11238-2","DOIUrl":null,"url":null,"abstract":"<div><p>In the last decade, zero-cost metrics have gained prominence in neural architecture search (NAS) due to their ability to evaluate architectures without training. These metrics are significantly faster and less computationally expensive than traditional NAS methods and provide insights into neural architectures’ internal workings. This paper introduces <span>epsinas</span>, a novel zero-cost NAS metric that assesses architecture potential using two constant shared weight initialisations and the statistics of their outputs. We show that the dispersion of raw outputs, normalised by their average magnitude, strongly correlates with trained accuracy. This effect holds across image classification and language tasks on NAS-Bench-101, NAS-Bench-201, and NAS-Bench-NLP. Our method requires no data labels, operates on a single minibatch, and eliminates the need for gradient computation, making it independent of training hyperparameters, loss metrics, and human annotations. It evaluates a network in a fraction of a GPU second and integrates seamlessly into existing NAS frameworks. The code supporting this study can be found on GitHub at https://github.com/egracheva/epsinas.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 10","pages":""},"PeriodicalIF":13.9000,"publicationDate":"2025-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11238-2.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-025-11238-2","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

In the last decade, zero-cost metrics have gained prominence in neural architecture search (NAS) due to their ability to evaluate architectures without training. These metrics are significantly faster and less computationally expensive than traditional NAS methods and provide insights into neural architectures’ internal workings. This paper introduces epsinas, a novel zero-cost NAS metric that assesses architecture potential using two constant shared weight initialisations and the statistics of their outputs. We show that the dispersion of raw outputs, normalised by their average magnitude, strongly correlates with trained accuracy. This effect holds across image classification and language tasks on NAS-Bench-101, NAS-Bench-201, and NAS-Bench-NLP. Our method requires no data labels, operates on a single minibatch, and eliminates the need for gradient computation, making it independent of training hyperparameters, loss metrics, and human annotations. It evaluates a network in a fraction of a GPU second and integrates seamlessly into existing NAS frameworks. The code supporting this study can be found on GitHub at https://github.com/egracheva/epsinas.

查看原文本刊更多论文

神经结构搜索：两个常量共享权重初始化

在过去的十年中，零成本度量在神经架构搜索（NAS）中获得了突出的地位，因为它们能够在没有训练的情况下评估架构。这些指标比传统的NAS方法更快，计算成本更低，并提供了对神经架构内部工作的见解。本文介绍了epsinas，一种新的零成本NAS度量，它使用两个恒定的共享权重初始化和它们的输出统计来评估架构潜力。我们表明，原始输出的离散度，通过其平均幅度归一化，与训练精度密切相关。这种效应在NAS-Bench-101、NAS-Bench-201和NAS-Bench-NLP上的图像分类和语言任务中都存在。我们的方法不需要数据标签，在单个小批量上操作，并且消除了梯度计算的需要，使其独立于训练超参数、损失度量和人工注释。它在几分之一GPU秒内评估网络，并无缝集成到现有的NAS框架中。支持这项研究的代码可以在GitHub上找到https://github.com/egracheva/epsinas。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Artificial Intelligence Review 工程技术-计算机：人工智能

CiteScore

22.00

自引率

3.30%

发文量

194

审稿时长

5.3 months

期刊介绍： Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.