Neural-guided superoptimization in ethereum

IF 4.3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information and Software Technology Pub Date : 2025-06-16 DOI:10.1016/j.infsof.2025.107800

Matheus Araújo Aguiar , Elvira Albert , Samir Genaim , Pablo Gordillo , Alejandro Hernández-Cerezo , Daniel Kirchner , Albert Rubio

{"title":"Neural-guided superoptimization in ethereum","authors":"Matheus Araújo Aguiar , Elvira Albert , Samir Genaim , Pablo Gordillo , Alejandro Hernández-Cerezo , Daniel Kirchner , Albert Rubio","doi":"10.1016/j.infsof.2025.107800","DOIUrl":null,"url":null,"abstract":"<div><h3>Context:</h3><div>Superoptimization is a synthesis technique that, given a <em>loop-free sequence</em> of instructions, searches for an equivalent sequence that is <em>optimal wrt.</em> an objective function. Superoptimization of Ethereum smart contracts aims at minimizing the <em>size of their bytecode</em> and the <em>gas consumption</em> of executing the contract’s functions. The search for the optimal solution poses huge computational demands – as the search space to find the optimal sequence is exponential on the given <em>size-bound</em> – being the main challenge for superoptimization today to scale up to real, industrial software. Even if the underlying problem for finding the optimal solution is decidable, practical tools often prioritize efficiency over completeness. This means they might be implemented to find a sub-optimal solution or even time out.</div></div><div><h3>Objective:</h3><div>This work aims at leveraging superoptimization to a real setting: Ethereum blockchain. This paper proposes a <em>neural-guided superoptimization</em> (NGS) approach which incorporates deep neural networks using (supervised) learning into superoptimization to improve scalability by predicting: (1) if a sequence is already optimal and hence the search can be skipped; (2) the size-bound for the optimal solution in order to reduce the search space.</div></div><div><h3>Method:</h3><div>We have downloaded over 13,000 smart contracts deployed on the blockchain for training and testing the machine learning models, and a disjoint set with 100 of the smart contracts with more transactions to prove our scalability gains and impact for the Ethereum community.</div></div><div><h3>Results:</h3><div>Incorporating DNNs resulted in a 16x overall speedup (12x for gas) with only 12% optimization loss (14% for gas), or a 3-4x speedup with no optimization loss. For the 100 analyzed contracts, this approach reduced the average compilation time to 3 min per contract and achieved monetary savings of $1.24M.</div></div><div><h3>Conclusions:</h3><div>The integration of machine learning models mitigates several limitations of traditional superoptimization by drastically reducing execution times while maintaining most of the original optimization gains.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"186 ","pages":"Article 107800"},"PeriodicalIF":4.3000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584925001399","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Context:

Superoptimization is a synthesis technique that, given a loop-free sequence of instructions, searches for an equivalent sequence that is optimal wrt. an objective function. Superoptimization of Ethereum smart contracts aims at minimizing the size of their bytecode and the gas consumption of executing the contract’s functions. The search for the optimal solution poses huge computational demands – as the search space to find the optimal sequence is exponential on the given size-bound – being the main challenge for superoptimization today to scale up to real, industrial software. Even if the underlying problem for finding the optimal solution is decidable, practical tools often prioritize efficiency over completeness. This means they might be implemented to find a sub-optimal solution or even time out.

Objective:

This work aims at leveraging superoptimization to a real setting: Ethereum blockchain. This paper proposes a neural-guided superoptimization (NGS) approach which incorporates deep neural networks using (supervised) learning into superoptimization to improve scalability by predicting: (1) if a sequence is already optimal and hence the search can be skipped; (2) the size-bound for the optimal solution in order to reduce the search space.

Method:

We have downloaded over 13,000 smart contracts deployed on the blockchain for training and testing the machine learning models, and a disjoint set with 100 of the smart contracts with more transactions to prove our scalability gains and impact for the Ethereum community.

Results:

Incorporating DNNs resulted in a 16x overall speedup (12x for gas) with only 12% optimization loss (14% for gas), or a 3-4x speedup with no optimization loss. For the 100 analyzed contracts, this approach reduced the average compilation time to 3 min per contract and achieved monetary savings of $1.24M.

Conclusions:

The integration of machine learning models mitigates several limitations of traditional superoptimization by drastically reducing execution times while maintaining most of the original optimization gains.

查看原文本刊更多论文

以太坊中的神经引导超优化

上下文：超优化是一种合成技术，给定一个无循环的指令序列，搜索最优wrt的等效序列。一个目标函数。以太坊智能合约的超优化旨在最大限度地减少字节码的大小和执行合约功能的消耗。寻找最优解带来了巨大的计算需求——因为寻找最优序列的搜索空间在给定的大小范围上是指数级的——这是当今超优化扩展到真实工业软件的主要挑战。即使寻找最优解决方案的潜在问题是可确定的，实用工具也经常优先考虑效率而不是完整性。这意味着它们的实现可能会找到次优解决方案，甚至超时。目标：本工作旨在将超优化应用于真实环境：以太坊区块链。本文提出了一种神经引导超优化（NGS）方法，该方法将使用（监督）学习的深度神经网络结合到超优化中，通过预测：(1)如果序列已经是最优的，因此可以跳过搜索；(2)对最优解的大小定界，以减小搜索空间。方法：我们已经下载了部署在区块链上的13000多个智能合约，用于训练和测试机器学习模型，以及一个包含100个智能合约的分离集，其中包含更多的交易，以证明我们的可扩展性收益和对以太坊社区的影响。结果：结合dnn的结果是16倍的总加速（气体12倍），只有12%的优化损失（气体14%），或者3-4倍的加速，没有优化损失。对于所分析的100个合同，这种方法将每个合同的平均编译时间减少到3分钟，并节省了124万美元。结论：机器学习模型的集成通过在保持大多数原始优化收益的同时大幅减少执行时间，减轻了传统超级优化的几个限制。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information and Software Technology 工程技术-计算机：软件工程

CiteScore

9.10

自引率

7.70%

发文量

164

审稿时长

9.6 weeks

期刊介绍： Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include: • Software management, quality and metrics, • Software processes, • Software architecture, modelling, specification, design and programming • Functional and non-functional software requirements • Software testing and verification & validation • Empirical studies of all aspects of engineering and managing software development Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information. The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.