{"title":"Agile Optimization Framework: A framework for tensor operator optimization in neural network","authors":"","doi":"10.1016/j.future.2024.07.019","DOIUrl":null,"url":null,"abstract":"<div><p>In recent years, with the gradual slowing of Moore’s Law and the development of deep learning, the demand for hardware performance of executing deep learning based applications has significantly increased. In this case, deep learning compilers have been proven to maximize hardware performance while keeping computational power constant, especially the end-to-end compiler Tensor Virtual Machine (TVM). TVM optimizes tensors by finding excellent parallel computing schemes, thereby achieving the goal of improving the performance of neural network inference. However, there is still untapped potential in current optimization methods. However, existing optimization methods based on the TVM, such as Genetic Algorithms Tuner (GA-Tuner), have failed to achieve a balance between optimization performance and optimization time. The intolerable duration of optimization detracts from TVM’s usability, rendering it challenging to extend into the scientific community. This paper introduces a novel deep learning compilation optimization framework base on TVM called Agile Optimization Framework (AOF), which incorporates a tuner based on the latest Beluga Whale Optimization Algorithm (BWO). The BWO is adept at tackling complex problems characterized by numerous local optima, making it particularly suitable for hardware compilation optimization scenarios. We further propose an Evolving Epsilon Strategy (EES), a search strategy that adaptively adjusts the balance between exploration and exploitation, thereby enhancing the effectiveness of the algorithm. Additionally, we developed a supervised Tuning Accelerator (TA) aimed at reducing the time required for optimization and enhancing efficiency. Comparative experiments demonstrate that AOF achieves 11.36%–66.20% improvement in performance and 30.30%–54.60% reduction in optimization time, significantly outperforming the control group.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2000,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24003856","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, with the gradual slowing of Moore’s Law and the development of deep learning, the demand for hardware performance of executing deep learning based applications has significantly increased. In this case, deep learning compilers have been proven to maximize hardware performance while keeping computational power constant, especially the end-to-end compiler Tensor Virtual Machine (TVM). TVM optimizes tensors by finding excellent parallel computing schemes, thereby achieving the goal of improving the performance of neural network inference. However, there is still untapped potential in current optimization methods. However, existing optimization methods based on the TVM, such as Genetic Algorithms Tuner (GA-Tuner), have failed to achieve a balance between optimization performance and optimization time. The intolerable duration of optimization detracts from TVM’s usability, rendering it challenging to extend into the scientific community. This paper introduces a novel deep learning compilation optimization framework base on TVM called Agile Optimization Framework (AOF), which incorporates a tuner based on the latest Beluga Whale Optimization Algorithm (BWO). The BWO is adept at tackling complex problems characterized by numerous local optima, making it particularly suitable for hardware compilation optimization scenarios. We further propose an Evolving Epsilon Strategy (EES), a search strategy that adaptively adjusts the balance between exploration and exploitation, thereby enhancing the effectiveness of the algorithm. Additionally, we developed a supervised Tuning Accelerator (TA) aimed at reducing the time required for optimization and enhancing efficiency. Comparative experiments demonstrate that AOF achieves 11.36%–66.20% improvement in performance and 30.30%–54.60% reduction in optimization time, significantly outperforming the control group.
期刊介绍:
Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications.
Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration.
Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.