{"title":"TinyMo: Graph-Level Memory Optimizer for Tiny Machine Learning","authors":"Byungchul Chae;Seonyeong Heo","doi":"10.1109/LES.2024.3485630","DOIUrl":null,"url":null,"abstract":"Effective memory optimization is essential for tiny machine learning because tiny embedded systems generally have limited memory for model execution. Previous research has proposed various model compression methods to reduce the memory usage of machine learning models. However, the methods often entail accuracy loss resulting from altering model weights. This letter proposes a graph-level memory optimizer for tiny embedded systems, TinyMo, which optimizes the memory usage of an input model by changing the structure of the model graph. TinyMo mainly uses two optimization methods: 1) tensor spilling and 2) tensor splitting, to reduce unnecessary memory usage from long skip connections and large separable convolutions. In the evaluation, this letter shows that the proposed optimizer can successfully reduce the peak memory usage of various neural network models for commercial embedded systems with little runtime overhead.","PeriodicalId":56143,"journal":{"name":"IEEE Embedded Systems Letters","volume":"17 3","pages":"196-199"},"PeriodicalIF":2.0000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Embedded Systems Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10734316/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Effective memory optimization is essential for tiny machine learning because tiny embedded systems generally have limited memory for model execution. Previous research has proposed various model compression methods to reduce the memory usage of machine learning models. However, the methods often entail accuracy loss resulting from altering model weights. This letter proposes a graph-level memory optimizer for tiny embedded systems, TinyMo, which optimizes the memory usage of an input model by changing the structure of the model graph. TinyMo mainly uses two optimization methods: 1) tensor spilling and 2) tensor splitting, to reduce unnecessary memory usage from long skip connections and large separable convolutions. In the evaluation, this letter shows that the proposed optimizer can successfully reduce the peak memory usage of various neural network models for commercial embedded systems with little runtime overhead.
期刊介绍:
The IEEE Embedded Systems Letters (ESL), provides a forum for rapid dissemination of latest technical advances in embedded systems and related areas in embedded software. The emphasis is on models, methods, and tools that ensure secure, correct, efficient and robust design of embedded systems and their applications.