{"title":"DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models","authors":"Tzu-Quan Lin, Hung-yi Lee, Hao Tang","doi":"arxiv-2406.05464","DOIUrl":null,"url":null,"abstract":"Self-supervised speech models have shown to be useful for various tasks, but\ntheir large size limits the use in devices with low computing power and memory.\nIn this work, we explore early exit, an approach for reducing latency by\nexiting the forward process of a network early. Most approaches of early exit\nneed a separate early exit model for each task, with some even requiring\nfine-tuning of the entire pretrained model. We introduce Data Adaptive\nSelf-Supervised Early Exit (DAISY), an approach that decides when to exit based\non the self-supervised loss, eliminating the need for multiple round of\ntraining and fine-tuning. DAISY matches the performance of HuBERT on the\nMiniSUPERB benchmark, but with much faster inference times. Our analysis on the\nadaptivity of DAISY shows that the model exits early (using fewer layers) on\nclean data while exits late (using more layers) on noisy data, dynamically\nadjusting the computational cost of inference based on the noise level of each\nsample.","PeriodicalId":501178,"journal":{"name":"arXiv - CS - Sound","volume":"38 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Sound","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.05464","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Self-supervised speech models have shown to be useful for various tasks, but
their large size limits the use in devices with low computing power and memory.
In this work, we explore early exit, an approach for reducing latency by
exiting the forward process of a network early. Most approaches of early exit
need a separate early exit model for each task, with some even requiring
fine-tuning of the entire pretrained model. We introduce Data Adaptive
Self-Supervised Early Exit (DAISY), an approach that decides when to exit based
on the self-supervised loss, eliminating the need for multiple round of
training and fine-tuning. DAISY matches the performance of HuBERT on the
MiniSUPERB benchmark, but with much faster inference times. Our analysis on the
adaptivity of DAISY shows that the model exits early (using fewer layers) on
clean data while exits late (using more layers) on noisy data, dynamically
adjusting the computational cost of inference based on the noise level of each
sample.