DAISY：语音表征模型的数据自适应自监督早期退出

arXiv - CS - Sound Pub Date : 2024-06-08 DOI:arxiv-2406.05464

Tzu-Quan Lin, Hung-yi Lee, Hao Tang

{"title":"DAISY：语音表征模型的数据自适应自监督早期退出","authors":"Tzu-Quan Lin, Hung-yi Lee, Hao Tang","doi":"arxiv-2406.05464","DOIUrl":null,"url":null,"abstract":"Self-supervised speech models have shown to be useful for various tasks, but\ntheir large size limits the use in devices with low computing power and memory.\nIn this work, we explore early exit, an approach for reducing latency by\nexiting the forward process of a network early. Most approaches of early exit\nneed a separate early exit model for each task, with some even requiring\nfine-tuning of the entire pretrained model. We introduce Data Adaptive\nSelf-Supervised Early Exit (DAISY), an approach that decides when to exit based\non the self-supervised loss, eliminating the need for multiple round of\ntraining and fine-tuning. DAISY matches the performance of HuBERT on the\nMiniSUPERB benchmark, but with much faster inference times. Our analysis on the\nadaptivity of DAISY shows that the model exits early (using fewer layers) on\nclean data while exits late (using more layers) on noisy data, dynamically\nadjusting the computational cost of inference based on the noise level of each\nsample.","PeriodicalId":501178,"journal":{"name":"arXiv - CS - Sound","volume":"38 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models\",\"authors\":\"Tzu-Quan Lin, Hung-yi Lee, Hao Tang\",\"doi\":\"arxiv-2406.05464\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Self-supervised speech models have shown to be useful for various tasks, but\\ntheir large size limits the use in devices with low computing power and memory.\\nIn this work, we explore early exit, an approach for reducing latency by\\nexiting the forward process of a network early. Most approaches of early exit\\nneed a separate early exit model for each task, with some even requiring\\nfine-tuning of the entire pretrained model. We introduce Data Adaptive\\nSelf-Supervised Early Exit (DAISY), an approach that decides when to exit based\\non the self-supervised loss, eliminating the need for multiple round of\\ntraining and fine-tuning. DAISY matches the performance of HuBERT on the\\nMiniSUPERB benchmark, but with much faster inference times. Our analysis on the\\nadaptivity of DAISY shows that the model exits early (using fewer layers) on\\nclean data while exits late (using more layers) on noisy data, dynamically\\nadjusting the computational cost of inference based on the noise level of each\\nsample.\",\"PeriodicalId\":501178,\"journal\":{\"name\":\"arXiv - CS - Sound\",\"volume\":\"38 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Sound\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2406.05464\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Sound","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.05464","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

自监督语音模型已被证明可用于各种任务，但其庞大的体积限制了其在计算能力和内存较低的设备中的使用。在这项工作中，我们探索了提前退出的方法，这是一种通过提前退出网络前向过程来减少延迟的方法。大多数早期退出方法需要为每个任务建立单独的早期退出模型，有些方法甚至需要对整个预训练模型进行微调。我们引入了数据自适应自监督早期退出（DAISY），这是一种根据自监督损失决定何时退出的方法，无需多轮训练和微调。DAISY与HuBERT在MiniSUPERB基准上的性能相当，但推理时间更短。我们对 DAISY 适应性的分析表明，该模型在干净数据上退出较早（使用较少层），而在噪声数据上退出较晚（使用较多层），根据每个样本的噪声水平动态调整推理的计算成本。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models

Self-supervised speech models have shown to be useful for various tasks, but their large size limits the use in devices with low computing power and memory. In this work, we explore early exit, an approach for reducing latency by exiting the forward process of a network early. Most approaches of early exit need a separate early exit model for each task, with some even requiring fine-tuning of the entire pretrained model. We introduce Data Adaptive Self-Supervised Early Exit (DAISY), an approach that decides when to exit based on the self-supervised loss, eliminating the need for multiple round of training and fine-tuning. DAISY matches the performance of HuBERT on the MiniSUPERB benchmark, but with much faster inference times. Our analysis on the adaptivity of DAISY shows that the model exits early (using fewer layers) on clean data while exits late (using more layers) on noisy data, dynamically adjusting the computational cost of inference based on the noise level of each sample.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Sound

自引率

0.00%

发文量