Meta-Whisper: Speech-Based Meta-ICL for ASR on Low-Resource Languages

arXiv - EE - Audio and Speech Processing Pub Date : 2024-09-16 DOI:arxiv-2409.10429

Ming-Hao Hsu, Kuan Po Huang, Hung-yi Lee

引用次数: 0

Abstract

This paper presents Meta-Whisper, a novel approach to improve automatic speech recognition (ASR) for low-resource languages using the Whisper model. By leveraging Meta In-Context Learning (Meta-ICL) and a k-Nearest Neighbors (KNN) algorithm for sample selection, Meta-Whisper enhances Whisper's ability to recognize speech in unfamiliar languages without extensive fine-tuning. Experiments on the ML-SUPERB dataset show that Meta-Whisper significantly reduces the Character Error Rate (CER) for low-resource languages compared to the original Whisper model. This method offers a promising solution for developing more adaptable multilingual ASR systems, particularly for languages with limited resources.

查看原文本刊更多论文

Meta-Whisper：基于语音的元智能语言（Meta-ICL），用于低资源语言的 ASR

本文介绍了 Meta-Whisper，这是一种利用 Whisper 模型改进低资源语言自动语音识别（ASR）的新方法。通过利用元上下文学习（Meta-ICL）和 k-Nearest Neighbors (KNN) 算法进行样本选择，Meta-Whisper 增强了 Whisper 识别陌生语言语音的能力，而无需进行大量微调。这种方法为开发适应性更强的多语言自动识别系统提供了一种前景广阔的解决方案，尤其适用于资源有限的语言。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - EE - Audio and Speech Processing

自引率

0.00%

发文量