An 8.3mW 1.6Msamples/s multi-modal event-driven speech enhancement processor for robust speech recognition in smart glasses

ESSCIRC Conference 2016: 42nd European Solid-State Circuits Conference Pub Date : 2016-09-01 DOI:10.1109/ESSCIRC.2016.7598256

Jinmook Lee, Seongwook Park, Injoon Hong, H. Yoo

引用次数: 0

Abstract

A low-power and high-speed speech enhancement processor for speech enhancement of noisy inputs is proposed to realize the robust speech recognition in smart glasses. It has 3 key schemes: multi-modal speech selection, look-up table based non-linear approximation circuits, and speech detection controlled dynamic clock gating. The multi-modal speech selection scheme uses three parameters to enhance the limited accuracy of the previous uni-modal user speech selection up to 98.1%. The non-linear function approximation circuit accelerates the throughput of the speech enhancement by 10.7×. The speech detection controlled clock gating reduces the redundant power consumption by 51% when there is no user voice. The proposed speech enhancement processor achieves 1.6Msamples/s throughput and 8.3mW average power consumption with the 98.1% true positive rate of speech selection in 65nm CMOS process.

查看原文本刊更多论文

8.3mW 1.6Msamples/s多模态事件驱动语音增强处理器，用于智能眼镜的鲁棒语音识别

为实现智能眼镜的鲁棒语音识别，提出了一种低功耗、高速的语音增强处理器，用于噪声输入的语音增强。它有3个关键方案:多模态语音选择、基于查找表的非线性近似电路和语音检测控制的动态时钟门控。多模态语音选择方案利用三个参数将以往单模态用户语音选择的有限准确率提高到98.1%。非线性函数逼近电路使语音增强的吞吐量提高了10.7倍。语音检测控制的时钟门控在无用户语音的情况下可减少51%的冗余功耗。所提出的语音增强处理器在65nm CMOS工艺下实现了1.6 m采样/s的吞吐量和8.3mW的平均功耗，语音选择的真阳性率为98.1%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ESSCIRC Conference 2016: 42nd European Solid-State Circuits Conference

自引率

0.00%

发文量