Speaker verification with TIMIT corpus - some remarks on classical methods

2020 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA) Pub Date : 2020-09-23 DOI:10.23919/spa50552.2020.9241298

A. Dustor

引用次数: 0

Abstract

The aim of this paper is to present some research on speaker verification system based on Gaussian Mixture Model-Universal Background Model (GMM-UBM) approach. All tests were done for the TIMIT corpus. Performance for the standard Mel-Frequency Cepstral Coefficients (MFCC) and dynamic delta features is shown. Influence of feature dimensionality and model complexity on Equal Error Rate (EER) is presented. Additionally, an impact of Voice Activity Detection (VAD) and normalization techniques like Cepstral Mean and Variance Normalization (CMVN) and RelAtive SpecTrA (RASTA) filtering is covered. Each combination of factors was examined. It is shown that careful selection of traditional techniques may lead to very satisfying results when it comes to achieved EER values.

查看原文本刊更多论文

用TIMIT语料库验证说话人——对经典方法的几点评述

本文的目的是研究基于高斯混合模型-通用背景模型(GMM-UBM)方法的说话人验证系统。对TIMIT语料库进行了所有测试。显示了标准mel -频率倒谱系数(MFCC)和动态δ特征的性能。研究了特征维数和模型复杂度对等错误率的影响。此外，还介绍了语音活动检测(VAD)和归一化技术的影响，如倒谱均值和方差归一化(CMVN)和相对谱(RASTA)滤波。对每种因素的组合进行了检查。结果表明，在达到EER值时，仔细选择传统技术可能会导致非常令人满意的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)

自引率

0.00%

发文量