Sparse Classifier Fusion for Speaker Verification

IEEE Transactions on Audio Speech and Language Processing Pub Date : 2013-08-01 DOI:10.1109/TASL.2013.2256895

Ville Hautamäki, T. Kinnunen, Filip Sedlak, Kong-Aik Lee, B. Ma, Haizhou Li

引用次数: 48

Abstract

State-of-the-art speaker verification systems take advantage of a number of complementary base classifiers by fusing them to arrive at reliable verification decisions. In speaker verification, fusion is typically implemented as a weighted linear combination of the base classifier scores, where the combination weights are estimated using a logistic regression model. An alternative way for fusion is to use classifier ensemble selection, which can be seen as sparse regularization applied to logistic regression. Even though score fusion has been extensively studied in speaker verification, classifier ensemble selection is much less studied. In this study, we extensively study a sparse classifier fusion on a collection of twelve I4U spectral subsystems on the NIST 2008 and 2010 speaker recognition evaluation (SRE) corpora.

查看原文本刊更多论文

基于稀疏分类器融合的说话人验证

最先进的说话人验证系统利用了许多互补的基础分类器，通过融合它们来得出可靠的验证决策。在说话人验证中，融合通常是作为基本分类器分数的加权线性组合来实现的，其中组合权重是使用逻辑回归模型估计的。融合的另一种方法是使用分类器集成选择，这可以看作是稀疏正则化应用于逻辑回归。尽管分数融合在说话人验证中得到了广泛的研究，但分类器集成选择的研究却很少。在本研究中，我们广泛研究了NIST 2008年和2010年说话人识别评估(SRE)语料库上12个I4U光谱子系统的稀疏分类器融合。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Audio Speech and Language Processing 工程技术-工程：电子与电气

自引率

0.00%

发文量

审稿时长

24.0 months

期刊介绍： The IEEE Transactions on Audio, Speech and Language Processing covers the sciences, technologies and applications relating to the analysis, coding, enhancement, recognition and synthesis of audio, music, speech and language. In particular, audio processing also covers auditory modeling, acoustic modeling and source separation. Speech processing also covers speech production and perception, adaptation, lexical modeling and speaker recognition. Language processing also covers spoken language understanding, translation, summarization, mining, general language modeling, as well as spoken dialog systems.