联合特征和模型训练最小检测误差应用于语音子词检测

2012 IEEE International Workshop on Machine Learning for Signal Processing Pub Date : 2012-11-12 DOI:10.1109/MLSP.2012.6349729

M. H. Johnsen, Alfonso M. Canterla

{"title":"联合特征和模型训练最小检测误差应用于语音子词检测","authors":"M. H. Johnsen, Alfonso M. Canterla","doi":"10.1109/MLSP.2012.6349729","DOIUrl":null,"url":null,"abstract":"This paper presents methods and results for joint optimization of the feature extraction and the model parameters of a detector. We further define a discriminative training criterion called Minimum Detection Error (MDE). The criterion can optimize the F-score or any other detection performance metric. The methods are used to design detectors of subwords in continuous speech, i.e. to spot phones and articulatory features. For each subword detector the MFCC filterbank matrix and the Gaussian means in the HMM models are jointly optimized. For experiments on TIMIT, the optimized detectors clearly outperform the baseline detectors and also our previous MCE based detectors. The results indicate that the same performance metric should be used for training and test and that accuracy outperforms F-score with respect to relative improvement. Furter, the optimized filterbanks usually reflect typical acoustic properties of the corresponding detection classes.","PeriodicalId":262601,"journal":{"name":"2012 IEEE International Workshop on Machine Learning for Signal Processing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Joint feature and model training for minimum detection errors applied to speech subword detection\",\"authors\":\"M. H. Johnsen, Alfonso M. Canterla\",\"doi\":\"10.1109/MLSP.2012.6349729\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents methods and results for joint optimization of the feature extraction and the model parameters of a detector. We further define a discriminative training criterion called Minimum Detection Error (MDE). The criterion can optimize the F-score or any other detection performance metric. The methods are used to design detectors of subwords in continuous speech, i.e. to spot phones and articulatory features. For each subword detector the MFCC filterbank matrix and the Gaussian means in the HMM models are jointly optimized. For experiments on TIMIT, the optimized detectors clearly outperform the baseline detectors and also our previous MCE based detectors. The results indicate that the same performance metric should be used for training and test and that accuracy outperforms F-score with respect to relative improvement. Furter, the optimized filterbanks usually reflect typical acoustic properties of the corresponding detection classes.\",\"PeriodicalId\":262601,\"journal\":{\"name\":\"2012 IEEE International Workshop on Machine Learning for Signal Processing\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-11-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE International Workshop on Machine Learning for Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MLSP.2012.6349729\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Workshop on Machine Learning for Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MLSP.2012.6349729","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

本文给出了特征提取与检测器模型参数联合优化的方法和结果。我们进一步定义了一个判别训练标准，称为最小检测误差(MDE)。该标准可以优化f分数或任何其他检测性能指标。这些方法用于设计连续语音中的子词检测器，即识别语音的电话和发音特征。对于每个子词检测器，联合优化MFCC滤波器组矩阵和HMM模型中的高斯均值。在TIMIT上的实验中，优化后的检测器明显优于基线检测器和我们以前基于MCE的检测器。结果表明，相同的性能指标应该用于训练和测试，准确性优于f分数相对改进。此外，优化后的滤波器组通常反映相应检测类别的典型声学特性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Joint feature and model training for minimum detection errors applied to speech subword detection

This paper presents methods and results for joint optimization of the feature extraction and the model parameters of a detector. We further define a discriminative training criterion called Minimum Detection Error (MDE). The criterion can optimize the F-score or any other detection performance metric. The methods are used to design detectors of subwords in continuous speech, i.e. to spot phones and articulatory features. For each subword detector the MFCC filterbank matrix and the Gaussian means in the HMM models are jointly optimized. For experiments on TIMIT, the optimized detectors clearly outperform the baseline detectors and also our previous MCE based detectors. The results indicate that the same performance metric should be used for training and test and that accuracy outperforms F-score with respect to relative improvement. Furter, the optimized filterbanks usually reflect typical acoustic properties of the corresponding detection classes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2012 IEEE International Workshop on Machine Learning for Signal Processing

自引率

0.00%

发文量