基于加权Logistic线性回归的自适应分数融合语音识别

2010 IEEE International Conference on Acoustics, Speech and Signal Processing Pub Date : 2010-03-14 DOI:10.1109/ICASSP.2010.5495069

K. Sim, Kong-Aik Lee

{"title":"基于加权Logistic线性回归的自适应分数融合语音识别","authors":"K. Sim, Kong-Aik Lee","doi":"10.1109/ICASSP.2010.5495069","DOIUrl":null,"url":null,"abstract":"State-of-the-art spoken language recognition systems typically consist of a combination of sub-systems. These sub-systems generate language detection scores for each speech segment, which will be fused (combined) to yield the overall detection scores. Typically, score fusion is achieved using a linear model and Logistic Linear Regression (LLR) is commonly used to estimate the model parameters. This paper proposes an extension to the LLR model, known as the Weighted LLR (WLLR). WLLR is obtained using a weighted combination of multiple LLRs where the weights are obtained as a nonlinear function of the speech segments. Although the resultant score is still linear with respect to the scores of the individual sub-systems, the linear function depends on the speech segment. Hence, the overall score fusion model can be regarded as an adaptive model. Experimental results shows that WLLR outperforms LLR by approximately 10% relative for PPRLM system fusion on the NIST 2003 and 2005 language recognition evaluation sets.","PeriodicalId":293333,"journal":{"name":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Adaptive score fusion using Weighted Logistic Linear Regression for spoken language recognition\",\"authors\":\"K. Sim, Kong-Aik Lee\",\"doi\":\"10.1109/ICASSP.2010.5495069\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"State-of-the-art spoken language recognition systems typically consist of a combination of sub-systems. These sub-systems generate language detection scores for each speech segment, which will be fused (combined) to yield the overall detection scores. Typically, score fusion is achieved using a linear model and Logistic Linear Regression (LLR) is commonly used to estimate the model parameters. This paper proposes an extension to the LLR model, known as the Weighted LLR (WLLR). WLLR is obtained using a weighted combination of multiple LLRs where the weights are obtained as a nonlinear function of the speech segments. Although the resultant score is still linear with respect to the scores of the individual sub-systems, the linear function depends on the speech segment. Hence, the overall score fusion model can be regarded as an adaptive model. Experimental results shows that WLLR outperforms LLR by approximately 10% relative for PPRLM system fusion on the NIST 2003 and 2005 language recognition evaluation sets.\",\"PeriodicalId\":293333,\"journal\":{\"name\":\"2010 IEEE International Conference on Acoustics, Speech and Signal Processing\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-03-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE International Conference on Acoustics, Speech and Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.2010.5495069\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Conference on Acoustics, Speech and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2010.5495069","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

最先进的口语识别系统通常由子系统的组合组成。这些子系统为每个语音片段生成语言检测分数，这些分数将被融合(组合)以产生总体检测分数。通常，分数融合是使用线性模型实现的，通常使用逻辑线性回归(LLR)来估计模型参数。本文对LLR模型进行了扩展，称为加权LLR (Weighted LLR, WLLR)。WLLR是通过多个llr的加权组合得到的，其中权重作为语音片段的非线性函数得到。尽管结果分数相对于各个子系统的分数仍然是线性的，但线性函数取决于语音片段。因此，综合评分融合模型可以看作是一种自适应模型。实验结果表明，在NIST 2003和2005语言识别评估集上，WLLR相对于PPRLM系统融合优于LLR约10%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Adaptive score fusion using Weighted Logistic Linear Regression for spoken language recognition

State-of-the-art spoken language recognition systems typically consist of a combination of sub-systems. These sub-systems generate language detection scores for each speech segment, which will be fused (combined) to yield the overall detection scores. Typically, score fusion is achieved using a linear model and Logistic Linear Regression (LLR) is commonly used to estimate the model parameters. This paper proposes an extension to the LLR model, known as the Weighted LLR (WLLR). WLLR is obtained using a weighted combination of multiple LLRs where the weights are obtained as a nonlinear function of the speech segments. Although the resultant score is still linear with respect to the scores of the individual sub-systems, the linear function depends on the speech segment. Hence, the overall score fusion model can be regarded as an adaptive model. Experimental results shows that WLLR outperforms LLR by approximately 10% relative for PPRLM system fusion on the NIST 2003 and 2005 language recognition evaluation sets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2010 IEEE International Conference on Acoustics, Speech and Signal Processing

自引率

0.00%

发文量