{"title":"比较说话人独立分类和说话人适应分类对单词突出检测的影响","authors":"Andrea Schnall, M. Heckmann","doi":"10.1109/SLT.2016.7846271","DOIUrl":null,"url":null,"abstract":"Prosodic cues are an important part of human communication. One of these cues is the word prominence which is used to e.g. highlight important information. Since individual speakers use different ways of expressing prominence, it is not easily extracted and incorporated in a dialog system. As a consequence, up to date prominence only plays a marginal role in human-machine communication. In this paper we compare DNNs and SVMs trained speaker independently with the results of classification with SVM using a speaker adaptation method we recently developed. This adaptation method is based on the radial basis function of the SVM with a Gaussian regularization, which is derived from fMLLR. With this adaptation, we can notably reduce the problem of speaker variations. We present detailed evaluations of the methods and discuss advantages and shortcomings of the proposed approaches for word prominence detection.","PeriodicalId":281635,"journal":{"name":"2016 IEEE Spoken Language Technology Workshop (SLT)","volume":"73 8","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Comparing speaker independent and speaker adapted classification for word prominence detection\",\"authors\":\"Andrea Schnall, M. Heckmann\",\"doi\":\"10.1109/SLT.2016.7846271\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Prosodic cues are an important part of human communication. One of these cues is the word prominence which is used to e.g. highlight important information. Since individual speakers use different ways of expressing prominence, it is not easily extracted and incorporated in a dialog system. As a consequence, up to date prominence only plays a marginal role in human-machine communication. In this paper we compare DNNs and SVMs trained speaker independently with the results of classification with SVM using a speaker adaptation method we recently developed. This adaptation method is based on the radial basis function of the SVM with a Gaussian regularization, which is derived from fMLLR. With this adaptation, we can notably reduce the problem of speaker variations. We present detailed evaluations of the methods and discuss advantages and shortcomings of the proposed approaches for word prominence detection.\",\"PeriodicalId\":281635,\"journal\":{\"name\":\"2016 IEEE Spoken Language Technology Workshop (SLT)\",\"volume\":\"73 8\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE Spoken Language Technology Workshop (SLT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SLT.2016.7846271\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Spoken Language Technology Workshop (SLT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2016.7846271","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Comparing speaker independent and speaker adapted classification for word prominence detection
Prosodic cues are an important part of human communication. One of these cues is the word prominence which is used to e.g. highlight important information. Since individual speakers use different ways of expressing prominence, it is not easily extracted and incorporated in a dialog system. As a consequence, up to date prominence only plays a marginal role in human-machine communication. In this paper we compare DNNs and SVMs trained speaker independently with the results of classification with SVM using a speaker adaptation method we recently developed. This adaptation method is based on the radial basis function of the SVM with a Gaussian regularization, which is derived from fMLLR. With this adaptation, we can notably reduce the problem of speaker variations. We present detailed evaluations of the methods and discuss advantages and shortcomings of the proposed approaches for word prominence detection.