{"title":"音高变换对儿童语音识别影响的研究","authors":"Shweta Ghai, R. Sinha","doi":"10.1109/TENCON.2008.4766828","DOIUrl":null,"url":null,"abstract":"The degradation in the automatic speech recognition performance of the adult speech trained models for children speech data is a well known problem. In this work, motivated by the voice conversion approaches for addressing the acoustic mis-match between the adult and children speech, we investigated the effect of pitch transformation on children speech on telephone-based connected digit recognition task. Our preliminary results indicate that the effect of pitch transformation on the recognition performance of the children speech varies with their average pitch values. With the reduction of pitch, an improvement of 10% was observed in the speech recognition performance for children having pitch values more than 300 Hz. We have also proposed an explanation for this performance improvement based on the study of filter-bank smoothing in front-end signal processing.","PeriodicalId":22230,"journal":{"name":"TENCON 2008 - 2008 IEEE Region 10 Conference","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"An investigation into the effect of pitch transformation on children speech recognition\",\"authors\":\"Shweta Ghai, R. Sinha\",\"doi\":\"10.1109/TENCON.2008.4766828\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The degradation in the automatic speech recognition performance of the adult speech trained models for children speech data is a well known problem. In this work, motivated by the voice conversion approaches for addressing the acoustic mis-match between the adult and children speech, we investigated the effect of pitch transformation on children speech on telephone-based connected digit recognition task. Our preliminary results indicate that the effect of pitch transformation on the recognition performance of the children speech varies with their average pitch values. With the reduction of pitch, an improvement of 10% was observed in the speech recognition performance for children having pitch values more than 300 Hz. We have also proposed an explanation for this performance improvement based on the study of filter-bank smoothing in front-end signal processing.\",\"PeriodicalId\":22230,\"journal\":{\"name\":\"TENCON 2008 - 2008 IEEE Region 10 Conference\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"TENCON 2008 - 2008 IEEE Region 10 Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TENCON.2008.4766828\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"TENCON 2008 - 2008 IEEE Region 10 Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TENCON.2008.4766828","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An investigation into the effect of pitch transformation on children speech recognition
The degradation in the automatic speech recognition performance of the adult speech trained models for children speech data is a well known problem. In this work, motivated by the voice conversion approaches for addressing the acoustic mis-match between the adult and children speech, we investigated the effect of pitch transformation on children speech on telephone-based connected digit recognition task. Our preliminary results indicate that the effect of pitch transformation on the recognition performance of the children speech varies with their average pitch values. With the reduction of pitch, an improvement of 10% was observed in the speech recognition performance for children having pitch values more than 300 Hz. We have also proposed an explanation for this performance improvement based on the study of filter-bank smoothing in front-end signal processing.