{"title":"语音变时标修正的谱变函数","authors":"P. Kachare, P. C. Pandey","doi":"10.1109/NCC52529.2021.9530088","DOIUrl":null,"url":null,"abstract":"Spectral variation function is used to detect salient segments (segments with sharp spectral transitions). It is calculated from cosine of the angle between the averaged feature vectors of the adjacent segments. A modified version of this function is presented for variable time-scale modification of the speech signal. It uses the magnitude spectrum smoothed by auditory critical band filters and a small offset in the normalization for the angle cosine. Test results showed that the modified function detects spectral saliencies and does not have spurious peaks. It is applied for variable time-scale modification without altering the overall duration. Listening tests showed significantly better speech quality for processing using the modified function.","PeriodicalId":414087,"journal":{"name":"2021 National Conference on Communications (NCC)","volume":"2006 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Spectral Variation Function for Variable Time-Scale Modification of Speech\",\"authors\":\"P. Kachare, P. C. Pandey\",\"doi\":\"10.1109/NCC52529.2021.9530088\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Spectral variation function is used to detect salient segments (segments with sharp spectral transitions). It is calculated from cosine of the angle between the averaged feature vectors of the adjacent segments. A modified version of this function is presented for variable time-scale modification of the speech signal. It uses the magnitude spectrum smoothed by auditory critical band filters and a small offset in the normalization for the angle cosine. Test results showed that the modified function detects spectral saliencies and does not have spurious peaks. It is applied for variable time-scale modification without altering the overall duration. Listening tests showed significantly better speech quality for processing using the modified function.\",\"PeriodicalId\":414087,\"journal\":{\"name\":\"2021 National Conference on Communications (NCC)\",\"volume\":\"2006 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 National Conference on Communications (NCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NCC52529.2021.9530088\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC52529.2021.9530088","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Spectral Variation Function for Variable Time-Scale Modification of Speech
Spectral variation function is used to detect salient segments (segments with sharp spectral transitions). It is calculated from cosine of the angle between the averaged feature vectors of the adjacent segments. A modified version of this function is presented for variable time-scale modification of the speech signal. It uses the magnitude spectrum smoothed by auditory critical band filters and a small offset in the normalization for the angle cosine. Test results showed that the modified function detects spectral saliencies and does not have spurious peaks. It is applied for variable time-scale modification without altering the overall duration. Listening tests showed significantly better speech quality for processing using the modified function.