{"title":"自动语音识别(ASR)的特征提取","authors":"B. Swartz, N. Magotra","doi":"10.1109/ACSSC.1996.601153","DOIUrl":null,"url":null,"abstract":"This paper presents a new speech feature extraction technique for use in automatic speech recognition (ASR). The technique is based on a new two-dimensional series expansion that is applied to the spectrogram of a sampled speech signal. The series expansion allows for global analysis in frequency and local multiresolution analysis in time. Multiresolution analysis in time is useful because the duration of vowels is almost an order of magnitude greater than that of consonants.","PeriodicalId":270729,"journal":{"name":"Conference Record of The Thirtieth Asilomar Conference on Signals, Systems and Computers","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1996-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Feature extraction for automatic speech recognition (ASR)\",\"authors\":\"B. Swartz, N. Magotra\",\"doi\":\"10.1109/ACSSC.1996.601153\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a new speech feature extraction technique for use in automatic speech recognition (ASR). The technique is based on a new two-dimensional series expansion that is applied to the spectrogram of a sampled speech signal. The series expansion allows for global analysis in frequency and local multiresolution analysis in time. Multiresolution analysis in time is useful because the duration of vowels is almost an order of magnitude greater than that of consonants.\",\"PeriodicalId\":270729,\"journal\":{\"name\":\"Conference Record of The Thirtieth Asilomar Conference on Signals, Systems and Computers\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1996-11-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Conference Record of The Thirtieth Asilomar Conference on Signals, Systems and Computers\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACSSC.1996.601153\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference Record of The Thirtieth Asilomar Conference on Signals, Systems and Computers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACSSC.1996.601153","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Feature extraction for automatic speech recognition (ASR)
This paper presents a new speech feature extraction technique for use in automatic speech recognition (ASR). The technique is based on a new two-dimensional series expansion that is applied to the spectrogram of a sampled speech signal. The series expansion allows for global analysis in frequency and local multiresolution analysis in time. Multiresolution analysis in time is useful because the duration of vowels is almost an order of magnitude greater than that of consonants.