{"title":"Diphone-like units without phonemes - option for very low bit rate speech coding","authors":"P. Motlícek, G. Baudoin, J. Černocký","doi":"10.1109/EURCON.2001.938162","DOIUrl":null,"url":null,"abstract":"The aim of our effort is to reach higher quality of the resulting speech coded by a very low bit rate (VLBR) segmental coder. The basic units are found automatically in a training database using temporal decomposition and vector quantization. They are modeled by HMM. Then two methods of re-segmentation are used in order to find new longer units. In the first approach borders are set to the centers of previous units. In the second, borders are fixed to the centers of middle HMM states of previous units. The number of frames in new units is conditioned to be bigger than a fixed constant. Hence, new units can consist of several previous segments. Decreasing transition noise of the resultant speech was obtained using these techniques.","PeriodicalId":205662,"journal":{"name":"EUROCON'2001. International Conference on Trends in Communications. Technical Program, Proceedings (Cat. No.01EX439)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"EUROCON'2001. International Conference on Trends in Communications. Technical Program, Proceedings (Cat. No.01EX439)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EURCON.2001.938162","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
The aim of our effort is to reach higher quality of the resulting speech coded by a very low bit rate (VLBR) segmental coder. The basic units are found automatically in a training database using temporal decomposition and vector quantization. They are modeled by HMM. Then two methods of re-segmentation are used in order to find new longer units. In the first approach borders are set to the centers of previous units. In the second, borders are fixed to the centers of middle HMM states of previous units. The number of frames in new units is conditioned to be bigger than a fixed constant. Hence, new units can consist of several previous segments. Decreasing transition noise of the resultant speech was obtained using these techniques.