Ganjun Liu , Tao Zhang , Biyun Ding , Ying Lv , Xiaohui Hou , Haoyang Guo , Yaqin Wu , Dehui Fu
{"title":"GBNF-VAE: A Pathological Voice Enhancement Model Based on Gold Section for Bottleneck Feature With Variational Autoencoder","authors":"Ganjun Liu , Tao Zhang , Biyun Ding , Ying Lv , Xiaohui Hou , Haoyang Guo , Yaqin Wu , Dehui Fu","doi":"10.1016/j.jvoice.2023.03.012","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>Speech enhancement has become a promising technique to accommodate demands of the improvement in quality of a degraded speech signal. The main works now focus on separating normal speech from noise, but have neglected the low quality of impaired speech influenced by anomalous glottis flow. In order to effectively enhance the pathological speech, it is essential to design a separation mechanism for extracting high-dimensional timbre features and speech features separately to suppress low-dimensional noises.</div></div><div><h3>Methods</h3><div>In this paper, we propose an enhancement model GBNF-VAE to extract timbre efficiently by reducing anomalous airflow noise interference, and by combining the semantic features with timbre features to synthesize the enhanced speech. In particular, the bottleneck feature can characterize the timbre by the controlled number of nodes through the Golden Section method, which effectively improves computational efficiency. In addition, variational autoencoder is adopted to extract semantic features which are combined with the previous timbre features to synthesize the enhanced speech.</div></div><div><h3>Results</h3><div>Finally, spectrum observation, objective indicators and subjective evaluation all show the outstanding performance of GBNF-VAE in pathological speech quality enhancement.</div></div>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":"39 5","pages":"Pages 1171-1182"},"PeriodicalIF":2.4000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Voice","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0892199723001054","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Objective
Speech enhancement has become a promising technique to accommodate demands of the improvement in quality of a degraded speech signal. The main works now focus on separating normal speech from noise, but have neglected the low quality of impaired speech influenced by anomalous glottis flow. In order to effectively enhance the pathological speech, it is essential to design a separation mechanism for extracting high-dimensional timbre features and speech features separately to suppress low-dimensional noises.
Methods
In this paper, we propose an enhancement model GBNF-VAE to extract timbre efficiently by reducing anomalous airflow noise interference, and by combining the semantic features with timbre features to synthesize the enhanced speech. In particular, the bottleneck feature can characterize the timbre by the controlled number of nodes through the Golden Section method, which effectively improves computational efficiency. In addition, variational autoencoder is adopted to extract semantic features which are combined with the previous timbre features to synthesize the enhanced speech.
Results
Finally, spectrum observation, objective indicators and subjective evaluation all show the outstanding performance of GBNF-VAE in pathological speech quality enhancement.
期刊介绍:
The Journal of Voice is widely regarded as the world''s premiere journal for voice medicine and research. This peer-reviewed publication is listed in Index Medicus and is indexed by the Institute for Scientific Information. The journal contains articles written by experts throughout the world on all topics in voice sciences, voice medicine and surgery, and speech-language pathologists'' management of voice-related problems. The journal includes clinical articles, clinical research, and laboratory research. Members of the Foundation receive the journal as a benefit of membership.