{"title":"Bone-Conducted Speech to Air-Conducted Speech Conversion Based on CycleConsistent Adversarial Networks","authors":"Qing Pan, Jian Zhou, Teng Gao, L. Tao","doi":"10.1109/ICICSP50920.2020.9232121","DOIUrl":null,"url":null,"abstract":"Compared with traditional Air-Conducted Microphone (ACM) speech, Bone-Conducted Microphone (BCM) speech has the advantage of shielding background noise and helps to improve the communication quality in the strong noise environment. This paper proposes a method that uses Cycle-Consistent Adversarial Networks (CycleGAN) to extend the bandwidth for converting BCM speech to ACM speech based on the analysis of the bandwidth difference. The proposed method learns the mapping relationship between BCM speech and ACM speech without relying on parallel data, and does not require any additional data, modules or alignment process, it also avoids the over smoothing that is easy to appear in many statistical models. The experimental results show that the method can better reconstruct the high-frequency components of BCM speech. Compared with the original speech, it improves the subjective and objective results, and obtains Melspectrum features with higher similarity to the target speech.","PeriodicalId":117760,"journal":{"name":"2020 IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICSP50920.2020.9232121","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Compared with traditional Air-Conducted Microphone (ACM) speech, Bone-Conducted Microphone (BCM) speech has the advantage of shielding background noise and helps to improve the communication quality in the strong noise environment. This paper proposes a method that uses Cycle-Consistent Adversarial Networks (CycleGAN) to extend the bandwidth for converting BCM speech to ACM speech based on the analysis of the bandwidth difference. The proposed method learns the mapping relationship between BCM speech and ACM speech without relying on parallel data, and does not require any additional data, modules or alignment process, it also avoids the over smoothing that is easy to appear in many statistical models. The experimental results show that the method can better reconstruct the high-frequency components of BCM speech. Compared with the original speech, it improves the subjective and objective results, and obtains Melspectrum features with higher similarity to the target speech.