{"title":"基于深度学习的单通道语音增强在调频传输语音中的应用","authors":"Yingyi Ma, Xueliang Zhang","doi":"10.23919/APSIPAASC55919.2022.9980216","DOIUrl":null,"url":null,"abstract":"There are three main interferences in the FM signal trans-mission process-Multipath effect, Doppler effect, and White noise. These interferences have significant influences on speech. We proposed a method that uses a masking or mapping approach for single-channel speech enhancement in wireless communication. Since the method improves speech equality by focusing on three interferences simultaneously, it is simpler in comparison to conventional methods. Experiments are conducted on the dataset, which is simulated by ourselves. Because the PESQ and STOI need reference targets, it is hard to evaluate the performance using real-world data. So we only give the spectral comparison of the real data enhancement results. Simulation results show excellent speech enhancement performance on the unprocessed mixture and significantly improve speech quality on the actual collected data. It verifies the feasibility of deep learning on this kind of task. Future studies will be made to improve the real-time performance and compress the number of network parameters.","PeriodicalId":382967,"journal":{"name":"2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Application of Deep Learning-based Single-channel Speech Enhancement for Frequency-modulation Transmitted Speech\",\"authors\":\"Yingyi Ma, Xueliang Zhang\",\"doi\":\"10.23919/APSIPAASC55919.2022.9980216\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There are three main interferences in the FM signal trans-mission process-Multipath effect, Doppler effect, and White noise. These interferences have significant influences on speech. We proposed a method that uses a masking or mapping approach for single-channel speech enhancement in wireless communication. Since the method improves speech equality by focusing on three interferences simultaneously, it is simpler in comparison to conventional methods. Experiments are conducted on the dataset, which is simulated by ourselves. Because the PESQ and STOI need reference targets, it is hard to evaluate the performance using real-world data. So we only give the spectral comparison of the real data enhancement results. Simulation results show excellent speech enhancement performance on the unprocessed mixture and significantly improve speech quality on the actual collected data. It verifies the feasibility of deep learning on this kind of task. Future studies will be made to improve the real-time performance and compress the number of network parameters.\",\"PeriodicalId\":382967,\"journal\":{\"name\":\"2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/APSIPAASC55919.2022.9980216\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/APSIPAASC55919.2022.9980216","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Application of Deep Learning-based Single-channel Speech Enhancement for Frequency-modulation Transmitted Speech
There are three main interferences in the FM signal trans-mission process-Multipath effect, Doppler effect, and White noise. These interferences have significant influences on speech. We proposed a method that uses a masking or mapping approach for single-channel speech enhancement in wireless communication. Since the method improves speech equality by focusing on three interferences simultaneously, it is simpler in comparison to conventional methods. Experiments are conducted on the dataset, which is simulated by ourselves. Because the PESQ and STOI need reference targets, it is hard to evaluate the performance using real-world data. So we only give the spectral comparison of the real data enhancement results. Simulation results show excellent speech enhancement performance on the unprocessed mixture and significantly improve speech quality on the actual collected data. It verifies the feasibility of deep learning on this kind of task. Future studies will be made to improve the real-time performance and compress the number of network parameters.