Haruki Tanaka, Yosuke Sugiura, N. Yasui, T. Shimamura, Ryoichi Miyazaki
{"title":"语音增强的交叉条件网络","authors":"Haruki Tanaka, Yosuke Sugiura, N. Yasui, T. Shimamura, Ryoichi Miyazaki","doi":"10.1109/ISPACS48206.2019.8986375","DOIUrl":null,"url":null,"abstract":"In the signal processing field, there is a growing interest in speech enhancement. Recently, a lot of speech enhancement methods based on the deep neural network have been proposed. Mostly, these networks, such as SEGAN, Wave-U-Net, adopt the autoencoder structure. In this paper, we propose the cross conditional network for speech enhancement based on SEGAN architecture. The proposed network has two Auto-Encoder, where the mutual latent vector is composed of the concatenated vector of these encoder outputs. In the experiments, we show that the proposed method exceeds SEGAN in terms of the objective evaluation measure by PESQ.","PeriodicalId":6765,"journal":{"name":"2019 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)","volume":"99 6 1","pages":"1-2"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cross Conditional Network for Speech Enhancement\",\"authors\":\"Haruki Tanaka, Yosuke Sugiura, N. Yasui, T. Shimamura, Ryoichi Miyazaki\",\"doi\":\"10.1109/ISPACS48206.2019.8986375\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the signal processing field, there is a growing interest in speech enhancement. Recently, a lot of speech enhancement methods based on the deep neural network have been proposed. Mostly, these networks, such as SEGAN, Wave-U-Net, adopt the autoencoder structure. In this paper, we propose the cross conditional network for speech enhancement based on SEGAN architecture. The proposed network has two Auto-Encoder, where the mutual latent vector is composed of the concatenated vector of these encoder outputs. In the experiments, we show that the proposed method exceeds SEGAN in terms of the objective evaluation measure by PESQ.\",\"PeriodicalId\":6765,\"journal\":{\"name\":\"2019 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)\",\"volume\":\"99 6 1\",\"pages\":\"1-2\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISPACS48206.2019.8986375\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPACS48206.2019.8986375","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
In the signal processing field, there is a growing interest in speech enhancement. Recently, a lot of speech enhancement methods based on the deep neural network have been proposed. Mostly, these networks, such as SEGAN, Wave-U-Net, adopt the autoencoder structure. In this paper, we propose the cross conditional network for speech enhancement based on SEGAN architecture. The proposed network has two Auto-Encoder, where the mutual latent vector is composed of the concatenated vector of these encoder outputs. In the experiments, we show that the proposed method exceeds SEGAN in terms of the objective evaluation measure by PESQ.