{"title":"利用非负矩阵分解及其变体研究重叠语音信号的单通道源分离","authors":"Nandini C Nag, M. Shah","doi":"10.1109/ICNTE44896.2019.8946013","DOIUrl":null,"url":null,"abstract":"A pre-processor to speech recognition, audio source separation may mitigate the problem of quality degradation of individual signal recognition in scenarios like cock-tail party environment. The same may be used for various other applications like audio forensics, speaker verification, instrument identification, hearing aids, etc. There are various techniques available for single channel audio source separation, but the technique based on Non-negative Matrix Factorization (NMF) is widely used. Several research studies have shown considerable performance improvement of signal separation using NMF on different mixture of audio signals like speech with noise, speech with music, speech with speech taken from different audio databases. In this paper, single channel source separation using Non-Negative Matrix Factorization and its variants for two-speaker mixed signal is investigated using same speech database, the GRID speech corpus. The separation performances of phase-aware algorithms are compared with phase-unaware approaches based on NMF and its variants. The quality of separated speech was judged by varying parameters such as number of bases and analysis window size.","PeriodicalId":292408,"journal":{"name":"2019 International Conference on Nascent Technologies in Engineering (ICNTE)","volume":"112 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Investigating Single Channel Source Separation Using Non-Negative Matrix Factorization and Its Variants for Overlapping Speech Signal\",\"authors\":\"Nandini C Nag, M. Shah\",\"doi\":\"10.1109/ICNTE44896.2019.8946013\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A pre-processor to speech recognition, audio source separation may mitigate the problem of quality degradation of individual signal recognition in scenarios like cock-tail party environment. The same may be used for various other applications like audio forensics, speaker verification, instrument identification, hearing aids, etc. There are various techniques available for single channel audio source separation, but the technique based on Non-negative Matrix Factorization (NMF) is widely used. Several research studies have shown considerable performance improvement of signal separation using NMF on different mixture of audio signals like speech with noise, speech with music, speech with speech taken from different audio databases. In this paper, single channel source separation using Non-Negative Matrix Factorization and its variants for two-speaker mixed signal is investigated using same speech database, the GRID speech corpus. The separation performances of phase-aware algorithms are compared with phase-unaware approaches based on NMF and its variants. The quality of separated speech was judged by varying parameters such as number of bases and analysis window size.\",\"PeriodicalId\":292408,\"journal\":{\"name\":\"2019 International Conference on Nascent Technologies in Engineering (ICNTE)\",\"volume\":\"112 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Nascent Technologies in Engineering (ICNTE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICNTE44896.2019.8946013\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Nascent Technologies in Engineering (ICNTE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNTE44896.2019.8946013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Investigating Single Channel Source Separation Using Non-Negative Matrix Factorization and Its Variants for Overlapping Speech Signal
A pre-processor to speech recognition, audio source separation may mitigate the problem of quality degradation of individual signal recognition in scenarios like cock-tail party environment. The same may be used for various other applications like audio forensics, speaker verification, instrument identification, hearing aids, etc. There are various techniques available for single channel audio source separation, but the technique based on Non-negative Matrix Factorization (NMF) is widely used. Several research studies have shown considerable performance improvement of signal separation using NMF on different mixture of audio signals like speech with noise, speech with music, speech with speech taken from different audio databases. In this paper, single channel source separation using Non-Negative Matrix Factorization and its variants for two-speaker mixed signal is investigated using same speech database, the GRID speech corpus. The separation performances of phase-aware algorithms are compared with phase-unaware approaches based on NMF and its variants. The quality of separated speech was judged by varying parameters such as number of bases and analysis window size.