Study on the Selection of Specific Filters for Enhancement of Recorded Speech for Speaker Identification

P. V. Jiju, C. P. Singh, R. Sharma
{"title":"Study on the Selection of Specific Filters for Enhancement of Recorded Speech for Speaker Identification","authors":"P. V. Jiju, C. P. Singh, R. Sharma","doi":"10.2174/1874402800902010029","DOIUrl":null,"url":null,"abstract":"Understanding the noise characteristics for finding appropriate filtering technique/s so as to obtain sufficiently clear speech samples for Speaker Identification, is one of the challenging tasks in Forensic Acoustics. Speaker's idiosyncratic speech should not be affected when the noise reduction is carried out; otherwise, Speaker Identification becomes highly erroneous. We have collected fifty noisy speech samples reported to be recorded in different modes from actual crime cases received in the laboratory. The samples are analyzed after subjecting to various filtering techniques and compared with the clear speech mixed with the noise collected from non-speech portion. Distortion levels on the speech are studied at various stages of application of filters in terms of SNR and Speaker Specific Information. Retaining the Speaker Specific Information as primary concern of our study, the limitation of filtering techniques depending on the characteristic and intensity level of noise is worked out for noisy speech samples. Subsequently a statistical study is also conducted. Listening tests were conducted to ensure that the perceptual features of the original noisy speech are preserved while applying filters. This work demonstrates the efficiency of Noise reduction filters in improving SNR and their controlled applications for preserving Speaker dependent features depending on the various noise characteristics embedded on speech samples. Audio Forensics has a challenging history of enhancement problems of speech samples received for examination. It is observed that out of the total speech samples received for Speaker Identification in the Laboratory, a large number of recordings requires enhancement. Speech is a non-linear time series represented in terms of complex number. Hence separating noise from noisy speech in spectral domain results into countless solutions. The main objective of a Noise Cancellation system is to obtain a clear signal with higher quality of speech signal. The presence of noise in speech signals can create higher degree of mismatch in performance of speech processing systems used for Speaker Identification as well as Speech Recognition. Inappropriate filtering of noise corresponds to extracting features of noise together with the actual speech signal during the feature extraction process. However, the desired parametric representation carries a high amount of error rate. The presence of broadband noise and a very low SNR deteriorate the intelligibility of most of the recorded speech samples. Speaker's idiosyncratic speech is affected when the noise reduction is carried out. Thus the Speaker Identification","PeriodicalId":88327,"journal":{"name":"The open forensic science journal","volume":"2 1","pages":"29-33"},"PeriodicalIF":0.0000,"publicationDate":"2009-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The open forensic science journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/1874402800902010029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Understanding the noise characteristics for finding appropriate filtering technique/s so as to obtain sufficiently clear speech samples for Speaker Identification, is one of the challenging tasks in Forensic Acoustics. Speaker's idiosyncratic speech should not be affected when the noise reduction is carried out; otherwise, Speaker Identification becomes highly erroneous. We have collected fifty noisy speech samples reported to be recorded in different modes from actual crime cases received in the laboratory. The samples are analyzed after subjecting to various filtering techniques and compared with the clear speech mixed with the noise collected from non-speech portion. Distortion levels on the speech are studied at various stages of application of filters in terms of SNR and Speaker Specific Information. Retaining the Speaker Specific Information as primary concern of our study, the limitation of filtering techniques depending on the characteristic and intensity level of noise is worked out for noisy speech samples. Subsequently a statistical study is also conducted. Listening tests were conducted to ensure that the perceptual features of the original noisy speech are preserved while applying filters. This work demonstrates the efficiency of Noise reduction filters in improving SNR and their controlled applications for preserving Speaker dependent features depending on the various noise characteristics embedded on speech samples. Audio Forensics has a challenging history of enhancement problems of speech samples received for examination. It is observed that out of the total speech samples received for Speaker Identification in the Laboratory, a large number of recordings requires enhancement. Speech is a non-linear time series represented in terms of complex number. Hence separating noise from noisy speech in spectral domain results into countless solutions. The main objective of a Noise Cancellation system is to obtain a clear signal with higher quality of speech signal. The presence of noise in speech signals can create higher degree of mismatch in performance of speech processing systems used for Speaker Identification as well as Speech Recognition. Inappropriate filtering of noise corresponds to extracting features of noise together with the actual speech signal during the feature extraction process. However, the desired parametric representation carries a high amount of error rate. The presence of broadband noise and a very low SNR deteriorate the intelligibility of most of the recorded speech samples. Speaker's idiosyncratic speech is affected when the noise reduction is carried out. Thus the Speaker Identification
用于说话人识别的录音增强特定滤波器的选择研究
了解噪声特性,寻找合适的滤波技术,以获得足够清晰的语音样本进行说话人识别,是法医声学研究的难点之一。在进行降噪时,不应影响说话人的特殊言语;否则,说话人识别就会变得非常错误。我们从实验室收到的实际犯罪案件中收集了50个以不同模式记录的嘈杂语音样本。经过各种滤波技术对样本进行分析,并与从非语音部分采集的噪声混合的清晰语音进行比较。从信噪比和说话人特定信息的角度研究了滤波器在应用的各个阶段对语音的失真程度。保留说话人的特定信息是我们研究的主要关注点,对于有噪声的语音样本,根据噪声的特征和强度水平进行滤波技术的局限性得到了解决。随后还进行了统计研究。进行了听力测试,以确保在应用滤波器时保留原始噪声语音的感知特征。这项工作证明了降噪滤波器在提高信噪比方面的效率,以及它们在保留基于语音样本中嵌入的各种噪声特征的说话者相关特征方面的控制应用。音频取证有一个具有挑战性的历史增强问题的语音样本收到检查。我们观察到,在实验室接收的用于说话人识别的所有语音样本中,有大量录音需要增强。语音是一个用复数表示的非线性时间序列。因此,在频谱域将噪声从噪声语音中分离出来会产生无数的解决方案。噪声消除系统的主要目标是获得清晰的信号和更高质量的语音信号。语音信号中噪声的存在会在用于说话人识别和语音识别的语音处理系统的性能中产生更高程度的不匹配。在特征提取过程中,对噪声滤波不当,就等于将噪声特征与实际语音信号一起提取。然而,所需的参数表示具有很高的错误率。宽带噪声的存在和极低的信噪比降低了大多数录制语音样本的可理解性。在降噪过程中,会影响说话人的特异语音。这就是说话人识别
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信