基于鲁棒管道的深度学习语音归因检测方法

Shreya Chakravarty, R. Khandelwal
{"title":"基于鲁棒管道的深度学习语音归因检测方法","authors":"Shreya Chakravarty, R. Khandelwal","doi":"10.1109/I2CT57861.2023.10126219","DOIUrl":null,"url":null,"abstract":"The \"thinking machines\" today, breathe hand-in-hand with the blessing of expunging human effort, as well as the disadvantage of being misused easily. There are enormous applications of automation, one of the most popular being speech recognition. Automated systems can now be controlled by voice commands, and also can provide human-like responses, whether it is appearance or communication media like speech. There won’t always be times when the source of audio would be in ideal surroundings. This aggravates the possibility of human-system interaction involving audio aberrations and hence, raises a great apprehension regarding forensic issues like authenticity and the source of the given audio, which calls for a challenge to resolve. This paper seeks to illustrate thorough augmentation of audio data for a robust solution that eradicates the anomalies in audio using a pipeline approach. We propose analysing the spectrogram representation of an audio signal to determine a mask that segregates noise from pure signal, and results in a signal that can be processed for speech recognition, further extending to fabrication of a deep neural network having an accuracy of 95.87%.","PeriodicalId":150346,"journal":{"name":"2023 IEEE 8th International Conference for Convergence in Technology (I2CT)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Robust Pipeline based Deep Learning Approach to Detect Speech Attribution\",\"authors\":\"Shreya Chakravarty, R. Khandelwal\",\"doi\":\"10.1109/I2CT57861.2023.10126219\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The \\\"thinking machines\\\" today, breathe hand-in-hand with the blessing of expunging human effort, as well as the disadvantage of being misused easily. There are enormous applications of automation, one of the most popular being speech recognition. Automated systems can now be controlled by voice commands, and also can provide human-like responses, whether it is appearance or communication media like speech. There won’t always be times when the source of audio would be in ideal surroundings. This aggravates the possibility of human-system interaction involving audio aberrations and hence, raises a great apprehension regarding forensic issues like authenticity and the source of the given audio, which calls for a challenge to resolve. This paper seeks to illustrate thorough augmentation of audio data for a robust solution that eradicates the anomalies in audio using a pipeline approach. We propose analysing the spectrogram representation of an audio signal to determine a mask that segregates noise from pure signal, and results in a signal that can be processed for speech recognition, further extending to fabrication of a deep neural network having an accuracy of 95.87%.\",\"PeriodicalId\":150346,\"journal\":{\"name\":\"2023 IEEE 8th International Conference for Convergence in Technology (I2CT)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE 8th International Conference for Convergence in Technology (I2CT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/I2CT57861.2023.10126219\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 8th International Conference for Convergence in Technology (I2CT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/I2CT57861.2023.10126219","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

今天的“思考机器”,与消除人类努力的好处一起呼吸,同时也有容易被滥用的缺点。自动化有很多应用,其中最流行的是语音识别。自动化系统现在可以通过语音命令来控制,也可以提供类似人类的反应,无论是外观还是像语音这样的交流媒介。并非所有情况下音频源都处于理想环境中。这加剧了涉及音频畸变的人类系统交互的可能性,因此,引起了对真实性和给定音频来源等法医问题的极大担忧,这需要挑战来解决。本文旨在说明音频数据的全面增强,以实现一个强大的解决方案,该解决方案使用管道方法消除音频中的异常。我们建议分析音频信号的频谱图表示,以确定将噪声从纯信号中分离出来的掩模,并产生可用于语音识别的信号,进一步扩展到具有95.87%精度的深度神经网络的制造。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Robust Pipeline based Deep Learning Approach to Detect Speech Attribution
The "thinking machines" today, breathe hand-in-hand with the blessing of expunging human effort, as well as the disadvantage of being misused easily. There are enormous applications of automation, one of the most popular being speech recognition. Automated systems can now be controlled by voice commands, and also can provide human-like responses, whether it is appearance or communication media like speech. There won’t always be times when the source of audio would be in ideal surroundings. This aggravates the possibility of human-system interaction involving audio aberrations and hence, raises a great apprehension regarding forensic issues like authenticity and the source of the given audio, which calls for a challenge to resolve. This paper seeks to illustrate thorough augmentation of audio data for a robust solution that eradicates the anomalies in audio using a pipeline approach. We propose analysing the spectrogram representation of an audio signal to determine a mask that segregates noise from pure signal, and results in a signal that can be processed for speech recognition, further extending to fabrication of a deep neural network having an accuracy of 95.87%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信