利用混响线索的语音音频拼接检测和定位

2020 IEEE International Workshop on Information Forensics and Security (WIFS) Pub Date : 2020-12-06 DOI:10.1109/WIFS49906.2020.9360900

Davide Capoferri, Clara Borrelli, Paolo Bestagini, F. Antonacci, A. Sarti, S. Tubaro

{"title":"利用混响线索的语音音频拼接检测和定位","authors":"Davide Capoferri, Clara Borrelli, Paolo Bestagini, F. Antonacci, A. Sarti, S. Tubaro","doi":"10.1109/WIFS49906.2020.9360900","DOIUrl":null,"url":null,"abstract":"Manipulating speech audio recordings through splicing is a task within everyone’s reach. Indeed, it is very easy to collect through social media multiple audio recordings from well-known public figures (e.g., actors, politicians, etc.). These can be cut into smaller excerpts that can be concatenated in order to generate new audio content. As a fake speech from a famous person can be used for fake news spreading and negatively impact on the society, the ability of detecting whether a speech recording has been manipulated is a task of great interest in the forensics community. In this work, we focus on speech audio splicing detection and localization. We leverage the idea that distinct recordings may be acquired in different environments, which are typically characterized by distinctive reverberation cues. Exploiting this property, our method estimates inconsistencies in the reverberation time throughout a speech recording. If reverberation inconsistencies are detected, the audio track is tagged as manipulated and the splicing point time instant is estimated.","PeriodicalId":354881,"journal":{"name":"2020 IEEE International Workshop on Information Forensics and Security (WIFS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Speech Audio Splicing Detection and Localization Exploiting Reverberation Cues\",\"authors\":\"Davide Capoferri, Clara Borrelli, Paolo Bestagini, F. Antonacci, A. Sarti, S. Tubaro\",\"doi\":\"10.1109/WIFS49906.2020.9360900\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Manipulating speech audio recordings through splicing is a task within everyone’s reach. Indeed, it is very easy to collect through social media multiple audio recordings from well-known public figures (e.g., actors, politicians, etc.). These can be cut into smaller excerpts that can be concatenated in order to generate new audio content. As a fake speech from a famous person can be used for fake news spreading and negatively impact on the society, the ability of detecting whether a speech recording has been manipulated is a task of great interest in the forensics community. In this work, we focus on speech audio splicing detection and localization. We leverage the idea that distinct recordings may be acquired in different environments, which are typically characterized by distinctive reverberation cues. Exploiting this property, our method estimates inconsistencies in the reverberation time throughout a speech recording. If reverberation inconsistencies are detected, the audio track is tagged as manipulated and the splicing point time instant is estimated.\",\"PeriodicalId\":354881,\"journal\":{\"name\":\"2020 IEEE International Workshop on Information Forensics and Security (WIFS)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Workshop on Information Forensics and Security (WIFS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WIFS49906.2020.9360900\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Workshop on Information Forensics and Security (WIFS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WIFS49906.2020.9360900","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

通过拼接来操纵语音录音是每个人都能完成的任务。事实上，通过社交媒体收集知名公众人物(如演员、政治家等)的多段录音是非常容易的。这些内容可以被切割成更小的片段，并将其连接起来以生成新的音频内容。名人的假演讲有可能被用来传播假新闻，给社会带来负面影响，因此，能否检测出录音是否被篡改，是法医学界非常关注的课题。在这项工作中，我们主要研究语音音频拼接的检测和定位。我们利用不同的录音可以在不同的环境中获得的想法，这些环境通常具有不同的混响线索。利用这一特性，我们的方法估计了整个语音录音中混响时间的不一致性。如果混响不一致被检测到，音轨被标记为操纵和拼接点时间瞬间估计。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Speech Audio Splicing Detection and Localization Exploiting Reverberation Cues

Manipulating speech audio recordings through splicing is a task within everyone’s reach. Indeed, it is very easy to collect through social media multiple audio recordings from well-known public figures (e.g., actors, politicians, etc.). These can be cut into smaller excerpts that can be concatenated in order to generate new audio content. As a fake speech from a famous person can be used for fake news spreading and negatively impact on the society, the ability of detecting whether a speech recording has been manipulated is a task of great interest in the forensics community. In this work, we focus on speech audio splicing detection and localization. We leverage the idea that distinct recordings may be acquired in different environments, which are typically characterized by distinctive reverberation cues. Exploiting this property, our method estimates inconsistencies in the reverberation time throughout a speech recording. If reverberation inconsistencies are detected, the audio track is tagged as manipulated and the splicing point time instant is estimated.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE International Workshop on Information Forensics and Security (WIFS)

自引率

0.00%

发文量