Chunyan Zeng, Yao Yang, Zhifeng Wang, Shuaifei Kong, Shixiong Feng
{"title":"Audio Tampering Forensics Based on Representation Learning of ENF Phase Sequence","authors":"Chunyan Zeng, Yao Yang, Zhifeng Wang, Shuaifei Kong, Shixiong Feng","doi":"10.4018/ijdcf.302894","DOIUrl":null,"url":null,"abstract":"This paper proposes an audio tampering detection method based on the ENF phase and BI-LSTM network from the perspective of temporal feature representation learning. First, the ENF phase is obtained by discrete Fourier transform of ENF component in audio. Second, the ENF phase is divided into frames to obtain ENF phase sequence characterization, and each frame is represented as the change information of the ENF phase in a period. Then, the BI-LSTM neural network is used to train and output the state of each time step, and the difference information between real audio and tampered audio is obtained. Finally, these differences were fitted and dimensionally reduced by the fully connected network and classified by the Softmax classifier. Experimental results show that the performance of this method is better than the state-of-the-art approaches.","PeriodicalId":44650,"journal":{"name":"International Journal of Digital Crime and Forensics","volume":null,"pages":null},"PeriodicalIF":0.6000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Digital Crime and Forensics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/ijdcf.302894","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 16
Abstract
This paper proposes an audio tampering detection method based on the ENF phase and BI-LSTM network from the perspective of temporal feature representation learning. First, the ENF phase is obtained by discrete Fourier transform of ENF component in audio. Second, the ENF phase is divided into frames to obtain ENF phase sequence characterization, and each frame is represented as the change information of the ENF phase in a period. Then, the BI-LSTM neural network is used to train and output the state of each time step, and the difference information between real audio and tampered audio is obtained. Finally, these differences were fitted and dimensionally reduced by the fully connected network and classified by the Softmax classifier. Experimental results show that the performance of this method is better than the state-of-the-art approaches.