{"title":"音频深度假检测的自适应逆摄动网络","authors":"Xue Ouyang , Chunhui Wang , Bin Zhao , Hao Li","doi":"10.1016/j.neucom.2025.131466","DOIUrl":null,"url":null,"abstract":"<div><div>The growing prevalence of audio deepfakes underscores the urgent need for advanced detection frameworks capable of identifying subtle synthetic artifacts. In response to this challenge, we propose an Adaptive Reverse Perturbation Network, a novel architecture that leverages partial reversal strategies on speech segments and incorporates hierarchical feature discrepancy analysis to enhance deepfake detection. Specifically, the proposed framework employs learnable reversal modules to capture phase discontinuities and spectral anomalies, and utilizes Prime-window reversal to reveal synthetic artifacts that emerge exclusively in reversed speech. Evaluations conducted on five benchmark datasets demonstrate the superior performance of the proposed method, achieving an equal error rate of 1.98 %, representing a 39.6 % improvement over previous systems, as well as a t-DCF of 0.237. Further analysis reveals an inverse correlation between language-specific weight similarity and detection accuracy. These results validate the effectiveness of the trainable differential convolution and reverse perturbation strategies in combating the evolving threat of audio deepfakes, and provide novel insights into phonological artifact patterns associated with synthetic speech.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"658 ","pages":"Article 131466"},"PeriodicalIF":6.5000,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive reverse perturbation network for audio deepfake detection\",\"authors\":\"Xue Ouyang , Chunhui Wang , Bin Zhao , Hao Li\",\"doi\":\"10.1016/j.neucom.2025.131466\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The growing prevalence of audio deepfakes underscores the urgent need for advanced detection frameworks capable of identifying subtle synthetic artifacts. In response to this challenge, we propose an Adaptive Reverse Perturbation Network, a novel architecture that leverages partial reversal strategies on speech segments and incorporates hierarchical feature discrepancy analysis to enhance deepfake detection. Specifically, the proposed framework employs learnable reversal modules to capture phase discontinuities and spectral anomalies, and utilizes Prime-window reversal to reveal synthetic artifacts that emerge exclusively in reversed speech. Evaluations conducted on five benchmark datasets demonstrate the superior performance of the proposed method, achieving an equal error rate of 1.98 %, representing a 39.6 % improvement over previous systems, as well as a t-DCF of 0.237. Further analysis reveals an inverse correlation between language-specific weight similarity and detection accuracy. These results validate the effectiveness of the trainable differential convolution and reverse perturbation strategies in combating the evolving threat of audio deepfakes, and provide novel insights into phonological artifact patterns associated with synthetic speech.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"658 \",\"pages\":\"Article 131466\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2025-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225021381\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225021381","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Adaptive reverse perturbation network for audio deepfake detection
The growing prevalence of audio deepfakes underscores the urgent need for advanced detection frameworks capable of identifying subtle synthetic artifacts. In response to this challenge, we propose an Adaptive Reverse Perturbation Network, a novel architecture that leverages partial reversal strategies on speech segments and incorporates hierarchical feature discrepancy analysis to enhance deepfake detection. Specifically, the proposed framework employs learnable reversal modules to capture phase discontinuities and spectral anomalies, and utilizes Prime-window reversal to reveal synthetic artifacts that emerge exclusively in reversed speech. Evaluations conducted on five benchmark datasets demonstrate the superior performance of the proposed method, achieving an equal error rate of 1.98 %, representing a 39.6 % improvement over previous systems, as well as a t-DCF of 0.237. Further analysis reveals an inverse correlation between language-specific weight similarity and detection accuracy. These results validate the effectiveness of the trainable differential convolution and reverse perturbation strategies in combating the evolving threat of audio deepfakes, and provide novel insights into phonological artifact patterns associated with synthetic speech.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.