{"title":"基于相邻帧集成和多码字优先注意的基于qim的VoIP隐写有效检测","authors":"Cheng Zhang;Yue Yan;Shujuan Jiang;Zhong Chen","doi":"10.1109/LSP.2025.3580496","DOIUrl":null,"url":null,"abstract":"With the growing volume of VoIP traffic, many steganography algorithms exploit VoIP speech as a carrier, posing a threat to cybersecurity. Among them, quantization index modulation (QIM)-based VoIP steganography has demonstrated excellent stealth, making detection difficult. In recent years, more studies have focused on developing feasible QIM-based VoIP steganalysis methods for detecting QIM-based VoIP steganography. Previous studies have mostly focused on improving detection performance while neglecting efficiency, resulting in insufficient research on lightweight models. In online detection scenarios, detection efficiency is crucial. On the one hand, the long inference time of large models can delay warnings. On the other hand, the high computational requirements of these models make them difficult to deploy on remote devices, which reduces their practical value. In this letter, we propose a simple yet efficient model named EQVS (efficient QIM-based VoIP steganalysis network) for detecting QIM-based VoIP steganography. In EQVS, the fold and unfold operations are redesigned based on the characteristics of VoIP speech samples and the requirements of the QIM-based VoIP steganalysis task, to avoid disrupting correlation features. Then, multi-codeword priority attention mechanism, inspired by the multi-query attention and retention mechanisms, redefines the calculation procedure for the query, key, and value matrices, as well as the normalization and softmax operations, to further reduce computational resource consumption in a single attention head. Experimental results demonstrate that EQVS outperforms other state-of-the-art models in both detection performance and efficiency.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"2534-2538"},"PeriodicalIF":3.9000,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient Detection of QIM-Based VoIP Steganography Using Adjacent Frame Integration and Multi-Codeword Priority Attention\",\"authors\":\"Cheng Zhang;Yue Yan;Shujuan Jiang;Zhong Chen\",\"doi\":\"10.1109/LSP.2025.3580496\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the growing volume of VoIP traffic, many steganography algorithms exploit VoIP speech as a carrier, posing a threat to cybersecurity. Among them, quantization index modulation (QIM)-based VoIP steganography has demonstrated excellent stealth, making detection difficult. In recent years, more studies have focused on developing feasible QIM-based VoIP steganalysis methods for detecting QIM-based VoIP steganography. Previous studies have mostly focused on improving detection performance while neglecting efficiency, resulting in insufficient research on lightweight models. In online detection scenarios, detection efficiency is crucial. On the one hand, the long inference time of large models can delay warnings. On the other hand, the high computational requirements of these models make them difficult to deploy on remote devices, which reduces their practical value. In this letter, we propose a simple yet efficient model named EQVS (efficient QIM-based VoIP steganalysis network) for detecting QIM-based VoIP steganography. In EQVS, the fold and unfold operations are redesigned based on the characteristics of VoIP speech samples and the requirements of the QIM-based VoIP steganalysis task, to avoid disrupting correlation features. Then, multi-codeword priority attention mechanism, inspired by the multi-query attention and retention mechanisms, redefines the calculation procedure for the query, key, and value matrices, as well as the normalization and softmax operations, to further reduce computational resource consumption in a single attention head. Experimental results demonstrate that EQVS outperforms other state-of-the-art models in both detection performance and efficiency.\",\"PeriodicalId\":13154,\"journal\":{\"name\":\"IEEE Signal Processing Letters\",\"volume\":\"32 \",\"pages\":\"2534-2538\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-06-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Signal Processing Letters\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11039055/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Signal Processing Letters","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11039055/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Efficient Detection of QIM-Based VoIP Steganography Using Adjacent Frame Integration and Multi-Codeword Priority Attention
With the growing volume of VoIP traffic, many steganography algorithms exploit VoIP speech as a carrier, posing a threat to cybersecurity. Among them, quantization index modulation (QIM)-based VoIP steganography has demonstrated excellent stealth, making detection difficult. In recent years, more studies have focused on developing feasible QIM-based VoIP steganalysis methods for detecting QIM-based VoIP steganography. Previous studies have mostly focused on improving detection performance while neglecting efficiency, resulting in insufficient research on lightweight models. In online detection scenarios, detection efficiency is crucial. On the one hand, the long inference time of large models can delay warnings. On the other hand, the high computational requirements of these models make them difficult to deploy on remote devices, which reduces their practical value. In this letter, we propose a simple yet efficient model named EQVS (efficient QIM-based VoIP steganalysis network) for detecting QIM-based VoIP steganography. In EQVS, the fold and unfold operations are redesigned based on the characteristics of VoIP speech samples and the requirements of the QIM-based VoIP steganalysis task, to avoid disrupting correlation features. Then, multi-codeword priority attention mechanism, inspired by the multi-query attention and retention mechanisms, redefines the calculation procedure for the query, key, and value matrices, as well as the normalization and softmax operations, to further reduce computational resource consumption in a single attention head. Experimental results demonstrate that EQVS outperforms other state-of-the-art models in both detection performance and efficiency.
期刊介绍:
The IEEE Signal Processing Letters is a monthly, archival publication designed to provide rapid dissemination of original, cutting-edge ideas and timely, significant contributions in signal, image, speech, language and audio processing. Papers published in the Letters can be presented within one year of their appearance in signal processing conferences such as ICASSP, GlobalSIP and ICIP, and also in several workshop organized by the Signal Processing Society.