基于稀疏表示的欺骗性语音检测

2016 IEEE 12th International Colloquium on Signal Processing & Its Applications (CSPA) Pub Date : 2016-03-04 DOI:10.1109/CSPA.2016.7515793

Xiaohe Fan, Heming Zhao, Xueqin Chen, Cheng Fan, Shuxi Chen

{"title":"基于稀疏表示的欺骗性语音检测","authors":"Xiaohe Fan, Heming Zhao, Xueqin Chen, Cheng Fan, Shuxi Chen","doi":"10.1109/CSPA.2016.7515793","DOIUrl":null,"url":null,"abstract":"Generally, the extracted features of distinguishing deceptive speeches always focused on prosodic, vocal tract, lexical and glottal waveform features. The purpose of this paper is to examine the effectiveness of sparse coefficients for deception detection. In this paper, we firstly extract the Mel-Frequency Cepstrum Coefficient (MFCC) and Zero Crossing Rate (ZCR) from speech utterances as the input data of K-SVD algorithm to learn a mixture dictionary. And sparse coefficients are obtained by Orthogonal Matching Pursuit (OMP) algorithm. Then we use those coefficients as features to train Support Vector Machine (SVM) model and test the classifier accuracy based on the trained model. Finally, we present the experimental results of this approach and compare the results with the conventional features consisting of Short-Time, Pitch, Formant, and Duration based on corpus of Soochow University Speech Processing Researches-Deception Speech Detection Corpus (SUSP-DSD). It shows that sparse coefficients perform better than the conventional features in deception detection.","PeriodicalId":314829,"journal":{"name":"2016 IEEE 12th International Colloquium on Signal Processing & Its Applications (CSPA)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Deceptive Speech Detection based on sparse representation\",\"authors\":\"Xiaohe Fan, Heming Zhao, Xueqin Chen, Cheng Fan, Shuxi Chen\",\"doi\":\"10.1109/CSPA.2016.7515793\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Generally, the extracted features of distinguishing deceptive speeches always focused on prosodic, vocal tract, lexical and glottal waveform features. The purpose of this paper is to examine the effectiveness of sparse coefficients for deception detection. In this paper, we firstly extract the Mel-Frequency Cepstrum Coefficient (MFCC) and Zero Crossing Rate (ZCR) from speech utterances as the input data of K-SVD algorithm to learn a mixture dictionary. And sparse coefficients are obtained by Orthogonal Matching Pursuit (OMP) algorithm. Then we use those coefficients as features to train Support Vector Machine (SVM) model and test the classifier accuracy based on the trained model. Finally, we present the experimental results of this approach and compare the results with the conventional features consisting of Short-Time, Pitch, Formant, and Duration based on corpus of Soochow University Speech Processing Researches-Deception Speech Detection Corpus (SUSP-DSD). It shows that sparse coefficients perform better than the conventional features in deception detection.\",\"PeriodicalId\":314829,\"journal\":{\"name\":\"2016 IEEE 12th International Colloquium on Signal Processing & Its Applications (CSPA)\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE 12th International Colloquium on Signal Processing & Its Applications (CSPA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSPA.2016.7515793\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 12th International Colloquium on Signal Processing & Its Applications (CSPA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSPA.2016.7515793","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

一般来说，欺骗性言语识别特征的提取主要集中在韵律、声道、词汇和声门波形特征上。本文的目的是检验稀疏系数在欺骗检测中的有效性。本文首先从语音中提取Mel-Frequency倒频谱系数(MFCC)和过零率(ZCR)作为K-SVD算法的输入数据，学习混合字典。利用正交匹配追踪(OMP)算法获得稀疏系数。然后将这些系数作为特征来训练支持向量机(SVM)模型，并在此基础上测试分类器的准确率。最后，我们给出了该方法的实验结果，并将实验结果与基于苏州大学语音处理研究-欺骗语音检测语料库(ssu - dsd)的短时间、音高、峰峰和持续时间的传统特征进行了比较。结果表明，稀疏系数在欺骗检测中的性能优于常规特征。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deceptive Speech Detection based on sparse representation

Generally, the extracted features of distinguishing deceptive speeches always focused on prosodic, vocal tract, lexical and glottal waveform features. The purpose of this paper is to examine the effectiveness of sparse coefficients for deception detection. In this paper, we firstly extract the Mel-Frequency Cepstrum Coefficient (MFCC) and Zero Crossing Rate (ZCR) from speech utterances as the input data of K-SVD algorithm to learn a mixture dictionary. And sparse coefficients are obtained by Orthogonal Matching Pursuit (OMP) algorithm. Then we use those coefficients as features to train Support Vector Machine (SVM) model and test the classifier accuracy based on the trained model. Finally, we present the experimental results of this approach and compare the results with the conventional features consisting of Short-Time, Pitch, Formant, and Duration based on corpus of Soochow University Speech Processing Researches-Deception Speech Detection Corpus (SUSP-DSD). It shows that sparse coefficients perform better than the conventional features in deception detection.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 IEEE 12th International Colloquium on Signal Processing & Its Applications (CSPA)

自引率

0.00%

发文量