一种改进的表征方法，有效地处理语音情感识别问题

2017 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC) Pub Date : 2017-11-01 DOI:10.1109/ROPEC.2017.8261686

Bryan E. Martínez, J. C. Jacobo

{"title":"一种改进的表征方法，有效地处理语音情感识别问题","authors":"Bryan E. Martínez, J. C. Jacobo","doi":"10.1109/ROPEC.2017.8261686","DOIUrl":null,"url":null,"abstract":"The speaker emotional state recognition task in human-computer interaction will be one of the most common in the future. This task is known as Speech Emotion Recognition (SER). Previous works have developed some characterizations which heavily relies on some sort of feature selection method in order to choose the best subset of features. To our knowledge, no effort has been invested in working out the original features with the idea to improve the classification. In this work, a methodology for feature preprocessing is presented. To this end, our characterization method uses a speech signal from which different characteristics, as well as statistics, are extracted. Then, these characteristics go through a preprocessing phase which will enhance the classification efficiency. After this, a two-stage classification scheme is used. In the first stage k-Means is used for clustering and then in the second stage, we use several standard classifiers. This strategy shows consistently across the classifiers, except for SVM, a superior classification rate (91–100%) than those reported in previous works.","PeriodicalId":260469,"journal":{"name":"2017 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC)","volume":"140 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"An improved characterization methodology to efficiently deal with the speech emotion recognition problem\",\"authors\":\"Bryan E. Martínez, J. C. Jacobo\",\"doi\":\"10.1109/ROPEC.2017.8261686\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The speaker emotional state recognition task in human-computer interaction will be one of the most common in the future. This task is known as Speech Emotion Recognition (SER). Previous works have developed some characterizations which heavily relies on some sort of feature selection method in order to choose the best subset of features. To our knowledge, no effort has been invested in working out the original features with the idea to improve the classification. In this work, a methodology for feature preprocessing is presented. To this end, our characterization method uses a speech signal from which different characteristics, as well as statistics, are extracted. Then, these characteristics go through a preprocessing phase which will enhance the classification efficiency. After this, a two-stage classification scheme is used. In the first stage k-Means is used for clustering and then in the second stage, we use several standard classifiers. This strategy shows consistently across the classifiers, except for SVM, a superior classification rate (91–100%) than those reported in previous works.\",\"PeriodicalId\":260469,\"journal\":{\"name\":\"2017 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC)\",\"volume\":\"140 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ROPEC.2017.8261686\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROPEC.2017.8261686","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

人机交互中的说话人情绪状态识别任务将是未来最常见的任务之一。这项任务被称为语音情感识别(SER)。以前的工作已经开发了一些特征描述，这些特征描述严重依赖于某种特征选择方法，以选择最佳的特征子集。据我们所知，目前还没有人试图用改进分类的想法来找出原始特征。在这项工作中，提出了一种特征预处理方法。为此，我们的表征方法使用语音信号，从中提取不同的特征和统计量。然后，对这些特征进行预处理，提高分类效率。在此之后，使用两阶段分类方案。在第一阶段，k-Means用于聚类，然后在第二阶段，我们使用几个标准分类器。除了支持向量机(SVM)外，该策略在不同分类器上的分类率(91-100%)都优于以往的研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An improved characterization methodology to efficiently deal with the speech emotion recognition problem

The speaker emotional state recognition task in human-computer interaction will be one of the most common in the future. This task is known as Speech Emotion Recognition (SER). Previous works have developed some characterizations which heavily relies on some sort of feature selection method in order to choose the best subset of features. To our knowledge, no effort has been invested in working out the original features with the idea to improve the classification. In this work, a methodology for feature preprocessing is presented. To this end, our characterization method uses a speech signal from which different characteristics, as well as statistics, are extracted. Then, these characteristics go through a preprocessing phase which will enhance the classification efficiency. After this, a two-stage classification scheme is used. In the first stage k-Means is used for clustering and then in the second stage, we use several standard classifiers. This strategy shows consistently across the classifiers, except for SVM, a superior classification rate (91–100%) than those reported in previous works.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC)

自引率

0.00%

发文量