Creation of an Afrikaans Speech Corpora for Speech Emotion Recognition

Michael Norval, Zenghui Wang
{"title":"Creation of an Afrikaans Speech Corpora for Speech Emotion Recognition","authors":"Michael Norval, Zenghui Wang","doi":"10.1109/RAAI56146.2022.10092988","DOIUrl":null,"url":null,"abstract":"The field of Artificial Intelligence (AI) and HumanComputer Interaction (HCI) has grown significantly in the last decade. Speech Recognition (SR) and more specifically Speech Emotion Recognition (SER) is still a growing field with quite a few academic and private companies doing research. Currently, SER is not specifically geared toward African-based languages. The paper is to show how to create an Afrikaans-based speech corpora to train a Neural Network (NN). Method-wise, speech samples are extracted from streamed broadcasts. A local Afrikaans Youtube channel is used. Care is taken that the ‘‘Creative Commons Attribution license (reuse allowed)’’ is always adhered to. In cases where the creative commons license is not available, authorization has been obtained. The speech clips are saved in.wav format. The emotions captured are Anger, Anticipation, Disgust, Joy, Sadness, Suprise, Fear and Trust. All data is anonymized. The recorded clips are verified by a second independent party and if required verified again by another. This makes sure that categorization is correct. The result is an Afrikaans speech corpus with roughly 800 speech clips. Finally, LTSM is applied to the dataset, and the new Afrikaans corpora yielded a detection accuracy of 58% and 74% with transfer learning.","PeriodicalId":190255,"journal":{"name":"2022 2nd International Conference on Robotics, Automation and Artificial Intelligence (RAAI)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 2nd International Conference on Robotics, Automation and Artificial Intelligence (RAAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RAAI56146.2022.10092988","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The field of Artificial Intelligence (AI) and HumanComputer Interaction (HCI) has grown significantly in the last decade. Speech Recognition (SR) and more specifically Speech Emotion Recognition (SER) is still a growing field with quite a few academic and private companies doing research. Currently, SER is not specifically geared toward African-based languages. The paper is to show how to create an Afrikaans-based speech corpora to train a Neural Network (NN). Method-wise, speech samples are extracted from streamed broadcasts. A local Afrikaans Youtube channel is used. Care is taken that the ‘‘Creative Commons Attribution license (reuse allowed)’’ is always adhered to. In cases where the creative commons license is not available, authorization has been obtained. The speech clips are saved in.wav format. The emotions captured are Anger, Anticipation, Disgust, Joy, Sadness, Suprise, Fear and Trust. All data is anonymized. The recorded clips are verified by a second independent party and if required verified again by another. This makes sure that categorization is correct. The result is an Afrikaans speech corpus with roughly 800 speech clips. Finally, LTSM is applied to the dataset, and the new Afrikaans corpora yielded a detection accuracy of 58% and 74% with transfer learning.
用于语音情感识别的南非荷兰语语料库的创建
人工智能(AI)和人机交互(HCI)领域在过去十年中有了显著的发展。语音识别(SR),更具体地说,语音情感识别(SER)仍然是一个不断发展的领域,有相当多的学术和私人公司在做研究。目前,SER并不是专门针对非洲语言的。本文展示了如何创建一个基于南非荷兰语的语音语料库来训练神经网络(NN)。在方法上,从流广播中提取语音样本。本文使用了一个当地的南非荷兰语Youtube频道。注意始终遵守“知识共享署名许可(允许重复使用)”。在知识共享许可不可用的情况下,已获得授权。演讲片段以。wav格式保存。捕捉到的情绪有愤怒、期待、厌恶、喜悦、悲伤、惊讶、恐惧和信任。所有数据都是匿名的。录制的片段由第二个独立方验证,如果需要,由另一个方再次验证。这确保了分类是正确的。结果是一个包含大约800个演讲片段的南非荷兰语语料库。最后,将LTSM应用于数据集,通过迁移学习,新的南非荷兰语语料库的检测准确率分别为58%和74%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信