改进转录的多层重音分类

Damiano Nicastro, Frankie Inguanez
{"title":"改进转录的多层重音分类","authors":"Damiano Nicastro, Frankie Inguanez","doi":"10.1109/ICCE-Berlin50680.2020.9352197","DOIUrl":null,"url":null,"abstract":"Corporate companies are becoming more aware of gathering public sentiment, which is facilitated with the presence and vast usage of social networks and media platforms. This is a big data problem, and thus automated machine learning systems are deployed. The process requires the analysis of textual mentions, visual illustrations of the brand and/or respective location, as well as audio mentions of the corporate identity and respective products. When focusing on gathering sentiment analysis from the spoken language, the problem of accent recognition is evident across native and non-native English speakers. Thus, in this research, we investigate the key features of accent recognition, calibrate a proposed system based on previous research using the Wildcat Corpus, and apply on a recent dataset, the Common Voice. Finally applying to a custom dataset gathered from an online media platform. We propose a novel hierarchical classifier solution, trained on the Common Voice dataset and tested on the custom dataset. Our three-tier solution achieved 86% and 89% in the first two levels of accents, and 59% at the final level. From this research, we highlight the issues around the considered datasets and propose a number of recommendations for future researchers. In this research we are not improving or comparing any existing works, but rather offer new insights on the Common Voice dataset. We are presenting a hierarchical classifier for the accent classification problem as proposed.","PeriodicalId":438631,"journal":{"name":"2020 IEEE 10th International Conference on Consumer Electronics (ICCE-Berlin)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Multi-Tier Accent Classification For Improved Transcribing\",\"authors\":\"Damiano Nicastro, Frankie Inguanez\",\"doi\":\"10.1109/ICCE-Berlin50680.2020.9352197\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Corporate companies are becoming more aware of gathering public sentiment, which is facilitated with the presence and vast usage of social networks and media platforms. This is a big data problem, and thus automated machine learning systems are deployed. The process requires the analysis of textual mentions, visual illustrations of the brand and/or respective location, as well as audio mentions of the corporate identity and respective products. When focusing on gathering sentiment analysis from the spoken language, the problem of accent recognition is evident across native and non-native English speakers. Thus, in this research, we investigate the key features of accent recognition, calibrate a proposed system based on previous research using the Wildcat Corpus, and apply on a recent dataset, the Common Voice. Finally applying to a custom dataset gathered from an online media platform. We propose a novel hierarchical classifier solution, trained on the Common Voice dataset and tested on the custom dataset. Our three-tier solution achieved 86% and 89% in the first two levels of accents, and 59% at the final level. From this research, we highlight the issues around the considered datasets and propose a number of recommendations for future researchers. In this research we are not improving or comparing any existing works, but rather offer new insights on the Common Voice dataset. We are presenting a hierarchical classifier for the accent classification problem as proposed.\",\"PeriodicalId\":438631,\"journal\":{\"name\":\"2020 IEEE 10th International Conference on Consumer Electronics (ICCE-Berlin)\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 10th International Conference on Consumer Electronics (ICCE-Berlin)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCE-Berlin50680.2020.9352197\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 10th International Conference on Consumer Electronics (ICCE-Berlin)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCE-Berlin50680.2020.9352197","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

随着社交网络和媒体平台的存在和广泛使用,企业越来越意识到收集公众情绪。这是一个大数据问题,因此需要部署自动化机器学习系统。这个过程需要分析文本提及,品牌和/或各自位置的视觉插图,以及企业形象和各自产品的音频提及。当专注于从口语中收集情感分析时,口音识别问题在英语母语和非英语母语者中都很明显。因此,在本研究中,我们研究了口音识别的关键特征,基于先前使用Wildcat语料库的研究校准了提出的系统,并将其应用于最近的数据集Common Voice。最后应用于从在线媒体平台收集的自定义数据集。我们提出了一种新的分层分类器解决方案,在通用语音数据集上进行训练,并在自定义数据集上进行测试。我们的三层解决方案在前两个级别的口音中分别达到86%和89%,在最后一个级别达到59%。从这项研究中,我们强调了围绕所考虑的数据集的问题,并为未来的研究人员提出了一些建议。在这项研究中,我们并不是在改进或比较任何现有的作品,而是在Common Voice数据集上提供新的见解。我们提出了一个分层分类器来解决重音分类问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multi-Tier Accent Classification For Improved Transcribing
Corporate companies are becoming more aware of gathering public sentiment, which is facilitated with the presence and vast usage of social networks and media platforms. This is a big data problem, and thus automated machine learning systems are deployed. The process requires the analysis of textual mentions, visual illustrations of the brand and/or respective location, as well as audio mentions of the corporate identity and respective products. When focusing on gathering sentiment analysis from the spoken language, the problem of accent recognition is evident across native and non-native English speakers. Thus, in this research, we investigate the key features of accent recognition, calibrate a proposed system based on previous research using the Wildcat Corpus, and apply on a recent dataset, the Common Voice. Finally applying to a custom dataset gathered from an online media platform. We propose a novel hierarchical classifier solution, trained on the Common Voice dataset and tested on the custom dataset. Our three-tier solution achieved 86% and 89% in the first two levels of accents, and 59% at the final level. From this research, we highlight the issues around the considered datasets and propose a number of recommendations for future researchers. In this research we are not improving or comparing any existing works, but rather offer new insights on the Common Voice dataset. We are presenting a hierarchical classifier for the accent classification problem as proposed.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信