Semi-autonomous data enrichment based on cross-task labelling of missing targets for holistic speech analysis

Yue Zhang, Yuxiang Zhou, Jie Shen, Björn Schuller
{"title":"Semi-autonomous data enrichment based on cross-task labelling of missing targets for holistic speech analysis","authors":"Yue Zhang, Yuxiang Zhou, Jie Shen, Björn Schuller","doi":"10.1109/ICASSP.2016.7472847","DOIUrl":null,"url":null,"abstract":"In this work, we propose a novel approach for large-scale data enrichment, with the aim to address a major shortcoming of current research in computational paralinguistics, namely, looking at speaker attributes in isolation although strong interdependencies between them exist. The scarcity of multi-target databases, in which instances are labelled for different kinds of speaker characteristics, compounds this problem. The core idea of our work is to join existing data resources into one single holistic database with a multi-dimensional label space by using semi-supervised learning techniques to predict missing labels. In the proposed new Cross-Task Labelling (CTL) method, a model is first trained on the labelled training set of the selected databases for each individual task. Then, the trained classifiers are used for the crosslabelling of databases among each other. To exemplify the effectiveness of the `CTL' method, we evaluated it for likability, personality, and emotion recognition as representative tasks from the INTERSPEECH Computational Paralinguistics ChallengE (ComParE) series. The results show that `CTL' lays the foundation for holistic speech analysis by semi-autonomously annotating the existing databases, and expanding the multi-target label space at the same time, while achieving higher accuracy as the baseline performance of the challenges.","PeriodicalId":165321,"journal":{"name":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"139 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2016.7472847","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

Abstract

In this work, we propose a novel approach for large-scale data enrichment, with the aim to address a major shortcoming of current research in computational paralinguistics, namely, looking at speaker attributes in isolation although strong interdependencies between them exist. The scarcity of multi-target databases, in which instances are labelled for different kinds of speaker characteristics, compounds this problem. The core idea of our work is to join existing data resources into one single holistic database with a multi-dimensional label space by using semi-supervised learning techniques to predict missing labels. In the proposed new Cross-Task Labelling (CTL) method, a model is first trained on the labelled training set of the selected databases for each individual task. Then, the trained classifiers are used for the crosslabelling of databases among each other. To exemplify the effectiveness of the `CTL' method, we evaluated it for likability, personality, and emotion recognition as representative tasks from the INTERSPEECH Computational Paralinguistics ChallengE (ComParE) series. The results show that `CTL' lays the foundation for holistic speech analysis by semi-autonomously annotating the existing databases, and expanding the multi-target label space at the same time, while achieving higher accuracy as the baseline performance of the challenges.
基于缺失目标跨任务标记的半自主数据充实,用于整体语音分析
在这项工作中,我们提出了一种大规模数据丰富的新方法,旨在解决当前计算副语言学研究的一个主要缺点,即尽管它们之间存在很强的相互依赖性,但仍然孤立地研究说话人属性。多目标数据库的稀缺性使得这个问题更加复杂。在多目标数据库中,每个实例都被标记为不同类型的说话人特征。我们工作的核心思想是通过使用半监督学习技术来预测缺失的标签,将现有的数据资源加入到一个具有多维标签空间的单一整体数据库中。在提出的新的跨任务标记(CTL)方法中,首先对每个单独任务的选定数据库的标记训练集进行模型训练。然后,将训练好的分类器用于数据库之间的交叉标记。为了证明“CTL”方法的有效性,我们将其作为INTERSPEECH计算副语言学挑战(ComParE)系列的代表性任务,对其进行了可爱性、个性和情感识别的评估。结果表明,“CTL”通过对现有数据库进行半自主标注,同时扩展多目标标签空间,为整体语音分析奠定了基础,同时实现了更高的准确率作为挑战的基准性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信