心对心:婴儿与成人定向言语分类的艺术

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2023-06-04 DOI:10.1109/ICASSP49357.2023.10096728

Najla D. Al Futaisi, Alejandrina Cristia, B. Schuller

{"title":"心对心:婴儿与成人定向言语分类的艺术","authors":"Najla D. Al Futaisi, Alejandrina Cristia, B. Schuller","doi":"10.1109/ICASSP49357.2023.10096728","DOIUrl":null,"url":null,"abstract":"Psycholinguistics researchers investigate child language exposure by studying children’s language environment. A main factor is whether, in humanistic heart-to-heart dialogue, the speech is directed to the infant (infant-directed speech) versus to another adult (adult-directed speech). The former has been found to better predict children’s lexicon, and therefore constitutes a more relevant part of children’s language environment. Listening to, segmenting and annotating naturalistic long-form recordings collected through infant-worn devices is highly costly and time-consuming, and could be prone to errors in misclassification. We aim to overcome these challenges by automatically classifying speech as infant-directed versus adult-directed. In this research, we exploit multiple datasets, combined to form a larger corpus for training. In addition, we employ four different methods: Multi-task learning, adversarial training, autoencoder multi-task learning and adversarial multi-task learning, the last of which yielded the best results on all datasets.","PeriodicalId":113072,"journal":{"name":"ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"26 10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hearttoheart: The Arts of Infant Versus Adult-Directed Speech Classification\",\"authors\":\"Najla D. Al Futaisi, Alejandrina Cristia, B. Schuller\",\"doi\":\"10.1109/ICASSP49357.2023.10096728\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Psycholinguistics researchers investigate child language exposure by studying children’s language environment. A main factor is whether, in humanistic heart-to-heart dialogue, the speech is directed to the infant (infant-directed speech) versus to another adult (adult-directed speech). The former has been found to better predict children’s lexicon, and therefore constitutes a more relevant part of children’s language environment. Listening to, segmenting and annotating naturalistic long-form recordings collected through infant-worn devices is highly costly and time-consuming, and could be prone to errors in misclassification. We aim to overcome these challenges by automatically classifying speech as infant-directed versus adult-directed. In this research, we exploit multiple datasets, combined to form a larger corpus for training. In addition, we employ four different methods: Multi-task learning, adversarial training, autoencoder multi-task learning and adversarial multi-task learning, the last of which yielded the best results on all datasets.\",\"PeriodicalId\":113072,\"journal\":{\"name\":\"ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"volume\":\"26 10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP49357.2023.10096728\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP49357.2023.10096728","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

心理语言学研究者通过研究儿童的语言环境来研究儿童的语言暴露。一个主要因素是，在人性化的心心对话中，言语是针对婴儿(婴儿指向语)还是针对另一个成年人(成人指向语)。研究发现，前者能更好地预测儿童的词汇，因此是儿童语言环境中更相关的一部分。通过婴儿穿戴的设备收听、分割和注释自然的长格式录音是非常昂贵和耗时的，并且容易在错误分类中出现错误。我们的目标是通过自动将语音分类为婴儿导向和成人导向来克服这些挑战。在本研究中，我们利用多个数据集，组合成一个更大的语料库进行训练。此外，我们采用了四种不同的方法:多任务学习、对抗性训练、自编码器多任务学习和对抗性多任务学习，最后一种方法在所有数据集上产生了最好的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Hearttoheart: The Arts of Infant Versus Adult-Directed Speech Classification

Psycholinguistics researchers investigate child language exposure by studying children’s language environment. A main factor is whether, in humanistic heart-to-heart dialogue, the speech is directed to the infant (infant-directed speech) versus to another adult (adult-directed speech). The former has been found to better predict children’s lexicon, and therefore constitutes a more relevant part of children’s language environment. Listening to, segmenting and annotating naturalistic long-form recordings collected through infant-worn devices is highly costly and time-consuming, and could be prone to errors in misclassification. We aim to overcome these challenges by automatically classifying speech as infant-directed versus adult-directed. In this research, we exploit multiple datasets, combined to form a larger corpus for training. In addition, we employ four different methods: Multi-task learning, adversarial training, autoencoder multi-task learning and adversarial multi-task learning, the last of which yielded the best results on all datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

自引率

0.00%

发文量