ACOUSTICALLY-DRIVEN PHONEME REMOVAL THAT PRESERVES VOCAL AFFECT CUES.

Camille Noufi, Jonathan Berger, Michael Frank, Karen Parker, Daniel L Bowling
{"title":"ACOUSTICALLY-DRIVEN PHONEME REMOVAL THAT PRESERVES VOCAL AFFECT CUES.","authors":"Camille Noufi,&nbsp;Jonathan Berger,&nbsp;Michael Frank,&nbsp;Karen Parker,&nbsp;Daniel L Bowling","doi":"10.1109/icassp49357.2023.10095942","DOIUrl":null,"url":null,"abstract":"<p><p>In this paper, we propose a method for removing linguistic information from speech for the purpose of isolating paralinguistic indicators of affect. The immediate utility of this method lies in clinical tests of sensitivity to vocal affect that are not confounded by language, which is impaired in a variety of clinical populations. The method is based on simultaneous recordings of speech audio and electroglotto-graphic (EGG) signals. The speech audio signal is used to estimate the average vocal tract filter response and amplitude envelop. The EGG signal supplies a direct correlate of voice source activity that is mostly independent of phonetic articulation. These signals are used to create a third signal designed to capture as much paralinguistic information from the vocal production system as possible-maximizing the retention of bioacoustic cues to affect-while eliminating phonetic cues to verbal meaning. To evaluate the success of this method, we studied the perception of corresponding speech audio and transformed EGG signals in an affect rating experiment with online listeners. The results show a high degree of similarity in the perceived affect of matched signals, indicating that our method is effective.</p>","PeriodicalId":74518,"journal":{"name":"Proceedings of the ... IEEE International Conference on Acoustics, Speech, and Signal Processing. ICASSP (Conference)","volume":"2023 ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10495117/pdf/nihms-1926898.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... IEEE International Conference on Acoustics, Speech, and Signal Processing. ICASSP (Conference)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icassp49357.2023.10095942","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/5/5 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this paper, we propose a method for removing linguistic information from speech for the purpose of isolating paralinguistic indicators of affect. The immediate utility of this method lies in clinical tests of sensitivity to vocal affect that are not confounded by language, which is impaired in a variety of clinical populations. The method is based on simultaneous recordings of speech audio and electroglotto-graphic (EGG) signals. The speech audio signal is used to estimate the average vocal tract filter response and amplitude envelop. The EGG signal supplies a direct correlate of voice source activity that is mostly independent of phonetic articulation. These signals are used to create a third signal designed to capture as much paralinguistic information from the vocal production system as possible-maximizing the retention of bioacoustic cues to affect-while eliminating phonetic cues to verbal meaning. To evaluate the success of this method, we studied the perception of corresponding speech audio and transformed EGG signals in an affect rating experiment with online listeners. The results show a high degree of similarity in the perceived affect of matched signals, indicating that our method is effective.

声学驱动的音素去除,保留声音影响线索。
在本文中,我们提出了一种从语音中去除语言信息的方法,目的是分离情感的副语言指标。这种方法的直接实用性在于对声音影响的敏感性的临床测试,而语言在各种临床人群中都会受损。该方法基于语音音频和电声图(EGG)信号的同时记录。语音音频信号用于估计平均声道滤波器响应和振幅包络。EGG信号提供语音源活动的直接相关性,该相关性主要独立于语音发音。这些信号用于创建第三信号,该第三信号被设计为从声音产生系统捕获尽可能多的副语言信息,最大限度地保持生物声学提示以产生影响,同时消除语音提示以达到语言意义。为了评估这种方法的成功性,我们在一项针对在线听众的情感评级实验中研究了对相应语音音频的感知和转换的EGG信号。结果表明,匹配信号的感知影响具有高度相似性,表明我们的方法是有效的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信