{"title":"Important Modulation Frequency Components of Temporal Amplitude Envelope Contributing to Vocal Emotion Perception.","authors":"Taiyang Guo, Shunsuke Kidani, Takuto Isoyama, Peter Birkholz, Masato Akagi, Masashi Unoki","doi":"10.1044/2025_JSLHR-24-00825","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Previous studies using noise-vocoded speech (NVS) have demonstrated the significance of the temporal amplitude envelope (TAE) of speech signals, such as modulation perception, in vocal emotion perception. In addition, due to the importance of modulation processing for TAE in speech perception, researchers began to focus on the role of TAE modulation components. A previous study suggested the contributions of modulation frequency components in vocal emotion perception. However, the important components remain unclear. This study aims to clarify the important components in vocal emotion perception.</p><p><strong>Method: </strong>Two experiments on vocal emotion perception using NVS were conducted with 10 native Japanese speakers (two women and eight men). In NVS generation, a modulation filterbank (MFB) is used to simulate modulation perception in the auditory system. The modulation frequency components of TAE are bandpass and bandstop filtered using the filterbank. The contributions of the individual modulation frequency components are evaluated by comparing the emotion recognition rates of NVS.</p><p><strong>Results: </strong>The results indicate that the use of an MFB does not affect emotion perception in NVS. The modulation frequency components within the 0- to 16-Hz band are important for each emotion, as well as for all emotions collectively. The important modulation frequency components for vocal emotion perception may differ slightly between positive and negative emotions. However, this observation should be interpreted cautiously and needs more verification due to the imbalance in the number of emotional categories in this study.</p><p><strong>Conclusion: </strong>This study investigated the important modulation frequency components of TAE that contribute to vocal emotion perception and suggested that modulation frequency components in the 0- to 16-Hz band are important components in vocal emotion perception.</p>","PeriodicalId":520690,"journal":{"name":"Journal of speech, language, and hearing research : JSLHR","volume":" ","pages":"4205-4219"},"PeriodicalIF":2.2000,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of speech, language, and hearing research : JSLHR","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1044/2025_JSLHR-24-00825","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/8/1 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Previous studies using noise-vocoded speech (NVS) have demonstrated the significance of the temporal amplitude envelope (TAE) of speech signals, such as modulation perception, in vocal emotion perception. In addition, due to the importance of modulation processing for TAE in speech perception, researchers began to focus on the role of TAE modulation components. A previous study suggested the contributions of modulation frequency components in vocal emotion perception. However, the important components remain unclear. This study aims to clarify the important components in vocal emotion perception.
Method: Two experiments on vocal emotion perception using NVS were conducted with 10 native Japanese speakers (two women and eight men). In NVS generation, a modulation filterbank (MFB) is used to simulate modulation perception in the auditory system. The modulation frequency components of TAE are bandpass and bandstop filtered using the filterbank. The contributions of the individual modulation frequency components are evaluated by comparing the emotion recognition rates of NVS.
Results: The results indicate that the use of an MFB does not affect emotion perception in NVS. The modulation frequency components within the 0- to 16-Hz band are important for each emotion, as well as for all emotions collectively. The important modulation frequency components for vocal emotion perception may differ slightly between positive and negative emotions. However, this observation should be interpreted cautiously and needs more verification due to the imbalance in the number of emotional categories in this study.
Conclusion: This study investigated the important modulation frequency components of TAE that contribute to vocal emotion perception and suggested that modulation frequency components in the 0- to 16-Hz band are important components in vocal emotion perception.