{"title":"A Comparison of Rhythm Metrics for L2 Speech","authors":"Kakeru Yazawa, M. Kondo","doi":"10.21437/speechprosody.2022-68","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-68","url":null,"abstract":"A wide range of rhythm metrics (global metrics: %V, Δ, Varco, and segVarco; pairwise metrics: rPVI, nPVI, CCI, and D nCCI) was applied to L1 Japanese speakers’ L2 English speech data. Less proficient Japanese speakers of English are expected to show less durational variability for both vocalic and consonantal intervals (because of insufficient stress realization and transfer of CV syllable structure), although this pattern may be obscured by their slower speech rate (which increases interval durations in general). To test if the metrics can capture the L2 rhythmic characteristics, each metric was applied to read speech samples of “The North Wind and the Sun” by 183 Japanese speakers in the J-AESOP corpus. Only %V, VarcoV, and segVarcoV/C were successful; other metrics yielded inconsistent or implausible results likely due to insufficient rate normalization. The overall results indicate that global metrics can effectively quantify L2 rhythm if speech rate is normalized by the mean duration of segments (which is a good predictor of tempo) rather than the mean interval duration (which is popular but susceptible to syllable complexity).","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125986412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marie-Anne Morand, M. Bruno, Sandra Schwab, Stephan Schmid
{"title":"Syllable rate and speech rhythm in multiethnolectal Zurich German: a comparison of speaking styles","authors":"Marie-Anne Morand, M. Bruno, Sandra Schwab, Stephan Schmid","doi":"10.21437/speechprosody.2022-69","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-69","url":null,"abstract":"Multiethnolectal ways of speaking have been emerging for 30 years in culturally and linguistically diverse neighborhoods of European cities, including Zurich (Switzerland). Among the prosodic features of Germanic multiethnolects, a so-called ‘staccato’ rhythm has been mentioned in several studies. For instance, a comparison between two groups of adolescents (12 speakers each) showed that speakers of multiethnolectal Zurich German displayed slower syllable rates and less vowel duration variability than speakers of a rather traditional dialect. This study compares syllable rate and speech rhythm metrics ( nPVI-V, nPVI-C ) in spontaneous and read speech of 48 Zurich German adolescents. In a regression analysis, rhythmic measures were compared with the perception of how multiethnolectal the speakers sounded ( rating score ). The results showed that syllable rate and nPVI-V were related to rating score independently of speaking style (read, spontaneous speech): Speakers who were perceived as more multiethnolectal had a slower syllable rate and less vowel duration variability. Such findings were not observed for nPVI-C. These results suggest that syllable rate and speech rhythm (at least, vowel duration variability) are stable phonetic features of multiethnolectal Zurich German, since the relationship between these features and the perception of multiethnolectal speech was observed in both read and spontaneous speech.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125939024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Speech Prosody 2022Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-171
Liang Zhao, Shayne Sloggett, Eleanor Chodroff
{"title":"Top-Down and Bottom-up Processing of Familiar and Unfamiliar Mandarin Dialect Tone Systems","authors":"Liang Zhao, Shayne Sloggett, Eleanor Chodroff","doi":"10.21437/speechprosody.2022-171","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-171","url":null,"abstract":"Speech processing involves active integration of bottom-up and top-down information types. In the present study, we investigated the relative weighting of top-down expectedness and bottom-up lexical tone in the perception of familiar and unfamiliar lexical tone systems. Standard Mandarin and Chengdu Mandarin are mutually intelligible language varieties with comparable segmental and highly distinct tonal realizations. In a spoken semantic-plausibility judgment task, we manipulated whether a word was high-surprisal or low-surprisal given the preceding context and dialect-specific tone. All participants were native Standard Mandarin speakers with minimal Chengdu Mandarin experience. Lower judgment accuracy was observed when the stimulus was Chengdu Mandarin, and suggested that expectedness (i.e., top-down) information overrides tonal (i.e., bottom-up) information in sentence plausibility judgments. However, judgment response times to sentence surprisal were uniform across stimuli from both dialects, suggesting that speakers are aware of the surprisal conveyed by a non-standard tone, even if not used in their final decision. These findings reveal listener sensitivity to both top-down expectedness and bottom-up tone regardless of the initial tone reliability. For unfamiliar tone systems, top-down influence overrides bottom-up processing to access utterance meaning, but bottom-up processing is indeed present and may reflect rapid learning of the unfamiliar tone system.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126205707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Speech Prosody 2022Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-123
Emilie Marty, R. Bertrand, Caterina Petrone, J. German
{"title":"Prosodic Correlates of Discourse Structure and Emotion in Discourse Markers that Preface Announcements of News","authors":"Emilie Marty, R. Bertrand, Caterina Petrone, J. German","doi":"10.21437/speechprosody.2022-123","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-123","url":null,"abstract":"Discourse markers serve important structuring functions such as concluding a contribution or resuming a topic. We address whether, along with their role in structuring discourse, discourse markers carry prosodic cues to the emotional valence of upcoming news, perhaps to prepare the listener’s emotional reaction. Specifically, we explored the realization of French voilà donc (yeah so) when occurring between an announcement of news and its preface: Je vous appelle au sujet de votre chat qui était malade [preface], voilà donc [discourse marker] il est désormais guéri [announcement] (“I’m calling about your sick cat, yeah so he’s now cured”). We recorded 15 speakers reading voicemail messages announcing negative, positive or neutral (e.g., factual) news. We found that the intonation patterns produced with voilà donc correspond to its discursive functions, in line with existing findings, though the choice of pattern did not depend on the emotional valence of the news. Valence was, however, associated with phonetic variation, in that high f0 targets were higher for positive and neutral valence and pitch range was larger for positive valence. This finding suggests that phonetic variation projects the emotional valence of upcoming news even though discourse function primarily determines the choice of intonation pattern.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121038325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Speech Prosody 2022Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-154
Christine T. Röhr, Michelina Savino, M. Grice
{"title":"The effect of intonational rises on serial recall in German","authors":"Christine T. Röhr, Michelina Savino, M. Grice","doi":"10.21437/speechprosody.2022-154","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-154","url":null,"abstract":"This paper uses a serial recall task to investigate the role of rising intonation in the allocation of attentional resources in German. It has been shown for Italian that rising intonation at prosodic boundaries enhances recall of digits in auditorily presented lists. Since resources are usually allocated to prominent items, and since pitch accents are primary encoders of prominence in both languages, we investigate whether an accentual rise leads to better recall than a boundary rise. In a serial recall task on nine-digit sequences in German we compare the effect on working memory of sequences grouped by marking the last item of the two non-final triplets with (i) a high/rising accent followed by an equally high boundary, (ii) a low accent followed by a boundary rise, or (iii) a low/falling accent-boundary sequence, as compared to (iv) ungrouped sequences as controls. Results reveal that items with a rise are recalled more accurately than items without a rise, with no evidence for superior recall of items with accent rises over those with boundary rises. However, boundary rises appear to facilitate recall over a larger domain than accentual rises.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126692316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Shape matters: Machine classification and listeners’ perceptual discrimination of American English intonational tunes","authors":"J. Cole, Jeremy Steffman, Sam Tilsen","doi":"10.21437/speechprosody.2022-61","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-61","url":null,"abstract":"In Autosegmental-Metrical models of intonational phonology, pitch accents, phrase accents and boundary tones may combine freely to create a predicted set of phonologically distinct phrase-final “nuclear” tunes. In this study we ask if an 8-way distinction in nuclear tune shape in American English, predicted from combinations of 2 (monotonal) pitch accents, 2 phrase accents and 2 boundary tones, is manifest in speech production and in speech perception. F0 trajectories from an imitative speech production experiment were analyzed using (i) neural net classification, and (ii) human listeners’ perceptual discrimination of the model utterances. Pairwise classification accuracy of the imitative productions is highest for tune pairs that differ in holistic shape (high-rising vs. rise-fall), and poorest for tunes with the same shape that differ in (higher vs. lower) final f0. Perception results show a similar pattern, with poor pairwise discrimination for tunes that differ primarily, but by a small degree, in final f0. Together the results suggest a hierarchy of distinctiveness among nuclear tunes, with a robust distinction based on holistic tune shape, which only partly aligns with distinctions in tonal specification, and a weak/poorly differentiated distinction between tunes with the same holistic shape but small differences in final f0.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"21 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129701349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Speech Prosody 2022Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-175
Tatiana V. Kachkovskaia, Svetlana Zimina, Alena Portnova, D. Kocharov
{"title":"Social variability of peak alignment in Russian rise-fall tunes","authors":"Tatiana V. Kachkovskaia, Svetlana Zimina, Alena Portnova, D. Kocharov","doi":"10.21437/speechprosody.2022-175","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-175","url":null,"abstract":"In Russian, rise-fall tunes (H*L) are very typical in yes-no questions and non-utterance-final clauses. In standard descriptions of Russian intonation, the melodic maximum in this tune is located late in the stressed vowel. However, studies of modern Russian intonation, especially within the younger age group, report on cases of ”displaced” melodic peaks—shifted signifi-cantly to the right, so that the F0 maximum occurs on the post-stressed syllable. In this paper we analyse the frequency of such misplaced peaks in Russian dialogue speech, with respect to the factors of gender, age and social distance between the interlocutors. The research is based on the SibLing speech corpus: 90 dialogues with varying relationship between the interlocutors.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127827689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The coordination of boundary tones with constriction gestures in Seoul Korean, an edge-prominence language","authors":"Jiyoung Jang, A. Katsika","doi":"10.21437/speechprosody.2022-30","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-30","url":null,"abstract":"Boundary tones mark major phrase boundaries and are expected to be coordinated with speech gestures adjacent to boundaries. Research on Greek has indeed shown that the onset of the boundary tone (BT) gestures co-occurs with the gestural target of the phrase-final vowel. Interestingly, this coordination is modulated by lexical stress even in the absence of phrasal pitch accent. The present electromagnetic articulography study examines the coordination between BT and constriction gestures in Seoul Korean, a language with no lexical prosody and an edge-prominence system, and further investigates whether focus-related prominence affects this coordination. To this end, the distance of the prominent linguistic unit to the boundary is manipulated in a variety of ways. Results indicate that the onset of BT gestures in Korean is most proximate to the peak velocity of the phrase-final vowel gesture, but suggest that a c-center account is also viable. Prominence fine-tunes this coordination: BT gestures are initiated earlier in Intonational Phrases (IPs) with non-final focus as opposed to IPs with final focus. Importantly, this pattern is detected in short IP-final Accentual Phrases (APs), but not in relatively long IP-final APs. Based on these results, implications on the relationships between lexical and phrasal levels are discussed.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131700483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Conversational Correlates of Prosodic Entrainment in Youth with and without Autism Spectrum Disorder","authors":"Heike Lehnert-LeHouillier, Steven Snadoval","doi":"10.21437/speechprosody.2022-9","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-9","url":null,"abstract":"Research on prosodic entrainment has shown correlations between the degree of prosodic entrainment and several dimensions of conversational success. Individuals with autism spectrum disorder (ASD) often encounter difficulties with a variety of skills necessary for conversational success, especially with the social dimensions of conversational behavior. The goal of the current study was to investigate whether children and teens with an autism diagnosis show similar correlations between prosodic entrainment in mean fundamental frequency ( f0 ) on the one hand and conversational effectiveness, duration of conversations, and conversational turn-taking behavior on the other hand when compared to their neurotypical peers.We found significant interaction effects by group between mean f0 entrainment and all three conversational measures. However, we found no significant differences in group means in the three investigated conversational measures of conversational effectiveness, the number of conversational turns, and duration of conversations for speakers in each group. These results suggest that even though speakers with ASD may show surface conversational behaviors similar to their neurotypical peers, the prosodic manifestation of conversational speech clearly marks conversation partners with ASD as different from their age and gender matched peers.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113966107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Speech Prosody 2022Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-155
B. Andreeva, S. Dimitrova
{"title":"The influence of L1 prosody on Bulgarian-accented German and English","authors":"B. Andreeva, S. Dimitrova","doi":"10.21437/speechprosody.2022-155","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-155","url":null,"abstract":"The present study investigates L2 prosodic realizations in the readings of two groups of Bulgarian informants: (a) with L2 German, and (b) with L2 English. Each of the two groups consisted of ten female learners, who read the fable “The North Wind and the Sun” in their L1 and in the respective L2. We also recorded two groups of female native speakers of the target languages as controls. The following durational parameters were obtained: mean accented syllable duration, accented vs. unaccented syllable duration ratio, and speaking rate. With respect to F0 parameters, mean, median, minimum, maximum, span in semitones, and standard deviations per IP were measured. Additionally, we calculated the number of accented and unaccented syllables, IPs and pauses in each reading. Statistical analyses show that the two groups differ in their use of F0. Both groups use higher standard deviation and level in their L2, whereas the ‘German group’ use higher pitch span as well. The number of accented syllables, IPs and pauses is also higher in L2. Regarding duration, both groups use slower articulation rate. The ratio between accented and unaccented syllables is lower in L2 for the ‘English group’. We also provide original data on speaking rate in Bulgarian from an information theoretical perspective.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"261 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122543067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}