{"title":"Word frequency and prosody bootstrap basic word order in prelexical infants","authors":"J. Gervain","doi":"10.21437/speechprosody.2022-83","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-83","url":null,"abstract":"Languages systematically vary in their basic word order, which infants need to learn as they acquire their native language. Here I present evidence for the prosodic bootstrapping of word order. Specifically, two behavioral experiments and one brain imaging study word are reviewed supporting the hypothesis that word frequency and phrasal prosody serve as powerful cues to help infants bootstrap the basic lexical categories of functors and content words and guide infants about the relative order of these two categories in their native language. The acoustic realization of prosodic prominence in phonological phrases correlates with basic word order, as functor-initial languages typically rely on phrase-final lengthening, while functor-final language on phrase-initial pitch and/or intensity rise (Nespor et al. 2008). The first study shows that 8-month-old infants can use this acoustic cue to determine the word order of an artificial language. A second study shows that infants expect this prosodic information to be aligned with word frequency, i.e. frequent words to be prosodically non-prominent, as are natural language functors. A near-infrared spectroscopy imaging study suggests that sensitivity to the acoustic realization of prosodic prominence and the resulting rhythmic (iambic/trochaic) grouping derives from babies’ prenatal experience with speech.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"126 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133953245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Speech Prosody 2022Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-100
R. R. Passetti, S. Madureira, P. Barbosa
{"title":"Voice perception on a voice messaging app: implications for Forensic Phonetics","authors":"R. R. Passetti, S. Madureira, P. Barbosa","doi":"10.21437/speechprosody.2022-100","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-100","url":null,"abstract":"This study aims to analyze the effects of a voice messaging app transmission on voice quality and voice dynamics perception and to discuss how they can impact forensic analysis. The study design comprised a perceptual experiment on pairs of stimuli from 10 Brazilian speakers recorded directly by a digital recorder and over WhatsApp voice messages. A group of four voice-specialized judges has assessed the stimuli set by means of the Vocal Profile Analysis (VPA), which comprises both voice quality and voice dynamics settings. Data analysis has included reliability measures, and the spatial arrangement of the stimuli pairs through a multidimensional scaling technique (MDS). The settings mainly correlated to MDS dimensions were “Lowered Larynx”, “Tense Larynx” and “Creaky” as well as “Pitch Mean” and “Pitch Extensive Range”. The voice quality settings were less affected by the recording condition, since the perceptual distances between stimuli pairs in this group were lower than those related to voice dynamics. However, the attested perceptual differences are not limited to stimuli acoustic quality only since none of the MDS perceptual dimensions could separate the recording conditions. Implications for Forensic Phonetics are considered.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122308916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Speech Prosody 2022Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-101
H. Mixdorff, Oliver Niebuhr
{"title":"The Effects of Fujisaki Model Parameter Manipulation on Perceived Charisma","authors":"H. Mixdorff, Oliver Niebuhr","doi":"10.21437/speechprosody.2022-101","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-101","url":null,"abstract":"In an earlier exploratory study we examined the prosody of speeches by two IT industry leaders, Steve Jobs and Marc Zuckerberg, whose perceived charisma differs greatly, with Jobs usually regarded as the much more captivating speaker. This previous study focused mainly on fundamental frequency contours as well as on the perceived local speech rate. Instead of analyzing the raw F0 data directly, we modeled the F0 contours using the Fujisaki model and examined the differences in the respective model components. Whereas in our comparison between Jobs and Zuckerberg we were only able to examine distributions of Fujisaki model parameters, in the current study we decided to systematically vary some of the Fujisaki model parameters on a fixed set of utterances and investigate their effects on perceived charisma. We found that in general pitch range extensions are beneficial, especially when connected to accented syllables, but also that effects differ considerably between male and female speakers.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125476492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Speech Prosody 2022Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-115
Alice Crochiquia, A. Eriksson, P. Barbosa, S. Madureira
{"title":"A perceptual and acoustic study of dubbed voices in an animated film","authors":"Alice Crochiquia, A. Eriksson, P. Barbosa, S. Madureira","doi":"10.21437/speechprosody.2022-115","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-115","url":null,"abstract":"Listeners rely on speech vocal cues to judge speakers’ age, size, personality, and other paralinguistic and extralinguistic features. These judgements are often based on vocal stereotypes which may be universally or culturally determined. This study examines how physical, psychological, social, and vocal features are perceived by listeners and which acoustic features may influence their judgements. An experiment integrating a perceptual test and acoustic measurements was performed. The corpus consisted of speech utterances produced by five animated film characters, dubbed in Brazilian Portuguese. The stimuli were judged by 77 Brazilian Portuguese native speakers, 46 women and 31 men, aged 20 to 50. The acoustic analysis was performed automatically. Acoustic measures included mean f 0 , f 0 baseline, spectral emphasis and H1-H2. For interrater agreement analysis, Cronbach's Alpha was chosen. The results indicated close agreements among judges for all characters. Overall scores obtained for all characters were above .90. In interpreting the results, the influence sound symbolism codes may have on listeners’ judgments and the factors influencing vocal stereotypes have been considered. The discussion of the acoustic and perceptual analysis results takes into consideration if voice actors adapt their voices to fit the characters or otherwise are cast because of their natural voice characteristics.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130139914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Speech Prosody 2022Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-183
M. Yan, S. Calhoun
{"title":"Prosodic prominence and clefting in L2 focus interpretation","authors":"M. Yan, S. Calhoun","doi":"10.21437/speechprosody.2022-183","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-183","url":null,"abstract":"Speech cues available in the utterance can guide the listener to the most important information (focus) and thus facilitate discourse comprehension. It has long been established that prosodic prominence plays an important role in the search for focus in a variety of languages including Mandarin and English; however, it is not yet clear how prosodic prominence interacts with other cues, such as clefting, available in the utterance, especially how L2 learners integrate multiple cues in processing information structure. This paper investigates the relative roles of prosodic and clefting cues in the interpretation of focus position in English utterances by moderate or high proficiency Mandarin listeners of English. It was found that both cues played an important role, with clefting weighting slightly higher. This study contributes significantly to our limited knowledge of how information structure is processed in L2 and has implications for speech perception, language learning and prosodic typology.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128837032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Speech Prosody 2022Pub Date : 2022-05-23DOI: 10.21437/speechprosody.2022-150
Michelina Savino, Simona Sbranna, C. Ventura, Aviad Albert, M. Grice
{"title":"Imitating intonation in a non-native variety: the influence of the native repertoire","authors":"Michelina Savino, Simona Sbranna, C. Ventura, Aviad Albert, M. Grice","doi":"10.21437/speechprosody.2022-150","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-150","url":null,"abstract":"Previous investigations have shown that speakers of one language variety are able to imitate the intonation contour of another variety, if the two contours share the same phonological function and overall F0 shape but differ in their phonetic implementation. In the current study, we asked speakers of Bari Italian, with a rise-fall(-rise) contour for questions, to imitate the question contour of Lecce Italian, which has a (level-)rise contour. Since the Bari Italian repertoire also has a rise (used to convey non-finality) that has a different phonetic implementation, we tested whether this native rise has an influence on the imitation of the Lecce Italian question rise. Results show that Bari Italian speakers can produce a question rise when asked to imitate the Lecce contour, although they are not able to imitate the whole F0 shape, possibly because of the interference of their native non-final rise. Imitators are more successful in reproducing the contour on the final syllable than on the preceding (level) part, possibly because of the perceptual salience of the rise, combined with its final position triggering a recency effect.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124109775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Contribution of voice quality to prediction of turn-taking events","authors":"M. Wlodarczak, Mattias Heldne","doi":"10.21437/speechprosody.2022-99","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-99","url":null,"abstract":"This paper evaluates the contribution of acoustic voice quality measures to prediction of upcoming floor change and retention. In order to minimize the influence of vocal tract resonances, the measures were calculated from miniature accelerometers attached to the tracheal wall. Overall, speaker changes accom-panied by silence were characterized by lower periodicity and steeper spectral slope than turn-holds and speaker changes in-volving overlapping speech. When used on their own, voice quality features contributed to prediction of turn-taking category, this was particularly true of smoothed cepstral peak prominence (CPPS). At the same time, their importance was limited when used in combination with fundamental frequency and intensity, especially compared to the joint effect of these two predictors.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114061347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Can Prosody Transfer Embeddings be Used for Prosody Assessment?","authors":"Mariana Julião, A. Abad, Helena Moniz","doi":"10.21437/speechprosody.2022-60","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-60","url":null,"abstract":"In voice conversion, it is possible to transfer some characteris-tic components of a (target) speech utterance, such as the content, pitch, or speaker identity, from the corresponding component from another (source) utterance. This has recently been achieved by characterizing these components through neural-based vector embeddings which encode the specific information to be transferred. In the particular case of neural prosody embeddings, to the best of our knowledge, no work has ex-plored the informativeness of these embeddings for other pur-poses, such as prosody assessment or comparison of prosodic patterns. In this work, we use an intonation data set and a voice conversion corpus to explore how these neural prosody embeddings group for utterances of different intonation, content, and speaker identity. We compare these neural prosody embeddings to hand-crafted acoustic-prosodic features and to content embeddings. We found that neural prosody embeddings can achieve a geometrical separability index as high as 0.956 for highly contrastive intonations, and 0.706 for different sentence types.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114177429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Georgian syllables uncentered","authors":"C. Crouch, A. Katsika, I. Chitoran","doi":"10.21437/speechprosody.2022-44","DOIUrl":"https://doi.org/10.21437/speechprosody.2022-44","url":null,"abstract":"Both sonority, via the Sonority Sequencing Principle (SSP), and timing, via the coupled oscillator model advanced within Articulatory Phonology (AP), have been invoked to define the syllable as a unit. Georgian presents challenges for both definitions. The irrelevance of the SSP for Georgian phonotactics is well documented, while it is unclear whether Georgian displays the AP-predicted timing pattern of syllable onsets, i.e., the c-center effect. We investigate the relationship between sonority shape and global timing in complex onsets in Georgian by the means of a series of Electromagnetic Articulography (EMA) experiments. We use two measures of global timing, i.e., rightward shift of prenuclear consonant gesture and c-center stability, both relative to an anchor point in the vowel. Contrary to predictions, neither measure supports a c-center effect for Georgian syllables Coordination is not affected by sonority shape, although sonority is reflected in patterns of overlap. We discuss these results in relationship to the phonological and morphological profile of Georgian and suggest that the absence of the c-center effect is possible given Georgian’s permissive phonotactics, and aids in the formation of morphologically complex words. Typological extensions of this account are made.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115286498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}