Analyzing speech in both time and space: Generalized additive mixed models can uncover systematic patterns of variation in vocal tract shape in real-time MRI

IF 1.3 2区文学 0 LANGUAGE & LINGUISTICS

Laboratory Phonology Pub Date : 2020-01-01 DOI:10.5334/labphon.214

C. Carignan, P. Hoole, E. Kunay, M. Pouplier, Arun A. Joseph, Dirk Voit, J. Frahm, J. Harrington

{"title":"Analyzing speech in both time and space: Generalized additive mixed models can uncover systematic patterns of variation in vocal tract shape in real-time MRI","authors":"C. Carignan, P. Hoole, E. Kunay, M. Pouplier, Arun A. Joseph, Dirk Voit, J. Frahm, J. Harrington","doi":"10.5334/labphon.214","DOIUrl":null,"url":null,"abstract":"We present a method of using generalized additive mixed models (GAMMs) to analyze midsagittal vocal tract data obtained from real-time magnetic resonance imaging (rt-MRI) video of speech production. Applied to rt-MRI data, GAMMs allow for observation of factor effects on vocal tract shape throughout two key dimensions: time (vocal tract change over the temporal course of a speech segment) and space (location of change within the vocal tract). Examples of this method are provided for rt-MRI data collected at a temporal resolution of 20 ms and a spatial resolution of 1.41 mm, for 36 native speakers of German. The rt-MRI data were quantified as 28-point semi-polar-grid aperture functions. Three test cases are provided as a way of observing vocal tract differences between: (1) /aː/ and /iː/, (2) /aː/ and /aɪ/, and (3) accentuated and unstressed /aː/. The results for each GAMM are independently validated using functional linear mixed models (FLMMs) constructed from data obtained at 20% and 80% of the vowel interval. In each case, the two methods yield similar results. In light of the method similarities, we propose that GAMMs are a robust, powerful, and interpretable method of simultaneously analyzing both temporal and spatial effects in rt-MRI video of speech.","PeriodicalId":45128,"journal":{"name":"Laboratory Phonology","volume":"11 1","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Laboratory Phonology","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.5334/labphon.214","RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}

引用次数: 20

Abstract

We present a method of using generalized additive mixed models (GAMMs) to analyze midsagittal vocal tract data obtained from real-time magnetic resonance imaging (rt-MRI) video of speech production. Applied to rt-MRI data, GAMMs allow for observation of factor effects on vocal tract shape throughout two key dimensions: time (vocal tract change over the temporal course of a speech segment) and space (location of change within the vocal tract). Examples of this method are provided for rt-MRI data collected at a temporal resolution of 20 ms and a spatial resolution of 1.41 mm, for 36 native speakers of German. The rt-MRI data were quantified as 28-point semi-polar-grid aperture functions. Three test cases are provided as a way of observing vocal tract differences between: (1) /aː/ and /iː/, (2) /aː/ and /aɪ/, and (3) accentuated and unstressed /aː/. The results for each GAMM are independently validated using functional linear mixed models (FLMMs) constructed from data obtained at 20% and 80% of the vowel interval. In each case, the two methods yield similar results. In light of the method similarities, we propose that GAMMs are a robust, powerful, and interpretable method of simultaneously analyzing both temporal and spatial effects in rt-MRI video of speech.

查看原文本刊更多论文

在时间和空间上分析语音:广义加性混合模型可以在实时MRI中揭示声道形状变化的系统模式

我们提出了一种使用广义加性混合模型(GAMMs)来分析语音生成实时磁共振成像(rt-MRI)视频中获得的中矢状声道数据的方法。将GAMMs应用于rt-MRI数据，可以通过两个关键维度观察声道形状的因素影响:时间(语音片段的时间过程中的声道变化)和空间(声道内变化的位置)。以时间分辨率为20 ms，空间分辨率为1.41 mm的36名德语母语者的rt-MRI数据为例，给出了该方法的示例。rt-MRI数据量化为28点半极栅孔径函数。提供了三个测试用例来观察声道差异:(1)/a / /和/i / /， (2) /a / /和/a / /，(3)重音和非重音/a / /。每个GAMM的结果都使用功能线性混合模型(flmm)独立验证，该模型由在20%和80%的元音间隔处获得的数据构建。在每种情况下，这两种方法产生相似的结果。鉴于方法的相似性，我们提出GAMMs是一种鲁棒的、强大的、可解释的方法，可以同时分析语音rt-MRI视频中的时间和空间效应。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊