{"title":"Reaching beneath the tip of the iceberg: A guide to the Freiburg Multimodal Interaction Corpus","authors":"Christoph Rühlemann, Alexander Ptak","doi":"10.1515/opli-2022-0245","DOIUrl":null,"url":null,"abstract":"Most corpora tacitly subscribe to a speech-only view filtering out anything that is not a ‘word’ and transcribing the spoken language merely orthographically despite the fact that the “speech-only view on language is fundamentally incomplete” (Kok 2017, 2) due to the deep intertwining of the verbal, vocal, and kinesic modalities (Levinson and Holler 2014). This article introduces the Freiburg Multimodal Interaction Corpus (FreMIC), a multimodal and interactional corpus of unscripted conversation in English currently under construction. At the time of writing, FreMIC comprises (i) c. 29 h of video-recordings transcribed and annotated in detail and (ii) automatically (and manually) generated multimodal data. All conversations are transcribed in ELAN both orthographically and using Jeffersonian conventions to render verbal content and interactionally relevant details of sequencing (e.g. overlap, latching), temporal aspects (pauses, acceleration/deceleration), phonological aspects (e.g. intensity, pitch, stretching, truncation, voice quality), and laughter. Moreover, the orthographic transcriptions are exhaustively PoS-tagged using the CLAWS web tagger (Garside and Smith 1997). ELAN-based transcriptions also provide exhaustive annotations of re-enactments (also referred to as (free) direct speech, constructed dialogue, etc.) as well as silent gestures (meaningful gestures that occur without accompanying speech). The multimodal data are derived from psychophysiological measurements and eye tracking. The psychophysiological measurements include, inter alia, electrodermal activity or GSR, which is indicative of emotional arousal (e.g. Peräkylä et al. 2015). Eye tracking produces data of two kinds: gaze direction and pupil size. In FreMIC, gazes are automatically recorded using the area-of-interest technology. Gaze direction is interactionally key, for example, in turn-taking (e.g. Auer 2021) and re-enactments (e.g. Pfeiffer and Weiss 2022), while changes in pupil size provide a window onto cognitive intensity (e.g. Barthel and Sauppe 2019). To demonstrate what opportunities FreMIC’s (combination of) transcriptions, annotations, and multimodal data open up for research in Interactional (Corpus) Linguistics, this article reports on interim results derived from work-in-progress.","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/opli-2022-0245","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Most corpora tacitly subscribe to a speech-only view filtering out anything that is not a ‘word’ and transcribing the spoken language merely orthographically despite the fact that the “speech-only view on language is fundamentally incomplete” (Kok 2017, 2) due to the deep intertwining of the verbal, vocal, and kinesic modalities (Levinson and Holler 2014). This article introduces the Freiburg Multimodal Interaction Corpus (FreMIC), a multimodal and interactional corpus of unscripted conversation in English currently under construction. At the time of writing, FreMIC comprises (i) c. 29 h of video-recordings transcribed and annotated in detail and (ii) automatically (and manually) generated multimodal data. All conversations are transcribed in ELAN both orthographically and using Jeffersonian conventions to render verbal content and interactionally relevant details of sequencing (e.g. overlap, latching), temporal aspects (pauses, acceleration/deceleration), phonological aspects (e.g. intensity, pitch, stretching, truncation, voice quality), and laughter. Moreover, the orthographic transcriptions are exhaustively PoS-tagged using the CLAWS web tagger (Garside and Smith 1997). ELAN-based transcriptions also provide exhaustive annotations of re-enactments (also referred to as (free) direct speech, constructed dialogue, etc.) as well as silent gestures (meaningful gestures that occur without accompanying speech). The multimodal data are derived from psychophysiological measurements and eye tracking. The psychophysiological measurements include, inter alia, electrodermal activity or GSR, which is indicative of emotional arousal (e.g. Peräkylä et al. 2015). Eye tracking produces data of two kinds: gaze direction and pupil size. In FreMIC, gazes are automatically recorded using the area-of-interest technology. Gaze direction is interactionally key, for example, in turn-taking (e.g. Auer 2021) and re-enactments (e.g. Pfeiffer and Weiss 2022), while changes in pupil size provide a window onto cognitive intensity (e.g. Barthel and Sauppe 2019). To demonstrate what opportunities FreMIC’s (combination of) transcriptions, annotations, and multimodal data open up for research in Interactional (Corpus) Linguistics, this article reports on interim results derived from work-in-progress.