到达冰山一角:弗莱堡多模态交互语料库指南

Pub Date : 2023-11-17 DOI:10.1515/opli-2022-0245
Christoph Rühlemann, Alexander Ptak
{"title":"到达冰山一角:弗莱堡多模态交互语料库指南","authors":"Christoph Rühlemann, Alexander Ptak","doi":"10.1515/opli-2022-0245","DOIUrl":null,"url":null,"abstract":"Most corpora tacitly subscribe to a speech-only view filtering out anything that is not a ‘word’ and transcribing the spoken language merely orthographically despite the fact that the “speech-only view on language is fundamentally incomplete” (Kok 2017, 2) due to the deep intertwining of the verbal, vocal, and kinesic modalities (Levinson and Holler 2014). This article introduces the Freiburg Multimodal Interaction Corpus (FreMIC), a multimodal and interactional corpus of unscripted conversation in English currently under construction. At the time of writing, FreMIC comprises (i) c. 29 h of video-recordings transcribed and annotated in detail and (ii) automatically (and manually) generated multimodal data. All conversations are transcribed in ELAN both orthographically and using Jeffersonian conventions to render verbal content and interactionally relevant details of sequencing (e.g. overlap, latching), temporal aspects (pauses, acceleration/deceleration), phonological aspects (e.g. intensity, pitch, stretching, truncation, voice quality), and laughter. Moreover, the orthographic transcriptions are exhaustively PoS-tagged using the CLAWS web tagger (Garside and Smith 1997). ELAN-based transcriptions also provide exhaustive annotations of re-enactments (also referred to as (free) direct speech, constructed dialogue, etc.) as well as silent gestures (meaningful gestures that occur without accompanying speech). The multimodal data are derived from psychophysiological measurements and eye tracking. The psychophysiological measurements include, inter alia, electrodermal activity or GSR, which is indicative of emotional arousal (e.g. Peräkylä et al. 2015). Eye tracking produces data of two kinds: gaze direction and pupil size. In FreMIC, gazes are automatically recorded using the area-of-interest technology. Gaze direction is interactionally key, for example, in turn-taking (e.g. Auer 2021) and re-enactments (e.g. Pfeiffer and Weiss 2022), while changes in pupil size provide a window onto cognitive intensity (e.g. Barthel and Sauppe 2019). To demonstrate what opportunities FreMIC’s (combination of) transcriptions, annotations, and multimodal data open up for research in Interactional (Corpus) Linguistics, this article reports on interim results derived from work-in-progress.","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reaching beneath the tip of the iceberg: A guide to the Freiburg Multimodal Interaction Corpus\",\"authors\":\"Christoph Rühlemann, Alexander Ptak\",\"doi\":\"10.1515/opli-2022-0245\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most corpora tacitly subscribe to a speech-only view filtering out anything that is not a ‘word’ and transcribing the spoken language merely orthographically despite the fact that the “speech-only view on language is fundamentally incomplete” (Kok 2017, 2) due to the deep intertwining of the verbal, vocal, and kinesic modalities (Levinson and Holler 2014). This article introduces the Freiburg Multimodal Interaction Corpus (FreMIC), a multimodal and interactional corpus of unscripted conversation in English currently under construction. At the time of writing, FreMIC comprises (i) c. 29 h of video-recordings transcribed and annotated in detail and (ii) automatically (and manually) generated multimodal data. All conversations are transcribed in ELAN both orthographically and using Jeffersonian conventions to render verbal content and interactionally relevant details of sequencing (e.g. overlap, latching), temporal aspects (pauses, acceleration/deceleration), phonological aspects (e.g. intensity, pitch, stretching, truncation, voice quality), and laughter. Moreover, the orthographic transcriptions are exhaustively PoS-tagged using the CLAWS web tagger (Garside and Smith 1997). ELAN-based transcriptions also provide exhaustive annotations of re-enactments (also referred to as (free) direct speech, constructed dialogue, etc.) as well as silent gestures (meaningful gestures that occur without accompanying speech). The multimodal data are derived from psychophysiological measurements and eye tracking. The psychophysiological measurements include, inter alia, electrodermal activity or GSR, which is indicative of emotional arousal (e.g. Peräkylä et al. 2015). Eye tracking produces data of two kinds: gaze direction and pupil size. In FreMIC, gazes are automatically recorded using the area-of-interest technology. Gaze direction is interactionally key, for example, in turn-taking (e.g. Auer 2021) and re-enactments (e.g. Pfeiffer and Weiss 2022), while changes in pupil size provide a window onto cognitive intensity (e.g. Barthel and Sauppe 2019). To demonstrate what opportunities FreMIC’s (combination of) transcriptions, annotations, and multimodal data open up for research in Interactional (Corpus) Linguistics, this article reports on interim results derived from work-in-progress.\",\"PeriodicalId\":0,\"journal\":{\"name\":\"\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0,\"publicationDate\":\"2023-11-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1515/opli-2022-0245\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/opli-2022-0245","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

大多数语料库默认只支持语音视图,过滤掉任何不是“单词”的东西,仅仅按正字法转录口语,尽管事实上“语言的语音视图基本上是不完整的”(Kok 2017, 2),因为言语、声音和动作模式的深度交织(Levinson和Holler 2014)。本文介绍了Freiburg多模态交互语料库(Freiburg Multimodal Interaction Corpus, FreMIC),这是一个目前正在建设中的多模态和交互的无脚本英语会话语料库。在撰写本文时,FreMIC包括(i) c. 29小时的视频记录的详细转录和注释,以及(ii)自动(和手动)生成的多模式数据。所有的对话都是在ELAN中正字法转录的,并使用杰斐逊的约定来呈现口头内容和交互相关的顺序细节(例如重叠、闩锁)、时间方面(暂停、加速/减速)、语音方面(例如强度、音高、拉伸、截断、音质)和笑声。此外,正字法转录是详尽的pos标记使用爪网络标签(Garside和史密斯1997年)。基于elan的转录还提供了详尽的再现注释(也称为(自由)直接语音、构造对话等)以及无声手势(没有伴随语音的有意义的手势)。多模态数据来源于心理生理测量和眼动追踪。心理生理测量包括,除其他外,皮电活动或GSR,这是情绪唤醒的指示(例如Peräkylä et al. 2015)。眼动追踪产生两种数据:凝视方向和瞳孔大小。在FreMIC中,使用感兴趣区域技术自动记录注视。例如,在轮流(例如Auer 2021)和重演(例如Pfeiffer和Weiss 2022)中,凝视方向是互动的关键,而瞳孔大小的变化为认知强度提供了一个窗口(例如Barthel和Sauppe 2019)。为了证明FreMIC(组合)的转录、注释和多模态数据为交互(语料库)语言学的研究提供了什么机会,本文报告了从正在进行的工作中获得的中期结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
分享
查看原文
Reaching beneath the tip of the iceberg: A guide to the Freiburg Multimodal Interaction Corpus
Most corpora tacitly subscribe to a speech-only view filtering out anything that is not a ‘word’ and transcribing the spoken language merely orthographically despite the fact that the “speech-only view on language is fundamentally incomplete” (Kok 2017, 2) due to the deep intertwining of the verbal, vocal, and kinesic modalities (Levinson and Holler 2014). This article introduces the Freiburg Multimodal Interaction Corpus (FreMIC), a multimodal and interactional corpus of unscripted conversation in English currently under construction. At the time of writing, FreMIC comprises (i) c. 29 h of video-recordings transcribed and annotated in detail and (ii) automatically (and manually) generated multimodal data. All conversations are transcribed in ELAN both orthographically and using Jeffersonian conventions to render verbal content and interactionally relevant details of sequencing (e.g. overlap, latching), temporal aspects (pauses, acceleration/deceleration), phonological aspects (e.g. intensity, pitch, stretching, truncation, voice quality), and laughter. Moreover, the orthographic transcriptions are exhaustively PoS-tagged using the CLAWS web tagger (Garside and Smith 1997). ELAN-based transcriptions also provide exhaustive annotations of re-enactments (also referred to as (free) direct speech, constructed dialogue, etc.) as well as silent gestures (meaningful gestures that occur without accompanying speech). The multimodal data are derived from psychophysiological measurements and eye tracking. The psychophysiological measurements include, inter alia, electrodermal activity or GSR, which is indicative of emotional arousal (e.g. Peräkylä et al. 2015). Eye tracking produces data of two kinds: gaze direction and pupil size. In FreMIC, gazes are automatically recorded using the area-of-interest technology. Gaze direction is interactionally key, for example, in turn-taking (e.g. Auer 2021) and re-enactments (e.g. Pfeiffer and Weiss 2022), while changes in pupil size provide a window onto cognitive intensity (e.g. Barthel and Sauppe 2019). To demonstrate what opportunities FreMIC’s (combination of) transcriptions, annotations, and multimodal data open up for research in Interactional (Corpus) Linguistics, this article reports on interim results derived from work-in-progress.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信