来自YouTube的北美双模态和自我修复数据库

Q2 Arts and Humanities
Steven Coats
{"title":"来自YouTube的北美双模态和自我修复数据库","authors":"Steven Coats","doi":"10.2478/plc-2022-13","DOIUrl":null,"url":null,"abstract":"Abstract Sequences of two modal verbs in spoken English can represent use of a nonstandard syntactic feature (double modal) or a corrected utterance in which a speaker begins with one modal auxiliary, but switches to another (self-repair). This article presents the Double Modals and Self-Repairs (DMSR) database, a table of naturalistic double modals and self-repairs in videos from local government entities in North America, created from the Corpus of North American Spoken English (CoNASE). The paper describes the procedures used for the database’s creation, discusses potential uses, and presents an exploratory analysis in which a logistic regression classifier is trained with CoNASE data to distinguish authentic double modals from self-repair sequences on the basis of local discourse context. The analysis demonstrates how large corpora of speech can be used to investigate the links between syntactic and pragmatic phenomena and shows specifically that double modals are an interactive device, while two-modal sequences as self-repairs may be the result of high cognitive load. The paper concludes with a discussion of multimodal corpus creation from YouTube for the study of lexical, syntactic, and interactional phenomena in speech as well as for the analysis of complex, multilevel computer-mediated communication (CMC) phenomena.","PeriodicalId":20768,"journal":{"name":"Psychology of Language and Communication","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A database of North American double modals and self-repairs from YouTube\",\"authors\":\"Steven Coats\",\"doi\":\"10.2478/plc-2022-13\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Sequences of two modal verbs in spoken English can represent use of a nonstandard syntactic feature (double modal) or a corrected utterance in which a speaker begins with one modal auxiliary, but switches to another (self-repair). This article presents the Double Modals and Self-Repairs (DMSR) database, a table of naturalistic double modals and self-repairs in videos from local government entities in North America, created from the Corpus of North American Spoken English (CoNASE). The paper describes the procedures used for the database’s creation, discusses potential uses, and presents an exploratory analysis in which a logistic regression classifier is trained with CoNASE data to distinguish authentic double modals from self-repair sequences on the basis of local discourse context. The analysis demonstrates how large corpora of speech can be used to investigate the links between syntactic and pragmatic phenomena and shows specifically that double modals are an interactive device, while two-modal sequences as self-repairs may be the result of high cognitive load. The paper concludes with a discussion of multimodal corpus creation from YouTube for the study of lexical, syntactic, and interactional phenomena in speech as well as for the analysis of complex, multilevel computer-mediated communication (CMC) phenomena.\",\"PeriodicalId\":20768,\"journal\":{\"name\":\"Psychology of Language and Communication\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Psychology of Language and Communication\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2478/plc-2022-13\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Arts and Humanities\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychology of Language and Communication","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/plc-2022-13","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 0

摘要

摘要英语口语中的双语气动词序列可以表示非标准句法特征(双语气)的使用,也可以表示说话人从一个语气助词开始,但转换到另一个语气辅助词(自我修复)的纠正话语。本文介绍了双模态和自我修复(DMSR)数据库,这是一个由北美口语语料库(CoNASE)创建的北美地方政府实体视频中的自然主义双模态和自修复表。本文描述了数据库创建过程,讨论了潜在的用途,并提出了一种探索性分析,其中用CoNASE数据训练逻辑回归分类器,以在本地话语上下文的基础上区分真实的双模态和自修复序列。该分析表明,大型语料库可以用来研究句法和语用现象之间的联系,并特别表明双模态是一种互动装置,而双模态序列作为自我修复可能是高认知负荷的结果。本文最后讨论了YouTube上的多模式语料库创建,用于研究语音中的词汇、句法和互动现象,以及分析复杂的、多层次的计算机中介通信(CMC)现象。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A database of North American double modals and self-repairs from YouTube
Abstract Sequences of two modal verbs in spoken English can represent use of a nonstandard syntactic feature (double modal) or a corrected utterance in which a speaker begins with one modal auxiliary, but switches to another (self-repair). This article presents the Double Modals and Self-Repairs (DMSR) database, a table of naturalistic double modals and self-repairs in videos from local government entities in North America, created from the Corpus of North American Spoken English (CoNASE). The paper describes the procedures used for the database’s creation, discusses potential uses, and presents an exploratory analysis in which a logistic regression classifier is trained with CoNASE data to distinguish authentic double modals from self-repair sequences on the basis of local discourse context. The analysis demonstrates how large corpora of speech can be used to investigate the links between syntactic and pragmatic phenomena and shows specifically that double modals are an interactive device, while two-modal sequences as self-repairs may be the result of high cognitive load. The paper concludes with a discussion of multimodal corpus creation from YouTube for the study of lexical, syntactic, and interactional phenomena in speech as well as for the analysis of complex, multilevel computer-mediated communication (CMC) phenomena.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Psychology of Language and Communication
Psychology of Language and Communication Arts and Humanities-Language and Linguistics
CiteScore
0.80
自引率
0.00%
发文量
11
审稿时长
14 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信