视频游戏对话语料库

IF 0.8 Q3 LINGUISTICS
Corpora Pub Date : 2024-04-01 DOI:10.3366/cor.2024.0299
Stephanie Rennick, Seán Roberts
{"title":"视频游戏对话语料库","authors":"Stephanie Rennick, Seán Roberts","doi":"10.3366/cor.2024.0299","DOIUrl":null,"url":null,"abstract":"This paper presents the Video Game Dialogue Corpus, the first large-scale, consistently coded, open source corpus of dialogue from video games. It contains over 6.2 million words of English dialogue from fifty games in the Role Playing Game (rpg) genre. This includes games produced between 1985 and 2020, rated for children, teenagers and adults, and in both ‘Western’ and ‘Japanese’ sub-genres. The corpus design is described, including custom data formats for representing branching dialogue. We demonstrate the use of the corpus by comparing the dialogue of female and male characters, where we find reflections of gendered language in other media as well as patterns that seem specific to video games. We provide the source code for a ‘self-inflating corpus’ – a pipeline that obtains the data then processes and parses it into a standard format. This makes the corpus available for teaching and research purposes, providing the first such resource for empirical analysis of video game dialogue.","PeriodicalId":44933,"journal":{"name":"Corpora","volume":null,"pages":null},"PeriodicalIF":0.8000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"The Video Game Dialogue Corpus\",\"authors\":\"Stephanie Rennick, Seán Roberts\",\"doi\":\"10.3366/cor.2024.0299\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents the Video Game Dialogue Corpus, the first large-scale, consistently coded, open source corpus of dialogue from video games. It contains over 6.2 million words of English dialogue from fifty games in the Role Playing Game (rpg) genre. This includes games produced between 1985 and 2020, rated for children, teenagers and adults, and in both ‘Western’ and ‘Japanese’ sub-genres. The corpus design is described, including custom data formats for representing branching dialogue. We demonstrate the use of the corpus by comparing the dialogue of female and male characters, where we find reflections of gendered language in other media as well as patterns that seem specific to video games. We provide the source code for a ‘self-inflating corpus’ – a pipeline that obtains the data then processes and parses it into a standard format. This makes the corpus available for teaching and research purposes, providing the first such resource for empirical analysis of video game dialogue.\",\"PeriodicalId\":44933,\"journal\":{\"name\":\"Corpora\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2024-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Corpora\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3366/cor.2024.0299\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Corpora","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3366/cor.2024.0299","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"LINGUISTICS","Score":null,"Total":0}
引用次数: 2

摘要

本文介绍了视频游戏对话语料库(Video Game Dialogue Corpus),这是第一个大规模、持续编码、开源的视频游戏对话语料库。该语料库包含 50 款角色扮演游戏(rpg)类型游戏中超过 620 万字的英语对话。其中包括 1985 年至 2020 年间制作的游戏,分级为儿童、青少年和成人,有 "西方 "和 "日本 "两种子类型。我们介绍了语料库的设计,包括用于表示分支对话的自定义数据格式。我们通过比较女性和男性角色的对话来演示语料库的使用,我们发现了其他媒体中性别语言的反映,以及似乎是电子游戏特有的模式。我们提供了 "自充气语料库 "的源代码--这是一个获取数据、处理数据并将其解析为标准格式的管道。这使得该语料库可用于教学和研究目的,为视频游戏对话的实证分析提供了首个此类资源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The Video Game Dialogue Corpus
This paper presents the Video Game Dialogue Corpus, the first large-scale, consistently coded, open source corpus of dialogue from video games. It contains over 6.2 million words of English dialogue from fifty games in the Role Playing Game (rpg) genre. This includes games produced between 1985 and 2020, rated for children, teenagers and adults, and in both ‘Western’ and ‘Japanese’ sub-genres. The corpus design is described, including custom data formats for representing branching dialogue. We demonstrate the use of the corpus by comparing the dialogue of female and male characters, where we find reflections of gendered language in other media as well as patterns that seem specific to video games. We provide the source code for a ‘self-inflating corpus’ – a pipeline that obtains the data then processes and parses it into a standard format. This makes the corpus available for teaching and research purposes, providing the first such resource for empirical analysis of video game dialogue.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Corpora
Corpora LINGUISTICS-
CiteScore
1.70
自引率
0.00%
发文量
20
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信