Building an Annotated L1 Arabic/L2 English Bilingual Writer Corpus: The Qatari Corpus of Argumentative Writing (QCAW)

Abdelhamid M. Ahmed, Xiao Zhang, L. Rezk, W. Zaghouani
{"title":"Building an Annotated L1 Arabic/L2 English Bilingual Writer Corpus: The Qatari Corpus of Argumentative Writing (QCAW)","authors":"Abdelhamid M. Ahmed, Xiao Zhang, L. Rezk, W. Zaghouani","doi":"10.1515/csh-2023-0012","DOIUrl":null,"url":null,"abstract":"Abstract The study presents the creation of the Qatari Corpus of Argumentative Writing (QCAW) as an annotated L1 Arabic and L2 English bilingual writer corpus. It comprises 200,000 tokens of argumentative writing by Qatari university students in L1 Arabic and L2 English. The corpus includes 195 essays written by 195 students, 159 females and 36 males. The students were native Arabic speakers proficient in English as a second language. The corpus is divided into Arabic and English sections, accompanied by part-of-speech annotated files in UTF-8 encoded text format. Metadata in CSV format contains information about the students (gender, major, first and second languages) and the essays (text serial numbers, word limits, genre, writing date, time spent, and location). The current study outlines the steps for collecting and analysing the corpus, including details on essay writers, topic selection, pre-analysis text modifications, proficiency level, gender, and major ratings. Statistical analyses were applied to examine the corpus. The QCAW offers a valuable bilingual data source authored by the same students in Arabic and English, with implications for further research.","PeriodicalId":474295,"journal":{"name":"Corpus-based Studies across Humanities","volume":"2 11","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Corpus-based Studies across Humanities","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.1515/csh-2023-0012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract The study presents the creation of the Qatari Corpus of Argumentative Writing (QCAW) as an annotated L1 Arabic and L2 English bilingual writer corpus. It comprises 200,000 tokens of argumentative writing by Qatari university students in L1 Arabic and L2 English. The corpus includes 195 essays written by 195 students, 159 females and 36 males. The students were native Arabic speakers proficient in English as a second language. The corpus is divided into Arabic and English sections, accompanied by part-of-speech annotated files in UTF-8 encoded text format. Metadata in CSV format contains information about the students (gender, major, first and second languages) and the essays (text serial numbers, word limits, genre, writing date, time spent, and location). The current study outlines the steps for collecting and analysing the corpus, including details on essay writers, topic selection, pre-analysis text modifications, proficiency level, gender, and major ratings. Statistical analyses were applied to examine the corpus. The QCAW offers a valuable bilingual data source authored by the same students in Arabic and English, with implications for further research.
建立带注释的第一语言阿拉伯语/第二语言英语双语写作语料库:卡塔尔论辩性写作语料库 (QCAW)
摘要 本研究介绍了卡塔尔论证性写作语料库(QCAW)的创建情况,该语料库是一个带注释的第一语言阿拉伯语和第二语言英语的双语作家语料库。该语料库包括卡塔尔大学生用第一语言阿拉伯语和第二语言英语撰写的 200,000 个论证性写作标记。语料库包括 195 名学生撰写的 195 篇文章,其中女生 159 人,男生 36 人。这些学生以阿拉伯语为母语,英语为第二语言。语料库分为阿拉伯语和英语两个部分,并附有部分语音注释文件(UTF-8 编码文本格式)。CSV 格式的元数据包含学生信息(性别、专业、第一语言和第二语言)和论文信息(文本序号、字数限制、体裁、写作日期、花费时间和地点)。本研究概述了收集和分析语料库的步骤,包括作文作者、选题、分析前文本修改、熟练程度、性别和专业评级等详细信息。对语料库进行了统计分析。QCAW 提供了由同一学生用阿拉伯语和英语撰写的宝贵的双语数据源,对进一步的研究具有重要意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信