识别学习者语料库中的错误——错误定位与错误描述的两个阶段以及测量和报告注释者间一致性的后果

Applied Corpus Linguistics Pub Date : 2023-04-01 DOI:10.1016/j.acorp.2022.100039

Nikola Dobrić

{"title":"识别学习者语料库中的错误——错误定位与错误描述的两个阶段以及测量和报告注释者间一致性的后果","authors":"Nikola Dobrić","doi":"10.1016/j.acorp.2022.100039","DOIUrl":null,"url":null,"abstract":"<div><p>Marking errors in L2 learner performance, though useful in both a didactic and academic sense, is a challenging process, one usually performed manually when involving learner corpora. This is because errors are largely latent phenomena whose manual identification and description involve a significant degree of judgment on the side of human annotators. The purpose of the paper is to discuss and demonstrate the implications of the two stages of the decision-making process that is manual error coding, <em>error location</em> and <em>error description</em>, for measuring inter-annotator agreement as a marker of quality of annotation. The crux of the study is in the proposal that inter-annotator agreement on error location and on error description should be considered and reported separately rather than, as is common, together as a single measurement. The case study, grounded in a high-stakes exam context and typified using an established error taxonomy, demonstrates the method behind the proposal and showcases its usefulness in real-world settings.</p></div>","PeriodicalId":72254,"journal":{"name":"Applied Corpus Linguistics","volume":"3 1","pages":"Article 100039"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Identifying errors in a learner corpus – the two stages of error location vs. error description and consequences for measuring and reporting inter-annotator agreement\",\"authors\":\"Nikola Dobrić\",\"doi\":\"10.1016/j.acorp.2022.100039\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Marking errors in L2 learner performance, though useful in both a didactic and academic sense, is a challenging process, one usually performed manually when involving learner corpora. This is because errors are largely latent phenomena whose manual identification and description involve a significant degree of judgment on the side of human annotators. The purpose of the paper is to discuss and demonstrate the implications of the two stages of the decision-making process that is manual error coding, <em>error location</em> and <em>error description</em>, for measuring inter-annotator agreement as a marker of quality of annotation. The crux of the study is in the proposal that inter-annotator agreement on error location and on error description should be considered and reported separately rather than, as is common, together as a single measurement. The case study, grounded in a high-stakes exam context and typified using an established error taxonomy, demonstrates the method behind the proposal and showcases its usefulness in real-world settings.</p></div>\",\"PeriodicalId\":72254,\"journal\":{\"name\":\"Applied Corpus Linguistics\",\"volume\":\"3 1\",\"pages\":\"Article 100039\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Corpus Linguistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666799122000235\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Corpus Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666799122000235","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

标记第二语言学习者表现中的错误，虽然在教学和学术意义上都很有用，但却是一个具有挑战性的过程，通常在涉及学习者语料库时手动进行。这是因为错误在很大程度上是潜在的现象，其手动识别和描述涉及人类注释者方面的很大程度的判断。本文的目的是讨论和展示决策过程的两个阶段，即人工错误编码、错误定位和错误描述，对于衡量注释者之间的一致性作为注释质量的标志的含义。该研究的关键在于，应单独考虑和报告注释者之间对错误位置和错误描述的一致意见，而不是像通常那样将其作为单一测量一起考虑和报告。案例研究以高风险的考试环境为基础，并使用已建立的错误分类法进行分类，演示了提案背后的方法，并展示了其在现实环境中的实用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Identifying errors in a learner corpus – the two stages of error location vs. error description and consequences for measuring and reporting inter-annotator agreement

Marking errors in L2 learner performance, though useful in both a didactic and academic sense, is a challenging process, one usually performed manually when involving learner corpora. This is because errors are largely latent phenomena whose manual identification and description involve a significant degree of judgment on the side of human annotators. The purpose of the paper is to discuss and demonstrate the implications of the two stages of the decision-making process that is manual error coding, error location and error description, for measuring inter-annotator agreement as a marker of quality of annotation. The crux of the study is in the proposal that inter-annotator agreement on error location and on error description should be considered and reported separately rather than, as is common, together as a single measurement. The case study, grounded in a high-stakes exam context and typified using an established error taxonomy, demonstrates the method behind the proposal and showcases its usefulness in real-world settings.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Corpus Linguistics Linguistics and Language

CiteScore

1.30

自引率

0.00%

发文量

审稿时长

70 days