{"title":"Annotating Entailment Relations for Shortanswer Questions","authors":"Simon Ostermann, Andrea Horbach, Manfred Pinkal","doi":"10.18653/v1/W15-4408","DOIUrl":null,"url":null,"abstract":"This paper presents an annotation project that explores the relationship between textual entailment and short answer scoring (SAS). We annotate entailment relations between learner and target answers in the Corpus of Reading Comprehension Exercises for German (CREG) with a finegrained label inventory and compare them in various ways to correctness scores assigned by teachers. Our main finding is that although both tasks are clearly related, not all of our entailment tags can be directly mapped to SAS scores and that especially the area of partial entailment covers instances that are problematic for automatic scoring and need further investigation.","PeriodicalId":316430,"journal":{"name":"NLP-TEA@ACL/IJCNLP","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NLP-TEA@ACL/IJCNLP","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/W15-4408","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
This paper presents an annotation project that explores the relationship between textual entailment and short answer scoring (SAS). We annotate entailment relations between learner and target answers in the Corpus of Reading Comprehension Exercises for German (CREG) with a finegrained label inventory and compare them in various ways to correctness scores assigned by teachers. Our main finding is that although both tasks are clearly related, not all of our entailment tags can be directly mapped to SAS scores and that especially the area of partial entailment covers instances that are problematic for automatic scoring and need further investigation.