Workshop on Innovative Use of NLP for Building Educational Applications最新文献

筛选
英文 中文
ACTA: Short-Answer Grading in High-Stakes Medical Exams 高风险医学考试中的简答评分
Workshop on Innovative Use of NLP for Building Educational Applications Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.bea-1.36
King Yiu Suen, Victoria Yaneva, L. Ha, Janet Mee, Yiyun Zhou, Polina Harik
{"title":"ACTA: Short-Answer Grading in High-Stakes Medical Exams","authors":"King Yiu Suen, Victoria Yaneva, L. Ha, Janet Mee, Yiyun Zhou, Polina Harik","doi":"10.18653/v1/2023.bea-1.36","DOIUrl":"https://doi.org/10.18653/v1/2023.bea-1.36","url":null,"abstract":"This paper presents the ACTA system, which performs automated short-answer grading in the domain of high-stakes medical exams. The system builds upon previous work on neural similarity-based grading approaches by applying these to the medical domain and utilizing contrastive learning as a means to optimize the similarity metric. ACTA is evaluated against three strong baselines and is developed in alignment with operational needs, where low-confidence responses are flagged for human review. Learning curves are explored to understand the effects of training data on performance. The results demonstrate that ACTA leads to substantially lower number of responses being flagged for human review, while maintaining high classification accuracy.","PeriodicalId":363390,"journal":{"name":"Workshop on Innovative Use of NLP for Building Educational Applications","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121107503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A dynamic model of lexical experience for tracking of oral reading fluency 追踪口语阅读流畅性的词汇体验动态模型
Workshop on Innovative Use of NLP for Building Educational Applications Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.bea-1.48
Beata Beigman Klebanov, Mike Suhan, Zuowei Wang, T. O’Reilly
{"title":"A dynamic model of lexical experience for tracking of oral reading fluency","authors":"Beata Beigman Klebanov, Mike Suhan, Zuowei Wang, T. O’Reilly","doi":"10.18653/v1/2023.bea-1.48","DOIUrl":"https://doi.org/10.18653/v1/2023.bea-1.48","url":null,"abstract":"We present research aimed at solving a problem in assessment of oral reading fluency using children’s oral reading data from our online book reading app. It is known that properties of the passage being read aloud impact fluency estimates; therefore, passage-based measures are used to remove passage-related variance when estimating growth in oral reading fluency. However, passage-based measures reported in the literature tend to treat passages as independent events, without explicitly modeling accumulation of lexical experience as one reads through a book. We propose such a model and show that it helps explain additional variance in the measurements of children’s fluency as they read through a book, improving over a strong baseline. These results have implications for measuring growth in oral reading fluency.","PeriodicalId":363390,"journal":{"name":"Workshop on Innovative Use of NLP for Building Educational Applications","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121679672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reconciling Adaptivity and Task Orientation in the Student Dashboard of an Intelligent Language Tutoring System 智能语言教学系统中学生仪表盘的适应性与任务导向的协调
Workshop on Innovative Use of NLP for Building Educational Applications Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.bea-1.25
Leona Colling, Tanja Heck, Walt Detmar Meurers
{"title":"Reconciling Adaptivity and Task Orientation in the Student Dashboard of an Intelligent Language Tutoring System","authors":"Leona Colling, Tanja Heck, Walt Detmar Meurers","doi":"10.18653/v1/2023.bea-1.25","DOIUrl":"https://doi.org/10.18653/v1/2023.bea-1.25","url":null,"abstract":"In intelligent language tutoring systems, student dashboards should display the learning progress and performance and support the navigation through the learning content. Designing an interface that transparently offers information on students’ learning in relation to specific learning targets while linking to the overarching functional goal, that motivates and organizes the practice in current foreign language teaching, is challenging.This becomes even more difficult in systems that adaptively expose students to different learning material and individualize system interactions. If such a system is used in an ecologically valid setting of blended learning, this generates additional requirements to incorporate the needs of students and teachers for control and customizability.We present the conceptual design of a student dashboard for a task-based, user-adaptive intelligent language tutoring system intended for use in real-life English classes in secondary schools. We highlight the key challenges and spell out open questions for future research.","PeriodicalId":363390,"journal":{"name":"Workshop on Innovative Use of NLP for Building Educational Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129308751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transformer-based Hebrew NLP models for Short Answer Scoring in Biology 基于转换器的希伯来语NLP模型在生物学中的简答评分
Workshop on Innovative Use of NLP for Building Educational Applications Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.bea-1.46
Abigail Gurin Schleifer, Beata Beigman Klebanov, Moriah Ariely, Giora Alexandron
{"title":"Transformer-based Hebrew NLP models for Short Answer Scoring in Biology","authors":"Abigail Gurin Schleifer, Beata Beigman Klebanov, Moriah Ariely, Giora Alexandron","doi":"10.18653/v1/2023.bea-1.46","DOIUrl":"https://doi.org/10.18653/v1/2023.bea-1.46","url":null,"abstract":"Pre-trained large language models (PLMs) are adaptable to a wide range of downstream tasks by fine-tuning their rich contextual embeddings to the task, often without requiring much task-specific data. In this paper, we explore the use of a recently developed Hebrew PLM aleph-BERT for automated short answer grading of high school biology items. We show that the alephBERT-based system outperforms a strong CNN-based baseline, and that it general-izes unexpectedly well in a zero-shot paradigm to items on an unseen topic that address the same underlying biological concepts, opening up the possibility of automatically assessing new items without item-specific fine-tuning.","PeriodicalId":363390,"journal":{"name":"Workshop on Innovative Use of NLP for Building Educational Applications","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121124498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Grammatical Error Correction for Sentence-level Assessment in Language Learning 语言学习中句子级评价的语法错误纠正
Workshop on Innovative Use of NLP for Building Educational Applications Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.bea-1.41
Anisia Katinskaia, R. Yangarber
{"title":"Grammatical Error Correction for Sentence-level Assessment in Language Learning","authors":"Anisia Katinskaia, R. Yangarber","doi":"10.18653/v1/2023.bea-1.41","DOIUrl":"https://doi.org/10.18653/v1/2023.bea-1.41","url":null,"abstract":"The paper presents experiments on using a Grammatical Error Correction (GEC) model to assess the correctness of answers that language learners give to grammar exercises. We explored whether a GEC model can be applied in the language learning context for a language with complex morphology. We empirically check a hypothesis that a GEC model corrects only errors and leaves correct answers unchanged. We perform a test on assessing learner answers in a real but constrained language-learning setup: the learners answer only fill-in-the-blank and multiple-choice exercises. For this purpose, we use ReLCo, a publicly available manually annotated learner dataset in Russian (Katinskaia et al., 2022). In this experiment, we fine-tune a large-scale T5 language model for the GEC task and estimate its performance on the RULEC-GEC dataset (Rozovskaya and Roth, 2019) to compare with top-performing models. We also release an updated version of the RULEC-GEC test set, manually checked by native speakers. Our analysis shows that the GEC model performs reasonably well in detecting erroneous answers to grammar exercises and potentially can be used for best-performing error types in a real learning setup. However, it struggles to assess answers which were tagged by human annotators as alternative-correct using the aforementioned hypothesis. This is in large part due to a still low recall in correcting errors, and the fact that the GEC model may modify even correct words—it may generate plausible alternatives, which are hard to evaluate against the gold-standard reference.","PeriodicalId":363390,"journal":{"name":"Workshop on Innovative Use of NLP for Building Educational Applications","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130765091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Socratic Questioning of Novice Debuggers: A Benchmark Dataset and Preliminary Evaluations 新手调试器的苏格拉底问题:一个基准数据集和初步评估
Workshop on Innovative Use of NLP for Building Educational Applications Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.bea-1.57
Erfan Al-Hossami, Razvan C. Bunescu, Ryan Teehan, Laurel Powell, Khyati Mahajan, Mohsen Dorodchi
{"title":"Socratic Questioning of Novice Debuggers: A Benchmark Dataset and Preliminary Evaluations","authors":"Erfan Al-Hossami, Razvan C. Bunescu, Ryan Teehan, Laurel Powell, Khyati Mahajan, Mohsen Dorodchi","doi":"10.18653/v1/2023.bea-1.57","DOIUrl":"https://doi.org/10.18653/v1/2023.bea-1.57","url":null,"abstract":"Socratic questioning is a teaching strategy where the student is guided towards solving a problem on their own, instead of being given the solution directly. In this paper, we introduce a dataset of Socratic conversations where an instructor helps a novice programmer fix buggy solutions to simple computational problems. The dataset is then used for benchmarking the Socratic debugging abilities of GPT-based language models. While GPT-4 is observed to perform much better than GPT-3.5, its precision, and recall still fall short of human expert abilities, motivating further work in this area.","PeriodicalId":363390,"journal":{"name":"Workshop on Innovative Use of NLP for Building Educational Applications","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128847849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Scalable and Explainable Automated Scoring for Open-Ended Constructed Response Math Word Problems 开放式构造反应数学字题的可扩展和可解释的自动评分
Workshop on Innovative Use of NLP for Building Educational Applications Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.bea-1.12
Scott Hellman, Alejandro Andrade, Kyle Habermehl
{"title":"Scalable and Explainable Automated Scoring for Open-Ended Constructed Response Math Word Problems","authors":"Scott Hellman, Alejandro Andrade, Kyle Habermehl","doi":"10.18653/v1/2023.bea-1.12","DOIUrl":"https://doi.org/10.18653/v1/2023.bea-1.12","url":null,"abstract":"Open-ended constructed response math word problems (“math plus text”, or MPT) are a powerful tool in the assessment of students’ abilities to engage in mathematical reasoning and creative thinking. Such problems ask the student to compute a value or construct an expression and then explain, potentially in prose, what steps they took and why they took them. MPT items can be scored against highly structured rubrics, and we develop a novel technique for the automated scoring of MPT items that leverages these rubrics to provide explainable scoring. We show that our approach can be trained automatically and performs well on a large dataset of 34,417 responses across 14 MPT items.","PeriodicalId":363390,"journal":{"name":"Workshop on Innovative Use of NLP for Building Educational Applications","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120900867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing Neural Question Generation Architectures for Reading Comprehension 比较阅读理解的神经问题生成架构
Workshop on Innovative Use of NLP for Building Educational Applications Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.bea-1.47
E. M. Perkoff, A. Bhattacharyya, Jon Z. Cai, Jie Cao
{"title":"Comparing Neural Question Generation Architectures for Reading Comprehension","authors":"E. M. Perkoff, A. Bhattacharyya, Jon Z. Cai, Jie Cao","doi":"10.18653/v1/2023.bea-1.47","DOIUrl":"https://doi.org/10.18653/v1/2023.bea-1.47","url":null,"abstract":"In recent decades, there has been a significant push to leverage technology to aid both teachers and students in the classroom. Language processing advancements have been harnessed to provide better tutoring services, automated feedback to teachers, improved peer-to-peer feedback mechanisms, and measures of student comprehension for reading. Automated question generation systems have the potential to significantly reduce teachers’ workload in the latter. In this paper, we compare three differ- ent neural architectures for question generation across two types of reading material: narratives and textbooks. For each architecture, we explore the benefits of including question attributes in the input representation. Our models show that a T5 architecture has the best overall performance, with a RougeL score of 0.536 on a narrative corpus and 0.316 on a textbook corpus. We break down the results by attribute and discover that the attribute can improve the quality of some types of generated questions, including Action and Character, but this is not true for all models.","PeriodicalId":363390,"journal":{"name":"Workshop on Innovative Use of NLP for Building Educational Applications","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122146549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated evaluation of written discourse coherence using GPT-4 使用GPT-4自动评估书面语篇连贯性
Workshop on Innovative Use of NLP for Building Educational Applications Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.bea-1.32
Ben Naismith, Phoebe Mulcaire, J. Burstein
{"title":"Automated evaluation of written discourse coherence using GPT-4","authors":"Ben Naismith, Phoebe Mulcaire, J. Burstein","doi":"10.18653/v1/2023.bea-1.32","DOIUrl":"https://doi.org/10.18653/v1/2023.bea-1.32","url":null,"abstract":"The popularization of large language models (LLMs) such as OpenAI’s GPT-3 and GPT-4 have led to numerous innovations in the field of AI in education. With respect to automated writing evaluation (AWE), LLMs have reduced challenges associated with assessing writing quality characteristics that are difficult to identify automatically, such as discourse coherence. In addition, LLMs can provide rationales for their evaluations (ratings) which increases score interpretability and transparency. This paper investigates one approach to producing ratings by training GPT-4 to assess discourse coherence in a manner consistent with expert human raters. The findings of the study suggest that GPT-4 has strong potential to produce discourse coherence ratings that are comparable to human ratings, accompanied by clear rationales. Furthermore, the GPT-4 ratings outperform traditional NLP coherence metrics with respect to agreement with human ratings. These results have implications for advancing AWE technology for learning and assessment.","PeriodicalId":363390,"journal":{"name":"Workshop on Innovative Use of NLP for Building Educational Applications","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128709967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
NAISTeacher: A Prompt and Rerank Approach to Generating Teacher Utterances in Educational Dialogues 教师:教育对话中教师话语生成的快速和重新排序方法
Workshop on Innovative Use of NLP for Building Educational Applications Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.bea-1.63
Justin Vasselli, Christopher Vasselli, Adam Nohejl, Taro Watanabe
{"title":"NAISTeacher: A Prompt and Rerank Approach to Generating Teacher Utterances in Educational Dialogues","authors":"Justin Vasselli, Christopher Vasselli, Adam Nohejl, Taro Watanabe","doi":"10.18653/v1/2023.bea-1.63","DOIUrl":"https://doi.org/10.18653/v1/2023.bea-1.63","url":null,"abstract":"This paper presents our approach to the BEA 2023 shared task of generating teacher responses in educational dialogues, using the Teacher-Student Chatroom Corpus. Our system prompts GPT-3.5-turbo to generate initial suggestions, which are then subjected to reranking. We explore multiple strategies for candidate generation, including prompting for multiple candidates and employing iterative few-shot prompts with negative examples. We aggregate all candidate responses and rerank them based on DialogRPT scores. To handle consecutive turns in the dialogue data, we divide the task of generating teacher utterances into two components: teacher replies to the student and teacher continuations of previously sent messages. Through our proposed methodology, our system achieved the top score on both automated metrics and human evaluation, surpassing the reference human teachers on the latter.","PeriodicalId":363390,"journal":{"name":"Workshop on Innovative Use of NLP for Building Educational Applications","volume":"202 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124287535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信