Uncertainty-Aware Personalized Readability Assessments for Second Language Learners

2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA) Pub Date : 2019-12-01 DOI:10.1109/ICMLA.2019.00307

Yo Ehara

{"title":"Uncertainty-Aware Personalized Readability Assessments for Second Language Learners","authors":"Yo Ehara","doi":"10.1109/ICMLA.2019.00307","DOIUrl":null,"url":null,"abstract":"Assessing whether an ungraded second language learner can read a given text quickly is important for further instructing and supporting the learner, particularly when evaluating numerous ungraded learners from diverse backgrounds. Second language acquisition (SLA) studies have tackled such assessment tasks wherein only a single short vocabulary test result is available to assess a learner; such studies have shown that the text-coverage, i.e., the percentage of words the learner knows in the text, is the key assessment measure. Currently, count-based percentages are used, in which each word in the given text is classified as being known or unknown to the learner, and the words classified as known are then simply counted. When each word is classified, we can also obtain an uncertainty value as to how likely each word is known to the learner. Although such values can be informative for a readability assessment, how to leverage these values to guarantee their use as an assessment measure that is comparable to that of the previous values remains unclear. We propose a novel framework that allows assessment methods to be uncertainty-aware while guaranteeing comparability to the text-coverage threshold. Such methods involve a computationally complex problem, for which we also propose a practical algorithm. In addition, we propose a neural-network based classifier from which we can obtain better uncertainty values. For evaluation, we created a crowdsourcing-based dataset in which a learner takes both vocabulary and readability tests. The best method under our framework outperformed conventional methods.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2019.00307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Assessing whether an ungraded second language learner can read a given text quickly is important for further instructing and supporting the learner, particularly when evaluating numerous ungraded learners from diverse backgrounds. Second language acquisition (SLA) studies have tackled such assessment tasks wherein only a single short vocabulary test result is available to assess a learner; such studies have shown that the text-coverage, i.e., the percentage of words the learner knows in the text, is the key assessment measure. Currently, count-based percentages are used, in which each word in the given text is classified as being known or unknown to the learner, and the words classified as known are then simply counted. When each word is classified, we can also obtain an uncertainty value as to how likely each word is known to the learner. Although such values can be informative for a readability assessment, how to leverage these values to guarantee their use as an assessment measure that is comparable to that of the previous values remains unclear. We propose a novel framework that allows assessment methods to be uncertainty-aware while guaranteeing comparability to the text-coverage threshold. Such methods involve a computationally complex problem, for which we also propose a practical algorithm. In addition, we propose a neural-network based classifier from which we can obtain better uncertainty values. For evaluation, we created a crowdsourcing-based dataset in which a learner takes both vocabulary and readability tests. The best method under our framework outperformed conventional methods.

查看原文本刊更多论文

面向第二语言学习者的不确定性个性化可读性评估

评估未分级的第二语言学习者是否能够快速阅读给定的文本对于进一步指导和支持学习者非常重要，特别是在评估来自不同背景的众多未分级的学习者时。第二语言习得(SLA)研究已经解决了这样的评估任务，其中只有一个简短的词汇测试结果可用来评估学习者;这些研究表明，文本覆盖率，即学习者在文本中认识的单词的百分比，是关键的评估指标。目前，使用基于计数的百分比，其中给定文本中的每个单词被学习者分类为已知或未知，然后简单地计数被分类为已知的单词。当每个单词被分类时，我们还可以获得一个不确定性值，即每个单词被学习者知道的可能性有多大。尽管这些值可以为可读性评估提供信息，但是如何利用这些值来保证将它们用作与先前值相当的评估度量仍然不清楚。我们提出了一种新的框架，允许评估方法在保证与文本覆盖阈值的可比性的同时具有不确定性意识。这种方法涉及计算复杂的问题，我们也提出了一种实用的算法。此外，我们提出了一种基于神经网络的分类器，从中我们可以获得更好的不确定性值。为了评估，我们创建了一个基于众包的数据集，学习者在其中接受词汇量和可读性测试。在我们的框架下，最佳方法优于传统方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)

自引率

0.00%

发文量