Hybrid Models for Sentence Readability Assessment

Workshop on Innovative Use of NLP for Building Educational Applications Pub Date : 1900-01-01 DOI:10.18653/v1/2023.bea-1.37

Feng Liu, John Lee

引用次数: 0

Abstract

Automatic readability assessment (ARA) predicts how difficult it is for the reader to understand a text. While ARA has traditionally been performed at the passage level, there has been increasing interest in ARA at the sentence level, given its applications in downstream tasks such as text simplification and language exercise generation. Recent research has suggested the effectiveness of hybrid approaches for ARA, but they have yet to be applied on the sentence level. We present the first study that compares neural and hybrid models for sentence-level ARA. We conducted experiments on graded sentences from the Wall Street Journal (WSJ) and a dataset derived from the OneStopEnglish corpus. Experimental results show that both neural and hybrid models outperform traditional classifiers trained on linguistic features. Hybrid models obtained the best accuracy on both datasets, surpassing the previous best result reported on the WSJ dataset by almost 13% absolute.

查看原文本刊更多论文

句子可读性评估的混合模型

自动可读性评估(ARA)预测读者理解文本的困难程度。虽然ARA传统上是在段落级别进行的，但鉴于其在下游任务(如文本简化和语言练习生成)中的应用，人们对句子级别的ARA越来越感兴趣。最近的研究表明，混合方法对ARA的有效性，但它们尚未应用于句子层面。我们提出了第一项比较句子级ARA的神经模型和混合模型的研究。我们对来自华尔街日报(WSJ)的分级句子和来自OneStopEnglish语料库的数据集进行了实验。实验结果表明，神经模型和混合模型都优于基于语言特征训练的传统分类器。混合模型在两个数据集上都获得了最好的准确性，比之前在《华尔街日报》数据集上报告的最佳结果高出近13%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Workshop on Innovative Use of NLP for Building Educational Applications

自引率

0.00%

发文量