Improving Automatic Quotation Attribution in Literary Novels

Annual Meeting of the Association for Computational Linguistics Pub Date : 2023-07-07 DOI:10.48550/arXiv.2307.03734

Krishnapriya Vishnubhotla, Frank Rudzicz, Graeme Hirst, Adam Hammond

引用次数: 0

Abstract

Current models for quotation attribution in literary novels assume varying levels of available information in their training and test data, which poses a challenge for in-the-wild inference. Here, we approach quotation attribution as a set of four interconnected sub-tasks: character identification, coreference resolution, quotation identification, and speaker attribution. We benchmark state-of-the-art models on each of these sub-tasks independently, using a large dataset of annotated coreferences and quotations in literary novels (the Project Dialogism Novel Corpus). We also train and evaluate models for the speaker attribution task in particular, showing that a simple sequential prediction model achieves accuracy scores on par with state-of-the-art models.

查看原文本刊更多论文

改进文学小说引文自动归因

目前的文学小说引文归因模型在训练和测试数据中假设了不同程度的可用信息，这对野外推理提出了挑战。在这里，我们将引文归因作为一组四个相互关联的子任务:字符识别、共指解析、引文识别和说话人归因。我们对这些子任务中的每一个独立的最先进的模型进行基准测试，使用文学小说中带注释的共同参考和引文的大型数据集(Project Dialogism Novel Corpus)。我们还特别训练和评估了说话人归因任务的模型，表明简单的顺序预测模型达到了与最先进模型相当的精度分数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Annual Meeting of the Association for Computational Linguistics

自引率

0.00%

发文量