SWING: Balancing Coverage and Faithfulness for Dialogue Summarization

Findings (Sydney (N.S.W.) Pub Date : 2023-01-25 DOI:10.48550/arXiv.2301.10483

Kung-Hsiang Huang, Kung-Hsiang Huang, Siffi Singh, Xiaofei Ma, Wei Xiao, Wei Xiao, Nicholas Dingwall, William Yang Wang, K. McKeown

引用次数: 4

Abstract

Missing information is a common issue of dialogue summarization where some information in the reference summaries is not covered in the generated summaries. To address this issue, we propose to utilize natural language inference (NLI) models to improve coverage while avoiding introducing factual inconsistencies. Specifically, we use NLI to compute fine-grained training signals to encourage the model to generate content in the reference summaries that have not been covered, as well as to distinguish between factually consistent and inconsistent generated sentences. Experiments on the DialogSum and SAMSum datasets confirm the effectiveness of the proposed approach in balancing coverage and faithfulness, validated with automatic metrics and human evaluations. Additionally, we compute the correlation between commonly used automatic metrics with human judgments in terms of three different dimensions regarding coverage and factual consistency to provide insight into the most suitable metric for evaluating dialogue summaries.

查看原文本刊更多论文

SWING：平衡对话总结的覆盖面和真实性

缺少信息是对话摘要的一个常见问题，其中引用摘要中的一些信息没有包含在生成的摘要中。为了解决这个问题，我们建议利用自然语言推理(NLI)模型来提高覆盖率，同时避免引入事实不一致。具体来说，我们使用NLI来计算细粒度的训练信号，以鼓励模型在参考摘要中生成未被覆盖的内容，以及区分事实一致和不一致生成的句子。在DialogSum和SAMSum数据集上的实验证实了所提出的方法在平衡覆盖率和可信度方面的有效性，并通过自动度量和人工评估进行了验证。此外，我们根据关于覆盖率和事实一致性的三个不同维度，计算常用的自动度量与人类判断之间的相关性，以深入了解评估对话摘要的最合适度量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Findings (Sydney (N.S.W.)

自引率

0.00%

发文量

审稿时长

4 weeks