narasum:用于抽象叙述摘要的大规模数据集

Chao Zhao, Faeze Brahman, Kaiqiang Song, Wenlin Yao, Dian Yu, Snigdha Chaturvedi
{"title":"narasum:用于抽象叙述摘要的大规模数据集","authors":"Chao Zhao, Faeze Brahman, Kaiqiang Song, Wenlin Yao, Dian Yu, Snigdha Chaturvedi","doi":"10.48550/arXiv.2212.01476","DOIUrl":null,"url":null,"abstract":"Narrative summarization aims to produce a distilled version of a narrative to describe its most salient events and characters. Summarizing a narrative is challenging as it requires an understanding of event causality and character behaviors. To encourage research in this direction, we propose NarraSum, a large-scale narrative summarization dataset. It contains 122K narrative documents, which are collected from plot descriptions of movies and TV episodes with diverse genres, and their corresponding abstractive summaries. Experiments show that there is a large performance gap between humans and the state-of-the-art summarization models on NarraSum. We hope that this dataset will promote future research in summarization, as well as broader studies of natural language understanding and generation. The dataset is available at https://github.com/zhaochaocs/narrasum.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"66 1","pages":"182-197"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"NarraSum: A Large-Scale Dataset for Abstractive Narrative Summarization\",\"authors\":\"Chao Zhao, Faeze Brahman, Kaiqiang Song, Wenlin Yao, Dian Yu, Snigdha Chaturvedi\",\"doi\":\"10.48550/arXiv.2212.01476\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Narrative summarization aims to produce a distilled version of a narrative to describe its most salient events and characters. Summarizing a narrative is challenging as it requires an understanding of event causality and character behaviors. To encourage research in this direction, we propose NarraSum, a large-scale narrative summarization dataset. It contains 122K narrative documents, which are collected from plot descriptions of movies and TV episodes with diverse genres, and their corresponding abstractive summaries. Experiments show that there is a large performance gap between humans and the state-of-the-art summarization models on NarraSum. We hope that this dataset will promote future research in summarization, as well as broader studies of natural language understanding and generation. The dataset is available at https://github.com/zhaochaocs/narrasum.\",\"PeriodicalId\":74540,\"journal\":{\"name\":\"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing\",\"volume\":\"66 1\",\"pages\":\"182-197\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2212.01476\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2212.01476","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

叙述摘要的目的是提炼出一篇叙述的精华,以描述其中最突出的事件和人物。总结一个故事是很有挑战性的,因为它需要理解事件的因果关系和角色的行为。为了鼓励这方面的研究,我们提出了一个大型叙事摘要数据集narasum。它包含122K个叙事文件,这些文件收集了不同类型的电影和电视剧集的情节描述及其相应的抽象摘要。实验表明,在narasum上,人类和最先进的总结模型之间存在很大的性能差距。我们希望这个数据集能够促进未来的总结研究,以及更广泛的自然语言理解和生成研究。该数据集可在https://github.com/zhaochaocs/narrasum上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
NarraSum: A Large-Scale Dataset for Abstractive Narrative Summarization
Narrative summarization aims to produce a distilled version of a narrative to describe its most salient events and characters. Summarizing a narrative is challenging as it requires an understanding of event causality and character behaviors. To encourage research in this direction, we propose NarraSum, a large-scale narrative summarization dataset. It contains 122K narrative documents, which are collected from plot descriptions of movies and TV episodes with diverse genres, and their corresponding abstractive summaries. Experiments show that there is a large performance gap between humans and the state-of-the-art summarization models on NarraSum. We hope that this dataset will promote future research in summarization, as well as broader studies of natural language understanding and generation. The dataset is available at https://github.com/zhaochaocs/narrasum.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信