自动讲故事评估和故事链生成

2017 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2017-11-01 DOI:10.1109/ICDMW.2017.15

J. Rigsby, Daniel Barbará

{"title":"自动讲故事评估和故事链生成","authors":"J. Rigsby, Daniel Barbará","doi":"10.1109/ICDMW.2017.15","DOIUrl":null,"url":null,"abstract":"Given a beginning and ending document, automated storytelling attempts to fill in intermediary documents to form a coherent story. This is a common problem for analysts; they often have two snippets of information and want to find the other pieces that relate them. Evaluation of the quality of the created stories is difficult and has routinely involved human judgment. This work extends the state of the art by providing quantitative methods of story quality evaluation which are shown to have good agreement with human judgment. Two methods of automated storytelling evaluation, dispersion and coherence are developed. Dispersion, a measure of story flow, ascertains how well the generated story flows away from the beginning document and towards the ending document. Coherence measures how well the articles in the middle of the story provide information about the relationship of the beginning and ending document pair. Kullback-Leibler divergence (KLD) is used to measure the ability to encode the vocabulary of the beginning and ending story documents using the set of middle documents in the story. The dispersion and coherence methodologies developed here have the added benefit that they do not require parametrization or user inputs and are also easily automated. An automated storytelling algorithm is proposed as a multicriteria optimization problem that maximizes dispersion and coherence simultaneously. The developed storytelling methodologies will allow for the automated identification of information which associates disparate documents in support of literaturebased discovery and link analysis tasking. In addition, the methods provide quantitative measures of the strength of these associations.","PeriodicalId":389183,"journal":{"name":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"189 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Automated Storytelling Evaluation and Story Chain Generation\",\"authors\":\"J. Rigsby, Daniel Barbará\",\"doi\":\"10.1109/ICDMW.2017.15\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Given a beginning and ending document, automated storytelling attempts to fill in intermediary documents to form a coherent story. This is a common problem for analysts; they often have two snippets of information and want to find the other pieces that relate them. Evaluation of the quality of the created stories is difficult and has routinely involved human judgment. This work extends the state of the art by providing quantitative methods of story quality evaluation which are shown to have good agreement with human judgment. Two methods of automated storytelling evaluation, dispersion and coherence are developed. Dispersion, a measure of story flow, ascertains how well the generated story flows away from the beginning document and towards the ending document. Coherence measures how well the articles in the middle of the story provide information about the relationship of the beginning and ending document pair. Kullback-Leibler divergence (KLD) is used to measure the ability to encode the vocabulary of the beginning and ending story documents using the set of middle documents in the story. The dispersion and coherence methodologies developed here have the added benefit that they do not require parametrization or user inputs and are also easily automated. An automated storytelling algorithm is proposed as a multicriteria optimization problem that maximizes dispersion and coherence simultaneously. The developed storytelling methodologies will allow for the automated identification of information which associates disparate documents in support of literaturebased discovery and link analysis tasking. In addition, the methods provide quantitative measures of the strength of these associations.\",\"PeriodicalId\":389183,\"journal\":{\"name\":\"2017 IEEE International Conference on Data Mining Workshops (ICDMW)\",\"volume\":\"189 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Conference on Data Mining Workshops (ICDMW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDMW.2017.15\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2017.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

给定开始和结束文档，自动讲故事会尝试填充中间文档以形成连贯的故事。这对分析师来说是一个常见的问题;他们通常有两个信息片段，并希望找到与它们相关的其他片段。对创作出来的故事的质量进行评估是困难的，通常需要人的判断。这项工作通过提供与人类判断良好一致的故事质量评估的定量方法，扩展了目前的技术水平。本文发展了两种自动故事评价方法:离散和连贯。分散度是衡量故事流的一种方法，它确定了生成的故事从开始文档到结束文档的流动情况。连贯性衡量的是故事中间的文章如何很好地提供了关于开头和结尾文档对之间关系的信息。Kullback-Leibler散度(KLD)用于衡量使用故事中的中间文档集对故事开头和结尾文档的词汇进行编码的能力。这里开发的分散和相干方法还有一个额外的好处，即它们不需要参数化或用户输入，也很容易自动化。提出了一种自动讲故事算法，作为一个多准则优化问题，同时最大化分散和连贯。所开发的讲故事方法将允许自动识别关联不同文档的信息，以支持基于文献的发现和链接分析任务。此外，这些方法提供了这些关联强度的定量测量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automated Storytelling Evaluation and Story Chain Generation

Given a beginning and ending document, automated storytelling attempts to fill in intermediary documents to form a coherent story. This is a common problem for analysts; they often have two snippets of information and want to find the other pieces that relate them. Evaluation of the quality of the created stories is difficult and has routinely involved human judgment. This work extends the state of the art by providing quantitative methods of story quality evaluation which are shown to have good agreement with human judgment. Two methods of automated storytelling evaluation, dispersion and coherence are developed. Dispersion, a measure of story flow, ascertains how well the generated story flows away from the beginning document and towards the ending document. Coherence measures how well the articles in the middle of the story provide information about the relationship of the beginning and ending document pair. Kullback-Leibler divergence (KLD) is used to measure the ability to encode the vocabulary of the beginning and ending story documents using the set of middle documents in the story. The dispersion and coherence methodologies developed here have the added benefit that they do not require parametrization or user inputs and are also easily automated. An automated storytelling algorithm is proposed as a multicriteria optimization problem that maximizes dispersion and coherence simultaneously. The developed storytelling methodologies will allow for the automated identification of information which associates disparate documents in support of literaturebased discovery and link analysis tasking. In addition, the methods provide quantitative measures of the strength of these associations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE International Conference on Data Mining Workshops (ICDMW)

自引率

0.00%

发文量