Query-Biased Summaries for Tabular Data

Proceedings of the 21st Australasian Document Computing Symposium Pub Date : 2016-12-05 DOI:10.1145/3015022.3015027

Vincent Au, Paul Thomas, Gaya K. Jayasinghe

引用次数: 2

Abstract

Government, research, and academic data portals publish a large amount of public data, but present tools make discovery difficult. In particular, search results do not support a user's decision whether or not to commit to a download of what might be a large data set. We describe a method for producing query-biased summaries of tabular data, which aims to support a user's download decision-or even to answer the question on the spot, with no further interaction. The method infers simple types in the data and query; automatically refines queries, where that makes sense; extracts relevant subsets of the complete table; and generates both graphical and tabular summaries of what remains. A small-scale user study suggests this both helps users identify useful results (fewer false negatives), and reduces wasted downloads (fewer false positives).

查看原文本刊更多论文

表格数据的查询偏置摘要

政府、研究和学术数据门户发布了大量的公共数据，但现有的工具使发现变得困难。特别是，搜索结果不支持用户决定是否下载可能很大的数据集。我们描述了一种生成表格数据的有查询偏差摘要的方法，其目的是支持用户的下载决策，甚至是在没有进一步交互的情况下当场回答问题。该方法推断数据和查询中的简单类型;自动优化查询，这是有意义的;提取完整表的相关子集;并生成剩余内容的图形和表格摘要。一项小规模的用户研究表明，这既可以帮助用户识别有用的结果(减少误报)，又可以减少浪费的下载(减少误报)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 21st Australasian Document Computing Symposium

自引率

0.00%

发文量