表格数据的查询偏置摘要

Proceedings of the 21st Australasian Document Computing Symposium Pub Date : 2016-12-05 DOI:10.1145/3015022.3015027

Vincent Au, Paul Thomas, Gaya K. Jayasinghe

{"title":"表格数据的查询偏置摘要","authors":"Vincent Au, Paul Thomas, Gaya K. Jayasinghe","doi":"10.1145/3015022.3015027","DOIUrl":null,"url":null,"abstract":"Government, research, and academic data portals publish a large amount of public data, but present tools make discovery difficult. In particular, search results do not support a user's decision whether or not to commit to a download of what might be a large data set. We describe a method for producing query-biased summaries of tabular data, which aims to support a user's download decision-or even to answer the question on the spot, with no further interaction. The method infers simple types in the data and query; automatically refines queries, where that makes sense; extracts relevant subsets of the complete table; and generates both graphical and tabular summaries of what remains. A small-scale user study suggests this both helps users identify useful results (fewer false negatives), and reduces wasted downloads (fewer false positives).","PeriodicalId":334601,"journal":{"name":"Proceedings of the 21st Australasian Document Computing Symposium","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Query-Biased Summaries for Tabular Data\",\"authors\":\"Vincent Au, Paul Thomas, Gaya K. Jayasinghe\",\"doi\":\"10.1145/3015022.3015027\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Government, research, and academic data portals publish a large amount of public data, but present tools make discovery difficult. In particular, search results do not support a user's decision whether or not to commit to a download of what might be a large data set. We describe a method for producing query-biased summaries of tabular data, which aims to support a user's download decision-or even to answer the question on the spot, with no further interaction. The method infers simple types in the data and query; automatically refines queries, where that makes sense; extracts relevant subsets of the complete table; and generates both graphical and tabular summaries of what remains. A small-scale user study suggests this both helps users identify useful results (fewer false negatives), and reduces wasted downloads (fewer false positives).\",\"PeriodicalId\":334601,\"journal\":{\"name\":\"Proceedings of the 21st Australasian Document Computing Symposium\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 21st Australasian Document Computing Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3015022.3015027\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st Australasian Document Computing Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3015022.3015027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

政府、研究和学术数据门户发布了大量的公共数据，但现有的工具使发现变得困难。特别是，搜索结果不支持用户决定是否下载可能很大的数据集。我们描述了一种生成表格数据的有查询偏差摘要的方法，其目的是支持用户的下载决策，甚至是在没有进一步交互的情况下当场回答问题。该方法推断数据和查询中的简单类型;自动优化查询，这是有意义的;提取完整表的相关子集;并生成剩余内容的图形和表格摘要。一项小规模的用户研究表明，这既可以帮助用户识别有用的结果(减少误报)，又可以减少浪费的下载(减少误报)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Query-Biased Summaries for Tabular Data

Government, research, and academic data portals publish a large amount of public data, but present tools make discovery difficult. In particular, search results do not support a user's decision whether or not to commit to a download of what might be a large data set. We describe a method for producing query-biased summaries of tabular data, which aims to support a user's download decision-or even to answer the question on the spot, with no further interaction. The method infers simple types in the data and query; automatically refines queries, where that makes sense; extracts relevant subsets of the complete table; and generates both graphical and tabular summaries of what remains. A small-scale user study suggests this both helps users identify useful results (fewer false negatives), and reduces wasted downloads (fewer false positives).

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 21st Australasian Document Computing Symposium

自引率

0.00%

发文量