面向现实世界数据的理论

W. Martens
{"title":"面向现实世界数据的理论","authors":"W. Martens","doi":"10.1145/3517804.3526066","DOIUrl":null,"url":null,"abstract":"Fundamental research on data manipulation languages is often motivated by the search for balance between desirable properties, such as expressiveness, robustness, compositionality, the existence of efficient algorithms, etc. Real-world data can be helpful for this search in many different respects. Data sets may exhibit common structures that efficient algorithms can exploit. Query logs and schemas can give us an idea of single features that are used very often, or groups of features that are frequently used together. In this sense, they can guide us towards features or fragments of data manipulation languages that are common in practice and may therefore be worthy of deeper study. In other cases, we may even get a glimpse on features that are not well-understood by users, which may inspire us to redesign them or develop tools that increase their ease-of-use. This tutorial aims to provide, first of all, an overview on several practical studies that have been conducted in the areas of tree-structured and graph-structured data, with a focus on cases with strong interaction between analysis of the data and fundamental research. Second, it aims to provide a set of lessons learned after the investigation of some large-scale logs consisting of more than 850 million queries.","PeriodicalId":230606,"journal":{"name":"Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","volume":"172 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Towards Theory for Real-World Data\",\"authors\":\"W. Martens\",\"doi\":\"10.1145/3517804.3526066\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fundamental research on data manipulation languages is often motivated by the search for balance between desirable properties, such as expressiveness, robustness, compositionality, the existence of efficient algorithms, etc. Real-world data can be helpful for this search in many different respects. Data sets may exhibit common structures that efficient algorithms can exploit. Query logs and schemas can give us an idea of single features that are used very often, or groups of features that are frequently used together. In this sense, they can guide us towards features or fragments of data manipulation languages that are common in practice and may therefore be worthy of deeper study. In other cases, we may even get a glimpse on features that are not well-understood by users, which may inspire us to redesign them or develop tools that increase their ease-of-use. This tutorial aims to provide, first of all, an overview on several practical studies that have been conducted in the areas of tree-structured and graph-structured data, with a focus on cases with strong interaction between analysis of the data and fundamental research. Second, it aims to provide a set of lessons learned after the investigation of some large-scale logs consisting of more than 850 million queries.\",\"PeriodicalId\":230606,\"journal\":{\"name\":\"Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems\",\"volume\":\"172 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3517804.3526066\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3517804.3526066","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

对数据操作语言进行基础研究的动机往往是寻找理想属性之间的平衡,如表达性、鲁棒性、组合性、有效算法的存在等。真实世界的数据可以在许多不同方面对这种搜索有所帮助。数据集可能显示出有效算法可以利用的共同结构。查询日志和模式可以让我们了解经常使用的单个特性,或者经常一起使用的一组特性。从这个意义上说,它们可以引导我们找到在实践中常见的数据操作语言的特征或片段,因此可能值得深入研究。在其他情况下,我们甚至可以看到用户不太理解的特性,这可能会激励我们重新设计它们或开发增加其易用性的工具。本教程的目的是提供,首先,几个已经在树结构和图结构数据领域进行的实际研究的概述,重点是在数据分析和基础研究之间有很强的相互作用的案例。其次,它旨在提供一组在调查了一些包含超过8.5亿个查询的大型日志后得到的经验教训。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Towards Theory for Real-World Data
Fundamental research on data manipulation languages is often motivated by the search for balance between desirable properties, such as expressiveness, robustness, compositionality, the existence of efficient algorithms, etc. Real-world data can be helpful for this search in many different respects. Data sets may exhibit common structures that efficient algorithms can exploit. Query logs and schemas can give us an idea of single features that are used very often, or groups of features that are frequently used together. In this sense, they can guide us towards features or fragments of data manipulation languages that are common in practice and may therefore be worthy of deeper study. In other cases, we may even get a glimpse on features that are not well-understood by users, which may inspire us to redesign them or develop tools that increase their ease-of-use. This tutorial aims to provide, first of all, an overview on several practical studies that have been conducted in the areas of tree-structured and graph-structured data, with a focus on cases with strong interaction between analysis of the data and fundamental research. Second, it aims to provide a set of lessons learned after the investigation of some large-scale logs consisting of more than 850 million queries.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信