演示NaturalMiner:搜索用自然语言描述的抽象模式的大数据集

Immanuel Trummer
{"title":"演示NaturalMiner:搜索用自然语言描述的抽象模式的大数据集","authors":"Immanuel Trummer","doi":"10.1145/3555041.3589694","DOIUrl":null,"url":null,"abstract":"The NaturalMiner system seeks to extract facts from large relational data sets that match abstract patterns defined in natural language. For instance, this enables users to search, with regards to a specific airline, for evidence that \"the airline underperforms\" or \"the airline outperforms'' within a data set containing flight statistics, hinting at areas for improvements or strengths to advertise. Internally, NaturalMiner iteratively generates statistical facts from data by processing SQL queries, selecting facts to generate by a reinforcement learning approach. It uses pre-trained language models to score candidate facts with regards to user-specified search patterns, returning the fact combination with maximal score after a user-specified time budget. To deal with large data sets, NaturalMiner features customized caching and sampling strategies. The proposed demonstration will showcase search for different patterns described in natural language, covering different data sets and scenarios.","PeriodicalId":161812,"journal":{"name":"Companion of the 2023 International Conference on Management of Data","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Demonstrating NaturalMiner: Searching Large Data Sets for Abstract Patterns Described in Natural Language\",\"authors\":\"Immanuel Trummer\",\"doi\":\"10.1145/3555041.3589694\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The NaturalMiner system seeks to extract facts from large relational data sets that match abstract patterns defined in natural language. For instance, this enables users to search, with regards to a specific airline, for evidence that \\\"the airline underperforms\\\" or \\\"the airline outperforms'' within a data set containing flight statistics, hinting at areas for improvements or strengths to advertise. Internally, NaturalMiner iteratively generates statistical facts from data by processing SQL queries, selecting facts to generate by a reinforcement learning approach. It uses pre-trained language models to score candidate facts with regards to user-specified search patterns, returning the fact combination with maximal score after a user-specified time budget. To deal with large data sets, NaturalMiner features customized caching and sampling strategies. The proposed demonstration will showcase search for different patterns described in natural language, covering different data sets and scenarios.\",\"PeriodicalId\":161812,\"journal\":{\"name\":\"Companion of the 2023 International Conference on Management of Data\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Companion of the 2023 International Conference on Management of Data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3555041.3589694\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion of the 2023 International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3555041.3589694","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

NaturalMiner系统试图从大型关系数据集中提取与自然语言定义的抽象模式相匹配的事实。例如,用户可以在包含航班统计数据的数据集中搜索特定航空公司“表现不佳”或“表现优异”的证据,从而提示需要改进的领域或需要宣传的优势。在内部,NaturalMiner通过处理SQL查询迭代地从数据中生成统计事实,选择通过强化学习方法生成的事实。它使用预训练的语言模型根据用户指定的搜索模式对候选事实进行评分,在用户指定的时间预算之后返回具有最大分数的事实组合。为了处理大型数据集,NaturalMiner提供了定制的缓存和采样策略。建议的演示将展示搜索用自然语言描述的不同模式,涵盖不同的数据集和场景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Demonstrating NaturalMiner: Searching Large Data Sets for Abstract Patterns Described in Natural Language
The NaturalMiner system seeks to extract facts from large relational data sets that match abstract patterns defined in natural language. For instance, this enables users to search, with regards to a specific airline, for evidence that "the airline underperforms" or "the airline outperforms'' within a data set containing flight statistics, hinting at areas for improvements or strengths to advertise. Internally, NaturalMiner iteratively generates statistical facts from data by processing SQL queries, selecting facts to generate by a reinforcement learning approach. It uses pre-trained language models to score candidate facts with regards to user-specified search patterns, returning the fact combination with maximal score after a user-specified time budget. To deal with large data sets, NaturalMiner features customized caching and sampling strategies. The proposed demonstration will showcase search for different patterns described in natural language, covering different data sets and scenarios.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信