Explaining Dataset Changes for Semantic Data Versioning with Explain-Da-V (Technical Report)

Roee Shraga, Renée J. Miller
{"title":"Explaining Dataset Changes for Semantic Data Versioning with Explain-Da-V (Technical Report)","authors":"Roee Shraga, Renée J. Miller","doi":"10.48550/arXiv.2301.13095","DOIUrl":null,"url":null,"abstract":"\n In multi-user environments in which data science and analysis is collaborative, multiple versions of the same datasets are generated. While managing and storing data versions has received some attention in the research literature, the semantic nature of such changes has remained under-explored. In this work, we introduce Explain-Da-V, a framework aiming to explain changes between two given dataset versions. Explain-Da-V generates\n explanations\n that use\n data transformations\n to explain changes. We further introduce a set of measures that evaluate the validity, generalizability, and explainability of these explanations. We empirically show, using an adapted existing benchmark and a newly created benchmark, that Explain-Da-V generates better explanations than existing data transformation synthesis methods.\n","PeriodicalId":20467,"journal":{"name":"Proc. VLDB Endow.","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proc. VLDB Endow.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2301.13095","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In multi-user environments in which data science and analysis is collaborative, multiple versions of the same datasets are generated. While managing and storing data versions has received some attention in the research literature, the semantic nature of such changes has remained under-explored. In this work, we introduce Explain-Da-V, a framework aiming to explain changes between two given dataset versions. Explain-Da-V generates explanations that use data transformations to explain changes. We further introduce a set of measures that evaluate the validity, generalizability, and explainability of these explanations. We empirically show, using an adapted existing benchmark and a newly created benchmark, that Explain-Da-V generates better explanations than existing data transformation synthesis methods.
使用Explain-Da-V解释语义数据版本控制的数据集更改(技术报告)
在数据科学和分析是协作的多用户环境中,会生成相同数据集的多个版本。虽然管理和存储数据版本在研究文献中受到了一些关注,但这种变化的语义性质仍未得到充分探索。在这项工作中,我们引入了explain - da - v框架,旨在解释两个给定数据集版本之间的变化。explain - da - v生成使用数据转换来解释更改的解释。我们进一步介绍了一套评估这些解释的有效性、概括性和可解释性的措施。我们通过经验证明,使用一个改编的现有基准和一个新创建的基准,Explain-Da-V生成的解释比现有的数据转换综合方法更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信