An on-the-fly provenance tracking mechanism for stream processing systems

Watsawee Sansrimahachai, L. Moreau, M. Weal
{"title":"An on-the-fly provenance tracking mechanism for stream processing systems","authors":"Watsawee Sansrimahachai, L. Moreau, M. Weal","doi":"10.1109/ICIS.2013.6607885","DOIUrl":null,"url":null,"abstract":"Applications that operate over streaming data with high-volume and real-time processing requirements are becoming increasingly important. These applications process streaming data in real-time and deliver instantaneous responses to support precise and on-time decisions. In such systems, traceability - the ability to verify and investigate the source of a particular output - in real-time is extremely important. This ability allows raw streaming data to be checked and processing steps to be verified and validated in timely manner. Therefore, it is crucial that stream systems have a mechanism for dynamically tracking provenance - the process that produced result data - at execution time, which we refer to as on-the-fly stream provenance tracking. In this paper, we propose a novel on-the-fly provenance tracking mechanism that enables provenance queries to be performed dynamically without requiring provenance assertions to be stored persistently. We demonstrate how our provenance mechanism works by means of an on-the-fly provenance tracking algorithm. The experimental evaluation shows that our provenance solution does not have a significant effect on the normal processing of stream systems given a 7% overhead. Moreover, our provenance solution offers low-latency processing (0.3 ms per additional component) with reasonable memory consumption.","PeriodicalId":345020,"journal":{"name":"2013 IEEE/ACIS 12th International Conference on Computer and Information Science (ICIS)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE/ACIS 12th International Conference on Computer and Information Science (ICIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIS.2013.6607885","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Applications that operate over streaming data with high-volume and real-time processing requirements are becoming increasingly important. These applications process streaming data in real-time and deliver instantaneous responses to support precise and on-time decisions. In such systems, traceability - the ability to verify and investigate the source of a particular output - in real-time is extremely important. This ability allows raw streaming data to be checked and processing steps to be verified and validated in timely manner. Therefore, it is crucial that stream systems have a mechanism for dynamically tracking provenance - the process that produced result data - at execution time, which we refer to as on-the-fly stream provenance tracking. In this paper, we propose a novel on-the-fly provenance tracking mechanism that enables provenance queries to be performed dynamically without requiring provenance assertions to be stored persistently. We demonstrate how our provenance mechanism works by means of an on-the-fly provenance tracking algorithm. The experimental evaluation shows that our provenance solution does not have a significant effect on the normal processing of stream systems given a 7% overhead. Moreover, our provenance solution offers low-latency processing (0.3 ms per additional component) with reasonable memory consumption.
流处理系统的动态溯源跟踪机制
具有高容量和实时处理需求的流数据操作应用程序正变得越来越重要。这些应用程序实时处理流数据,并提供即时响应,以支持精确和及时的决策。在这样的系统中,可追溯性——实时验证和调查特定输出来源的能力——是极其重要的。该功能允许检查原始流数据,并及时验证和验证处理步骤。因此,流系统在执行时具有动态跟踪溯源(生成结果数据的过程)的机制是至关重要的,我们将其称为实时流溯源跟踪。在本文中,我们提出了一种新颖的动态溯源跟踪机制,该机制允许动态执行溯源查询,而无需持久存储溯源断言。我们通过实时溯源跟踪算法演示了溯源机制的工作原理。实验评估表明,在给定7%开销的情况下,我们的溯源解决方案对流系统的正常处理没有显著影响。此外,我们的溯源解决方案提供了低延迟处理(每个额外组件0.3 ms)和合理的内存消耗。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信