JetStream: Cluster-Scale Parallelization of Information Flow Queries

Andrew Quinn, David Devecsery, Peter M. Chen, J. Flinn
{"title":"JetStream: Cluster-Scale Parallelization of Information Flow Queries","authors":"Andrew Quinn, David Devecsery, Peter M. Chen, J. Flinn","doi":"10.5555/3026877.3026912","DOIUrl":null,"url":null,"abstract":"Dynamic information flow tracking (DIFT) is an important tool in many domains, such as security, debugging, forensics, provenance, configuration troubleshooting, and privacy tracking. However, the usability of DIFT is currently limited by its high overhead; complex information flow queries can take up to two orders of magnitude longer to execute than the original execution of the program. This precludes interactive uses in which users iteratively refine queries to narrow down bugs, leaks of private data, or performance anomalies.JetStream applies cluster computing to parallelize and accelerate information flow queries over past executions. It uses deterministic record and replay to time slice executions into distinct contiguous chunks of execution called epochs, and it tracks information flow for each epoch on a separate core in the cluster. It structures the aggregation of information flow data from each epoch as a streaming computation. Epochs are arranged in a sequential chain from the beginning to the end of program execution; relationships to program inputs (sources) are streamed forward along the chain, and relationships to program outputs (sinks) are streamed backward. Jet-Stream is the first system to parallelize DIFT across a cluster. Our results show that JetStream queries scale to at least 128 cores over a wide range of applications. JetStream accelerates DIFT queries to run 12-48 times faster than sequential queries; in most cases, queries run faster than the original execution of the program.","PeriodicalId":90294,"journal":{"name":"Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation","volume":"1 1","pages":"451-466"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the -- USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Symposium on Operating Systems Design and Implementation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5555/3026877.3026912","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 25

Abstract

Dynamic information flow tracking (DIFT) is an important tool in many domains, such as security, debugging, forensics, provenance, configuration troubleshooting, and privacy tracking. However, the usability of DIFT is currently limited by its high overhead; complex information flow queries can take up to two orders of magnitude longer to execute than the original execution of the program. This precludes interactive uses in which users iteratively refine queries to narrow down bugs, leaks of private data, or performance anomalies.JetStream applies cluster computing to parallelize and accelerate information flow queries over past executions. It uses deterministic record and replay to time slice executions into distinct contiguous chunks of execution called epochs, and it tracks information flow for each epoch on a separate core in the cluster. It structures the aggregation of information flow data from each epoch as a streaming computation. Epochs are arranged in a sequential chain from the beginning to the end of program execution; relationships to program inputs (sources) are streamed forward along the chain, and relationships to program outputs (sinks) are streamed backward. Jet-Stream is the first system to parallelize DIFT across a cluster. Our results show that JetStream queries scale to at least 128 cores over a wide range of applications. JetStream accelerates DIFT queries to run 12-48 times faster than sequential queries; in most cases, queries run faster than the original execution of the program.
JetStream:信息流查询的集群级并行化
动态信息流跟踪(DIFT)是许多领域中的重要工具,例如安全、调试、取证、来源、配置故障排除和隐私跟踪。然而,DIFT的可用性目前受到其高开销的限制;复杂信息流查询的执行时间可能比程序的原始执行时间长两个数量级。这就排除了交互使用,在交互使用中,用户迭代地改进查询以缩小错误、私有数据泄漏或性能异常。JetStream应用集群计算来并行化和加速过去执行的信息流查询。它使用确定性记录和重放将执行时间切片为不同的连续执行块(称为epoch),并在集群中单独的核心上跟踪每个epoch的信息流。它将来自每个epoch的信息流数据聚合为流计算。epoch从程序执行的开始到结束排列在一个顺序链中;与程序输入(源)的关系沿着链向前流,而与程序输出(接收)的关系向后流。Jet-Stream是第一个跨集群并行DIFT的系统。我们的结果表明,JetStream查询在广泛的应用中至少可以扩展到128核。JetStream加速DIFT查询,运行速度比顺序查询快12-48倍;在大多数情况下,查询的运行速度比程序的原始执行要快。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信