A compression-based framework for the efficient analysis of business process logs

Bettina Fazzinga, S. Flesca, F. Furfaro, E. Masciari, L. Pontieri
{"title":"A compression-based framework for the efficient analysis of business process logs","authors":"Bettina Fazzinga, S. Flesca, F. Furfaro, E. Masciari, L. Pontieri","doi":"10.1145/2791347.2791351","DOIUrl":null,"url":null,"abstract":"The increasing availability of large process log repositories calls for efficient solutions for their analysis. In this regard, a novel specialized compression technique for process logs is proposed, that builds a synopsis supporting a fast estimation of aggregate queries, which are of crucial importance in exploratory and high-level analysis tasks. The synopsis is constructed by progressively merging the original log-tuples, which represent single activity executions within the process instances, into aggregate tuples, summarizing sets of activity executions. The compression strategy is guided by a heuristic aiming at limiting the loss of information caused by summarization, while guaranteeing that no information is lost on the set of activities performed within the process instances and on the order among their executions. The selection conditions in an aggregate query are specified in terms of a graph pattern, that allows precedence relationships over activity executions to be expressed, along with conditions on their starting times, durations, and executors. The efficacy of the compression technique, in terms of capability of reducing the size of the log and of accuracy of the estimates retrieved from the synopsis, has been experimentally validated.","PeriodicalId":225179,"journal":{"name":"Proceedings of the 27th International Conference on Scientific and Statistical Database Management","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 27th International Conference on Scientific and Statistical Database Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2791347.2791351","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

The increasing availability of large process log repositories calls for efficient solutions for their analysis. In this regard, a novel specialized compression technique for process logs is proposed, that builds a synopsis supporting a fast estimation of aggregate queries, which are of crucial importance in exploratory and high-level analysis tasks. The synopsis is constructed by progressively merging the original log-tuples, which represent single activity executions within the process instances, into aggregate tuples, summarizing sets of activity executions. The compression strategy is guided by a heuristic aiming at limiting the loss of information caused by summarization, while guaranteeing that no information is lost on the set of activities performed within the process instances and on the order among their executions. The selection conditions in an aggregate query are specified in terms of a graph pattern, that allows precedence relationships over activity executions to be expressed, along with conditions on their starting times, durations, and executors. The efficacy of the compression technique, in terms of capability of reducing the size of the log and of accuracy of the estimates retrieved from the synopsis, has been experimentally validated.
用于有效分析业务流程日志的基于压缩的框架
大型过程日志存储库的可用性不断增加,需要有效的解决方案来分析它们。在这方面,提出了一种新的专门的过程日志压缩技术,该技术构建了一个支持快速估计聚合查询的概要,这在探索性和高级分析任务中至关重要。通过逐步将原始日志元组(表示流程实例中的单个活动执行)合并为聚合元组来构建概要,汇总活动执行集。压缩策略由一种启发式方法指导,其目的是限制由汇总引起的信息丢失,同时保证在流程实例中执行的一组活动及其执行顺序上没有信息丢失。聚合查询中的选择条件是根据图形模式指定的,该模式允许表示活动执行的优先关系,以及它们的开始时间、持续时间和执行器的条件。压缩技术在减小日志大小和从概要中检索到的估计的准确性方面的有效性已经得到了实验验证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信