A ground truth approach for assessing process mining techniques.

Process science Pub Date : 2025-01-01 Epub Date: 2025-03-20 DOI:10.1007/s44311-025-00006-8
Dominique Sommers, Natalia Sidorova, Boudewijn van Dongen
{"title":"A ground truth approach for assessing process mining techniques.","authors":"Dominique Sommers, Natalia Sidorova, Boudewijn van Dongen","doi":"10.1007/s44311-025-00006-8","DOIUrl":null,"url":null,"abstract":"<p><p>The assessment of process mining techniques using real-life data is often compromised by the lack of ground truth knowledge, the presence of non-essential outliers in system behavior and recording errors in event logs. Using synthetically generated data could leverage ground truth for better evaluation. Existing log generation tools inject noise directly into the logs, which does not capture many typical behavioral deviations. Furthermore, the link between the model and the log, which is needed for later assessment, becomes lost. We propose a ground-truth approach for generating process data from existing or synthetic initial process models, whether automatically generated or hand-made. This approach incorporates patterns of behavioral deviations and recording errors to produce a synthetic yet realistic deviating model and imperfect event log. These, together with the initial model, are required to assess process mining techniques based on ground truth knowledge. We demonstrate this approach to create datasets of synthetic process data for three processes, one of which we used in a conformance checking use case, focusing on the assessment of (relaxed) systemic alignments to expose and explain deviations in modeled and recorded behavior. Our results show that this approach, unlike traditional methods, provides detailed insights into the strengths and weaknesses of process mining techniques, both quantitatively and qualitatively.</p>","PeriodicalId":520481,"journal":{"name":"Process science","volume":"2 1","pages":"1"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11934509/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Process science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s44311-025-00006-8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/20 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The assessment of process mining techniques using real-life data is often compromised by the lack of ground truth knowledge, the presence of non-essential outliers in system behavior and recording errors in event logs. Using synthetically generated data could leverage ground truth for better evaluation. Existing log generation tools inject noise directly into the logs, which does not capture many typical behavioral deviations. Furthermore, the link between the model and the log, which is needed for later assessment, becomes lost. We propose a ground-truth approach for generating process data from existing or synthetic initial process models, whether automatically generated or hand-made. This approach incorporates patterns of behavioral deviations and recording errors to produce a synthetic yet realistic deviating model and imperfect event log. These, together with the initial model, are required to assess process mining techniques based on ground truth knowledge. We demonstrate this approach to create datasets of synthetic process data for three processes, one of which we used in a conformance checking use case, focusing on the assessment of (relaxed) systemic alignments to expose and explain deviations in modeled and recorded behavior. Our results show that this approach, unlike traditional methods, provides detailed insights into the strengths and weaknesses of process mining techniques, both quantitatively and qualitatively.

一种评估过程挖掘技术的基础真实度方法。
使用真实数据的过程挖掘技术的评估常常受到缺乏基础真相知识,系统行为中存在非必要的异常值以及事件日志中记录错误的影响。使用合成生成的数据可以利用地面真相进行更好的评估。现有的日志生成工具将噪声直接注入到日志中,这并不能捕捉到许多典型的行为偏差。此外,模型和日志之间的链接(这是以后评估所需要的)将会丢失。我们提出了一种基于事实的方法,用于从现有的或合成的初始过程模型中生成过程数据,无论是自动生成还是手工生成。这种方法结合了行为偏差和记录错误的模式,产生了一个综合的、真实的偏差模型和不完善的事件日志。这些与初始模型一起,是评估基于地面真值知识的过程挖掘技术所必需的。我们演示了这种方法来为三个过程创建合成过程数据的数据集,其中一个我们在一致性检查用例中使用,专注于(放松的)系统对齐的评估,以暴露和解释建模和记录行为中的偏差。我们的结果表明,与传统方法不同,这种方法提供了对过程挖掘技术的优缺点的详细见解,无论是定量的还是定性的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信