{"title":"Rule-guided process discovery","authors":"Ali Norouzifar , Marcus Dees , Wil van der Aalst","doi":"10.1016/j.datak.2025.102508","DOIUrl":null,"url":null,"abstract":"<div><div>Event data extracted from information systems serves as the foundation for process mining, enabling the extraction of insights and identification of improvements. Process discovery focuses on deriving descriptive process models from event logs, which form the basis for conformance checking, performance analysis, and other applications. Traditional process discovery techniques predominantly rely on event logs, often overlooking supplementary information such as domain knowledge and process rules. These rules, which define relationships between activities, can be obtained through automated techniques like declarative process discovery or provided by domain experts based on process specifications. When used as an additional input alongside event logs, such rules have significant potential to guide process discovery. However, leveraging rules to discover high-quality imperative process models, such as BPMN models and Petri nets, remains an underexplored area in the literature. To address this gap, we propose an enhanced framework, IMr, which integrates discovered or user-defined rules into the process discovery workflow via a novel recursive approach. The IMr framework employs a divide-and-conquer strategy, using rules to guide the selection of process structures at each recursion step in combination with the input event log. We evaluate our approach on several real-world event logs and demonstrate that the discovered models better align with the provided rules without compromising their conformance to the event log. Additionally, we show that high-quality rules can improve model quality across well-known conformance metrics. This work highlights the importance of integrating domain knowledge into process discovery, enhancing the quality, interpretability, and applicability of the resulting process models.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"161 ","pages":"Article 102508"},"PeriodicalIF":2.7000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & Knowledge Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169023X2500103X","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Event data extracted from information systems serves as the foundation for process mining, enabling the extraction of insights and identification of improvements. Process discovery focuses on deriving descriptive process models from event logs, which form the basis for conformance checking, performance analysis, and other applications. Traditional process discovery techniques predominantly rely on event logs, often overlooking supplementary information such as domain knowledge and process rules. These rules, which define relationships between activities, can be obtained through automated techniques like declarative process discovery or provided by domain experts based on process specifications. When used as an additional input alongside event logs, such rules have significant potential to guide process discovery. However, leveraging rules to discover high-quality imperative process models, such as BPMN models and Petri nets, remains an underexplored area in the literature. To address this gap, we propose an enhanced framework, IMr, which integrates discovered or user-defined rules into the process discovery workflow via a novel recursive approach. The IMr framework employs a divide-and-conquer strategy, using rules to guide the selection of process structures at each recursion step in combination with the input event log. We evaluate our approach on several real-world event logs and demonstrate that the discovered models better align with the provided rules without compromising their conformance to the event log. Additionally, we show that high-quality rules can improve model quality across well-known conformance metrics. This work highlights the importance of integrating domain knowledge into process discovery, enhancing the quality, interpretability, and applicability of the resulting process models.
期刊介绍:
Data & Knowledge Engineering (DKE) stimulates the exchange of ideas and interaction between these two related fields of interest. DKE reaches a world-wide audience of researchers, designers, managers and users. The major aim of the journal is to identify, investigate and analyze the underlying principles in the design and effective use of these systems.