Andreas Egger , Tobias Fehrer , Wolfgang Kratsch , Niklas Wördehoff , Fabian König , Maximilian Röglinger
{"title":"Refining the process picture: Unstructured data in object-centric process mining","authors":"Andreas Egger , Tobias Fehrer , Wolfgang Kratsch , Niklas Wördehoff , Fabian König , Maximilian Röglinger","doi":"10.1016/j.is.2025.102582","DOIUrl":null,"url":null,"abstract":"<div><div>Process mining aims to discover, monitor, and improve processes. To this end, process mining techniques use event data, typically extracted from information systems and organized along process instances. The inherent complexity of real-world processes has driven the recent introduction of object-centric process mining, allowing for a more comprehensive view of processes. Another avenue of research contributing to more complete process analyses is integrating unstructured data, which can enhance traditional event logs by extracting hitherto unidentified process information. Although combining the object-centric perspective with event log enrichment from unstructured data sources holds promising potential, such investigation remains in its infancy. Against this background, this study presents the OCRAUD, a reference architecture that provides guidance on using unstructured data sources and traditional event logs for object-centric process mining. A design science research process was employed to design and evaluate the OCRAUD. This involved conducting a total of 20 expert interviews over two rounds, comparing the OCRAUD to competing artifacts, instantiating the artifact for the use of video and sensor data, developing a software prototype, and applying the prototype to real-world data. This work contributes to process mining by guiding the combination of unstructured data with traditional event logs, incorporating an object-centric representation of event data. The instantiation targets video and sensor data, thereby demonstrating the use of the artifact. This enables researchers and practitioners to instantiate the artifact for other data types or specific use cases. The published code of the software prototype allows for further development of the implemented algorithms.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"134 ","pages":"Article 102582"},"PeriodicalIF":3.4000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306437925000663","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Process mining aims to discover, monitor, and improve processes. To this end, process mining techniques use event data, typically extracted from information systems and organized along process instances. The inherent complexity of real-world processes has driven the recent introduction of object-centric process mining, allowing for a more comprehensive view of processes. Another avenue of research contributing to more complete process analyses is integrating unstructured data, which can enhance traditional event logs by extracting hitherto unidentified process information. Although combining the object-centric perspective with event log enrichment from unstructured data sources holds promising potential, such investigation remains in its infancy. Against this background, this study presents the OCRAUD, a reference architecture that provides guidance on using unstructured data sources and traditional event logs for object-centric process mining. A design science research process was employed to design and evaluate the OCRAUD. This involved conducting a total of 20 expert interviews over two rounds, comparing the OCRAUD to competing artifacts, instantiating the artifact for the use of video and sensor data, developing a software prototype, and applying the prototype to real-world data. This work contributes to process mining by guiding the combination of unstructured data with traditional event logs, incorporating an object-centric representation of event data. The instantiation targets video and sensor data, thereby demonstrating the use of the artifact. This enables researchers and practitioners to instantiate the artifact for other data types or specific use cases. The published code of the software prototype allows for further development of the implemented algorithms.
期刊介绍:
Information systems are the software and hardware systems that support data-intensive applications. The journal Information Systems publishes articles concerning the design and implementation of languages, data models, process models, algorithms, software and hardware for information systems.
Subject areas include data management issues as presented in the principal international database conferences (e.g., ACM SIGMOD/PODS, VLDB, ICDE and ICDT/EDBT) as well as data-related issues from the fields of data mining/machine learning, information retrieval coordinated with structured data, internet and cloud data management, business process management, web semantics, visual and audio information systems, scientific computing, and data science. Implementation papers having to do with massively parallel data management, fault tolerance in practice, and special purpose hardware for data-intensive systems are also welcome. Manuscripts from application domains, such as urban informatics, social and natural science, and Internet of Things, are also welcome. All papers should highlight innovative solutions to data management problems such as new data models, performance enhancements, and show how those innovations contribute to the goals of the application.