A. T. Ter Hofstede, A. Koschmider, Andrea Marrella, R. Andrews, D. Fischer, Sareh Sadeghianasl, M. Wynn, M. Comuzzi, Jochen De Weerdt, Kanika Goel, Niels Martin, P. Soffer
{"title":"Process-Data Quality: The True Frontier of Process Mining","authors":"A. T. Ter Hofstede, A. Koschmider, Andrea Marrella, R. Andrews, D. Fischer, Sareh Sadeghianasl, M. Wynn, M. Comuzzi, Jochen De Weerdt, Kanika Goel, Niels Martin, P. Soffer","doi":"10.1145/3613247","DOIUrl":null,"url":null,"abstract":"Since its emergence over two decades ago, process mining has flourished as a discipline, with numerous contributions to its theory, widespread practical applications, and mature support by commercial tooling environments. However, its potential for significant organisational impact is hampered by poor quality event data. Process mining starts with the acquisition and preparation of event data coming from different data sources. These are then transformed into event logs, consisting of process execution traces including multiple events. In real-life scenarios, event logs suffer from significant data quality problems, which must be recognised and effectively resolved for obtaining meaningful insights from process mining analysis. Despite its importance, the topic of data quality in process mining has received limited attention. In this paper, we discuss the emerging challenges related to process-data quality from both a research and practical point of view. Additionally, we present a corresponding research agenda with key research directions.","PeriodicalId":44355,"journal":{"name":"ACM Journal of Data and Information Quality","volume":"19 1","pages":"1 - 21"},"PeriodicalIF":1.5000,"publicationDate":"2023-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Journal of Data and Information Quality","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3613247","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Since its emergence over two decades ago, process mining has flourished as a discipline, with numerous contributions to its theory, widespread practical applications, and mature support by commercial tooling environments. However, its potential for significant organisational impact is hampered by poor quality event data. Process mining starts with the acquisition and preparation of event data coming from different data sources. These are then transformed into event logs, consisting of process execution traces including multiple events. In real-life scenarios, event logs suffer from significant data quality problems, which must be recognised and effectively resolved for obtaining meaningful insights from process mining analysis. Despite its importance, the topic of data quality in process mining has received limited attention. In this paper, we discuss the emerging challenges related to process-data quality from both a research and practical point of view. Additionally, we present a corresponding research agenda with key research directions.