Runtime integration of machine learning and simulation for business processes: Time and decision mining predictions

IF 3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Systems Pub Date : 2024-10-09 DOI:10.1016/j.is.2024.102472

Francesca Meneghello , Chiara Di Francescomarino , Chiara Ghidini , Massimiliano Ronzani

{"title":"Runtime integration of machine learning and simulation for business processes: Time and decision mining predictions","authors":"Francesca Meneghello , Chiara Di Francescomarino , Chiara Ghidini , Massimiliano Ronzani","doi":"10.1016/j.is.2024.102472","DOIUrl":null,"url":null,"abstract":"<div><div>Recent research in Computer Science has investigated the use of Deep Learning (DL) techniques to complement outcomes or decisions within a Discrete Event Simulation (DES) model. The main idea of this combination is to maintain a white box simulation model complement it with information provided by DL models to overcome the unrealistic or oversimplified assumptions of traditional DESs. State-of-the-art techniques in BPM combine Deep Learning and Discrete Event Simulation in a post-integration fashion: first an entire simulation is performed, and then a DL model is used to add waiting times and processing times to the events produced by the simulation model.</div><div>In this paper, we aim at taking a step further by introducing <span>Rims</span> (Runtime Integration of Machine Learning and Simulation). Instead of complementing the outcome of a complete simulation with the results of predictions a posteriori, <span>Rims</span> provides a tight integration of the predictions of the DL model <em>at runtime</em> during the simulation. This runtime-integration enables us to fully exploit the specific predictions while respecting simulation execution, thus enhancing the performance of the overall system both w.r.t. the single techniques (Business Process Simulation and DL) separately and the post-integration approach. In particular, the runtime integration ensures the accuracy of intercase features for time prediction, such as the number of ongoing traces at a given time, by calculating them during directly the simulation, where all traces are executed in parallel. Additionally, it allows for the incorporation of online queue information in the DL model and enables the integration of other predictive models into the simulator to enhance decision point management within the process model. These enhancements improve the performance of <span>Rims</span> in accurately simulating the real process in terms of control flow, as well as in terms of time and congestion dimensions. Especially in process scenarios with significant congestion – when a limited availability of resources leads to significant event queues for their allocation – the ability of <span>Rims</span> to use queue features to predict waiting times allows it to surpass the state-of-the-art. We evaluated our approach with real-world and synthetic event logs, using various metrics to assess the simulation model’s quality in terms of control-flow, time, and congestion dimensions.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"128 ","pages":"Article 102472"},"PeriodicalIF":3.0000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306437924001303","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Recent research in Computer Science has investigated the use of Deep Learning (DL) techniques to complement outcomes or decisions within a Discrete Event Simulation (DES) model. The main idea of this combination is to maintain a white box simulation model complement it with information provided by DL models to overcome the unrealistic or oversimplified assumptions of traditional DESs. State-of-the-art techniques in BPM combine Deep Learning and Discrete Event Simulation in a post-integration fashion: first an entire simulation is performed, and then a DL model is used to add waiting times and processing times to the events produced by the simulation model.

In this paper, we aim at taking a step further by introducing Rims (Runtime Integration of Machine Learning and Simulation). Instead of complementing the outcome of a complete simulation with the results of predictions a posteriori, Rims provides a tight integration of the predictions of the DL model at runtime during the simulation. This runtime-integration enables us to fully exploit the specific predictions while respecting simulation execution, thus enhancing the performance of the overall system both w.r.t. the single techniques (Business Process Simulation and DL) separately and the post-integration approach. In particular, the runtime integration ensures the accuracy of intercase features for time prediction, such as the number of ongoing traces at a given time, by calculating them during directly the simulation, where all traces are executed in parallel. Additionally, it allows for the incorporation of online queue information in the DL model and enables the integration of other predictive models into the simulator to enhance decision point management within the process model. These enhancements improve the performance of Rims in accurately simulating the real process in terms of control flow, as well as in terms of time and congestion dimensions. Especially in process scenarios with significant congestion – when a limited availability of resources leads to significant event queues for their allocation – the ability of Rims to use queue features to predict waiting times allows it to surpass the state-of-the-art. We evaluated our approach with real-world and synthetic event logs, using various metrics to assess the simulation model’s quality in terms of control-flow, time, and congestion dimensions.

查看原文本刊更多论文

业务流程中机器学习与模拟的运行时集成：时间和决策挖掘预测

计算机科学领域的最新研究调查了深度学习（DL）技术的使用情况，以补充离散事件仿真（DES）模型中的结果或决策。这种组合的主要理念是，利用深度学习模型提供的信息对白盒仿真模型进行补充，以克服传统 DES 不切实际或过于简化的假设。BPM 领域最先进的技术以一种后整合的方式将深度学习和离散事件仿真结合在一起：首先执行整个仿真，然后使用 DL 模型为仿真模型生成的事件添加等待时间和处理时间。Rims 不是用事后预测的结果来补充完整的仿真结果，而是在仿真过程中的运行时对 DL 模型的预测进行紧密集成。这种运行时集成使我们能够在尊重仿真执行的前提下充分利用特定的预测结果，从而提高整个系统的性能，无论是与单独的技术（业务流程仿真和 DL）相比，还是与后集成方法相比，都是如此。特别是，运行时集成确保了用于时间预测的案例间特征的准确性，如在给定时间内正在进行的跟踪数量，方法是在所有跟踪都并行执行的模拟过程中直接计算这些特征。此外，它还允许在 DL 模型中纳入在线队列信息，并允许将其他预测模型集成到模拟器中，以加强流程模型中的决策点管理。这些改进提高了 Rims 在控制流、时间和拥塞维度方面精确模拟实际流程的性能。特别是在拥堵严重的流程场景中，当有限的可用资源导致大量事件排队等待分配时，Rims 利用队列特征预测等待时间的能力使其超越了最先进的技术。我们利用真实世界和合成事件日志对我们的方法进行了评估，并使用各种指标来评估仿真模型在控制流、时间和拥塞方面的质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information Systems 工程技术-计算机：信息系统

CiteScore

9.40

自引率

2.70%

发文量

112

审稿时长

53 days

期刊介绍： Information systems are the software and hardware systems that support data-intensive applications. The journal Information Systems publishes articles concerning the design and implementation of languages, data models, process models, algorithms, software and hardware for information systems. Subject areas include data management issues as presented in the principal international database conferences (e.g., ACM SIGMOD/PODS, VLDB, ICDE and ICDT/EDBT) as well as data-related issues from the fields of data mining/machine learning, information retrieval coordinated with structured data, internet and cloud data management, business process management, web semantics, visual and audio information systems, scientific computing, and data science. Implementation papers having to do with massively parallel data management, fault tolerance in practice, and special purpose hardware for data-intensive systems are also welcome. Manuscripts from application domains, such as urban informatics, social and natural science, and Internet of Things, are also welcome. All papers should highlight innovative solutions to data management problems such as new data models, performance enhancements, and show how those innovations contribute to the goals of the application.