{"title":"通过 N 符索引从跟踪前缀高效在线计算业务流程状态","authors":"David Chapela-Campa, Marlon Dumas","doi":"arxiv-2409.05658","DOIUrl":null,"url":null,"abstract":"This paper addresses the following problem: Given a process model and an\nevent log containing trace prefixes of ongoing cases of a process, map each\ncase to its corresponding state (i.e., marking) in the model. This state\ncomputation operation is a building block of other process mining operations,\nsuch as log animation and short-term simulation. An approach to this state\ncomputation problem is to perform a token-based replay of each trace prefix\nagainst the model. However, when a trace prefix does not strictly follow the\nbehavior of the process model, token replay may produce a state that is not\nreachable from the initial state of the process. An alternative approach is to\nfirst compute an alignment between the trace prefix of each ongoing case and\nthe model, and then replay the aligned trace prefix. However,\n(prefix-)alignment is computationally expensive. This paper proposes a method\nthat, given a trace prefix of an ongoing case, computes its state in constant\ntime using an index that represents states as n-grams. An empirical evaluation\nshows that the proposed approach has an accuracy comparable to that of the\nprefix-alignment approach, while achieving a throughput of hundreds of\nthousands of traces per second.","PeriodicalId":501278,"journal":{"name":"arXiv - CS - Software Engineering","volume":"60 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient Online Computation of Business Process State From Trace Prefixes via N-Gram Indexing\",\"authors\":\"David Chapela-Campa, Marlon Dumas\",\"doi\":\"arxiv-2409.05658\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper addresses the following problem: Given a process model and an\\nevent log containing trace prefixes of ongoing cases of a process, map each\\ncase to its corresponding state (i.e., marking) in the model. This state\\ncomputation operation is a building block of other process mining operations,\\nsuch as log animation and short-term simulation. An approach to this state\\ncomputation problem is to perform a token-based replay of each trace prefix\\nagainst the model. However, when a trace prefix does not strictly follow the\\nbehavior of the process model, token replay may produce a state that is not\\nreachable from the initial state of the process. An alternative approach is to\\nfirst compute an alignment between the trace prefix of each ongoing case and\\nthe model, and then replay the aligned trace prefix. However,\\n(prefix-)alignment is computationally expensive. This paper proposes a method\\nthat, given a trace prefix of an ongoing case, computes its state in constant\\ntime using an index that represents states as n-grams. An empirical evaluation\\nshows that the proposed approach has an accuracy comparable to that of the\\nprefix-alignment approach, while achieving a throughput of hundreds of\\nthousands of traces per second.\",\"PeriodicalId\":501278,\"journal\":{\"name\":\"arXiv - CS - Software Engineering\",\"volume\":\"60 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Software Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.05658\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Software Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05658","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Efficient Online Computation of Business Process State From Trace Prefixes via N-Gram Indexing
This paper addresses the following problem: Given a process model and an
event log containing trace prefixes of ongoing cases of a process, map each
case to its corresponding state (i.e., marking) in the model. This state
computation operation is a building block of other process mining operations,
such as log animation and short-term simulation. An approach to this state
computation problem is to perform a token-based replay of each trace prefix
against the model. However, when a trace prefix does not strictly follow the
behavior of the process model, token replay may produce a state that is not
reachable from the initial state of the process. An alternative approach is to
first compute an alignment between the trace prefix of each ongoing case and
the model, and then replay the aligned trace prefix. However,
(prefix-)alignment is computationally expensive. This paper proposes a method
that, given a trace prefix of an ongoing case, computes its state in constant
time using an index that represents states as n-grams. An empirical evaluation
shows that the proposed approach has an accuracy comparable to that of the
prefix-alignment approach, while achieving a throughput of hundreds of
thousands of traces per second.