Information Systems最新文献

筛选
英文 中文
Class Representatives Selection in non-metric spaces for nearest prototype classification 非度量空间中最接近原型分类的类代表选择
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-05-10 DOI: 10.1016/j.is.2025.102564
Jaroslav Hlaváč , Martin Kopp , Tomáš Skopal
{"title":"Class Representatives Selection in non-metric spaces for nearest prototype classification","authors":"Jaroslav Hlaváč ,&nbsp;Martin Kopp ,&nbsp;Tomáš Skopal","doi":"10.1016/j.is.2025.102564","DOIUrl":"10.1016/j.is.2025.102564","url":null,"abstract":"<div><div>The nearest prototype classification is a less computationally intensive replacement for the <span><math><mi>k</mi></math></span>-NN method, especially when large datasets are considered. Centroids are often used as prototypes to represent whole classes in metric spaces. Selection of class prototypes in non-metric spaces is more challenging as the idea of computing centroids is not directly applicable. Instead, a set of representative objects can be used as the class prototype.</div><div>This paper presents the Class Representatives Selection (CRS) method, a novel memory and computationally efficient method that finds a small yet representative set of objects from each class to be used as a prototype. CRS leverages the similarity graph representation of each class created by the NN-Descent algorithm to pick a low number of representatives that ensure sufficient class coverage. Thanks to the graph-based approach, CRS can be applied to any space where at least a pairwise similarity can be defined. In the experimental evaluation, we demonstrate that our method outperforms the state-of-the-art techniques on multiple datasets from different domains.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"133 ","pages":"Article 102564"},"PeriodicalIF":3.0,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143948792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Back to the Order: Partial orders in streaming conformance checking 回到顺序:流一致性检查中的部分顺序
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-05-10 DOI: 10.1016/j.is.2025.102566
Kristo Raun , Riccardo Tommasini , Ahmed Awad
{"title":"Back to the Order: Partial orders in streaming conformance checking","authors":"Kristo Raun ,&nbsp;Riccardo Tommasini ,&nbsp;Ahmed Awad","doi":"10.1016/j.is.2025.102566","DOIUrl":"10.1016/j.is.2025.102566","url":null,"abstract":"<div><div>Most organizations are built around their business processes. Commonly, these processes follow a predefined path. Deviations from the expected path can lead to lower quality products and services, reduced efficiencies, and compliance liabilities. Rapid identification of deviations helps mitigate such risks. For identifying deviations, the conformance checker would need to know the sequence in which events occurred. In this paper, we tackle two challenges associated with knowing the right sequence of events. First, we look at out-of-order event arrival, a common occurrence in modern information systems. Second, we extend the previous work by incorporating partial order handling. Partially ordered events are a well-studied problem in process mining, but to the best of our knowledge it has not been researched in terms of fast-paced streaming conformance checking. Real-life and semi-synthetic datasets are used for validating the proposed methods.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"133 ","pages":"Article 102566"},"PeriodicalIF":3.0,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143948791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Timeline-based process discovery 基于时间轴的流程发现
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-05-10 DOI: 10.1016/j.is.2025.102568
Christoffer Rubensson , Harleen Kaur , Timotheus Kampik , Jan Mendling
{"title":"Timeline-based process discovery","authors":"Christoffer Rubensson ,&nbsp;Harleen Kaur ,&nbsp;Timotheus Kampik ,&nbsp;Jan Mendling","doi":"10.1016/j.is.2025.102568","DOIUrl":"10.1016/j.is.2025.102568","url":null,"abstract":"<div><div>A key concern of automatic process discovery is providing insights into business process performance. Process analysts are specifically interested in waiting times and delays for identifying opportunities to speed up processes. Against this backdrop, it is surprising that current techniques for automatic process discovery generate directly-follows graphs and comparable process models without representing the time axis explicitly. This paper presents four layout strategies for automatically constructing process models that explicitly align with a time axis. We exemplify our approaches for directly-follows graphs. We evaluate their effectiveness by applying them to real-world event logs with varying complexities. Our specific focus is on their ability to handle the trade-off between high control-flow abstraction and high consistency of temporal activity order. Our results show that timeline-based layouts provide benefits in terms of an explicit representation of temporal distances. They face challenges for logs with many repeating and concurrent activities.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"133 ","pages":"Article 102568"},"PeriodicalIF":3.0,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144070923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stochastic conformance checking based on variable-length Markov chains 基于变长马尔可夫链的随机一致性检验
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-05-09 DOI: 10.1016/j.is.2025.102561
Emilio Incerto , Andrea Vandin , Sima Sarv Ahrabi
{"title":"Stochastic conformance checking based on variable-length Markov chains","authors":"Emilio Incerto ,&nbsp;Andrea Vandin ,&nbsp;Sima Sarv Ahrabi","doi":"10.1016/j.is.2025.102561","DOIUrl":"10.1016/j.is.2025.102561","url":null,"abstract":"<div><div>Conformance checking is central in process mining (PM). It studies deviations of logs from reference processes. Originally, the proposed approaches did not focus on stochastic aspects of the underlying process, and gave qualitative models as output. Recently, these have been extended in approaches for <em>stochastic conformance checking</em> (SCC), giving quantitative models as output. A different community, namely the <em>software performance engineering</em> (PE) one, interested in the synthesis of stochastic processes since decades, has developed independently techniques to synthesize Markov Chains (MC) that describe the stochastic process underlying program runs. However, these were never applied to SCC problems. We propose a novel approach to SCC based on PE results for the synthesis of stochastic processes. Thanks to a rich experimental evaluation, we show that it outperforms the state-of-the-art. In doing so, we further bridge PE and PM, fostering cross-fertilization. We use techniques for the synthesis of Variable-length MC (VLMC), higher-order MC able to compactly encode complex path dependencies in the control-flow. VLMCs are equipped with a notion of likelihood that a trace belongs to a model. We use it to perform SCC of a log against a model. We establish the degree of conformance by equipping VLMCs with uEMSC, a standard conformance measure in the SCC literature. We compare with 18 SCC techniques from the PM literature, using 11 benchmark datasets from the PM community. We outperform all approaches in 10 out of 11 datasets, i.e., we get uEMSC values closer to 1 for logs conforming to a model. Furthermore, we show that VLMC are efficient, as they handled all considered datasets in a few seconds.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"133 ","pages":"Article 102561"},"PeriodicalIF":3.0,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144068634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Multi-Faceted Visual Process Analytics 面向多面可视化过程分析
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-05-06 DOI: 10.1016/j.is.2025.102560
Stef van den Elzen , Mieke Jans , Niels Martin , Femke Pieters , Christian Tominski , Maria-Cruz Villa-Uriol , Sebastiaan J. van Zelst
{"title":"Towards Multi-Faceted Visual Process Analytics","authors":"Stef van den Elzen ,&nbsp;Mieke Jans ,&nbsp;Niels Martin ,&nbsp;Femke Pieters ,&nbsp;Christian Tominski ,&nbsp;Maria-Cruz Villa-Uriol ,&nbsp;Sebastiaan J. van Zelst","doi":"10.1016/j.is.2025.102560","DOIUrl":"10.1016/j.is.2025.102560","url":null,"abstract":"<div><div>Both the fields of Process Mining (PM) and Visual Analytics (VA) aim to make complex phenomena understandable. In PM, the goal is to gain insights into the execution of complex processes by analyzing the event data that is captured in event logs. This data is inherently multi-faceted, meaning that it covers various data facets, including spatial and temporal dependencies, relations between data entities (such as cases/events), and multivariate data attributes per entity. However, the multi-faceted nature of the data has not received much attention in PM. Conversely, VA research has investigated interactive visual methods for making multi-faceted data understandable for about two decades. In this study, we bring together PM and VA with the goal of advancing towards Visual Process Analytics (VPA) of multi-faceted processes. To this end, we present a systematic view of relevant (VA) data facets in the context of PM and assess to what extent existing PM visualizations address the data facets’ characteristics, making use of VA guidelines. In addition to visualizations, we look at how PM can benefit from analytical abstraction and interaction techniques known in the VA realm. Based on this, we discuss open challenges and opportunities for future research towards multi-faceted VPA.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"133 ","pages":"Article 102560"},"PeriodicalIF":3.0,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143934651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Support estimation in frequent itemsets mining on Enriched Two Level Tree 富二层树频繁项集挖掘中的支持度估计
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-05-06 DOI: 10.1016/j.is.2025.102559
Clémentin Tayou Djamegni , William Kery Branston Ndemaze , Edith Belise Kenmogne , Hervé Maradona Nana Kouassi , Arnauld Nzegha Fountsop , Idriss Tetakouchom , Laurent Cabrel Tabueu Fotso
{"title":"Support estimation in frequent itemsets mining on Enriched Two Level Tree","authors":"Clémentin Tayou Djamegni ,&nbsp;William Kery Branston Ndemaze ,&nbsp;Edith Belise Kenmogne ,&nbsp;Hervé Maradona Nana Kouassi ,&nbsp;Arnauld Nzegha Fountsop ,&nbsp;Idriss Tetakouchom ,&nbsp;Laurent Cabrel Tabueu Fotso","doi":"10.1016/j.is.2025.102559","DOIUrl":"10.1016/j.is.2025.102559","url":null,"abstract":"<div><div>Efficiently counting the support of candidate itemsets is a crucial aspect of extracting frequent itemsets because it directly impacts the overall performance of the mining process. Researchers have developed various techniques and data structures to overcome this challenge, but the problem is still open. In this paper, we investigate the two-level tree enrichment technique as a potential solution without adding significant computational overhead. In addition, we introduce ETL_Miner, a novel algorithm that provides an estimated bound for the support value of all candidate itemsets within the search space. The method presented in this article is flexible and can be used with various algorithms. To demonstrate this point, we introduce a modified version of Apriori that integrates ETL_Miner as an extra pruning phase. Preliminary empirical experimental results on both real and synthetic datasets confirm the accuracy of the proposed method and reduce the total extraction time.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"133 ","pages":"Article 102559"},"PeriodicalIF":3.0,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143934650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Substring compression variations and LZ78-Derivates 子串压缩变化和lz78派生
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-04-25 DOI: 10.1016/j.is.2025.102553
Dominik Köppl
{"title":"Substring compression variations and LZ78-Derivates","authors":"Dominik Köppl","doi":"10.1016/j.is.2025.102553","DOIUrl":"10.1016/j.is.2025.102553","url":null,"abstract":"<div><div>We propose algorithms computing the semi-greedy Lempel–Ziv 78 (LZ78), the Lempel–Ziv Double (LZD), and the Lempel–Ziv–Miller–Wegman (LZMW) factorizations in linear time for integer alphabets. For LZD and LZMW, we additionally propose data structures that can be constructed in linear time, which can solve the substring compression problems for these factorizations in time linear in the output size. For substring compression, we give the first results for lexparse and closed factorizations.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"133 ","pages":"Article 102553"},"PeriodicalIF":3.0,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143906791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning to resolve inconsistencies in qualitative constraint networks 学习解决定性约束网络中的不一致性
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-04-18 DOI: 10.1016/j.is.2025.102557
Anastasia Paparrizou, Michael Sioutis
{"title":"Learning to resolve inconsistencies in qualitative constraint networks","authors":"Anastasia Paparrizou,&nbsp;Michael Sioutis","doi":"10.1016/j.is.2025.102557","DOIUrl":"10.1016/j.is.2025.102557","url":null,"abstract":"<div><div>In this paper, we present a reinforcement learning approach for resolving inconsistencies in qualitative constraint networks (<span><math><mi>QCN</mi></math></span>s). <span><math><mi>QCN</mi></math></span>s are typically used in constraint programming to represent and reason about intuitive spatial or temporal relations like <em>x</em> {<em>is inside of</em> <span><math><mo>∨</mo></math></span> <em>overlaps</em>} <em>y</em>. Naturally, <span><math><mi>QCN</mi></math></span>s are not immune to uncertainty, noise, or imperfect data that may be present in information, and thus, more often than not, they are hampered by inconsistencies. We propose a multi-armed bandit approach that defines a well-suited ordering of constraints for finding a maximal satisfiable subset of them. Specifically, our learning approach interacts with a solver, and after each trial a reward is returned to measure the performance of the selected action (constraint addition). The reward function is based on the reduction of the solution space of a consistent reconstruction of the input <span><math><mi>QCN</mi></math></span>. Experimental results with different bandit policies and various rewards that are obtained by our algorithm suggest that we can do better than the state of the art in terms of both effectiveness, viz., lower number of repairs obtained for an inconsistent <span><math><mi>QCN</mi></math></span>, and efficiency, viz., faster runtime.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"133 ","pages":"Article 102557"},"PeriodicalIF":3.0,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143868929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Incremental checking of SQL assertions in an RDBMS RDBMS中SQL断言的增量检查
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-04-16 DOI: 10.1016/j.is.2025.102550
Xavier Oriol, Ernest Teniente
{"title":"Incremental checking of SQL assertions in an RDBMS","authors":"Xavier Oriol,&nbsp;Ernest Teniente","doi":"10.1016/j.is.2025.102550","DOIUrl":"10.1016/j.is.2025.102550","url":null,"abstract":"<div><div>The notion of SQL assertion was introduced, in SQL-92 standard, to define general constraints over a relational database. They can be used, for instance, to specify cross-row constraints or multitable check constraints. However, up to now, none of the current relational database management systems (RDBMSs) support SQL assertions due to the difficulty of providing an efficient solution.</div><div>To implement SQL assertions efficiently, the RDBMs require an incremental checking mechanism. I.e., given an assertion, the RDBMS should revalidate it only when a transaction changes data in a manner that could violate it, and only for the affected data. Some years ago, the deductive database community provided several <em>incremental checking</em> methods, however, their results could not get into practice in RDBMS.</div><div>In this paper, we propose an approach to efficiently implement SQL assertions in an RDBMS through an incremental revalidation technique. Such an approach is compatible with any RDBMS since it is fully based on standard SQL concepts (tables, triggers, and procedures). Our proposal uses and extends <em>the Event Rules</em>, an existing proposal for incremental checking in deductive databases. This extension is required to handle distributive aggregates, which pushes the expressiveness of the handled SQL assertions beyond first-order constraints. Moreover, we exploit this extension to improve the treatment of constraints involving existential variables, which are a very common kind of constraints difficult and expensive to handle. Finally, we show the efficiency of our approach through some experiments, and we formally prove its soundness and completeness.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"133 ","pages":"Article 102550"},"PeriodicalIF":3.0,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143848283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A high-accuracy unsupervised statistical learning method for joint dangling entity detection and entity alignment 一种用于关节悬垂实体检测和实体对齐的高精度无监督统计学习方法
IF 3 2区 计算机科学
Information Systems Pub Date : 2025-04-11 DOI: 10.1016/j.is.2025.102554
Cong Xu , Mengxin Shi , Xiang Gao , Zhongkang Yin , Xiujuan Yao , Wei Li , Jiasen Yang
{"title":"A high-accuracy unsupervised statistical learning method for joint dangling entity detection and entity alignment","authors":"Cong Xu ,&nbsp;Mengxin Shi ,&nbsp;Xiang Gao ,&nbsp;Zhongkang Yin ,&nbsp;Xiujuan Yao ,&nbsp;Wei Li ,&nbsp;Jiasen Yang","doi":"10.1016/j.is.2025.102554","DOIUrl":"10.1016/j.is.2025.102554","url":null,"abstract":"<div><div>Dangling entities are common in knowledge graphs but there is a lack of research on entity alignment involving them. Most existing studies leverage neural network methods through supervised learning. However, these data-driven methods suffer from poor interpretability and high computation overhead. In this paper, we propose a Simple Unsupervised Dangling entity detection and entity Alignment method (SUDA)<span><span><sup>1</sup></span></span> without employing neural networks. Our method consists of three modules: entity embedding, dangling entity detection, and entity alignment. While the state-of-the-art Simple but Effective Unsupervised entity alignment method (SEU)<span><span><sup>2</sup></span></span> is incapable of dealing with dangling entities, SUDA further extends it and addresses the bilateral dangling entities problem. Theoretical proof of our method is given. We also design a new adjacent matrix for incorporating richer entity relations. Then we construct entity similarity outlier intervals to detect dangling entities and align entities through assignment problem after removing them. Extensive experiments demonstrate that our method outperforms those supervised and unsupervised methods. Additionally, in the entity alignment tasks, SUDA consumes less runtime compared to neural network methods, while maintaining high efficiency, interpretability, and stability. Code is available at <span><span>https://github.com/skyccong/SUDA.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"133 ","pages":"Article 102554"},"PeriodicalIF":3.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143838186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信