Information Systems最新文献_第8页

IF 3 2区计算机科学

Information Systems Pub Date : 2025-01-25 DOI: 10.1016/j.is.2025.102525

Majid Rafiei, Mahsa Pourbafrani, Wil M.P. van der Aalst

{"title":"Federated conformance checking","authors":"Majid Rafiei, Mahsa Pourbafrani, Wil M.P. van der Aalst","doi":"10.1016/j.is.2025.102525","DOIUrl":"10.1016/j.is.2025.102525","url":null,"abstract":"<div><div>Conformance checking is a crucial aspect of process mining, where the main objective is to compare the actual execution of a process, as recorded in an event log, with a reference process model, e.g., in the form of a Petri net or a BPMN. Conformance checking enables identifying deviations, anomalies, or non-compliance instances. It offers different perspectives on problems in processes, bottlenecks, or process instances that are not compliant with the model. Performing conformance checking in federated (inter-organizational) settings allows organizations to gain insights into the overall process execution and to identify compliance issues across organizational boundaries, which facilitates process improvement efforts among collaborating entities. In this paper, we propose <em>a privacy-aware federated conformance-checking approach</em> that allows for evaluating the correctness of overall cross-organizational process models, identifying miscommunications, and quantifying their costs. For evaluation, we design and simulate a supply chain process with three organizations engaged in purchase-to-pay, order-to-cash, and shipment processes. We generate synthetic event logs for each organization as well as the complete process, and we apply our approach to identify and evaluate the cost of pre-injected miscommunications.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"131 ","pages":"Article 102525"},"PeriodicalIF":3.0,"publicationDate":"2025-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143196972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Scalable and accurate online multivariate anomaly detection 可扩展和准确的在线多变量异常检测

IF 3 2区计算机科学

Information Systems Pub Date : 2025-01-21 DOI: 10.1016/j.is.2025.102524

Rebecca Salles , Benoit Lange , Reza Akbarinia , Florent Masseglia , Eduardo Ogasawara , Esther Pacitti

{"title":"Scalable and accurate online multivariate anomaly detection","authors":"Rebecca Salles , Benoit Lange , Reza Akbarinia , Florent Masseglia , Eduardo Ogasawara , Esther Pacitti","doi":"10.1016/j.is.2025.102524","DOIUrl":"10.1016/j.is.2025.102524","url":null,"abstract":"<div><div>The continuous monitoring of dynamic processes generates vast amounts of streaming multivariate time series data. Detecting anomalies within them is crucial for real-time identification of significant events, such as environmental phenomena, security breaches, or system failures, which can critically impact sensitive applications. Despite significant advances in univariate time series anomaly detection, scalable and efficient solutions for online detection in multivariate streams remain underexplored. This challenge becomes increasingly prominent with the growing volume and complexity of multivariate time series data in streaming scenarios. In this paper, we provide the first structured survey primarily focused on scalable and online anomaly detection techniques for multivariate time series, offering a comprehensive taxonomy. Additionally, we introduce the Online Distributed Outlier Detection (2OD) methodology, a novel well-defined and repeatable process designed to benchmark the online and distributed execution of anomaly detection methods. Experimental results with both synthetic and real-world datasets, covering up to hundreds of millions of observations, demonstrate that a distributed approach can enable centralized algorithms to achieve significant computational efficiency gains, averaging tens and reaching up to hundreds in speedup, without compromising detection accuracy.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"131 ","pages":"Article 102524"},"PeriodicalIF":3.0,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143196973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

On the use of trajectory data for tackling data scarcity 利用轨迹数据解决数据稀缺问题

IF 3 2区计算机科学

Information Systems Pub Date : 2025-01-13 DOI: 10.1016/j.is.2025.102523

Gerard Pons , Besim Bilalli , Alberto Abelló , Santiago Blanco Sánchez

{"title":"On the use of trajectory data for tackling data scarcity","authors":"Gerard Pons , Besim Bilalli , Alberto Abelló , Santiago Blanco Sánchez","doi":"10.1016/j.is.2025.102523","DOIUrl":"10.1016/j.is.2025.102523","url":null,"abstract":"<div><div>In recent years, the availability of GPS-equipped mobile devices and other inexpensive location-tracking technologies have enabled the ubiquitous capturing of the location of moving objects. As a result, trajectory data are abundantly available and there is an increasing trend in analyzing them in the context of mobility data science. However, the abundant availability of trajectory data makes them compelling for other tasks too. In this paper, we propose the use of these data to tackle the data scarcity problem in data analysis by appropriately transforming them to extract relevant knowledge. The challenge lies not just in leveraging these abundant trajectory data, but in accurately deriving information from them that closely approximates the target variable of interest. Such knowledge can be used to generate or supplement the scarcely available datasets in a data analytics problem, thereby enhancing model learning. We showcase the feasibility of our approach in the domain of fishing where there is an abundance of trajectory data but a scarcity of detailed catch information. By using environmental data as explanatory variables, we build and compare models to predict fishing productivity using the actual catches from fishing reports and/or the inferred knowledge from the vessel’s trajectories. The results show that, mainly due to trajectory data being larger in volume than fishing data, models trained with the former obtain a precision 7.9% higher, despite the simplicity of the applied transformations.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"130 ","pages":"Article 102523"},"PeriodicalIF":3.0,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143311830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Revisiting explicit recommendation with DC-GCN: Divide-and-Conquer Graph Convolution Network 用DC-GCN：分而治之的图卷积网络重述显式推荐

IF 3 2区计算机科学

Information Systems Pub Date : 2025-01-01 DOI: 10.1016/j.is.2024.102513

Furong Peng , Fujin Liao , Xuan Lu , Jianxing Zheng , Ru Li

{"title":"Revisiting explicit recommendation with DC-GCN: Divide-and-Conquer Graph Convolution Network","authors":"Furong Peng , Fujin Liao , Xuan Lu , Jianxing Zheng , Ru Li","doi":"10.1016/j.is.2024.102513","DOIUrl":"10.1016/j.is.2024.102513","url":null,"abstract":"<div><div>In recent years, Graph Convolutional Networks (GCNs) have primarily been applied to implicit feedback recommendation, with limited exploration in explicit scenarios. Although explicit recommendations can yield promising results, the conflict between the sparsity of data and the data starvation of deep learning hinders its development. Unlike implicit scenarios, explicit recommendation provides less evidence for predictions and requires distinguishing weights of edges (ratings) in the user-item graph.</div><div>To exploit high-order relations by GCN in explicit scenarios, we propose dividing the explicit rating graph into sub-graphs, each containing only one type of rating. We then employ GCN to capture user and item representations within each sub-graph, allowing the model to focus on rating-related user-item relations, and aggregate the representations of all subgraphs by MLP for the final recommendation. This approach, named Divide-and-Conquer Graph Convolution Network (DC-GCN), simplifies each model’s mission and highlights the strengths of individual modules. Considering that creating GCNs for each sub-graph may result in over-fitting and faces more serious data sparsity, we propose to share node embeddings for all GCNs to reduce the number of parameters, and create rating-aware embedding for each sub-graph to model rating-related relations. Moreover, to alleviate over-smoothing, we utilize random column mask to randomly select columns of node features to update in GCN layers. This technique can prevent node representations from becoming homogeneous in deep GCN networks. DC-GCN is evaluated on four public datasets and achieves the SOTA experimentally. Furthermore, DC-GCN is analyzed in cold-start and popularity bias scenarios, exhibiting competitive performance in various scenarios.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"130 ","pages":"Article 102513"},"PeriodicalIF":3.0,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143311864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Inductive link prediction via global relational semantic learning 基于全局关系语义学习的归纳链接预测

IF 3 2区计算机科学

Information Systems Pub Date : 2024-12-31 DOI: 10.1016/j.is.2024.102514

Chong Mu , Lizong Zhang , Junsong Li , Zhiguo Wang , Ling Tian , Ming Jia

{"title":"Inductive link prediction via global relational semantic learning","authors":"Chong Mu , Lizong Zhang , Junsong Li , Zhiguo Wang , Ling Tian , Ming Jia","doi":"10.1016/j.is.2024.102514","DOIUrl":"10.1016/j.is.2024.102514","url":null,"abstract":"<div><div>Knowledge graphs (KGs) play a crucial role in storing and utilizing real-world facts, but they often suffer from sparse and missing relations. To overcome these challenges, researchers have proposed relation prediction models, including embedding-based methods. However, these methods are restricted to the transductive setting and require retraining when new entities emerge. Thus, recent research has focused on the inductive setting, allowing for different entities in the test set. Subgraph-based models utilizing graph neural networks (GNNs) for local structural information aggregation have shown promising performance. However, existing approaches focus only on local structural information, ignoring the semantic correlation among relations in the global perspective, resulting in sub-optimal performance. Thus, we propose an inductive relation prediction model GRelGT that incorporates the <strong>g</strong>lobal <strong>rel</strong>ation <strong>g</strong>raph with <strong>t</strong>opological information and the enclosing subgraph. GRelGT consists of two core components: a global relation graph module and a subgraph module. The global relation graph module converts the original knowledge graph into a relation graph, with nodes representing edges (triples) in KGs. Furthermore, we introduce four topological structural features as edge types in the global relation graph to facilitating the learning of the semantic correlations between relations. By leveraging the topological features of the relations, the model’s ability to capture the hidden patterns in the KG is enhanced. Meanwhile, the subgraph module is dedicated to exploring the local structural and semantic information within the enclosing subgraph around the target triple. For a more precise understanding of semantic correlations, we further introduce global relation-aware attention and local query-aware attention mechanisms in the subgraph GNN. This allows GRelGT to dynamically weigh the importance of different relations, effectively leveraging both global and local information for inference. Experimental results on three KG datasets demonstrate the superiority of our model compared to state-of-the-art approaches.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"130 ","pages":"Article 102514"},"PeriodicalIF":3.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143311863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Quantifying and relating the completeness and diversity of process representations using species estimation 使用物种估计对过程表示的完整性和多样性进行量化和关联

IF 3 2区计算机科学

Information Systems Pub Date : 2024-12-30 DOI: 10.1016/j.is.2024.102512

Martin Kabierski, Markus Richter, Matthias Weidlich

{"title":"Quantifying and relating the completeness and diversity of process representations using species estimation","authors":"Martin Kabierski, Markus Richter, Matthias Weidlich","doi":"10.1016/j.is.2024.102512","DOIUrl":"10.1016/j.is.2024.102512","url":null,"abstract":"<div><div>The analysis of process representations, such as event logs or process models, has become a staple in the context of business process management. Insights gained from such an analysis serve to monitor and improve the business processes that is captured. Yet, any process representation is merely a sample of the past and possible behaviour of a business process, which raises the question of its representativeness: To which extent does the process representation capture the process characteristics that are relevant for the analysis? In this paper, we propose to answer this question using estimators from biodiversity research. Specifically, we propose to infer a completeness profile based on the estimated number of distinct relevant characteristics of the process representation and a diversity profile, that captures the heterogeneity of relevant distinct characteristics using asymptotic Hill numbers. We validate the applicability of the proposed estimators for process analysis in a series of controlled experiments. Applying the estimators to real-world event logs, we highlight potential issues in terms of trustworthiness of analysis that is based on them, and show how the profiles can be leveraged to compare different process representations concerning their similarity and completeness.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"130 ","pages":"Article 102512"},"PeriodicalIF":3.0,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143311862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Adaptive sliding window normalization 自适应滑动窗口归一化

IF 3 2区计算机科学

Information Systems Pub Date : 2024-12-28 DOI: 10.1016/j.is.2024.102515

George Papageorgiou, Christos Tjortjis

{"title":"Adaptive sliding window normalization","authors":"George Papageorgiou, Christos Tjortjis","doi":"10.1016/j.is.2024.102515","DOIUrl":"10.1016/j.is.2024.102515","url":null,"abstract":"<div><div>Time series data, frequent in various domains such as finance, healthcare, environmental monitoring, and energy management, often exhibit nonstationary behaviors and anomalies that challenge traditional normalization techniques. This research proposes an innovative methodology termed Adaptive Sliding Window Normalization (ASWN) to address these limitations. ASWN dynamically adjusts normalization window sizes based on detected anomalies with multiple methods, applied Density-Based Spatial Clustering of Applications with Noise (DBSCAN) for the finalization of those, and utilizes the Akaike Information Criterion (AIC) with AutoRegressive Integrated Moving Average (ARIMA) models to determine optimal window sizes in the absence of anomalies. This approach integrates multiple anomaly detection methods to ensure responsiveness to changes in data patterns and effective management of outliers. ASWN is applied to diverse time series datasets, including energy consumption, and financial data, demonstrating significant improvements in predictive accuracy. Extensive experiments show that ASWN outperforms traditional normalization methods, providing empirical evidence of its benefits in handling nonstationary and anomalous data. This research enhances the robustness and reliability of time series forecasting and contributes to the broader field by thoroughly documenting the methodology, experimental setup, and results. The findings are intended to foster further advancements in time series normalization and forecasting.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"129 ","pages":"Article 102515"},"PeriodicalIF":3.0,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143165465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Proactive event matching with predictive analysis in content-based publish/subscribe systems 在基于内容的发布/订阅系统中，主动事件与预测分析相匹配

IF 3 2区计算机科学

Information Systems Pub Date : 2024-12-23 DOI: 10.1016/j.is.2024.102508

Yongpeng Dong, Shiyou Qian, Tianchen Ding, Jian Cao, Guangtao Xue, Minglu Li

{"title":"Proactive event matching with predictive analysis in content-based publish/subscribe systems","authors":"Yongpeng Dong, Shiyou Qian, Tianchen Ding, Jian Cao, Guangtao Xue, Minglu Li","doi":"10.1016/j.is.2024.102508","DOIUrl":"10.1016/j.is.2024.102508","url":null,"abstract":"<div><div>The real-time efficacy of content-based publish/subscribe systems is largely dependent on the efficiency of matching algorithms. Current methodologies mainly focus on overall matching performance, often ignoring the dynamic nature and evolving trends of hot events. This paper introduces a novel, learning-driven approach – the proactive adjustment framework (PAF) – tailored to dynamically adapt to hot event trends. By strategically prioritizing subscriptions in alignment with the changing dynamics of hot events, PAF enhances the efficiency of matching algorithms and optimize the system real-time performance. One challenge of PAF is the trade-off that needs to be made between the gains of improving real-time performance by identifying matching subscriptions earlier and the cost of increasing matching time due to subscription classification and adjustment. We design a concise scheme to classify subscriptions, establish a lightweight adjustment mechanism to handle dynamics, and propose an efficient greedy algorithm to compute adjustment plans. This approach helps to mitigate the impact of PAF on matching performance. The experiment results show that the 95th percentile of the determining time of matching subscriptions is improved by about 50.5% and the throughput is also increased by 13%, compared to the baseline SCSL. Furthermore, we integrate PAF into Apache Kafka to augment it as a content-based publish/subscribe system. We test the effectiveness of PAF using two real-world datasets. Compared with two baselines, SCSL and REIN, PAF achieves an improvement of 22.5% and 51.8% respectively in average event transfer latency.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"129 ","pages":"Article 102508"},"PeriodicalIF":3.0,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143165463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Capturing end-to-end provenance for machine learning pipelines 捕获机器学习管道的端到端来源

IF 3 2区计算机科学

Information Systems Pub Date : 2024-12-22 DOI: 10.1016/j.is.2024.102495

Marius Schlegel, Kai-Uwe Sattler

引用次数: 0

On the discovery of seasonal gradual patterns through periodic patterns mining 通过周期性模式挖掘发现季节性渐变模式

IF 3 2区计算机科学

Information Systems Pub Date : 2024-12-19 DOI: 10.1016/j.is.2024.102511

Jerry Lonlac , Arnaud Doniec , Marin Lujak , Stéphane Lecoeuche

{"title":"On the discovery of seasonal gradual patterns through periodic patterns mining","authors":"Jerry Lonlac , Arnaud Doniec , Marin Lujak , Stéphane Lecoeuche","doi":"10.1016/j.is.2024.102511","DOIUrl":"10.1016/j.is.2024.102511","url":null,"abstract":"<div><div>Gradual patterns, capturing intricate attribute co-variations expressed as “when X increases/decreases, Y increases/decreases” in numerical data, play a vital role in managing vast volumes of complex numerical data in real-world applications. Recently, the data science community has focused on efficient extraction methods for gradual patterns from temporal data. However, there is a notable gap in approaches addressing the extraction of gradual patterns that capture seasonality from the graduality point of view in the temporal data sequences, despite their potential to yield valuable insights in applications such as e-commerce. This paper proposes a new method for extracting co-variations of periodically repeating attributes termed as seasonal gradual patterns. To achieve this, we formulate the task of mining seasonal gradual patterns as the problem of mining periodic patterns in multiple sequences and then, leverage periodic pattern mining algorithms to extract seasonal gradual patterns. Additionally, we propose a new antimonotonic support definition associated with these seasonal gradual patterns. Illustrative results from real-world datasets demonstrate the efficiency of the proposed approach and its ability to sift through numerous non-seasonal patterns to identify the seasonal ones.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"129 ","pages":"Article 102511"},"PeriodicalIF":3.0,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143165464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0