Dominique Sommers, Natalia Sidorova, Boudewijn van Dongen
{"title":"In system alignments we trust! Explainable alignments via projections","authors":"Dominique Sommers, Natalia Sidorova, Boudewijn van Dongen","doi":"10.1016/j.is.2025.102631","DOIUrl":"10.1016/j.is.2025.102631","url":null,"abstract":"<div><div>Alignments are a well-known process mining technique for reconciling system logs and normative process models. Evidence of certain behaviors in a real system may only be present in one representation – either a log or a model – but not in the other. Since processes involve multiple entities, such as objects and resources performing different tasks with objects, the interaction of these entities must be taken into account in the alignments. Additionally, both logged and modeled representations of reality may be imprecise and only partially represent some of these entities, but not all. In this paper, we introduce the concept of “relaxations” through projections for alignments to deal with partially correct models and logs. Relaxed alignments help to distinguish between trustworthy and untrustworthy content of the two representations (the log and the model) to achieve a better understanding of the underlying process and expose quality issues.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"136 ","pages":"Article 102631"},"PeriodicalIF":3.4,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145158143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating the lifecycle economics of AI: The levelized cost of artificial intelligence (LCOAI)","authors":"Eliseo Curcio","doi":"10.1016/j.is.2025.102634","DOIUrl":"10.1016/j.is.2025.102634","url":null,"abstract":"<div><div>As artificial intelligence (AI) becomes foundational to enterprise infrastructure, organizations face growing challenges in accurately assessing the full economic implications of AI deployment. Existing metrics such as API token costs, GPU-hour billing, or Total Cost of Ownership (TCO) fail to capture the complete lifecycle costs of AI systems and provide limited comparability across deployment models. This paper introduces the Levelized Cost of Artificial Intelligence (LCOAI), a standardized economic metric designed to quantify the total capital (CAPEX) and operational (OPEX) expenditures per unit of productive AI output, normalized by valid inference volume. Analogous to established metrics like the Levelized Cost of Electricity (LCOE) and the Levelized Cost of Hydrogen (LCOH) in the energy sector, LCOAI provides a rigorous, transparent framework for evaluating and comparing AI deployment strategies. We define the LCOAI methodology in detail and apply it to four representative scenarios OpenAI GPT-4.1 API, Anthropic Claude Haiku API, a self-hosted LLaMA-2–13B deployment, and a cloud-hosted LLaMA-2–13B deployment demonstrating how LCOAI captures critical trade-offs in scalability, investment planning, and cost optimization. Extensive sensitivity analyses further explore the impact of inference volume, CAPEX, and OPEX variability on lifecycle economics. The results illustrate the practical utility of LCOAI in procurement, infrastructure planning, and automation strategy, and establish it as a foundational benchmark for AI economic analysis. Policy implications and directions for future refinement, including integration of environmental and performance-adjusted cost metrics, are also discussed.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"136 ","pages":"Article 102634"},"PeriodicalIF":3.4,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145220536","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SaaMS: The synopses-as-a-microservice paradigm for scalable adaptive streaming analytics across the cloud to edge continuum","authors":"Georgios Panagiotis Kalfakis, Nikos Giatrakos","doi":"10.1016/j.is.2025.102629","DOIUrl":"10.1016/j.is.2025.102629","url":null,"abstract":"<div><div>The use of data synopses in Big streaming Data analytics can offer 3 types of scalability: (i) horizontal scalability, for scaling with the volume and velocity of Big streaming Data, (ii) vertical scalability, for scaling with the number of processed streams, and (iii) federated scalability, i.e. reducing the communication cost for performing global analytics across a number of geo-distributed data centers or devices in IoT settings. Despite the aforementioned virtues of synopses, no state-of-the-art Big Data framework or IoT platform provides a native API for stream synopses supporting all three types of required scalability. In this work, we fill this gap by introducing a novel system and architectural paradigm, namely Synopses-as-a-MicroService (SaaMS), for both parallel and geo-distributed stream summarization at scale. SaaMS is developed on Apache Kafka and Kafka Streams and can provide all the required types of scalability together with (i) the ability to seamlessly perform adaptive resource allocation with zero downtime for the running analytics and (ii) the ability to run both across powerful computer clusters and Java-enabled IoT devices. Therefore, SaaMS is directly deployable from applications that either operate on powerful clouds or across the cloud to edge continuum.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"136 ","pages":"Article 102629"},"PeriodicalIF":3.4,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145220535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jianxin Li , Taotao Cai , Ke Deng , Timos Sellis , Feng Xia
{"title":"Reflection on community-diversified influence maximization in social networks","authors":"Jianxin Li , Taotao Cai , Ke Deng , Timos Sellis , Feng Xia","doi":"10.1016/j.is.2025.102630","DOIUrl":"10.1016/j.is.2025.102630","url":null,"abstract":"<div><div>To celebrate the 50th Anniversary of the Information Systems Journal, we are delighted to share our research reflections on the article “Community-diversified influence maximization in social networks” published at Information Systems in 2020. Our reflections will highlight the impact of this article on the authors’ research trajectories, its influence on the broader research community, and its contributions to industry practice.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"136 ","pages":"Article 102630"},"PeriodicalIF":3.4,"publicationDate":"2025-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145158145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Synthesizing goal models from declarative data-centric process models","authors":"Rik Eshuis , Aditya Ghose","doi":"10.1016/j.is.2025.102626","DOIUrl":"10.1016/j.is.2025.102626","url":null,"abstract":"<div><div>Knowledge-intensive processes progress towards the achievement of operational goals. These processes typically rely on data to enable data-driven decision making, but also require substantial flexibility to deal with the complex and dynamic environments in which they operate. Consequently, declarative data-centric process modeling languages such as the Case Management Model and Notation (CMMN) have been proposed to model knowledge-intensive processes. However, while such process models allow to express goals, they specify dependencies between the goals only implicitly. This makes the goal-oriented behavior of declarative data-centric process models hard to understand, and therefore obfuscates the goal-oriented behavior of knowledge-intensive processes. This paper defines a structural, semi-automated approach to explicate the goal-oriented aspects of declarative data-centric process models. The approach first derives goal relations from a declarative data-centric process model and next synthesizes these goal relations into a goal model using an algorithm. The approach is supported by a tool and has been evaluated in case studies. Using the approach, implicit goal dependencies in declarative data-centric process models are expressed in goal models. This supports the understanding of goal-oriented aspects of declarative data-centric process models.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"136 ","pages":"Article 102626"},"PeriodicalIF":3.4,"publicationDate":"2025-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145158144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GPT-5 and open-weight large language models: Advances in reasoning, transparency, and control","authors":"Maikel Leon","doi":"10.1016/j.is.2025.102620","DOIUrl":"10.1016/j.is.2025.102620","url":null,"abstract":"<div><div>The rapid evolution of Generative Pre-trained Transformers (GPTs) has revolutionized natural language processing, enabling models to generate coherent text, solve mathematical problems, write code, and even reason about complex tasks. This paper presents a scientific review of GPT-5, OpenAI’s latest flagship model, and examines its innovations in comparison to previous generations of GPT. We summarize the model’s architecture and features, including hierarchical routing, expanded context windows, and enhanced tool-use capabilities, and survey empirical evidence of improved performance on academic benchmarks. A dedicated section discusses the release of open-weight mixture-of-experts models (GPT-OSS), describing their technical design, licensing, and comparative performance. Our analysis synthesizes findings from recent literature on long-context evaluation, cognitive biases, medical summarization, and hallucination vulnerability, highlighting where GPT-5 advances the state of the art and where challenges remain. We conclude by discussing the implications of open-weight models for transparency and reproducibility and propose directions for future research on evaluation, safety, and agentic behavior.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"136 ","pages":"Article 102620"},"PeriodicalIF":3.4,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145106042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Process-driven visual analysis of cybersecurity capture the flag exercises","authors":"Radek Ošlejšek, Radoslav Chudovský, Martin Macak","doi":"10.1016/j.is.2025.102627","DOIUrl":"10.1016/j.is.2025.102627","url":null,"abstract":"<div><div>Hands-on training sessions become a standard way to develop and increase knowledge in cybersecurity. As practical cybersecurity exercises are strongly process-oriented with knowledge-intensive processes, process mining techniques and models can help enhance learning analytics tools. The design of our open-source analytical dashboard is backed by guidelines for visualizing multivariate networks complemented with temporal views and clustering. The design aligns with the requirements for post-training analysis of a special subset of cybersecurity exercises — supervised Capture the Flag games. Usability is demonstrated in a case study using trainees’ engagement measurement to reveal potential flaws in training design or organization.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"136 ","pages":"Article 102627"},"PeriodicalIF":3.4,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145118714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matteo Francia , Stefano Rizzi , Matteo Golfarelli , Patrick Marcel
{"title":"Predicting multidimensional cubes through intentional analytics","authors":"Matteo Francia , Stefano Rizzi , Matteo Golfarelli , Patrick Marcel","doi":"10.1016/j.is.2025.102628","DOIUrl":"10.1016/j.is.2025.102628","url":null,"abstract":"<div><div>In an attempt to streamline exploratory data analysis of multidimensional cubes, the Intentional Analytics Model ha been proposed as a way to unite OLAP and analytics by allowing users to indicate their analysis intentions and returning cubes enhanced with models. Five intention operators were envisioned to this end; in this work we focus on the <span>predict</span> operator, whose goal is to estimate the missing values of a cube measure starting from known values of the same measure or other measures using different regression models. Although prediction tasks such as forecasting and imputation are routinary for analysts, the added value of our approach is (i) to encapsulate them in a declarative, concise, natural language-like syntax; (ii) to automate the selection of the best measures to be used and the computation of the models, and (iii) to automate the evaluation of the interest of the models computed. First we propose a syntax and a semantics for <span>predict</span> and discuss how enhanced cubes are built by (i) predicting the missing values for a measure based on the available information via one or more models and (ii) highlighting the most interesting prediction. Then we test the operator implementation, proving that its performance is in line with the interactivity requirement of OLAP session and that accurate predictions can be returned.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"136 ","pages":"Article 102628"},"PeriodicalIF":3.4,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145106045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yongqing Li , Qimeng Yang , Long Yu , ShengWei Tian , Xin Fan
{"title":"Robust Graph Contrastive Learning for recommender systems: Addressing data sparsity and noise","authors":"Yongqing Li , Qimeng Yang , Long Yu , ShengWei Tian , Xin Fan","doi":"10.1016/j.is.2025.102625","DOIUrl":"10.1016/j.is.2025.102625","url":null,"abstract":"<div><div>Graph Contrastive Learning (GCL) enhances recommender systems by leveraging Graph Neural Networks (GNNs) and self-supervised learning (SSL). However, existing methods struggle with data sparsity and noise. We propose Robust Graph Contrastive Learning (RoGCL), a novel framework that generates high-quality contrastive views through dual-perspective generators. The local generator employs Variational Graph Autoencoders (VGAE) to capture micro-level collaborative patterns by sampling from user–item interaction distributions. The global generator utilizes Singular Value Decomposition (SVD) to reconstruct macro-level structures while filtering noise through low-rank approximation. By incorporating Information Bottleneck (InfoBN) to minimize redundancy between views, RoGCL learns robust representations combining local and global collaborative signals. Extensive experiments on Last.FM, Yelp, and BeerAdvocate datasets demonstrate that RoGCL significantly outperforms state-of-the-art methods including Self-supervised Graph Learning (SGL), Neural Collaborative Learning (NCL), and Adaptive Graph Contrastive Learning (AdaGCL). Results show improved Recall@20 by up to 8.7% and NDCG@20 by 5.8% compared to best baselines. Notably, RoGCL exhibits exceptional robustness, maintaining over 90% relative performance with 25% noise injection and showing 37.7% improvement for sparse user groups, making it particularly suitable for real-world applications with imperfect data.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"136 ","pages":"Article 102625"},"PeriodicalIF":3.4,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145106041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Martita Muñoz , José Fuentes-Sepúlveda , Cecilia Hernández , Diego Seco
{"title":"Estimating the compressibility of raster data","authors":"Martita Muñoz , José Fuentes-Sepúlveda , Cecilia Hernández , Diego Seco","doi":"10.1016/j.is.2025.102624","DOIUrl":"10.1016/j.is.2025.102624","url":null,"abstract":"<div><div>The raster data model is widely used in Geographic Information Systems and image processing. The continuous growth of raster data volume poses significant challenges for storage and management. Compact representations of rasters have emerged as a critical solution to address this issue, leveraging data locality to achieve efficient compression. In this context, the research community has proposed compressibility measures aiming to estimate the compressibility of data. Some measures, initially proposed for sequences, have been extended to two- and three-dimensional matrices. This work conducts an experimental analysis of measures applied to raster data compressibility estimation. The first approach applies a linearization function on raster data with matrix representation and then uses existing one-dimensional compressibility measures. The evaluation of the approach compares 1D compressibility measures with 2D measures, data compressors, Compact Data Structures (CDSs), and spatial locality estimation techniques. The results show that spatial locality, alphabet size, and noise directly influence raster compressibility, having more impact over measures like <span><math><mi>z</mi></math></span>, <span><math><mi>v</mi></math></span>, and <span><math><mi>g</mi></math></span>, compressors (bzip, gzip) and a CDS called <span><math><msup><mrow><mi>k</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span>-raster. The second approach introduces <span><math><msub><mrow><mi>δ</mi></mrow><mrow><mi>Δ</mi></mrow></msub></math></span>, a 2D compressibility measure sensitive to differences within the alphabet values. Its purpose is to refine the estimation of raster compressibility. Results indicate that <span><math><msub><mrow><mi>δ</mi></mrow><mrow><mi>Δ</mi></mrow></msub></math></span> is affected by the actual values and their frequencies, aligning with the outcomes of some specific compressors. This alignment underscores the suitability of <span><math><msub><mrow><mi>δ</mi></mrow><mrow><mi>Δ</mi></mrow></msub></math></span> for compressibility estimation tasks closely related to those performed by such compressors.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"136 ","pages":"Article 102624"},"PeriodicalIF":3.4,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145118713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}