Matteo Zavatteri, Davide Bresolin, Massimiliano de Leoni, Aurelo Makaj
{"title":"Data-aware process models: From soundness checking to repair","authors":"Matteo Zavatteri, Davide Bresolin, Massimiliano de Leoni, Aurelo Makaj","doi":"10.1016/j.datak.2024.102377","DOIUrl":"10.1016/j.datak.2024.102377","url":null,"abstract":"<div><div>Process-aware Information Systems support the enactment of business processes and rely on a model that prescribes which executions are allowed. As a result, the model needs to be sound for the process to be carried out. Traditionally, soundness has been defined and studied by only focusing on the control-flow. Some works proposed techniques to repair the process model to ensure soundness, ignoring data and decision perspectives. This paper puts forward a technique to repair the data perspective of process models, keeping intact the control flow structure. Processes are modeled by Data Petri nets. Our approach repairs the Constraint Graph, a finite symbolic abstraction of the infinite state–space of the underlying Data Petri net. The changes in the Constraint Graph are then projected back onto the Data Petri net.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"155 ","pages":"Article 102377"},"PeriodicalIF":2.7,"publicationDate":"2024-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142705367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Context normalization: A new approach for the stability and improvement of neural network performance","authors":"Bilal Faye , Hanane Azzag , Mustapha Lebbah , Fangchen Feng","doi":"10.1016/j.datak.2024.102371","DOIUrl":"10.1016/j.datak.2024.102371","url":null,"abstract":"<div><div>Deep neural networks face challenges with distribution shifts across layers, affecting model convergence and performance. While Batch Normalization (BN) addresses these issues, its reliance on a single Gaussian distribution assumption limits adaptability. To overcome this, alternatives like Layer Normalization, Group Normalization, and Mixture Normalization emerged, yet struggle with dynamic activation distributions. We propose ”Context Normalization” (CN), introducing contexts constructed from domain knowledge. CN normalizes data within the same context, enabling local representation. During backpropagation, CN learns normalized parameters and model weights for each context, ensuring efficient convergence and superior performance compared to BN and MN. This approach emphasizes context utilization, offering a fresh perspective on activation normalization in neural networks. We release our code at <span><span>https://github.com/b-faye/Context-Normalization</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"155 ","pages":"Article 102371"},"PeriodicalIF":2.7,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142705364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jamila Oukharijane , Mohamed Amine Chaâbane , Imen Ben Said , Eric Andonoff , Rafik Bouaziz
{"title":"An assessment taxonomy for self-adaptation business process solutions","authors":"Jamila Oukharijane , Mohamed Amine Chaâbane , Imen Ben Said , Eric Andonoff , Rafik Bouaziz","doi":"10.1016/j.datak.2024.102374","DOIUrl":"10.1016/j.datak.2024.102374","url":null,"abstract":"<div><div>Self-adaptation of business processes has become the focus of several research studies aiming at avoiding a manual adaptation of processes at run-time, which is error-prone and time-consuming. In fact, several contributions addressing the self-adaptation of processes have been proposed in the literature, but none of them has comprehensively studied and analyzed the literature to assess the current state of progress in the self-adaptation of processes. To address this gap, we propose in this paper a comprehensive taxonomy that identifies a set of characteristics to serve as support for the comparative analysis of solutions addressing self-adaptation of processes. Our taxonomy includes 25 characteristics that address relevant questions regarding self-adaptation of processes. While creating our taxonomy, we built on existing literature and involved academic experts from different universities. These experts did not only validate our taxonomy regarding completeness and understandability, but also rectified and enriched it with their knowledge. Finally, we report the application of this taxonomy to evaluate some existing contributions on self-adaptation of processes.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"155 ","pages":"Article 102374"},"PeriodicalIF":2.7,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142705365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anouck Chan , Anthony Fernandes Pires , Thomas Polacsek , Stéphanie Roussel , François Bouissière , Claude Cuiller , Pierre-Eric Dereux
{"title":"Goal modelling in aeronautics: Practical applications for aircraft and manufacturing designs","authors":"Anouck Chan , Anthony Fernandes Pires , Thomas Polacsek , Stéphanie Roussel , François Bouissière , Claude Cuiller , Pierre-Eric Dereux","doi":"10.1016/j.datak.2024.102375","DOIUrl":"10.1016/j.datak.2024.102375","url":null,"abstract":"<div><div>Traditional aircraft development follows a sequential approach: the aircraft is designed first, followed by the industrial system. This approach limits the industrial system’s performance due to constraints imposed by the pre-defined aircraft design. Collaborative approaches, however, advocate for simultaneous design of different products to create new opportunities. Within a project focused on co-designing aircraft and their industrial systems, we put goal modelling into practice to gain a comprehensive understanding of the objectives driving each system’s design and their interdependencies. The intention was to develop an approach for actively involving domain experts, even those lacking prior knowledge of Goal-Oriented Requirements Engineering (GORE).</div><div>This paper provides a detailed account of the iterative process employed to develop and refine our approach. For each iteration, we describe the organisation of modelling sessions with experts, the resulting models, and the collected feedback. We also report on the overall approach’s reception from both industry experts and academic participants. Furthermore, we highlight recommendations and research challenges that emerged from the encountered difficulties during the iterative process, suggesting avenues for further investigation and improvement.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"155 ","pages":"Article 102375"},"PeriodicalIF":2.7,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142661002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sergio España , Chris van der Maaten , Jens Gulden , Óscar Pastor
{"title":"Ethical reasoning methods for ICT: What they are and when to use them","authors":"Sergio España , Chris van der Maaten , Jens Gulden , Óscar Pastor","doi":"10.1016/j.datak.2024.102373","DOIUrl":"10.1016/j.datak.2024.102373","url":null,"abstract":"<div><div>Information and communication technology (ICT) brings about numerous advantages across various domains of our lives. However, alongside these benefits, there is a growing awareness of its potential negative ethical, social, and environmental impacts. Consequently, stakeholders ranging from conceptual modellers to policy makers often find themselves grappling with ethical considerations stemming from ICT engineering and usage. This paper presents a review of 10 ethical reasoning methods suitable for the ICT domain. We have employed a method engineering technique to author metamodels for the methods, which were subsequently subjected to validation by experts proficient in the respective methods. Following a situational method engineering approach, we have also characterised each ethical reasoning method and validated the characterisation with the experts. This has allowed us to develop a tool that helps select the method that is most suitable for a given ethical reasoning situation. Furthermore, we deliberate on the practical application of ethical reasoning methods within conceptual modelling contexts. We are confident that we have laid the groundwork for further research into ethical reasoning of ICT, with a specific emphasis on its role during conceptual modelling.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"155 ","pages":"Article 102373"},"PeriodicalIF":2.7,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142661004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SSQTKG: A Subgraph-based Semantic Query Approach for Temporal Knowledge Graph","authors":"Lin Zhu, Xinyi Duan, Luyi Bai","doi":"10.1016/j.datak.2024.102372","DOIUrl":"10.1016/j.datak.2024.102372","url":null,"abstract":"<div><div>Real-world knowledge graphs are growing in size with the explosion of data and rapid expansion of knowledge. There are some studies on knowledge graph query, but temporal knowledge graph (TKG) query is still a relatively unexplored field. A temporal knowledge graph is a knowledge graph that contains temporal information and contains knowledge that is likely to change over time. It introduces a temporal dimension that can characterize the changes and evolution of entities and relationships at different points in time. However, in the existing temporal knowledge graph query, the entity labels are one-sided, which cannot accurately reflect the semantic relationships of temporal knowledge graphs, resulting in incomplete query results. For the processing of temporal information in temporal knowledge graphs, we propose a temporal frame filtering approach and measure the acceptability of temporal frames by the new definition <em>sim</em><sub><em>time</em></sub> based on the proposed three temporal frames and nine rules. For measuring the semantic relationship of predicates between entities, we vectorize the semantic similarity between predicates, i.e., edges, using the knowledge embedding model, and propose the new definition <em>sim</em><sub><em>pre</em></sub> to measure the semantic similarity of predicates. Based on these, we propose a new semantic temporal knowledge graph query method <span><math><msub><mrow><mi>SSQ</mi></mrow><mrow><mi>TKG</mi></mrow></msub></math></span>, and perform pruning operations to optimize the query efficiency of the algorithm based on connectivity. Extensive experiments show that <span><math><msub><mrow><mi>SSQ</mi></mrow><mrow><mi>TKG</mi></mrow></msub></math></span> can return more accurate and complete results that meet the query conditions in the semantic query and can improve the performance of the querying on the temporal knowledge graph.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"155 ","pages":"Article 102372"},"PeriodicalIF":2.7,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142661003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mireia Costa , Alberto García S. , Ana León , Anna Bernasconi , Oscar Pastor
{"title":"VarClaMM: A reference meta-model to understand DNA variant classification","authors":"Mireia Costa , Alberto García S. , Ana León , Anna Bernasconi , Oscar Pastor","doi":"10.1016/j.datak.2024.102370","DOIUrl":"10.1016/j.datak.2024.102370","url":null,"abstract":"<div><div>Determining the significance of a DNA variant in patients’ health status – a complex process known as <em>variant classification</em> – is highly critical for precision medicine applications. However, there is still debate on how to combine and weigh diverse available evidence to achieve proper and consistent conclusions. Indeed, currently, there are more than 200 different variant classification guidelines available to the scientific community, aiming to establish a framework for standardizing the classification process. Yet, these guidelines are qualitative and vague by nature, hindering their practical application and potential automation. Consequently, more precise definitions are needed.</div><div>In this work, we discuss our efforts to create VarClaMM, a UML meta-model that aims to provide a clear specification of the key concepts involved in variant classification, serving as a common framework for the process. Through this accurate characterization of the domain, we were able to find contradictions or inconsistencies that might have an effect on the classification results. VarClaMM’s conceptualization efforts will lay the ground for the operationalization of variant classification, enabling any potential automation to be based on precise definitions.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"154 ","pages":"Article 102370"},"PeriodicalIF":2.7,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142573531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NoSQL document data migration strategy in the context of schema evolution","authors":"Solomiia Fedushko , Roman Malyi , Yuriy Syerov , Pavlo Serdyuk","doi":"10.1016/j.datak.2024.102369","DOIUrl":"10.1016/j.datak.2024.102369","url":null,"abstract":"<div><div>In Agile development, one approach cannot be chosen and used all the time. Constant updates and strategy changes are necessary. We want to show that combining several migration strategies is better than choosing only one. Also, we emphasize the need to consider the type of schema change. This paper introduces a novel approach designed to optimize the migration process for NoSQL databases. The approach represents a significant advancement in migration strategy planning, providing a quantitative framework to guide decision-making. By incorporating critical factors such as schema changes, database size, the necessity of data in search functionalities, and potential latency issues, the approach comprehensively evaluates the migration feasibility and identifies the optimal migration path. Unlike existing methodologies, this approach adapts to the dynamic nature of NoSQL databases, offering a scalable and flexible approach to migration planning.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"154 ","pages":"Article 102369"},"PeriodicalIF":2.7,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142554233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Change pattern relationships in event logs","authors":"Jonas Cremerius, Hendrik Patzlaff, Mathias Weske","doi":"10.1016/j.datak.2024.102368","DOIUrl":"10.1016/j.datak.2024.102368","url":null,"abstract":"<div><div>Process mining utilises process execution data to discover and analyse business processes. Event logs represent process executions, providing information about the activities executed. In addition to generic event attributes like activity name and timestamp, events might contain domain-specific attributes, such as a blood sugar measurement in a healthcare environment. Many of these values change during a typical process quite frequently. We refer to those as dynamic event attributes. Change patterns can be derived from dynamic event attributes, describing if the attribute values change from one activity to another. So far, change patterns can only be identified in an isolated manner, neglecting the chance of finding co-occuring change patterns. This paper provides an approach to identifying relationships between change patterns by utilising correlation methods from statistics. We applied the proposed technique on two event logs derived from the MIMIC-IV real-world dataset on hospitalisations in the US and evaluated the results with a medical expert. It turns out that relationships between change patterns can be detected within the same directly or eventually follows relation and even beyond that. Further, we identify unexpected relationships that are occurring only at certain parts of the process. Thus, the process perspective reveals novel insights on how dynamic event attributes change together during process execution. The approach is implemented in Python using the PM4Py framework.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"154 ","pages":"Article 102368"},"PeriodicalIF":2.7,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142533491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}