Data & Knowledge Engineering最新文献_第8页

How well can a large language model explain business processes as perceived by users? 大型语言模型如何很好地解释用户感知到的业务流程？

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-02-15 DOI: 10.1016/j.datak.2025.102416

Dirk Fahland , Fabiana Fournier , Lior Limonad , Inna Skarbovsky , Ava J.E. Swevels

{"title":"How well can a large language model explain business processes as perceived by users?","authors":"Dirk Fahland , Fabiana Fournier , Lior Limonad , Inna Skarbovsky , Ava J.E. Swevels","doi":"10.1016/j.datak.2025.102416","DOIUrl":"10.1016/j.datak.2025.102416","url":null,"abstract":"<div><div>Large Language Models (LLMs) are trained on a vast amount of text to interpret and generate human-like textual content. They are becoming a vital vehicle in realizing the vision of the autonomous enterprise, with organizations today actively adopting LLMs to automate many aspects of their operations. LLMs are likely to play a prominent role in future AI-augmented business process management systems (ABPMSs) catering functionalities across all system lifecycle stages. One such system’s functionality is Situation-Aware eXplainability (SAX), which relates to generating causally sound and yet human-interpretable explanations that take into account the process context in which the explained condition occurred.</div><div>In this paper, we present the SAX4BPM framework developed to generate SAX explanations. The SAX4BPM suite consists of a set of services and a central knowledge repository. The functionality of these services is to elicit the various knowledge ingredients that underlie SAX explanations. A key innovative component among these ingredients is the causal process execution view. In this work, we integrate the framework with an LLM to leverage its power to synthesize the various input ingredients for the sake of improved SAX explanations.</div><div>Since the use of LLMs for SAX is also accompanied by a certain degree of doubt related to its capacity to adequately fulfill SAX along with its tendency for hallucination and lack of inherent capacity to reason, we pursued a methodological evaluation of the perceived quality of the generated explanations. To this aim, we developed a designated scale and conducted a rigorous user study. Our findings show that the input presented to the LLMs aided with the guard-railing of its performance, yielding SAX explanations having better-perceived fidelity. This improvement is moderated by the perception of trust and curiosity. More so, this improvement comes at the cost of the perceived interpretability of the explanation.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"157 ","pages":"Article 102416"},"PeriodicalIF":2.7,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143445243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evaluating diabetes dataset for knowledge graph embedding based link prediction 评估基于知识图嵌入的糖尿病数据集链接预测

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-02-15 DOI: 10.1016/j.datak.2025.102414

Sushmita Singh, Manvi Siwach

{"title":"Evaluating diabetes dataset for knowledge graph embedding based link prediction","authors":"Sushmita Singh, Manvi Siwach","doi":"10.1016/j.datak.2025.102414","DOIUrl":"10.1016/j.datak.2025.102414","url":null,"abstract":"<div><div>For doing any accurate analysis or prediction on data, a complete and well-populated dataset is required. Medical based data for any disease like diabetes is highly coupled and heterogeneous in nature, with numerous interconnections. This inherently complex data cannot be analysed by simple relational databases making knowledge graphs an ideal tool for its representation which can efficiently handle intricate relationships. Thus, knowledge graphs can be leveraged to analyse diabetes data, enhancing both the accuracy and efficiency of data-driven decision-making processes. Although substantial data exists on diabetes in various formats, the availability of organized and complete datasets is limited, highlighting the critical need for creation of a well-populated knowledge graph. Moreover while developing the knowledge graph, an inevitable problem of incompleteness is present due to missing links or relationships, necessitating the use of knowledge graph completion tasks to fill in this absent information which involves predicting missing data with various Link Prediction (LP) techniques. Among various link prediction methods, approaches based on knowledge graph embeddings have demonstrated superior performance and effectiveness. These knowledge graphs can support in-depth analysis and enhance the prediction of diabetes-associated risks in this field. This paper introduces a dataset specifically designed for performing link prediction on a diabetes knowledge graph, so that it can be used to fill the information gaps further contributing in the domain of risk analysis in diabetes. The accuracy of the dataset is assessed through validation with state-of-the-art embedding-based link prediction methods.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"157 ","pages":"Article 102414"},"PeriodicalIF":2.7,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143419097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Modelling and solving industrial production tasks as planning-scheduling tasks 将工业生产任务建模并求解为计划调度任务

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-02-14 DOI: 10.1016/j.datak.2025.102415

Andrii Nyporko , Lukáš Chrpa

{"title":"Modelling and solving industrial production tasks as planning-scheduling tasks","authors":"Andrii Nyporko , Lukáš Chrpa","doi":"10.1016/j.datak.2025.102415","DOIUrl":"10.1016/j.datak.2025.102415","url":null,"abstract":"<div><div>Industrial production planning or manufacturing concerns the selection of activities that can produce a desired product and scheduling them on resources that perform these activities. To deal with such problems techniques in the fields of <em>Automated Planning</em> and <em>Scheduling</em> might be leveraged, which are usually pursued separately even though they are (very) complementary. In manufacturing, the activities represent elementary steps in the production and each activity requires a specific input in order to produce a desired output. From that perspective, activities resemble actions in planning as they can capture such a requirement. Selecting proper activities including their (partial) ordering can be understood as a planning task while allocating the activities to the resources can be understood as a scheduling task.</div><div>This paper formalises the concept of “combined” planning and scheduling tasks by defining <em>planning-scheduling tasks</em> that are suitable for problems concerning industrial production or manufacturing. In particular, we define two types of activities – <em>production</em> and <em>maintenance</em> activities – where the former describes elementary production tasks while the latter modifies attributes of the resources (e.g. changing the configuration of reconfigurable machines). We introduce an extension of Planning Domain Definition Language (PDDL), a well-known language for describing planning tasks, to support modelling of planning-scheduling tasks. To tackle planning-scheduling tasks we propose two compilation schemes, one into temporal planning (in PDDL 2.1) and one into classical planning. We evaluated our approaches in three use cases of industrial production planning — Reconfigurable Machines, Woodworking, and Tube Factory domains. The results showed that solving planning-scheduling tasks by compiling them into planning tasks in order to use off-the-shelf planning engines is suitable as it scales reasonably well with the size of the actual tasks (although the resulting solutions are suboptimal).</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"157 ","pages":"Article 102415"},"PeriodicalIF":2.7,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143437007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Static and dynamic techniques for iterative test-driven modelling of Dynamic Condition Response Graphs 动态条件响应图的迭代测试驱动建模的静态和动态技术

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-02-04 DOI: 10.1016/j.datak.2025.102413

Axel K.F. Christfort , Vlad Paul Cosma , Søren Debois , Thomas T. Hildebrandt , Tijs Slaats

{"title":"Static and dynamic techniques for iterative test-driven modelling of Dynamic Condition Response Graphs","authors":"Axel K.F. Christfort , Vlad Paul Cosma , Søren Debois , Thomas T. Hildebrandt , Tijs Slaats","doi":"10.1016/j.datak.2025.102413","DOIUrl":"10.1016/j.datak.2025.102413","url":null,"abstract":"<div><div>Test-driven declarative process modelling combines process models with test traces and has been introduced as a means to achieve both the flexibility provided by the declarative approach and the comprehensibility of the imperative approach. Open test-driven modelling adds a notion of context to tests, specifying the activities of concern in the model, and has been introduced as a means to support both iterative test-driven modelling, where the model can be extended without having to change all tests, and unit testing, where tests can define desired properties of parts of the process without needing to reason about the details of the whole process. The openness however makes checking a test more demanding, since actions outside the context are allowed at any point in the test execution and therefore many different traces may validate or invalidate an open test. In this paper we combine previously developed static techniques for effective open test-driven modelling for Dynamic Condition Response Graphs with a novel efficient implementation of dynamic checking of open tests based on alignment checking. We illustrate the static techniques on an example based on a real-life cross-organizational case management system and benchmark the dynamic checking on models and tests of varying size.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"157 ","pages":"Article 102413"},"PeriodicalIF":2.7,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143377553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Reinforcement learning for optimizing responses in care processes 用于优化护理过程反应的强化学习

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-02-03 DOI: 10.1016/j.datak.2025.102412

Olusanmi A. Hundogan , Bart J. Verhoef , Patrick Theeven , Hajo A. Reijers , Xixi Lu

{"title":"Reinforcement learning for optimizing responses in care processes","authors":"Olusanmi A. Hundogan , Bart J. Verhoef , Patrick Theeven , Hajo A. Reijers , Xixi Lu","doi":"10.1016/j.datak.2025.102412","DOIUrl":"10.1016/j.datak.2025.102412","url":null,"abstract":"<div><div>Prescriptive process monitoring aims to derive recommendations for optimizing complex processes. While previous studies have successfully used reinforcement learning techniques to derive actionable policies in business processes, care processes present unique challenges due to their dynamic and multifaceted nature. For example, at any stage of a care process, a multitude of actions is possible. In this study, we follow the Reinforcement Learning (RL) approach and present a general approach that uses event data to build and train Markov decision processes. We proposed three algorithms including one that takes the elapsed time into account when transforming an event log into a semi-Markov decision process. We evaluated the RL approach using an aggression incident data set. Specifically, the goal is to optimize staff member actions when clients are displaying different types of aggressive behavior. The Q-learning and SARSA are used to find optimal policies. Our results showed that the derived policies align closely with current practices while offering alternative options in specific situations. By employing RL in the context of care processes, we contribute to the ongoing efforts to enhance decision-making and efficiency in dynamic and complex environments.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"157 ","pages":"Article 102412"},"PeriodicalIF":2.7,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143372710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Symmetric non negative matrices factorization applied to the detection of communities in graphs and forensic image analysis 对称非负矩阵分解应用于图形群体检测和法医图像分析

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-01-31 DOI: 10.1016/j.datak.2025.102411

Gaël Marec , Nédra Mellouli

引用次数: 0

REDIRE: Extreme REduction DImension for extRactivE Summarization rerere：提取摘要的极限降维

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-01-26 DOI: 10.1016/j.datak.2025.102407

Christophe Rodrigues , Marius Ortega , Aurélien Bossard , Nédra Mellouli

引用次数: 0

Logic-infused knowledge graph QA: Enhancing large language models for specialized domains through Prolog integration 逻辑注入的知识图QA：通过Prolog集成增强专门领域的大型语言模型

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-01-24 DOI: 10.1016/j.datak.2025.102406

Aneesa Bashir, Rong Peng, Yongchang Ding

{"title":"Logic-infused knowledge graph QA: Enhancing large language models for specialized domains through Prolog integration","authors":"Aneesa Bashir, Rong Peng, Yongchang Ding","doi":"10.1016/j.datak.2025.102406","DOIUrl":"10.1016/j.datak.2025.102406","url":null,"abstract":"<div><div>Efficiently answering questions over complex, domain-specific knowledge graphs remain a substantial challenge, as large language models (LLMs) often lack the logical reasoning abilities and particular knowledge required for such tasks. This paper presents a novel framework integrating LLMs with logical programming languages like Prolog for Logic-Infused Knowledge Graph Question Answering (KGQA) in specialized domains. The proposed methodology uses a transformer-based encoder–decoder architecture. An encoder reads the question, and a named entity recognition (NER) module connects entities to the knowledge graph. The extracted entities are fed into a grammar-guided decoder, producing a logical form (Prolog query) that captures the semantic constraints and relationships. The Prolog query is executed over the knowledge graph to perform symbolic reasoning and retrieve relevant answer entities. Comprehensive experiments on the MetaQA benchmark dataset demonstrate the superior performance of this logic-infused method in accurately identifying correct answer entities from the knowledge graph. Even when trained on a limited subset of annotated data, it outperforms state-of-the-art baselines, achieving 89.60 % and F1-scores of up to 89.61 %, showcasing its effectiveness in enhancing large language models with symbolic reasoning capabilities for specialized question-answering tasks. The seamless integration of LLMs and logical programming enables the proposed framework to reason effectively over complex, domain-specific knowledge graphs, overcoming a key limitation of existing KGQA systems. In specialized domains, the interpretability provided by representing questions such as Prologue queries is a valuable asset.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"157 ","pages":"Article 102406"},"PeriodicalIF":2.7,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143135551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A methodology for the systematic design of storytelling dashboards applied to Industry 4.0 一种应用于工业4.0的叙事仪表板系统设计方法

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-01-22 DOI: 10.1016/j.datak.2025.102410

Ana Lavalle , Alejandro Maté , Maribel Yasmina Santos , Pedro Guimarães , Juan Trujillo , Antonina Santos

{"title":"A methodology for the systematic design of storytelling dashboards applied to Industry 4.0","authors":"Ana Lavalle , Alejandro Maté , Maribel Yasmina Santos , Pedro Guimarães , Juan Trujillo , Antonina Santos","doi":"10.1016/j.datak.2025.102410","DOIUrl":"10.1016/j.datak.2025.102410","url":null,"abstract":"<div><div>Dashboards are popular tools for presenting key insights to decision-makers by translating large volumes of data into clear information. However, while individual visualizations may effectively answer specific questions, they often fail to connect in a way that conveys the overall narrative, leaving decision-makers without a cohesive understanding of the area under analysis.</div><div>This paper presents a novel methodology for the systematic design of holistic dashboards, moving from analytical requirements to storytelling dashboards. Our approach ensures that all visualizations are aligned with the analytical goals of decision-makers. It includes several key steps: capturing analytical requirements through the i* framework; structuring and refining these requirements into a tree model to reflect the decision-maker’s mental analysis; identifying and preparing relevant data; capturing the key concepts and relationships for the composition of the cohesive storytelling dashboard through a novel storytelling conceptual model; finally, implementing and integrating the visualizations into the dashboard, ensuring coherence and alignment with the decision-maker’s needs. Our methodology has been applied in real-world industrial environments. We evaluated its impact through a controlled experiment. The findings show that storytelling dashboards significantly improve data interpretation, reduce misinterpretations, and enhance the overall user experience compared to traditional dashboards.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"156 ","pages":"Article 102410"},"PeriodicalIF":2.7,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143133530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Ensuring safety in digital spaces: Detecting code-mixed hate speech in social media posts 确保数字空间的安全：在社交媒体帖子中检测混合代码的仇恨言论

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-01-18 DOI: 10.1016/j.datak.2025.102409

Pradeep Kumar Roy , Abhinav Kumar

{"title":"Ensuring safety in digital spaces: Detecting code-mixed hate speech in social media posts","authors":"Pradeep Kumar Roy , Abhinav Kumar","doi":"10.1016/j.datak.2025.102409","DOIUrl":"10.1016/j.datak.2025.102409","url":null,"abstract":"<div><div>Social networks strive to offer positive content to users, yet a considerable amount of inappropriate material, such as rumors, fake news, and hate speech, persists. Despite significant efforts to detect and prevent hate speech early, it remains widespread due to issues like misspellings and mixed language in posts. To address these challenges, this research utilizes advanced algorithms like CNN, LSTM, and BERT to develop an automated system for detecting hate speech in Telugu-English code-mixed posts. Additionally, evaluating the effectiveness of data translation and transliteration approaches for detecting hate in mixed language. Results indicate that the transliteration approach achieves the highest accuracy, with a performance of 75% accuracy, surpassing raw and translated data by 1% and 3%, respectively. The proposed system may effectively minimizes hate speech and offensive content on social media platforms, resulting in an enhanced user experience. From a managerial perspective, this research presents numerous benefits, such as improved content moderation, optimized resource allocation, data-driven decision-making, enhanced user satisfaction, strengthened reputation management, and greater scalability. These advancements underscore the potential of utilizing advanced technologies to address complex challenges in social media management.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"156 ","pages":"Article 102409"},"PeriodicalIF":2.7,"publicationDate":"2025-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143133540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0