Data & Knowledge Engineering最新文献_第6页

ECS-KG: An event-centric semantic knowledge graph for event-related news articles ECS-KG：针对事件相关新闻文章的以事件为中心的语义知识图谱

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-04-08 DOI: 10.1016/j.datak.2025.102451

MVPT Lakshika, HA Caldera, TNK De Zoysa

{"title":"ECS-KG: An event-centric semantic knowledge graph for event-related news articles","authors":"MVPT Lakshika, HA Caldera, TNK De Zoysa","doi":"10.1016/j.datak.2025.102451","DOIUrl":"10.1016/j.datak.2025.102451","url":null,"abstract":"<div><div>Recent advances in deep learning techniques and contextual understanding render Knowledge Graphs (KGs) valuable tools for enhancing accessibility and news comprehension. Conventional and news-specific KGs frequently lack the specificity for efficient news-related tasks, leading to limited relevance and static data representation. To fill the gap, this study proposes an Event-Centric Semantic Knowledge Graph (ECS-KG) model that combines deep learning approaches with contextual embeddings to improve the procedural and dynamic knowledge representation observed in news articles. The ECS-KG incorporates several information extraction techniques, a temporal Graph Neural Network (GNN), and a Graph Attention Network (GAT), yielding significant improvements in news representation. Several gold-standard datasets, comprising CNN/Daily Mail, TB-Dense, and ACE 2005, revealed that the proposed model outperformed the most advanced models. By integrating temporal reasoning and semantic insights, ECS-KG not only enhances user understanding of news significance but also meets the evolving demands of news consumers. This model advances the field of event-centric semantic KGs and provides valuable resources for applications in news information processing.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"159 ","pages":"Article 102451"},"PeriodicalIF":2.7,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143828580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Overcoming the hurdle of legal expertise: A reusable model for smartwatch privacy policies 克服法律专业知识的障碍：智能手表隐私政策的可重用模型

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-04-01 DOI: 10.1016/j.datak.2025.102443

Constantin Buschhaus , Arvid Butting , Judith Michael , Verena Nitsch , Sebastian Pütz , Bernhard Rumpe , Carolin Stellmacher , Sabine Theis

{"title":"Overcoming the hurdle of legal expertise: A reusable model for smartwatch privacy policies","authors":"Constantin Buschhaus , Arvid Butting , Judith Michael , Verena Nitsch , Sebastian Pütz , Bernhard Rumpe , Carolin Stellmacher , Sabine Theis","doi":"10.1016/j.datak.2025.102443","DOIUrl":"10.1016/j.datak.2025.102443","url":null,"abstract":"<div><div>Regulations for privacy protection aim to protect individuals from the unauthorized storage, processing, and transfer of their personal data but oftentimes fail in providing helpful support for understanding these regulations. To better communicate privacy policies for smartwatches, we need an in-depth understanding of their concepts and provide better ways to enable developers to integrate them when engineering systems. Up to now, no conceptual model exists covering privacy statements from different smartwatch manufacturers that is reusable for developers. This paper introduces such a conceptual model for privacy policies of smartwatches and shows its use in a model-driven software engineering approach to create a platform for data visualization of wearable privacy policies from different smartwatch manufacturers. We have analyzed the privacy policies of various manufacturers and extracted the relevant concepts. Moreover, we have checked the model with lawyers for its correctness, instantiated it with concrete data, and used it in a model-driven software engineering approach to create a platform for data visualization. This reusable privacy policy model can enable developers to easily represent privacy policies in their systems. This provides a foundation for more structured and understandable privacy policies which, in the long run, can increase the data sovereignty of application users.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"159 ","pages":"Article 102443"},"PeriodicalIF":2.7,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143817727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Editorial preface to the special issue on research challenges in information science (RCIS’2023) 信息科学研究挑战特刊编辑序言（RCIS ' 2023）

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-03-31 DOI: 10.1016/j.datak.2025.102446

Selmin Nurcan, Andreas L. Opdahl

引用次数: 0

Customized long short-term memory architecture for multi-document summarization with improved text feature set 用于多文档摘要的定制化长短时记忆架构，具有改进的文本特征集

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-03-25 DOI: 10.1016/j.datak.2025.102440

Satya Deo , Debajyoty Banik , Prasant Kumar Pattnaik

{"title":"Customized long short-term memory architecture for multi-document summarization with improved text feature set","authors":"Satya Deo , Debajyoty Banik , Prasant Kumar Pattnaik","doi":"10.1016/j.datak.2025.102440","DOIUrl":"10.1016/j.datak.2025.102440","url":null,"abstract":"<div><div>One <strong>a</strong>mong the most crucial concerns in the domain of Natural Language Processing (NLP) is the Multi-Document Summarization (MDS) and in recent decades, the focus on this issue has risen massively. Hence, it is vital for the NLP community to provide effective and reliable MDS methods. Current deep learning-dependent MDS techniques rely on the extraordinary capacity of neural networks, in order to extract distinctive features. Motivated by this fact, we introduce a novel MDS technique, named as Customized Long Short-Term Memory-based Multi-Document Summarization using IBi-GRU <strong>(</strong>CLSTM-MDS+IBi-GRU), which includes the following working processes. Firstly, the input data gets converted into tokens by the Bi-directional Transformer (BERT) tokenizer. The features, such as Term Frequency- Inverse Document Frequency (TF-IDF), Bag of Words (BoW), thematic features and an improved aspect term-based feature are then extracted afterwards. Finally, the summarization process takes place by utilizing the concatenation of Customized Long Short-Term Memory (CLSTM) with a pre-eminent layer. Accurate and high-quality summary is provided via introducing this layer in the LSTM module and the Bi-GRU-based Inception module (IBi-GRU), which can capture long range dependences through parallel convolution. The outcomes of this work prove the superiority of our CLSTM-MDS in the Multi-Document Summarization task.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"159 ","pages":"Article 102440"},"PeriodicalIF":2.7,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143800459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Application of digital shadows on different levels in the automation pyramid 数字阴影在自动化金字塔不同层次上的应用

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-03-24 DOI: 10.1016/j.datak.2025.102442

Malte Heithoff , Christian Hopmann , Thilo Köbel , Judith Michael , Bernhard Rumpe , Patrick Sapel

引用次数: 0

Fake news detection algorithms – A systematic literature review 假新闻检测算法-系统文献综述

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-03-17 DOI: 10.1016/j.datak.2025.102441

Ana Julia Dal Forno , Graziela Piccoli Richetti , Vinícius Heinz Knaesel

{"title":"Fake news detection algorithms – A systematic literature review","authors":"Ana Julia Dal Forno , Graziela Piccoli Richetti , Vinícius Heinz Knaesel","doi":"10.1016/j.datak.2025.102441","DOIUrl":"10.1016/j.datak.2025.102441","url":null,"abstract":"<div><div>Social media and news platforms make available to their users, in real-time and simultaneously, access to a significant amount of content that may be true or false. It is remarkable that, with the evolution of Industry 4.0 technologies, the production and dissemination of fake news also increased in recent years. Some content quickly reaches considerable popularity because it is accessed and shared on a large scale, especially in social networks, thus having a potential for going viral. Thus, this study aimed to identify the algorithms and software used for fake news detection. The choice for this combination is justified because in Brazil this process is carried out manually by verification agencies and thus, based on the mapping of the algorithms identified in the literature, an architecture proposal will be developed using artificial intelligence. As a methodology, a systematic literature review (SLR) was conducted in the Science Direct and Scopus databases using the keywords \"fake news\" and \"machine learning\" to locate reviews and research articles published in Engineering fields from 2018 to 2023. A total of 24 articles were analyzed, and the results pointed out that Facebook and X<span><span><sup>1</sup></span></span> were the social networks most used to disseminate fake news. Moreover, the main topics addressed were the COVID-19 pandemic and the United States presidential elections of 2016 and 2020. As for the most used algorithms, a predominance of neural networks was observed. The contribution of this study is in mapping the most used algorithms and their degree of assertiveness, as well as identifying the themes, countries and related researchers that help in the evolution of the fake news theme.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"158 ","pages":"Article 102441"},"PeriodicalIF":2.7,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143683664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Modelling process durations with gamma mixtures for right-censored data: Applications in customer clustering, pattern recognition, drift detection, and rationalisation 用伽马混合对右删节数据建模过程持续时间：在客户聚类、模式识别、漂移检测和合理化中的应用

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-03-14 DOI: 10.1016/j.datak.2025.102430

Lingkai Yang , Sally McClean , Kevin Burke , Mark Donnelly , Kashaf Khan

{"title":"Modelling process durations with gamma mixtures for right-censored data: Applications in customer clustering, pattern recognition, drift detection, and rationalisation","authors":"Lingkai Yang , Sally McClean , Kevin Burke , Mark Donnelly , Kashaf Khan","doi":"10.1016/j.datak.2025.102430","DOIUrl":"10.1016/j.datak.2025.102430","url":null,"abstract":"<div><div>Customer modelling, particularly concerning length of stay or process duration, is vital for identifying customer patterns and optimising business processes. Recent advancements in computing and database technologies have revolutionised statistics and business process analytics by producing heterogeneous data that reflects diverse customer behaviours. Different models should be employed for distinct customer categories, culminating in an overall mixture model. Furthermore, some customers may remain “alive” at the conclusion of the observation period, meaning their journeys are incomplete, resulting in right-censored (RC) duration data. This combination of heterogeneous and right-censored data introduces complexity to process duration modelling and analysis. This paper presents a general approach to modelling process duration data using a gamma mixture model, where each gamma distribution represents a specific customer pattern. The model is adapted to account for RC data by modifying the likelihood function during model fitting. The paper explores three key application scenarios: (1) offline pattern clustering, which categorises customers who have completed their journeys; (2) online pattern tracking, which monitors and predicts customer behaviours in real-time; and (3) concept drift detection and rationalisation, which identifies shifts in customer patterns and explains their underlying causes. The proposed method has been validated using synthetically generated data and real-world data from a hospital billing process. In all instances, the fitted models effectively represented the data and demonstrated strong performance across the three application scenarios.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"158 ","pages":"Article 102430"},"PeriodicalIF":2.7,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143654587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Accessibility in conceptual modeling—A systematic literature review, a keyboard-only UML modeling tool, and a research roadmap 概念建模中的可访问性——一个系统的文献综述，一个只需要键盘的UML建模工具，以及一个研究路线图

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-03-06 DOI: 10.1016/j.datak.2025.102423

Aylin Sarioğlu, Haydar Metin, Dominik Bork

{"title":"Accessibility in conceptual modeling—A systematic literature review, a keyboard-only UML modeling tool, and a research roadmap","authors":"Aylin Sarioğlu, Haydar Metin, Dominik Bork","doi":"10.1016/j.datak.2025.102423","DOIUrl":"10.1016/j.datak.2025.102423","url":null,"abstract":"<div><div>The reports on Disability by the World Health Organization show that the number of people with disabilities is increasing. Consequently, accessibility should play an essential role in information systems engineering research. While there is an increasingly rich set of available web accessibility guidelines, testing frameworks, and generally accessibility features in modern web-based software systems, software development frameworks, and Integrated Development Environments, this paper shows, based on a systematic review of the literature and current modeling tools, that accessibility is, so far, only scarcely focused in conceptual modeling research. With this paper, we assess the state of the art of accessibility in conceptual modeling, we identify current research gaps, and we delineate a vision toward more accessible conceptual modeling methods and tools. As a concrete step forward toward this vision, we present a generic concept of a keyboard-only modeling tool interaction that is implemented as a new module for the Graphical Language Server Platform (GLSP) framework. We show—using a currently developed UML modeling tool—how efficiently this module allows GLSP-based tool developers to introduce accessibility features into their modeling tools, thereby engaging physically disabled users in conceptual modeling.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"158 ","pages":"Article 102423"},"PeriodicalIF":2.7,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143579543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Privacy-preserving cross-network service recommendation via federated learning of unified user representations 通过统一用户表示的联邦学习来保护隐私的跨网络服务推荐

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-03-04 DOI: 10.1016/j.datak.2025.102422

Mohamed Gaith Ayadi , Haithem Mezni , Hela Elmannai , Reem Ibrahim Alkanhel

{"title":"Privacy-preserving cross-network service recommendation via federated learning of unified user representations","authors":"Mohamed Gaith Ayadi , Haithem Mezni , Hela Elmannai , Reem Ibrahim Alkanhel","doi":"10.1016/j.datak.2025.102422","DOIUrl":"10.1016/j.datak.2025.102422","url":null,"abstract":"<div><div>With the emergence of cloud computing, the Internet of Things, and other large-scale environments, recommender systems have been faced with several issues, mainly (i) the distribution of user–item data across multiple information networks, (ii) privacy restrictions and the partial profiling of users and items caused by this distribution, (iii) the heterogeneity of user–item knowledge in different information networks. Furthermore, most approaches perform recommendations based on a single source of information, and do not handle the partial representation of users’ and items’ information in a federated way. Such isolated and non-collaborative behavior, in multi-source and cross-network information settings, often results in inaccurate and low-quality recommendations. To address these issues, we exploit the strengths of network representation learning and federated learning to propose a service recommendation approach in smart service networks. While NRL is employed to learn rich representations of entities (e.g., users, services, IoT objects), federated learning helps collaboratively infer a unified profile of users and items, based on the concept of <em>anchor user</em>, which are bridge entities connecting multiple information networks. These unified profiles are, finally, fed into a federated recommendation algorithm to select the top-rated services. Using a scenario from the smart healthcare context, the proposed approach was developed and validated on a multiplex information network built from real-world electronic medical records (157 diseases, 491 symptoms, 273 174 patients, treatments and anchors data). Experimental results under varied federated settings demonstrated the utility of cross-client knowledge (i.e. anchor links) and the collaborative reconstruction of composite embeddings (i.e. user representations) for improving recommendation accuracy. In terms of RMSE@K and MAE@K, our approach achieved an improvement of 54.41% compared to traditional single-network recommendation, as long as the federation and communication scale increased. Moreover, the gap with four federated approaches has reached 19.83 %, highlighting our approach’s ability to map local embeddings (i.e. user’s partial representations) into a complete view.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"158 ","pages":"Article 102422"},"PeriodicalIF":2.7,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143551137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A graph theoretic approach to assess quality of data for classification task 一种评估分类任务数据质量的图论方法

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-03-03 DOI: 10.1016/j.datak.2025.102421

Payel Sadhukhan , Samrat Gupta

{"title":"A graph theoretic approach to assess quality of data for classification task","authors":"Payel Sadhukhan , Samrat Gupta","doi":"10.1016/j.datak.2025.102421","DOIUrl":"10.1016/j.datak.2025.102421","url":null,"abstract":"<div><div>The correctness of predictions rendered by an AI/ML model is key to its acceptability. To foster researchers’ and practitioners’ confidence in the model, it is necessary to render an intuitive understanding of the workings of a model. In this work, we attempt to explain a model’s working by providing some insights into the quality of data. While doing this, it is essential to consider that revealing the training data to the users is not feasible for logistical and security reasons. However, sharing some interpretable parameters of the training data and correlating them with the model’s performance can be helpful in this regard. To this end, we propose a new measure based on Euclidean Minimum Spanning Tree (EMST) for quantifying the intrinsic separation (or overlaps) between the data classes. For experiments, we use datasets from diverse domains such as finance, medical, and marketing. We use state-of-the-art measure known as <em>Davies Bouldin Index (DBI)</em> to validate our approach on four different datasets from aforementioned domains. The experimental results of this study establish the viability of the proposed approach in explaining the working and efficiency of a classifier. Firstly, the proposed measure of class-overlap quantification has shown a better correlation with the classification performance as compared to DBI scores. Secondly, the results on multi-class datasets demonstrate that the proposed measure can be used to determine the feature importance so as to learn a better classification model.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"158 ","pages":"Article 102421"},"PeriodicalIF":2.7,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143591647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0