Data & Knowledge Engineering最新文献

Knowledge graph question generation based on crucial semantic information

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-10-17 DOI: 10.1016/j.datak.2025.102529

Mingtao Zhou , Juxiang Zhou , Jianhou Gan , Jun Wang , Jiatian Mei

{"title":"Knowledge graph question generation based on crucial semantic information","authors":"Mingtao Zhou , Juxiang Zhou , Jianhou Gan , Jun Wang , Jiatian Mei","doi":"10.1016/j.datak.2025.102529","DOIUrl":"10.1016/j.datak.2025.102529","url":null,"abstract":"<div><div>The aim of the knowledge graph-based question generation (KGQG) task is to generate an answerable, fluent question from a ternary knowledge graph and a target answer. Existing KGQG that study knowledge graph subgraphs and the question of target answer generation do not effectively capture the critical semantic information between tokens within nodes/edges in subgraphs and fail to make full use of target answers and answer markers. This has led to the generation of disfluent and unanswerable questions. To address these problems, we propose a model called knowledge graph question generation based on crucial semantic information (KGQG-CSI). Our proposed model utilizes the critical semantic information encoding module to dynamically learn the degree of significance of tokens within the edges and nodes of fused answers, capturing critical semantic information that would remedy disfluency. In addition, the target answers and answer markers are sufficiently integrated with the nodes to make the generated questions answerable. First, the attention mechanism is used to allow the nodes to interact with the target answers, thereby expressing the semantic information related to the answers more accurately. The nodes that have been processed through the critical semantic information encoding module are then spliced with the answer markers to reduce the ambiguous information. The experimental results on two public datasets show that the results of the proposed model outperform the existing methods.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"161 ","pages":"Article 102529"},"PeriodicalIF":2.7,"publicationDate":"2025-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145361577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A framework for purpose-guided event logs generation 用于生成目的导向的事件日志的框架

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-10-15 DOI: 10.1016/j.datak.2025.102526

Andrea Burattin , Barbara Re , Lorenzo Rossi , Francesco Tiezzi

引用次数: 0

Temporal knowledge graph recommendation with sequence-aware and path reasoning 时序感知和路径推理的时序知识图推荐

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-10-03 DOI: 10.1016/j.datak.2025.102522

Yuanming Zhang, Ziyou He, Yongbiao Lou, Haixia Long, Fei Gao

{"title":"Temporal knowledge graph recommendation with sequence-aware and path reasoning","authors":"Yuanming Zhang, Ziyou He, Yongbiao Lou, Haixia Long, Fei Gao","doi":"10.1016/j.datak.2025.102522","DOIUrl":"10.1016/j.datak.2025.102522","url":null,"abstract":"<div><div>Knowledge graph recommendation (KGRec) models not only alleviate the issues of data sparsity and the cold start problem encountered by traditional models but also enhance interpretability and credibility through the provision of explicit recommendation rationales. Nonetheless, existing KGRec models predominantly concentrate on extracting static structural features of user preferences from KG, often neglecting the dynamic temporal features, such as purchase time and click time. This oversight results in considerable limitations in recommendation performance. In response to this challenge, this paper introduces a novel temporal knowledge graph recommendation model (TKGRec), which fully utilizes both dynamic temporal feature and static structure feature for better recommendation. We specifically construct a temporal KG that encapsulates both static and dynamic user–item interactions. Based on the new environment, we propose a sequence-aware and path reasoning framework, in which the sequence-aware module employs a dual-attention mechanism to distill temporal features from interactions, whereas the path reasoning module utilizes reinforcement learning to extract path features. These two modules are seamlessly fused and iteratively refined to capture a more holistic understanding of user preferences. Experimental results on three real-world datasets demonstrate that the proposed model significantly outperforms existing state-of-the-art baseline models in terms of performance.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"161 ","pages":"Article 102522"},"PeriodicalIF":2.7,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145266450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An optimization enabled Hierarchical Attention -Deep LSTM model for sentiment analysis on cloth products from customer rating 一个优化的分层关注深度LSTM模型，用于从客户评级中对布料产品进行情感分析

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-09-29 DOI: 10.1016/j.datak.2025.102523

Zhijun Chen , Tsungshun Hsieh , Ze Chen

{"title":"An optimization enabled Hierarchical Attention -Deep LSTM model for sentiment analysis on cloth products from customer rating","authors":"Zhijun Chen , Tsungshun Hsieh , Ze Chen","doi":"10.1016/j.datak.2025.102523","DOIUrl":"10.1016/j.datak.2025.102523","url":null,"abstract":"<div><div>The primary aim of the study endeavour is to introduce a deep learning approaches augmented with optimization techniques to conduct sentiment analysis on apparel products, utilize customer reviews and ratings as foundational data. Consequently, a review of a clothing item is utilized as input, which undergoes pre-processing involving the elimination of stop words and stemming to eradicate superfluous information. In parallel, critical features are extracted from the pre-processed data to facilitate effective categorization. Thereafter, feature extraction is executed through execution of Term frequency-inverse document frequency (TF-IDF), SentiWordNet features, positive sentiment scores, negative sentiment scores, the count of capitalized words, and hashtags. Subsequently, feature fusion is conducted utilizing the proposed Trend factor smoothing-Siberian Tiger Optimization (TS-STO), which is innovatively premeditated by integrating trend factor smoothing within the update process of Siberian Tiger Optimization (STO). Ultimately, sentiment analysis is conducted through the implementation of HA-Deep LSTM, which is conceived by merging Hierarchical Attention Network with Deep LSTM. Experimental analysis portrayed that presented approach conquered an accuracy of 95.9 %, a sensitivity of 96.1 % and specificity of 94.2 %.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"161 ","pages":"Article 102523"},"PeriodicalIF":2.7,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145266451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A graph-based model for semantic textual similarity measurement 基于图的语义文本相似度度量模型

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-09-12 DOI: 10.1016/j.datak.2025.102509

Van-Tan Bui , Quang-Minh Nguyen , Van-Vinh Nguyen , Duc-Toan Nguyen

{"title":"A graph-based model for semantic textual similarity measurement","authors":"Van-Tan Bui , Quang-Minh Nguyen , Van-Vinh Nguyen , Duc-Toan Nguyen","doi":"10.1016/j.datak.2025.102509","DOIUrl":"10.1016/j.datak.2025.102509","url":null,"abstract":"<div><div>Measuring semantic similarity between sentence pairs is a fundamental problem in Natural Language Processing with applications in various domains, including machine translation, speech recognition, automatic question answering, and text summarization. Despite its significance, accurately assessing semantic similarity remains a challenging task, particularly for underrepresented languages such as Vietnamese. Existing methods have yet to fully leverage the unique linguistic characteristics of Vietnamese for semantic similarity measurement. To address this limitation, we propose GBNet-STS (Graph-Based Network for Semantic Textual Similarity), a novel framework for measuring the semantic similarity of Vietnamese sentence pairs. GBNet-STS integrates lexical-grammatical similarity scores and distributional semantic similarity scores within a multi-layered graph-based model. By capturing different semantic perspectives through multiple interconnected layers, our approach provides a more comprehensive and robust similarity estimation. Experimental results demonstrate that GBNet-STS outperforms traditional methods, achieving state-of-the-art performance in Vietnamese semantic similarity tasks.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"161 ","pages":"Article 102509"},"PeriodicalIF":2.7,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145060462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Rule-guided process discovery 规则引导的流程发现

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-09-02 DOI: 10.1016/j.datak.2025.102508

Ali Norouzifar , Marcus Dees , Wil van der Aalst

{"title":"Rule-guided process discovery","authors":"Ali Norouzifar , Marcus Dees , Wil van der Aalst","doi":"10.1016/j.datak.2025.102508","DOIUrl":"10.1016/j.datak.2025.102508","url":null,"abstract":"<div><div>Event data extracted from information systems serves as the foundation for process mining, enabling the extraction of insights and identification of improvements. Process discovery focuses on deriving descriptive process models from event logs, which form the basis for conformance checking, performance analysis, and other applications. Traditional process discovery techniques predominantly rely on event logs, often overlooking supplementary information such as domain knowledge and process rules. These rules, which define relationships between activities, can be obtained through automated techniques like declarative process discovery or provided by domain experts based on process specifications. When used as an additional input alongside event logs, such rules have significant potential to guide process discovery. However, leveraging rules to discover high-quality imperative process models, such as BPMN models and Petri nets, remains an underexplored area in the literature. To address this gap, we propose an enhanced framework, IMr, which integrates discovered or user-defined rules into the process discovery workflow via a novel recursive approach. The IMr framework employs a divide-and-conquer strategy, using rules to guide the selection of process structures at each recursion step in combination with the input event log. We evaluate our approach on several real-world event logs and demonstrate that the discovered models better align with the provided rules without compromising their conformance to the event log. Additionally, we show that high-quality rules can improve model quality across well-known conformance metrics. This work highlights the importance of integrating domain knowledge into process discovery, enhancing the quality, interpretability, and applicability of the resulting process models.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"161 ","pages":"Article 102508"},"PeriodicalIF":2.7,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145048987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

LQ-FJS: A logical query-digging fake-news judgment system with structured video-summarization engine using LLM LQ-FJS：基于LLM的结构化视频摘要引擎逻辑查询挖掘假新闻判断系统

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-09-02 DOI: 10.1016/j.datak.2025.102507

Jhing-Fa Wang , Din-Yuen Chan , Hsin-Chun Tsai , Bo-Xuan Fang

{"title":"LQ-FJS: A logical query-digging fake-news judgment system with structured video-summarization engine using LLM","authors":"Jhing-Fa Wang , Din-Yuen Chan , Hsin-Chun Tsai , Bo-Xuan Fang","doi":"10.1016/j.datak.2025.102507","DOIUrl":"10.1016/j.datak.2025.102507","url":null,"abstract":"<div><div>The proliferation of online social platforms can greatly benefit people by fostering remote relationships, but it also inevitably amplifies the impact of multimodal fake news on societal trust and ethics. Existing fake-news detection AI systems are still vulnerable to the inconspicuous and indiscernible multimodal misinformation, and often lacking interpretability and accuracy in cross-platform settings. Hence, we propose a new innovative logical query-digging fake-news judgment system (LQ-FJS) to tackle the above problem based on multimodal approach. The LQ-FJS verifies the truthfulness of claims made within multimedia news by converting video content into structured textual summaries. It then acts as an interpretable agent, explaining the reasons for identified fake news by the structured video-summarization engine (SVSE) to act as an interpretable detection intermediary agent. The SVSE generates condensed captions for raw video content, converting it into structured textual narratives. Then, LQ-FJS exploits these condensed captions to retrieve reliable information related to the video content from LLM. Thus, LQ-FJS cross-verifies external knowledge sources and internal LLM responses to determine whether contradictions exist with factual information through a multimodal inconsistency verification procedure. Our experiments demonstrate that the subtle summarization produced by SVSE can facilitate the generation of explanatory reports that mitigate large-scale trust deficits caused by opaque “black-box” models. Our experiments show that LQ-FJS improves F1 scores by 4.5% and 7.2% compared to state-of-the-art models (FactLLaMA 2023 and HiSS 2023), and increases 14% user trusts through interpretable conclusions.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"161 ","pages":"Article 102507"},"PeriodicalIF":2.7,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145019198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ABBA: Index structure for sequential pattern-based aggregate queries ABBA：基于顺序模式的聚合查询的索引结构

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-08-27 DOI: 10.1016/j.datak.2025.102506

Witold Andrzejewski, Tadeusz Morzy, Maciej Zakrzewicz

引用次数: 0

Source-Free Domain Adaptation with complex distribution considerations for time series data 考虑复杂分布的无源域自适应时间序列数据

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-08-21 DOI: 10.1016/j.datak.2025.102501

Jing Shang, Zunming Chen, Zhiwen Xiao, Zhihui Wu, Yifei Zhang, Jibing Wang

{"title":"Source-Free Domain Adaptation with complex distribution considerations for time series data","authors":"Jing Shang, Zunming Chen, Zhiwen Xiao, Zhihui Wu, Yifei Zhang, Jibing Wang","doi":"10.1016/j.datak.2025.102501","DOIUrl":"10.1016/j.datak.2025.102501","url":null,"abstract":"<div><div>Source-Free Domain Adaptation (SFDA) aims to adapt a pre-trained model from a labeled source domain to an unlabeled target domain without accessing source domain data, thereby protecting source domain privacy. Although SFDA has recently been applied to time series data, the inherent complex distribution characteristics including temporal variability and distributional diversity of such data remain underexplored. Time series data exhibit significant dynamic variability influenced by collection environments, leading to discrepancies between sequences. Additionally, multidimensional time series data face distributional diversity across dimensions. These complex characteristics increase the learning difficulty for source models and widen the adaptation gap between the source and target domains. To address these challenges, this paper proposes a novel SFDA method for time series data, named Adaptive Latent Subdomain feature extraction and joint Prediction (ALSP). The method divides the source domain, which has a complex distribution, into multiple latent subdomains with relatively simple distributions, thereby effectively capturing the features of different subdistributions. It extracts latent domain-specific and domain-invariant features to identify subdomain-specific characteristics. Furthermore, it combines domain-specific classifiers and a domain-invariant classifier to enhance model performance through multi-classifier joint prediction. During target domain adaptation, ALSP reduces domain dependence by extracting invariant features, thereby narrowing the distributional gap between the source and target domains. Simultaneously, it leverages prior knowledge from the source domain distribution to support the hypothesis space and dynamically adapt to the target domain. Experiments on three real-world datasets demonstrate that ALSP achieves superior performance in cross-domain time series classification tasks, significantly outperforming existing methods.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"161 ","pages":"Article 102501"},"PeriodicalIF":2.7,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144918105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Elevating human-machine collaboration in NLP for enhanced content creation and decision support 提升NLP中的人机协作，以增强内容创建和决策支持

IF 2.7 3区计算机科学

Data & Knowledge Engineering Pub Date : 2025-08-21 DOI: 10.1016/j.datak.2025.102505

Priyanka V. Deshmukh, Aniket K. Shahade

{"title":"Elevating human-machine collaboration in NLP for enhanced content creation and decision support","authors":"Priyanka V. Deshmukh, Aniket K. Shahade","doi":"10.1016/j.datak.2025.102505","DOIUrl":"10.1016/j.datak.2025.102505","url":null,"abstract":"<div><div>Human-machine collaboration in Natural Language Processing (NLP) is revolutionizing content creation and decision support by seamlessly combining the strengths of both entities for enhanced efficiency and quality. The lack of seamless integration between human creativity and machine efficiency in NLP hinders optimal content creation and decision support. The objective of this study is to explore and promote the integration of human-machine collaboration in NLP to enhance both content creation and decision support processes. Data Acquisition for NLP requests involves defining the task and target audience, identifying relevant data sources like text documents and web data, and incorporating human expertise for data curation through validation and annotation. Machine processing techniques like tokenization, stemming/lemmatization, and removal of stop words, as well as human input for tasks like data annotation and error correction, to improve data quality and relevance for NLP applications. The combination of automated processing and human feedback leads to more precise and dependable effects. Techniques such as sentiment analysis, topic modelling, and entity recognition are utilized to excerpt valued perceptions from the data and enhance collaboration between humans and machines. These techniques help to streamline the NLP process and ensure that the system is providing accurate and relevant information to users. The analysis of NLP models in machine processing involves training the models to perform specific tasks, such as summarization, sentiment analysis, information extraction, trend identification, and creative content generation. The results show that social media leads with 90% usage, pivotal for audience engagement, while blogs at 78% highlight their depth in content creation implementation using Python software. These trained models are then used to improve decision-making processes, generate creative content, and enhance the accuracy of search results. The future scope involves leveraging advanced NLP techniques to deepen the collaboration between humans and machines for more effective content creation and decision support.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"161 ","pages":"Article 102505"},"PeriodicalIF":2.7,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144908780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0