Fernando Pastor Ricós , Beatriz Marín , I.S.W.B. Prasetya , Tanja E.J. Vos , Joseph Davidson , Karel Hovorka
{"title":"Behavior Driven Development for 3D games","authors":"Fernando Pastor Ricós , Beatriz Marín , I.S.W.B. Prasetya , Tanja E.J. Vos , Joseph Davidson , Karel Hovorka","doi":"10.1016/j.datak.2025.102486","DOIUrl":"10.1016/j.datak.2025.102486","url":null,"abstract":"<div><div>Computer 3D games are complex software environments that require novel testing processes to ensure high-quality standards. The Intelligent Verification/Validation for Extended Reality Based Systems (<span>iv4XR</span>) framework addresses this need by enabling the implementation of autonomous agents to automate game testing scenarios. This framework facilitates the automation of regression test cases for complex 3D games like Space Engineers. Nevertheless, the technical expertise required to define test scripts using <span>iv4XR</span> can constrain seamless collaboration between developers and testers. This paper reports how integrating a Behavior-Driven Development (BDD) approach with the <span>iv4XR</span> framework allows the industrial company behind Space Engineers to automate regression testing. The success of this industrial collaboration has inspired the <span>iv4XR</span> team to integrate the BDD approach to improve the automation of play-testing for the experimental 3D game LabRecruits. Furthermore, the <span>iv4XR</span> framework has been extended with tactical programming to enable the automation of long-play test scenarios in Space Engineers. These results underscore the versatility of the <span>iv4XR</span> framework in supporting diverse testing approaches while showcasing how BDD empowers users to create, manage, and execute automated game tests using comprehensive and human-readable statements.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"160 ","pages":"Article 102486"},"PeriodicalIF":2.7,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144588942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
John Mylopoulos , Giancarlo Guizzardi , Nicola Guarino
{"title":"Conceptual modeling: Foundations, a historical perspective, and a vision for the future","authors":"John Mylopoulos , Giancarlo Guizzardi , Nicola Guarino","doi":"10.1016/j.datak.2025.102483","DOIUrl":"10.1016/j.datak.2025.102483","url":null,"abstract":"<div><div>We recount the foundations of Conceptual Modeling in Computer Science, Philosophy and Cognitive Science and their implications on what are concepts, conceptualizations, and conceptual models. We then review the history of the field, considering earlier work by the three co-authors, and highlight some of the contributions that made it what it is. Finally, we propose three research directions whose solutions could advance the field and will hopefully be addressed in the future. Our study is intended to help to circumscribe and characterize the field. It draws ideas from Philosophy, Cognitive Science, Engineering and the Social Sciences, as well as several areas within Computer Science, including Programming languages, Artificial Intelligence, Databases, Software Engineering, and Information Systems Engineering.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"160 ","pages":"Article 102483"},"PeriodicalIF":2.7,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144631123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IDL-BiGRU: Integrated deep learning assisted smart scheduling of big data over cloud environment","authors":"Rama Satish K V , Vibha M B , Lovely Sasidharan","doi":"10.1016/j.datak.2025.102489","DOIUrl":"10.1016/j.datak.2025.102489","url":null,"abstract":"<div><div>The rapid expansion of Internet of Things (IoT) applications generates a continuous and massive flow of data, creating significant challenges in both data processing and storage management. Cloud computing offers scalable infrastructure to handle such data intensive workloads, but optimal task scheduling remains critical to ensure performance and resource efficiency. Traditional scheduling algorithms often fall short due to limited adaptability and consideration of only a few system parameters. In this paper, a novel integrated deep learning-assisted scheduling framework is utilized for scheduling big data over a cloud environment. The proposed framework integrated deep reinforcement learning with the bidirectional gated recurrent unit (IDL-BiGRU) model to intelligently schedule tasks based on real-time system states. The IDL-BiGRU model leverages the advantage of deep Q-learning for decision making and BiGRU's ability to capture bidirectional temporal dependencies in task and resource usage patterns. In this work, RAM, CPU, bandwidth utilization of the network, and disk storage are considered for scheduling purposes. The suggested method is to shorten the makespan and increase resource utilization. The Java tool is utilized for conducting the experimental verifications. Analysis and comparison of the suggested deep learning framework's performance with current methods are done. For 1000 tasks, the proposed method attains 0.90 degrees of imbalance, 291.17 ms downtime, 1050 ms throughput, and 721.58 makespan. The performance analysis demonstrates that the suggested strategy outperforms previous methods.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"160 ","pages":"Article 102489"},"PeriodicalIF":2.7,"publicationDate":"2025-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144711633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dirigo: A method to extract event logs for object-centric processes","authors":"Jia Wei , Chun Ouyang , Ying Wang , Lei Huang","doi":"10.1016/j.datak.2025.102485","DOIUrl":"10.1016/j.datak.2025.102485","url":null,"abstract":"<div><div>Real-world processes involve multiple object types with intricate interrelationships. Traditional event logs (in XES format), which record process execution centred around the case notion, are restricted to a single-object perspective, making it difficult to capture the behaviour of multiple objects and their interactions. To address this limitation, object-centric event logs (OCEL) have been introduced to capture both the objects involved in a process and their interactions with events. The object-centric event data (OCED) metamodel extends the OCEL format by further capturing dynamic object attributes and object-to-object relations. Recently OCEL 2.0 has been proposed based on OCED metamodel. Current research on generating OCEL logs requires specific input data sources, and resulting log data often fails to fully conform to OCEL 2.0. Moreover, the generated OCEL logs vary across different representational formats and their quality remains unevaluated. To address these challenges, a set of quality criteria for evaluating OCEL log representations is established. Guided by these criteria, <em>Dirigo</em> is proposed—a method for extracting event logs that not only conforms to OCEL 2.0 but also extends it by capturing the temporal aspect of dynamic object-to-object relations. Object-role Modelling (ORM), a conceptual data modelling technique, is employed to describe the artifact produced at each step of <em>Dirigo</em>. To validate the applicability of <em>Dirigo</em>, it is applied to a real-life use case. The quality of the log representation of the extracted event log is compared to those of existing OCEL logs using the established quality criteria.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"160 ","pages":"Article 102485"},"PeriodicalIF":2.7,"publicationDate":"2025-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144614765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
João Paulo A Almeida , José Borbinha , Giancarlo Guizzardi , Sebastian Link , Jelena Zdravkovic
{"title":"Editorial introduction for special issue on research challenges and practices in conceptual modeling – ER 2023","authors":"João Paulo A Almeida , José Borbinha , Giancarlo Guizzardi , Sebastian Link , Jelena Zdravkovic","doi":"10.1016/j.datak.2025.102487","DOIUrl":"10.1016/j.datak.2025.102487","url":null,"abstract":"","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"160 ","pages":"Article 102487"},"PeriodicalIF":2.7,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145120291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PSA-GAT: Integrating position-syntax and cross-aspect graph attention networks for aspect-based sentiment analysis","authors":"Ning Zhou, Linfu Sun, Min Han, Songlin He","doi":"10.1016/j.datak.2025.102477","DOIUrl":"10.1016/j.datak.2025.102477","url":null,"abstract":"<div><div>Aspect-based sentiment analysis (ABSA) is widely applied in analyzing user review data on web platforms to identify sentiment polarity toward specific aspects of web reviews. However, individual reviews often contain multiple conditions and coordinating and conflicting elements or relationships, which significantly increases the complexity of this task. In recent years, exploiting semantic–syntactic information with graph neural networks has been widely used to address such tasks. However, such methods overlook the features of the location influence factor of words and may provide irrelevant or even interfering noisy signals for ABSA because of the word association relationships mined by the syntax tree and semantic composition tree. To alleviate the effect of noise information and fully strengthen the context for multiple-aspect representation in ABSA, we propose a new framework, PSA-GAT, that mines information on position importance, syntactic–semantic dependencies and cross-aspect correlations. Overall, the structural features of the multi-aspect sentiment set are learned by using various variations of graph neural networks. Moreover, the experimental results on four real-world datasets demonstrate the effectiveness of PSA-GAT compared to state-of-the-art methods. The code is available at <span><span>https://github.com/zhouning6000/PSA_GAT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"160 ","pages":"Article 102477"},"PeriodicalIF":2.7,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144534819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Veda C. Storey , Jeffrey Parsons , Arturo Castellanos Bueso , Monica Chiarini Tremblay , Roman Lukyanenko , Alfred Castillo , Wolfgang Maaß
{"title":"Domain knowledge in artificial intelligence: Using conceptual modeling to increase machine learning accuracy and explainability","authors":"Veda C. Storey , Jeffrey Parsons , Arturo Castellanos Bueso , Monica Chiarini Tremblay , Roman Lukyanenko , Alfred Castillo , Wolfgang Maaß","doi":"10.1016/j.datak.2025.102482","DOIUrl":"10.1016/j.datak.2025.102482","url":null,"abstract":"<div><div>Machine learning enables the extraction of useful information from large, diverse datasets. However, despite many successful applications, machine learning continues to suffer from performance and transparency issues. These challenges can be partially attributed to the limited use of domain knowledge by machine learning models. This research proposes using the domain knowledge represented in conceptual models to improve the preparation of the data used to train machine learning models. We develop and demonstrate a method, called the <em>Conceptual Modeling for Machine Learning (CMML)</em>, which is comprised of guidelines for data preparation in machine learning and based on conceptual modeling constructs and principles. To assess the impact of CMML on machine learning outcomes, we first applied it to two real-world problems to evaluate its impact on model performance. We then solicited an assessment by data scientists on the applicability of the method. These results demonstrate the value of CMML for improving machine learning outcomes.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"160 ","pages":"Article 102482"},"PeriodicalIF":2.7,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144534882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Veda C. Storey , Oscar Pastor , Giancarlo Guizzardi , Stephen W. Liddle , Wolfgang Maaß , Jeffrey Parsons , Jolita Ralyté , Maribel Yasmina Santos
{"title":"Large language models for conceptual modeling: Assessment and application potential","authors":"Veda C. Storey , Oscar Pastor , Giancarlo Guizzardi , Stephen W. Liddle , Wolfgang Maaß , Jeffrey Parsons , Jolita Ralyté , Maribel Yasmina Santos","doi":"10.1016/j.datak.2025.102480","DOIUrl":"10.1016/j.datak.2025.102480","url":null,"abstract":"<div><div>Large Language Models (LLMs) are being rapidly adopted for many activities in organizations, business, and education. Included in their applications are capabilities to generate text, code, and models. This leads to questions about their potential role in the conceptual modeling part of information systems development. This paper reports on a panel presented at the <em>43rd International Conference on Conceptual Modeling</em> where researchers discussed the current and potential role of LLMs in conceptual modeling. The panelists discussed applications and interest levels and expressed both optimism and caution in the adoption of LLMs. Suggested is a need for much continued research by the conceptual modeling community on LLM development and their role in research and teaching.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"160 ","pages":"Article 102480"},"PeriodicalIF":2.7,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144517377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Md. Mehedi Hassan , Anindya Nag , Riya Biswas , Md Shahin Ali , Sadika Zaman , Anupam Kumar Bairagi , Chetna Kaushal
{"title":"Explainable artificial intelligence for natural language processing: A survey","authors":"Md. Mehedi Hassan , Anindya Nag , Riya Biswas , Md Shahin Ali , Sadika Zaman , Anupam Kumar Bairagi , Chetna Kaushal","doi":"10.1016/j.datak.2025.102470","DOIUrl":"10.1016/j.datak.2025.102470","url":null,"abstract":"<div><div>Recently, artificial intelligence has gained a lot of momentum and is predicted to surpass expectations across a range of industries. However, explainability is a major challenge due to sub-symbolic techniques like Deep Neural Networks and Ensembles, which were absent during the boom of AI. The practical application of AI in numerous application areas is greatly undermined by this lack of explainability. In order to counter the lack of perception of AI-based systems, Explainable AI (XAI) aims to increase transparency and human comprehension of black-box AI models. Explainable AI (XAI) also strives to promote transparency and human comprehension of black-box AI models. The explainability problem has been approached using a variety of XAI strategies; however, given the complexity of the search space, it may be tricky for ML developers and data scientists to construct XAI applications and choose the optimal XAI algorithms. This paper provides different frameworks, surveys, operations, and explainability methodologies that are currently available for producing reasoning for predictions from Natural Language Processing models in order to aid developers. Additionally, a thorough analysis of current work in explainable NLP and AI is undertaken, providing researchers worldwide with exploration, insight, and idea development opportunities. Finally, the authors highlight gaps in the literature and offer ideas for future research in this area.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"160 ","pages":"Article 102470"},"PeriodicalIF":2.7,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144297314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unveiling cancellation dynamics: A two-stage model for predictive analytics","authors":"Soumyadeep Kundu , Soumya Roy , Archit Shukla , Arqum Mateen","doi":"10.1016/j.datak.2025.102467","DOIUrl":"10.1016/j.datak.2025.102467","url":null,"abstract":"<div><div>Booking cancellations have an adverse impact on the performance of firms in the hospitality industry. Most of the studies in this domain have considered the questions of whether a booking would be cancelled or not (if). While useful, given the nature of the industry, it would be important to understand the timing of cancellation as well (when). Answering the inter-temporal nature of the question would help hotels to devise appropriate strategies to accommodate this change. In our study, we have proposed a novel two-stage model, which predicts both the likelihood (if) as well as the timing (when) of cancellation, using various statistical and machine learning techniques. We find that significant predictors include the average daily rate (which is an indicator of average rental revenue earned for an occupied room per day), month of arrival, day of arrival, and the lead time. Our insights can help hotels design bespoke cancellation policies and exercise personalised services and interventions for guests.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"160 ","pages":"Article 102467"},"PeriodicalIF":2.7,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144279321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}