Yanlin Zhang , Yuchen Shi , Deqing Yang , Xiaodong Gu
{"title":"Exploiting explicit item–item correlations from knowledge graphs for enhanced sequential recommendation","authors":"Yanlin Zhang , Yuchen Shi , Deqing Yang , Xiaodong Gu","doi":"10.1016/j.is.2024.102470","DOIUrl":"10.1016/j.is.2024.102470","url":null,"abstract":"<div><div>In recent years, the research of employing knowledge graphs (KGs) in sequential recommendation (SR) has received a lot of attention, since the side information extracted from KGs, especially the information of the correlations between items, indeed helps the SR models achieve better performance. However, many previous KG-based SR models tend to introduce some noise information when learning item embeddings, or insufficiently fuse item–item correlations into their sequential modeling, thus limiting their performance improvements. In this paper, we propose a <strong>D</strong>istance-<strong>A</strong>ware <strong>K</strong>nowledge-based <strong>S</strong>equential <strong>R</strong>ecommendation model (<strong>DAKSR</strong>), which exploits the explicit item–item correlations from KGs to achieve enhanced SR. Specifically, as one critical component in our DAKSR, the <em>distance score matrix</em> (DSM) is first obtained to indicate the correlations between items, and then leveraged in the following three major modules of DAKSR. First, in the Item-Set Embedding layer (ISE) all item embeddings are learned based on DSM, in which the noise information is eliminated effectively. Meanwhile, the Knowledge-Infused Transformer (KIT) incorporates DSM into its attention mechanism to improve the feature extraction. Furthermore, the Knowledge Contrastive Learning module (KCL) also leverages the item–item correlations presented in DSM to generate two credible sequence views, which are used to refine sample representations through a contrastive learning strategy, and thus improve the model’s robustness. Our extensive experiments on three SR benchmarks obviously demonstrate our DAKSR’s superior performance over the state-of-the-art (SOTA) KG-based recommendation models. The implementation of our DAKSR is available at <span><span>https://github.com/Easonsi/DAKSR</span><svg><path></path></svg></span> for reproducing our experiment results conveniently.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"128 ","pages":"Article 102470"},"PeriodicalIF":3.0,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142529775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nelly Barret , Antoine Gauquier , Jia-Jean Law , Ioana Manolescu
{"title":"Finding meaningful paths in heterogeneous graphs with PathWays","authors":"Nelly Barret , Antoine Gauquier , Jia-Jean Law , Ioana Manolescu","doi":"10.1016/j.is.2024.102463","DOIUrl":"10.1016/j.is.2024.102463","url":null,"abstract":"<div><div>Graphs, and notably RDF graphs, are a prominent way of sharing data. As data usage democratizes, users need help figuring out the useful content of a graph dataset. In particular, journalists with whom we collaborate are interested in identifying, in a graph, the <em>connections between entities</em>, e.g., people, organizations, emails, etc. We present a novel method for exploring data graphs through <em>their data paths connecting Named Entities</em> (NEs, in short); each data path leads to a tabular-looking set of results. NEs are extracted from the data through dedicated Information Extraction modules. Our method builds upon the pre-existing ConnectionLens platform and follow-up work in the Abstra project, which builds simple, visual ER-style summaries of semi-structured data. The contribution of the present work, and its novelty, is twofold. First, we propose a novel analysis of entity-to-entity paths contained in datasets of any nature, and propose a new method for ranking paths, leveraging a novel Information Extraction (IE) module we built on top of ChatGPT. Second, we present an efficient approach to enumerate and compute NE paths, based on an algorithm which automatically recommends sub-paths to materialize, and rewrites the path queries using these subpaths. Our experiments demonstrate the interest of NE paths and the efficiency of our method for computing and ranking them.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102463"},"PeriodicalIF":3.0,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142420464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Francesco Prinzi , Pietro Barbiero , Claudia Greco , Terry Amorese , Gennaro Cordasco , Pietro Liò , Salvatore Vitabile , Anna Esposito
{"title":"Using AI explainable models and handwriting/drawing tasks for psychological well-being","authors":"Francesco Prinzi , Pietro Barbiero , Claudia Greco , Terry Amorese , Gennaro Cordasco , Pietro Liò , Salvatore Vitabile , Anna Esposito","doi":"10.1016/j.is.2024.102465","DOIUrl":"10.1016/j.is.2024.102465","url":null,"abstract":"<div><div>This study addresses the increasing threat to Psychological Well-Being (PWB) posed by Depression, Anxiety, and Stress conditions. Machine learning methods have shown promising results for several psychological conditions. However, the lack of transparency in existing models impedes practical application. The study aims to develop explainable machine learning models for depression, anxiety and stress prediction, focusing on features extracted from tasks involving handwriting and drawing.</div><div>Two hundred patients completed the Depression, Anxiety, and Stress Scale (DASS-21) and performed seven tasks related to handwriting and drawing. Extracted features, encompassing pressure, stroke pattern, time, space, and pen inclination, were used to train the explainable-by-design Entropy-based Logic Explained Network (e-LEN) model, employing first-order logic rules for explanation. Performance comparison was performed with XGBoost, enhanced by the SHAP explanation method.</div><div>The trained models achieved notable accuracy in predicting depression (0.749 ±0.089), anxiety (0.721 ±0.088), and stress (0.761 ±0.086) through 10-fold cross-validation (repeated 20 times). The e-LEN model’s logic rules facilitated clinical validation, uncovering correlations with existing clinical literature. While performance remained consistent for depression and anxiety on an independent test dataset, a slight degradation was observed for stress prediction in the test task.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102465"},"PeriodicalIF":3.0,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142420465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elodie Escriva , Tom Lefrere , Manon Martin , Julien Aligon , Alexandre Chanson , Jean-Baptiste Excoffier , Nicolas Labroche , Chantal Soulé-Dupuy , Paul Monsarrat
{"title":"Effective data exploration through clustering of local attributive explanations","authors":"Elodie Escriva , Tom Lefrere , Manon Martin , Julien Aligon , Alexandre Chanson , Jean-Baptiste Excoffier , Nicolas Labroche , Chantal Soulé-Dupuy , Paul Monsarrat","doi":"10.1016/j.is.2024.102464","DOIUrl":"10.1016/j.is.2024.102464","url":null,"abstract":"<div><div>Machine Learning (ML) has become an essential tool for modeling complex phenomena, offering robust predictions and comprehensive data analysis. Nevertheless, the lack of interpretability in these predictions often results in a closed-box effect, which the field of eXplainable Machine Learning (XML) aims to address. Local attributive XML methods, in particular, provide explanations by quantifying the contribution of each attribute to individual predictions, referred to as influences. This type of explanation is the most acute as it focuses on each instance of the dataset and allows the detection of individual differences. Additionally, aggregating local explanations allows for a deeper analysis of the underlying data. In this context, influences can be considered as a new data space to reveal and understand complex data patterns. We hypothesize that these influences, derived from ML explanations, are more informative than the original raw data, especially for identifying homogeneous groups within the data. To identify such groups effectively, we utilize a clustering approach. We compare clusters formed using raw data against those formed using influences computed by various local attributive XML methods. Our findings reveal that clusters based on influences consistently outperform those based on raw data, even when using models with low accuracy.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102464"},"PeriodicalIF":3.0,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142356795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data Lakehouse: A survey and experimental study","authors":"Ahmed A. Harby , Farhana Zulkernine","doi":"10.1016/j.is.2024.102460","DOIUrl":"10.1016/j.is.2024.102460","url":null,"abstract":"<div><div>Efficient big data management is a dire necessity to manage the exponential growth in data generated by digital information systems to produce usable knowledge. Structured databases, data lakes, and warehouses have each provided a solution with varying degrees of success. However, a new and superior solution, the data Lakehouse, has emerged to extract actionable insights from unstructured data ingested from distributed sources. By combining the strengths of data warehouses and data lakes, the data Lakehouse can process and merge data quickly while ingesting and storing high-speed unstructured data with post-storage transformation and analytics capabilities. The Lakehouse architecture offers the necessary features for optimal functionality and has gained significant attention in the big data management research community. In this paper, we compare data lake, warehouse, and lakehouse systems, highlight their strengths and shortcomings, identify the desired features to handle the evolving challenges in big data management and analysis and propose an advanced data Lakehouse architecture. We also demonstrate the performance of three state-of-the-art data management systems namely HDFS data lake, Hive data warehouse, and Delta lakehouse in managing data for analytical query responses through an experimental study.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102460"},"PeriodicalIF":3.0,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142356794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael Grohs , Peter Pfeiffer , Jana-Rebecca Rehse
{"title":"Proactive conformance checking: An approach for predicting deviations in business processes","authors":"Michael Grohs , Peter Pfeiffer , Jana-Rebecca Rehse","doi":"10.1016/j.is.2024.102461","DOIUrl":"10.1016/j.is.2024.102461","url":null,"abstract":"<div><div>Modern business processes are subject to an increasing number of external and internal regulations. Compliance with these regulations is crucial for the success of organizations. To ensure this compliance, process managers can identify and mitigate deviations between the predefined process behavior and the executed process instances by means of conformance checking techniques. However, these techniques are inherently reactive, meaning that they can only detect deviations after they have occurred. It would be desirable to detect and mitigate deviations before they occur, enabling managers to proactively ensure compliance of running process instances. In this paper, we propose Business Process Deviation Prediction (BPDP), a novel predictive approach that relies on a supervised machine learning model to predict which deviations can be expected in the future of running process instances. BPDP is able to predict individual deviations as well as deviation patterns. Further, it provides the user with a list of potential reasons for predicted deviations. Our evaluation shows that BPDP outperforms existing methods for deviation prediction. Following the idea of action-oriented process mining, BPDP thus enables process managers to prevent deviations in early stages of running process instances.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102461"},"PeriodicalIF":3.0,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142420466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexander Baumstark, Muhammad Attahir Jibril, Kai-Uwe Sattler
{"title":"Temporal graph processing in modern memory hierarchies","authors":"Alexander Baumstark, Muhammad Attahir Jibril, Kai-Uwe Sattler","doi":"10.1016/j.is.2024.102462","DOIUrl":"10.1016/j.is.2024.102462","url":null,"abstract":"<div><div>Updates in graph DBMS lead to structural changes in the graph over time with different intermediate states. Capturing these changes and their time is one of the main purposes of temporal DBMS. Most DBMSs built their temporal features based on their non-temporal processing and storage without considering the memory hierarchy of the underlying system. This leads to slower temporal processing and poor storage utilization. In this paper, we propose a storage and processing strategy for (bi-) temporal graphs using temporal materialized views (TMV) while exploiting the memory hierarchy of a modern system. Further, we show a solution to the query containment problem for certain types of temporal graph queries. Finally, we evaluate the overhead and performance of the presented approach. The results show that using TMV reduces the runtime of temporal graph queries while using less memory.</div></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102462"},"PeriodicalIF":3.0,"publicationDate":"2024-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142319396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bridging reading and mapping: The role of reading annotations in facilitating feedback while concept mapping","authors":"Oscar Díaz, Xabier Garmendia","doi":"10.1016/j.is.2024.102458","DOIUrl":"10.1016/j.is.2024.102458","url":null,"abstract":"<div><p>Concept maps are visual tools for organizing knowledge, commonly used in education and design. The process often involves reading and developing conceptual models, where feedback is crucial. Learners (e.g., students, designers) often refer to reading materials, and receive feedback from instructors (e.g., teachers, stakeholders) based on the maps they create. However, annotations made by learners, like highlights, are usually not visible to instructors, limiting tailored feedback. We propose incorporating annotation practices into concept mapping. Learners could highlight text and link these highlights to existing or newly created concepts in their concept map. This way, instructors can access both the concept map and the relevant readings for better feedback. This vision is realized through <em>Concept&Go</em>, a plug-in for the editor <em>CmapCloud</em>. This extension aims at the interplay between mapping, reading, and feedback during concept mapping. The effectiveness of this approach is demonstrated through a focus group (n=5) and a UTAUT evaluation (n=12). <em>Concept&Go</em> is publicly available.</p></div>","PeriodicalId":50363,"journal":{"name":"Information Systems","volume":"127 ","pages":"Article 102458"},"PeriodicalIF":3.0,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0306437924001169/pdfft?md5=f1df1b7c90dae26d25484ea7d7b77c25&pid=1-s2.0-S0306437924001169-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142147687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}