{"title":"A novel technique using graph neural networks and relevance scoring to improve the performance of knowledge graph-based question answering systems","authors":"Sincy V. Thambi, P. C. Reghu Raj","doi":"10.1007/s10844-023-00839-4","DOIUrl":"https://doi.org/10.1007/s10844-023-00839-4","url":null,"abstract":"<p>A Knowledge Graph-based Question Answering (KGQA) system attempts to answer a given natural language question using a knowledge graph (KG) rather than from text data. The current KGQA methods attempt to determine whether there is an explicit relationship between the entities in the question and a well-structured relationship between them in the KG. However, such strategies are difficult to build and train, limiting their consistency and versatility. The use of language models such as BERT has aided in the advancement of natural language question answering. In this paper, we present a novel Graph Neural Network(GNN) based approach with relevance scoring for improving KGQA. GNNs use the weight of nodes and edges to influence the information propagation while updating the node features in the network. The suggested method comprises subgraph construction, weighing of nodes and edges, and pruning processes to obtain meaningful answers. BERT-based GNN is used to build subgraph node embeddings. We tested the influence of weighting for both nodes and edges and observed that the system performs better for weighted graphs than unweighted graphs. Additionally, we experimented with several GNN convolutional layers and obtainined improved results by combining GENeralised Graph Convolution (GENConv) with node weights for simple questions. Extensive testing on benchmark datasets confirmed the effectiveness of the proposed model in comparison to state-of-the-art KGQA systems.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"17 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139551672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sentiment analysis of twitter data to detect and predict political leniency using natural language processing","authors":"","doi":"10.1007/s10844-024-00842-3","DOIUrl":"https://doi.org/10.1007/s10844-024-00842-3","url":null,"abstract":"<h3>Abstract</h3> <p>This paper analyses Twitter data to detect the political lean of a profile by extracting and classifying sentiments expressed through tweets. The work utilizes natural language processing, augmented with sentiment analysis algorithms and machine learning techniques, to classify specific keywords. The proposed methodology initially performs data pre-processing, followed by multi-aspect sentiment analysis for computing the sentiment score of the extracted keywords, for precisely classifying users into various clusters based on similarity score with respect to a sample user in each cluster. The proposed technique also predicts the sentiment of a profile towards unknown keywords and gauges the bias of an unidentified user towards political events or social issues. The proposed technique was tested on Twitter dataset with 1.72 million tweets taken from over 10,000 profiles and was able to successfully identify the political leniency of the user profiles with 99% confidence level, and also on a synthetic dataset with 2500 tweets, where the predicted accuracy and F1 score were 0.99 and 0.985 respectively, and 0.97 and 0.975 when neutral users were also considered for classification. The paper could also identify the impact of political decisions on various clusters, by analyzing the shift in the number of users belonging to the different clusters.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"28 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139509067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vito Bellini, Eugenio Di Sciascio, Francesco Maria Donini, Claudio Pomo, Azzurra Ragone, Angelo Schiavone
{"title":"A qualitative analysis of knowledge graphs in recommendation scenarios through semantics-aware autoencoders","authors":"Vito Bellini, Eugenio Di Sciascio, Francesco Maria Donini, Claudio Pomo, Azzurra Ragone, Angelo Schiavone","doi":"10.1007/s10844-023-00830-z","DOIUrl":"https://doi.org/10.1007/s10844-023-00830-z","url":null,"abstract":"<p>Knowledge Graphs (KGs) have already proven their strength as a source of high-quality information for different tasks such as data integration, search, text summarization, and personalization. Another prominent research field that has been benefiting from the adoption of KGs is that of Recommender Systems (RSs). Feeding a RS with data coming from a KG improves recommendation accuracy, diversity, and novelty, and paves the way to the creation of interpretable models that can be used for explanations. This possibility of combining a KG with a RS raises the question whether such an addition can be performed in a plug-and-play fashion – also with respect to the recommendation domain – or whether each combination needs a careful evaluation. To investigate such a question, we consider all possible combinations of <i>(i)</i> three recommendation tasks (books, music, movies); <i>(ii)</i> three recommendation models fed with data from a KG (and in particular, a semantics-aware deep learning model, that we discuss in detail), compared with three baseline models without KG addition; <i>(iii)</i> two main encyclopedic KGs freely available on the Web: DBpedia and Wikidata. Supported by an extensive experimental evaluation, we show the final results in terms of accuracy and diversity of the various combinations, highlighting that the injection of knowledge does not always pay off. Moreover, we show how the choice of the KG, and the form of data in it, affect the results, depending on the recommendation domain and the learning model.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"14 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139509261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing the fairness of offensive memes detection models by mitigating unintended political bias","authors":"","doi":"10.1007/s10844-023-00834-9","DOIUrl":"https://doi.org/10.1007/s10844-023-00834-9","url":null,"abstract":"<h3>Abstract</h3> <p>This paper tackles the critical challenge of detecting and mitigating unintended political bias in offensive meme detection. Political memes are a powerful tool that can be used to influence public opinion and disrupt voters’ mindsets. However, current visual-linguistic models for offensive meme detection exhibit unintended bias and struggle to accurately classify non-offensive and offensive memes. This can harm the fairness of the democratic process either by targeting minority groups or promoting harmful political ideologies. With Hindi being the fifth most spoken language globally and having a significant number of native speakers, it is essential to detect and remove Hindi-based offensive memes to foster a fair and equitable democratic process. To address these concerns, we propose three debiasing techniques to mitigate the overrepresentation of majority group perspectives while addressing the suppression of minority opinions in political discourse. To support our approach, we curate a comprehensive dataset called Pol_Off_Meme, designed especially for the Hindi language. Empirical analysis of this dataset demonstrates the efficacy of our proposed debiasing techniques in reducing political bias in internet memes, promoting a fair and equitable democratic environment. Our debiased model, named <span> <span>(DRTIM^{Adv}_{Att})</span> </span>, exhibited superior performance compared to the CLIP-based baseline model. It achieved a significant improvement of +9.72% in the F1-score while reducing the False Positive Rate Difference (FPRD) by -16% and the False Negative Rate Difference (FNRD) by -14.01%. Our efforts strive to cultivate a more informed and inclusive political discourse, ensuring that all opinions, irrespective of their majority or minority status, receive adequate attention and representation.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"20 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139375936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Massimo Guarascio, Marco Minici, Francesco Sergio Pisani, Erika De Francesco, Pasquale Lambardi
{"title":"Movie tag prediction: An extreme multi-label multi-modal transformer-based solution with explanation","authors":"Massimo Guarascio, Marco Minici, Francesco Sergio Pisani, Erika De Francesco, Pasquale Lambardi","doi":"10.1007/s10844-023-00836-7","DOIUrl":"https://doi.org/10.1007/s10844-023-00836-7","url":null,"abstract":"<p>Providing rich and accurate metadata for indexing media content is a crucial problem for all the companies offering streaming entertainment services. These metadata are commonly employed to enhance search engine results and feed recommendation algorithms to improve the matching with user interests. However, the problem of labeling multimedia content with informative tags is challenging as the labeling procedure, manually performed by domain experts, is time-consuming and prone to error. Recently, the adoption of AI-based methods has been demonstrated to be an effective approach for automating this complex process. However, developing an effective solution requires coping with different challenging issues, such as data noise and the scarcity of labeled examples during the training phase. In this work, we address these challenges by introducing a Transformer-based framework for multi-modal multi-label classification enriched with model prediction explanation capabilities. These explanations can help the domain expert to understand the system’s predictions. Experimentation conducted on two real test cases demonstrates its effectiveness.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"4 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139375935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vincenzo Pasquadibisceglie, Annalisa Appice, Giuseppe Ieva, Donato Malerba
{"title":"TSUNAMI - an explainable PPM approach for customer churn prediction in evolving retail data environments","authors":"Vincenzo Pasquadibisceglie, Annalisa Appice, Giuseppe Ieva, Donato Malerba","doi":"10.1007/s10844-023-00838-5","DOIUrl":"https://doi.org/10.1007/s10844-023-00838-5","url":null,"abstract":"<p>Retail companies are greatly interested in performing continuous monitoring of purchase traces of customers, to identify weak customers and take the necessary actions to improve customer satisfaction and ensure their revenues remain unaffected. In this paper, we formulate the customer churn prediction problem as a Predictive Process Monitoring (PPM) problem to be addressed under possible dynamic conditions of evolving retail data environments. To this aim, we propose <span>TSUNAMI</span> as a PPM approach to monitor the customer loyalty in the retail sector. It processes online the sale receipt stream produced by customers of a retail business company and learns a deep neural model to early detect possible purchase customer traces that will outcome in future churners. In addition, the proposed approach integrates a mechanism to detect concept drifts in customer purchase traces and adapts the deep neural model to concept drifts. Finally, to make decisions of customer purchase monitoring explainable to potential stakeholders, we analyse Shapley values of decisions, to explain which characteristics of the customer purchase traces are the most relevant for disentangling churners from non-churners and how these characteristics have possibly changed over time. Experiments with two benchmark retail data sets explore the effectiveness of the proposed approach.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"37 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139065395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A bayesian-neural-networks framework for scaling posterior distributions over different-curation datasets","authors":"Alfredo Cuzzocrea, Alessandro Baldo, Edoardo Fadda","doi":"10.1007/s10844-023-00837-6","DOIUrl":"https://doi.org/10.1007/s10844-023-00837-6","url":null,"abstract":"<p>In this paper, we propose and experimentally assess <i>an innovative framework for scaling posterior distributions over different-curation datasets, based on Bayesian-Neural-Networks (BNN)</i>. Another innovation of our proposed study consists in enhancing the accuracy of the Bayesian classifier via intelligent sampling algorithms. The proposed methodology is relevant in emerging applicative settings, such as <i>provenance detection and analysis</i> and <i>cybercrime</i>. Our contributions are complemented by a comprehensive experimental evaluation and analysis over both static and dynamic image datasets. Derived results confirm the successful application of our proposed methodology to emerging <i>big data analytics</i> settings.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"44 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139051948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cataldo Musto, Alessandro Francesco Maria Martina, Andrea Iovine, Fedelucio Narducci, Marco de Gemmis, Giovanni Semeraro
{"title":"Tell me what you Like: introducing natural language preference elicitation strategies in a virtual assistant for the movie domain","authors":"Cataldo Musto, Alessandro Francesco Maria Martina, Andrea Iovine, Fedelucio Narducci, Marco de Gemmis, Giovanni Semeraro","doi":"10.1007/s10844-023-00835-8","DOIUrl":"https://doi.org/10.1007/s10844-023-00835-8","url":null,"abstract":"<p>Preference elicitation is a crucial step for every recommendation algorithm. In this paper, we present a strategy that allows users to express their preferences and needs through natural language statements. In particular, our natural language preference elicitation pipeline allows users to express preferences on <i>objective</i> movie features (e.g., actors, directors, etc.) as well as on <i>subjective</i> features that are collected by mining user-written movie reviews. To validate our claims, we carried out a user study in the movie domain (<span>(N=114)</span>). The main finding of our experiment is that users tend to express their preferences by using <i>objective</i> features, whose usage largely overcomes that of <i>subjective</i> features, which are more complicated to be expressed. However, when the users are able to express their preferences also in terms of <i>subjective</i> features, they obtain better recommendations in a lower number of conversation turns. We have also identified the main challenges that arise when users talk to the virtual assistant by using subjective features, and this paves the way for future developments of our methodology.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"75 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138629518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Audio super-resolution via vision transformer","authors":"Simona Nisticò, Luigi Palopoli, Adele Pia Romano","doi":"10.1007/s10844-023-00833-w","DOIUrl":"https://doi.org/10.1007/s10844-023-00833-w","url":null,"abstract":"<p>Audio super-resolution refers to techniques that improve the audio signals quality, usually by exploiting bandwidth extension methods, whereby audio enhancement is obtained by expanding the phase and the spectrogram of the input audio traces. These techniques are therefore much significant for all those cases where audio traces miss relevant parts of the audible spectrum. In several cases, the given input signal contains the low-band frequencies (the easiest to capture with low-quality recording instruments) whereas the high-band must be generated. In this paper, we illustrate techniques implemented into a system for bandwidth extension that works on musical tracks and generates the high-band frequencies starting from the low-band ones. The system, called <i>ViT Super-resolution</i> (<span>(textit{ViT-SR})</span>), features an architecture based on a Generative Adversarial Network and Vision Transformer model. In particular, two versions of the architecture will be presented in this paper, that work on different input frequency ranges. Experiments, which are accounted for in the paper, prove the effectiveness of our approach. In particular, the objective has been attained to demonstrate that it is possible to faithfully reconstruct the high-band signal of an audio file having only its low-band spectrum available as the input, therewith including the usually difficult to synthetically generate harmonics occurring in the audio tracks, which significantly contribute to the final perceived sound quality.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"90 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138630225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How can text mining improve the explainability of Food security situations?","authors":"Hugo Deléglise, Agnès Bégué, Roberto Interdonato, Elodie Maître d’Hôtel, Mathieu Roche, Maguelonne Teisseire","doi":"10.1007/s10844-023-00832-x","DOIUrl":"https://doi.org/10.1007/s10844-023-00832-x","url":null,"abstract":"<p>Food Security (FS) is a major concern in West Africa, particularly in Burkina Faso, which has been the epicenter of a humanitarian crisis since the beginning of this century. Early warning systems for FS and famines rely mainly on numerical data for their analyses, whereas textual data, which are more complex to process, are rarely used. However, this data is easy to access and represents a source of relevant information that is complementary to commonly used data sources. This study explores methods for obtaining the explanatory context associated with FS from textual data. Based on a corpus of local newspaper articles, we analyze FS over the last ten years in Burkina Faso. We propose an original and dedicated pipeline that combines different textual analysis approaches to obtain an explanatory model evaluated on real-world and large-scale data. The results of our analyses have proven how our approach provides significant results that offer distinct and complementary qualitative information on food security and its spatial and temporal characteristics.</p>","PeriodicalId":56119,"journal":{"name":"Journal of Intelligent Information Systems","volume":"10 1","pages":""},"PeriodicalIF":3.4,"publicationDate":"2023-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138577089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}