Pietro Ducange, Francesco Marcelloni, Alessandro Renda, Fabrizio Ruffini
{"title":"Federated Learning of XAI Models in Healthcare: A Case Study on Parkinson’s Disease","authors":"Pietro Ducange, Francesco Marcelloni, Alessandro Renda, Fabrizio Ruffini","doi":"10.1007/s12559-024-10332-x","DOIUrl":"https://doi.org/10.1007/s12559-024-10332-x","url":null,"abstract":"<p>Artificial intelligence (AI) systems are increasingly used in healthcare applications, although some challenges have not been completely overcome to make them fully trustworthy and compliant with modern regulations and societal needs. First of all, sensitive health data, essential to train AI systems, are typically stored and managed in several separate medical centers and cannot be shared due to privacy constraints, thus hindering the use of all available information in learning models. Further, transparency and explainability of such systems are becoming increasingly urgent, especially at a time when “opaque” or “black-box” models are commonly used. Recently, technological and algorithmic solutions to these challenges have been investigated: on the one hand, federated learning (FL) has been proposed as a paradigm for collaborative model training among multiple parties without any disclosure of private raw data; on the other hand, research on eXplainable AI (XAI) aims to enhance the explainability of AI systems, either through interpretable by-design approaches or post-hoc explanation techniques. In this paper, we focus on a healthcare case study, namely predicting the progression of Parkinson’s disease, and assume that raw data originate from different medical centers and data collection for centralized training is precluded due to privacy limitations. We aim to investigate how FL of XAI models can allow achieving a good level of accuracy and trustworthiness. Cognitive and biologically inspired approaches are adopted in our analysis: FL of an interpretable by-design fuzzy rule-based system and FL of a neural network explained using a federated version of the SHAP post-hoc explanation technique. We analyze accuracy, interpretability, and explainability of the two approaches, also varying the degree of heterogeneity across several data distribution scenarios. Although the neural network is generally more accurate, the results show that the fuzzy rule-based system achieves competitive performance in the federated setting and presents desirable properties in terms of interpretability and transparency.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":"11 1","pages":""},"PeriodicalIF":5.4,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142181638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huanhuan Zhang, Lei Wang, Yuxian Qu, Wei Li, Qiaoyong Jiang
{"title":"Enhanced Dynamic Key-Value Memory Networks for Personalized Student Modeling and Learning Ability Classification","authors":"Huanhuan Zhang, Lei Wang, Yuxian Qu, Wei Li, Qiaoyong Jiang","doi":"10.1007/s12559-024-10341-w","DOIUrl":"https://doi.org/10.1007/s12559-024-10341-w","url":null,"abstract":"<p>Knowledge tracing (KT) is a technique that can be applied to predict students’ current skill mastery levels and future academic performance based on previous question-answering data. A good KT model can more accurately reflect a student’s cognitive processes and provide a more realistic assessment of skill mastery level. Currently, most KT models regard all students as a whole, while ignoring their personal differences; a few KT models attempt to personalize the modeling of students from the perspective of their learning abilities, among which a typical example is Deep Knowledge Tracing with Dynamic Student Classification (DKT-DSC). However, these models have a relatively coarse-grained approach to modeling students’ learning abilities and cannot accurately capture the nonlinear relationship between students’ learning abilities and the questions they answer. To solve these problems, we propose a novel KT model named the Enhanced Dynamic Key-Value Memory Networks for Dynamic Student Classification (EnDKVMN-DSC). This model is specifically designed for personalized student modeling and learning ability classification. Specifically, first, we propose a novel Enhanced Dynamic Key-Value Memory Network (EnDKVMN) and use it to model each student’s learning ability. Second, students are classified according to their learning abilities based on the <i>K</i>-means algorithm. Finally, the enriched input features are constructed and passed through Gated Recurrent Unit (GRU) networks to obtain prediction results. All experiments are conducted on four real-world datasets to evaluate our proposed model, and the results show that EnDKVMN-DSC outperforms the other four state-of-the-art KT models based on DKT or DKVMN in predicting student performance.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":"124 1","pages":""},"PeriodicalIF":5.4,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142181641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image Retrieval Using Multilayer Feature Aggregation Histogram","authors":"Fen Lu, Guang-Hai Liu, Xiao-Zhi Gao","doi":"10.1007/s12559-024-10334-9","DOIUrl":"https://doi.org/10.1007/s12559-024-10334-9","url":null,"abstract":"<p>Aggregating the diverse features into a compact representation is a hot issue in image retrieval. However, aggregating the differential feature of multilayer into a discriminative representation remains challenging. Inspired by the value-guided neural mechanisms, a novel representation method, namely, the <i>multilayer feature aggregation histogram</i> was proposed to image retrieval. It can aggregate multilayer features, such as low-, mid-, and high-layer features, into a discriminative yet compact representation via simulating the neural mechanisms that mediate the ability to make value-guided decisions. The highlights of the proposed method have the following: (1) A <i>detail-attentive map</i> was proposed to represent the aggregation of low- and mid-layer features. It can be well used to evaluate the distinguishable detail feature. (2) A simple yet straightforward aggregation method is proposed to re-evaluate the distinguishable high-layer feature. It can provide aggregated features including detail, object, and semantic by using <i>semantic-attentive map</i>. (3) A novel whitening method, namely <i>difference whitening</i>, is introduced to reduce dimensionality. It did not need to seek a training dataset of semantical similarity and can provide a compact yet discriminative representation. Experiments on the popular benchmark datasets demonstrate the proposed method can obviously increase retrieval performance in terms of mAP metric. The proposed method using 128-dimensionality representation can provide significantly higher mAPs than the DSFH, DWDF, and OSAH methods by 0.083, 0.043, and 0.022 on the Oxford5k dataset and by 0.195, 0.036, and 0.071 on the Paris6k dataset. The difference whitening method can conveniently transfer the deep learning model to a new task. Our method provided competitive performance compared with the existing aggregation methods and can retrieve scene images with similar colors, objects, and semantics.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":"2 1","pages":""},"PeriodicalIF":5.4,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142181640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Iti Chaturvedi, Vlad Pandelea, Erik Cambria, Roy Welsch, Bithin Datta
{"title":"Barrier Function to Skin Elasticity in Talking Head","authors":"Iti Chaturvedi, Vlad Pandelea, Erik Cambria, Roy Welsch, Bithin Datta","doi":"10.1007/s12559-024-10344-7","DOIUrl":"https://doi.org/10.1007/s12559-024-10344-7","url":null,"abstract":"<p>In this paper, we target the problem of generating facial expressions from a piece of audio. This is challenging since both audio and video have inherent characteristics that are distinct from the other. Some words may have identical lip movements, and speech impediments may prevent lip-reading in some individuals. Previous approaches to generating such a talking head suffered from stiff expressions. This is because they focused only on lip movements and the facial landmarks did not contain the information flow from the audio. Hence, in this work, we employ spatio-temporal independent component analysis to accurately sync the audio with the corresponding face video. Proper word formation also requires control over the face muscles that can be captured using a barrier function. We first validated the approach on the diffusion of salt water in coastal areas using a synthetic finite element simulation. Next, we applied it to 3D facial expressions in toddlers for which training data is difficult to capture. Prior knowledge in the form of rules is specified using Fuzzy logic, and multi-objective optimization is used to collectively learn a set of rules. We observed significantly higher F-measure on three real-world problems.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":"116 1","pages":""},"PeriodicalIF":5.4,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142181642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scrutinizing Label: Contrastive Learning on Label Semantics and Enriched Representation for Relation Extraction","authors":"Zhenyu Zhou, Qinghua Zhang, Fan Zhao","doi":"10.1007/s12559-024-10338-5","DOIUrl":"https://doi.org/10.1007/s12559-024-10338-5","url":null,"abstract":"<p>Sentence-level relation extraction is a technique for extracting factual information about relationships between entities from a sentence. However, the customary method overlooks the semantic information conveyed by the label itself, thereby compromising the efficacy of rare types. Furthermore, there is a growing interest in exploring the use of textual information as a crucial resource to enhance RE models for more effectiveness. To address these two issues, CLERE (<i>C</i>ontrastive <i>L</i>earning and <i>E</i>nriched Representation for <i>R</i>elation <i>E</i>xtraction) based on contrastive learning and enriched representation of context is proposed. Firstly, by contrastive learning to incorporate semantic information of labels, CLERE is able to effectively convey and exploit the underlying semantics of various sample categories. Thereby enhancing its semantics understanding and classification capabilities, the issue of misclassification due to data imbalance is alleviated. Secondly, both semantics of context and positional information of tagged entities are enhanced by employing weighted layer pooling on pre-trained language models, which improves the representation of context and entity mentions. Experiments are conducted on three public dataset to authenticate the effectiveness of CLERE. The results demonstrate that the proposed model outperforms existing mainstream baseline methods significantly.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":"50 1","pages":""},"PeriodicalIF":5.4,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142181644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Shift-Reduce Task-Oriented Semantic Parsing with Stack-Transformers","authors":"Daniel Fernández-González","doi":"10.1007/s12559-024-10339-4","DOIUrl":"https://doi.org/10.1007/s12559-024-10339-4","url":null,"abstract":"<p>Intelligent voice assistants, such as Apple Siri and Amazon Alexa, are widely used nowadays. These task-oriented dialogue systems require a semantic parsing module in order to process user utterances and understand the action to be performed. This semantic parsing component was initially implemented by rule-based or statistical slot-filling approaches for processing simple queries; however, the appearance of more complex utterances demanded the application of shift-reduce parsers or sequence-to-sequence models. Although shift-reduce approaches were initially considered the most promising option, the emergence of sequence-to-sequence neural systems has propelled them to the forefront as the highest-performing method for this particular task. In this article, we advance the research on shift-reduce semantic parsing for task-oriented dialogue. We implement novel shift-reduce parsers that rely on Stack-Transformers. This framework allows to adequately model transition systems on the transformer neural architecture, notably boosting shift-reduce parsing performance. Furthermore, our approach goes beyond the conventional top-down algorithm: we incorporate alternative bottom-up and in-order transition systems derived from constituency parsing into the realm of task-oriented parsing. We extensively test our approach on multiple domains from the Facebook TOP benchmark, improving over existing shift-reduce parsers and state-of-the-art sequence-to-sequence models in both high-resource and low-resource settings. We also empirically prove that the in-order algorithm substantially outperforms the commonly used top-down strategy. Through the creation of innovative transition systems and harnessing the capabilities of a robust neural architecture, our study showcases the superiority of shift-reduce parsers over leading sequence-to-sequence methods on the main benchmark.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":"65 1","pages":""},"PeriodicalIF":5.4,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142181645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Diagnostic Potential of Eye Movements in Alzheimer’s Disease via a Multiclass Machine Learning Model","authors":"Jiaqi Song, Haodong Huang, Jiarui Liu, Jiani Wu, Yingxi Chen, Lisong Wang, Fuxin Zhong, Xiaoqin Wang, Zihan Lin, Mengyu Yan, Wenbo Zhang, Xintong Liu, Xinyi Tang, Yang Lü, Weihua Yu","doi":"10.1007/s12559-024-10346-5","DOIUrl":"https://doi.org/10.1007/s12559-024-10346-5","url":null,"abstract":"<p>Early diagnosis plays a crucial role in controlling Alzheimer’s disease (AD) progression and delaying cognitive decline. Traditional diagnostic tools present great challenges to clinical practice due to their invasiveness, high cost, and time-consuming administration. This study was designed to construct a non-invasive and cost-effective classification model based on eye movement parameters to distinguish dementia due to AD (ADD), mild cognitive impairment (MCI), and normal cognition. Eye movement data were collected from 258 subjects, comprising 111 patients with ADD, 81 patients with MCI, and 66 individuals with normal cognition. The fixation, smooth pursuit, prosaccade, and anti-saccade tasks were performed. Machine learning methods were used to screen eye movement parameters and build diagnostic models. Pearson’s correlation analysis was used to assess the correlations between the five most important eye movement indicators in the optimal model and neuropsychological scales. The gradient boosting classifier model demonstrated the best classification performance, achieving 68.2% of accuracy and 66.32% of F1-score in multiclass classification of AD. Moreover, the correlation analysis indicated that the eye movement parameters were associated with various cognitive functions, including general cognitive status, attention, visuospatial ability, episodic memory, short-term memory, and language and instrumental activities of daily life. Eye movement parameters in conjunction with machine learning methods achieve satisfactory overall accuracy, making it an effective and less time-consuming method to assist clinical diagnosis of AD.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":"22 1","pages":""},"PeriodicalIF":5.4,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142181643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Krishna Raj S R, Srinivasa Chakravarthy V, Anindita Sahoo
{"title":"From Pixels to Prepositions: Linking Visual Perception with Spatial Prepositions Far and Near","authors":"Krishna Raj S R, Srinivasa Chakravarthy V, Anindita Sahoo","doi":"10.1007/s12559-024-10329-6","DOIUrl":"https://doi.org/10.1007/s12559-024-10329-6","url":null,"abstract":"<p>Human language is influenced by sensory-motor experiences. Sensory experiences gathered in a spatiotemporal world are used as raw material to create more abstract concepts. In language, one way to encode spatial relationships is through spatial prepositions. Spatial prepositions that specify the proximity of objects in space, like <i>far</i> and <i>near</i> or their variants, are found in most languages. The mechanism for determining the proximity of another entity to itself is a useful evolutionary trait. From the taxic behavior in unicellular organisms like bacteria to the tropism in the plant kingdom, this behavior can be found in almost all organisms. In humans, vision plays a critical role in spatial localization and navigation. This computational study analyzes the relationship between vision and spatial prepositions using an artificial neural network. For this study, a synthetic image dataset was created, with each image featuring a 2D projection of an object placed in 3D space. The objects can be of various shapes, sizes, and colors. A convolutional neural network is trained to classify the object in the images as <i>far</i> or <i>near</i> based on a set threshold. The study mainly explores two visual scenarios: objects confined to a plane (<b>grounded</b>) and objects not confined to a plane (<b>ungrounded</b>), while also analyzing the influence of camera placement. The classification performance is high for the grounded case, demonstrating that the problem of <i>far/near</i> classification is well-defined for grounded objects, given that the camera is at a sufficient height. The network performance showed that depth can be determined in grounded cases only from monocular cues with high accuracy, given the camera is at an adequate height. The difference in the network’s performance between grounded and ungrounded cases can be explained using the physical properties of the retinal imaging system. The task of determining the distance of an object from individual images in the dataset is challenging as they lack any background cues. Still, the network performance shows the influence of spatial constraints placed on the image generation process in determining depth. The results show that monocular cues significantly contribute to depth perception when all the objects are confined to a single plane. A set of sensory inputs (images) and a specific task (<i>far/near</i> classification) allowed us to obtain the aforementioned results. The visual task, along with reaching and motion, may enable humans to carve the space into various spatial prepositional categories like <i>far</i> and <i>near</i>. The network’s performance and how it learns to classify between <i>far</i> and <i>near</i> provided insights into certain visual illusions that involve size constancy.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":"42 1","pages":""},"PeriodicalIF":5.4,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142181646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cognitive Analysis of Medical Decision-Making: An Extended MULTIMOORA-Based Multigranulation Probabilistic Model with Evidential Reasoning","authors":"Wenhui Bai, Chao Zhang, Yanhui Zhai, Arun Kumar Sangaiah, Baoli Wang, Wentao Li","doi":"10.1007/s12559-024-10340-x","DOIUrl":"https://doi.org/10.1007/s12559-024-10340-x","url":null,"abstract":"<p>Cognitive computation has leveraged the capabilities of computer algorithms, rendering it an exceptionally efficient approach for addressing multi-attribute group decision-making (MAGDM) problems. Due to the stability of MULTIMOORA (Multi-Objective Optimization by Ratio Analysis plus the full MULTIplicative form) and the capability of evidential reasoning (ER) to combine information from multiple sources, the technique of multigranulation probabilistic rough sets (MG PRSs) holds great promise for solving MAGDM problems. Thus, a new and stable method for MAGDM is proposed. Initially, three forms of multigranulation Pythagorean fuzzy probabilistic rough sets (MG PF PRSs) are constructed using MULTIMOORA approaches. Next, the hierarchical clustering method is employed to cluster similar decision information and consolidate the decision-makers’ preferences. Representatives are chosen from each category to simplify information fusion calculations and reduce complexity by reducing the model’s dimensionality. Following that, the rankings obtained from the three methods are fused using ER. Ultimately, the validity of our method is revealed via a case analysis on chickenpox cases from the UCI data set by employing cognitive analysis. The paper outlines a method for MAGDM that provides significant advantages. Specifically, the use of MULTIMOORA improves the stability of decision results, while the incorporation of ER reduces the overall uncertainty of entire decision processes.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":"20 1","pages":""},"PeriodicalIF":5.4,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142223810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}