Yifan Kou, Fangzhen Ge, Debao Chen, Longfeng Shen, Huaiyu Liu
{"title":"An Enhanced Cross-Attention Based Multimodal Model for Depression Detection","authors":"Yifan Kou, Fangzhen Ge, Debao Chen, Longfeng Shen, Huaiyu Liu","doi":"10.1111/coin.70019","DOIUrl":"https://doi.org/10.1111/coin.70019","url":null,"abstract":"<div>\u0000 \u0000 <p>Depression, a prevalent mental disorder in modern society, significantly impacts people's daily lives. Recently, there have been advancements in developing automated diagnosis models for detecting depression. However, data scarcity, primarily due to privacy concerns, has posed a challenge. Traditional speech features have limitations in representing knowledge for depression diagnosis, and the complexity of deep learning algorithms necessitates substantial data support. Furthermore, existing multimodal methods based on neural networks overlook the heterogeneity gap between different modalities, potentially resulting in redundant information. To address these issues, we propose a multimodal depression detection model based on the Enhanced Cross-Attention (ECA) Mechanism. This model effectively explores text-speech interactions while considering modality heterogeneity. Data scarcity has been mitigated by fine-tuning pre-trained models. Additionally, we design a modal fusion module based on ECA, which emphasizes similarity responses and updates the weight of each modal feature based on the similarity information between modal features. Furthermore, for speech feature extraction, we have reduced the computational complexity of the model by integrating a multi-window self-attention mechanism with the Fourier transform. The proposed model is evaluated on the public dataset, DAIC-WOZ, achieving an accuracy of 80.0% and an average <i>F</i>1 value improvement of 4.3% compared with relevant methods.</p>\u0000 </div>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"41 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143114732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Veesam Pavan Kumar, Satya Ranjan Pattanaik, V. V. Sunil Kumar
{"title":"A Heuristic Strategy Assisted Deep Learning Models for Brain Tumor Classification and Abnormality Segmentation","authors":"Veesam Pavan Kumar, Satya Ranjan Pattanaik, V. V. Sunil Kumar","doi":"10.1111/coin.70018","DOIUrl":"https://doi.org/10.1111/coin.70018","url":null,"abstract":"<div>\u0000 \u0000 <p>Brain tumors are prevalent forms of malignant neoplasms that, depending on their type, location, and grade, can significantly reduce life expectancy due to their invasive nature and potential for rapid progression. Accordingly, brain tumors classification is an essential step that allows doctors to perform appropriate treatment. Many studies have been done in the sector of medical image processing by employing computational methods to effectively segment and classify tumors. However, the larger amount of information collected by healthcare images prohibits the manual segmentation process in a reasonable time frame, reducing error measures in healthcare settings. Therefore, automated and efficient techniques for segmentation are crucial. In addition, various visual information, noisy images, occlusion, uneven image textures, confused objects, and other features may impact the process. Therefore, the implementation of deep learning provides remarkable results in medicinal image processing, particularly in the segmentation and classification process. However, conventional deep learning-assisted methods struggle with complex structures and dimensional issues. Thus, this paper develops an effective technique for diagnosing brain tumors. The main aspect of the proposed system is to classify the brain tumor types by segmenting the affected regions of the raw images. This novel approach can be applied for various applications like diagnostic centers, decision-making tools, clinical trials, medical research institutes, disease prognosis, and so on. Initially, the requisite images are collected from standard datasets and further, it is subjected to the segmentation period. In this stage, the Multi-scale and Dilated TransUNet++ (MDTUNet++) model is employed to segment the abnormalities. Further, the segmented images are given into an Adaptive Dilated Dense Residual Attention Network (ADDRAN) to classify the brain tumor types. Here, to optimize the ADDRAN technique's parameters, an Improved Hermit Crab Optimizer (IHCO) is supported, which increases the accuracy rates of the overall network. Finally, the numerical examination is conducted to guarantee the robustness and usefulness of the designed model by contrasting it with other related techniques. For Dataset 1, the accuracy value attains 93.71 for the proposed work compared to 87.86 for CNN, 90.18 for DenseNet, and 89.56 and 90.96 for RAN and DRAN, respectively. Thus, supremacy has been achieved for the recommended system while detecting the brain tumor types.</p>\u0000 </div>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"41 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143114731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-Time Nail-Biting Detection on a Smartwatch Using Three CNN Models Pipeline","authors":"Abdullah Alesmaeil, Eftal Şehirli","doi":"10.1111/coin.70020","DOIUrl":"https://doi.org/10.1111/coin.70020","url":null,"abstract":"<div>\u0000 \u0000 <p>Nail-biting (NB) or onychophagia is a compulsive disorder that affects millions of people in both children and adults. It has several health complications and negative social effects. Treatments include surgical interventions, pharmacological medications, or additionally, it can be treated using behavioral modification therapies that utilize positive reinforcement and periodical reminders. Although it is the least invasive, such therapies still depend on manual monitoring and tracking which limits their success. In this work, we propose a novel approach for automatic real-time NB detection and alert on a smartwatch that does not require surgical intervention, medications, or manual habit monitoring. It addresses two key challenges: First, NB actions generate subtle motion patterns at the wrist that lead to a high false-positives (FP) rate even when the hand is not on the face. Second, is the challenge to run power-intensive applications on a power-constrained edge device like a smartwatch. To overcome these challenges, our proposed approach implements a pipeline of three convolutional neural networks (CNN) models instead of a single model. The first two models are small and efficient, designed to detect face-touch (FT) actions and hand movement away (MA) from the face. The third model is a larger and deeper CNN model dedicated to classifying hand actions on the face and detecting NB actions. This separation of tasks addresses the key challenges: decreasing FPs by ensuring NB model is activated only when the hand on the face, and optimizing power usage by ensuring the larger NB model runs only for short periods while the efficient FT model runs most of the time. In addition, this separation of tasks gives more freedom to design, configure, and optimize the three models based on each model task. Lastly, for training the main NB model, this work presents further optimizations including developing NB dataset from start through a dedicated data collection application, applying data augmentation, and utilizing several CNN optimization techniques during training. Results show that the model pipeline approach minimizes FPs significantly compared with the single model for NB detection while improving the overall efficiency.</p>\u0000 </div>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"41 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143114730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Deep Learning Based Dual Watermarking System for Securing Healthcare Data","authors":"Kumari Suniti Singh, Harsh Vikram Singh","doi":"10.1111/coin.70011","DOIUrl":"https://doi.org/10.1111/coin.70011","url":null,"abstract":"<div>\u0000 \u0000 <p>The sharing of patient information on an open network has drawn attention to the healthcare system. Security is the primary issue while sharing documents online. Thus, a dual watermarking technique has been developed to improve the security of shared data. The classical watermarking schemes are resilient to many attacks. Protecting the authenticity and copyrights of medical images is essential to prevent duplication, modification, or unauthorized distribution. This paper proposes a robust, novel dual watermarking system for securing healthcare data. Initially, watermarking is performed based on redundant lifting wavelet transform (LWT) and turbo code decomposition for COVID-19 patient images and patient text data. To achieve a high level of authenticity, watermarks in the form of encoded text data and decomposed watermark images are inserted together, and an inverse LWT is used to generate an initial watermarked image. Improve imperceptibility and robustness by incorporating the cover image into the watermarked image. Cross-guided bilateral filtering (CG_BF) improves cover image quality, while the integrated Walsh–Hadamard transform (IWHT) extracts features. The novel adaptive coati optimization (ACO) technique is used to identify the ideal location for the watermarked image in the cover image. To improve security, the watermarked image is dissected using discrete wavelet transform (DWT) and encrypted with a chaotic extended logistic system. Finally, the encrypted watermarked image is implanted in the desired place using a novel deep-learning model based on the Hybrid Convolutional Cascaded Capsule Network (HC<sup>3</sup>Net). Thus, the secured watermarked image is obtained, and the watermark and text data are extracted using the decryption and inverse DWT procedure. The performance of the proposed method is evaluated using accuracy, peak signal-to-noise ratio (PSNR), NC, and other metrics. The proposed method achieved an accuracy of 99.26%, which is greater than the existing methods.</p>\u0000 </div>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"41 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143112640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Automated Recommendation System for Crowdsourcing Data Using Improved Heuristic-Aided Residual Long Short-Term Memory","authors":"K. Dhinakaran, R. Nedunchelian","doi":"10.1111/coin.70017","DOIUrl":"https://doi.org/10.1111/coin.70017","url":null,"abstract":"<div>\u0000 \u0000 <p>In recent years, crowdsourcing has developed into a business production paradigm and a distributed problem-solving platform. However, the conventional machine learning models failed to assist both requesters and workers in finding the proper jobs that affect better quality outputs. The traditional large-scale crowdsourcing systems typically involve a lot of microtasks, and it requires more time for a crowdworker to search a work on this platform. Thus, task suggestion methods are more useful. Yet, the traditional approaches do not consider the cold-start issue. To tackle these issues, in this paper, a new recommendation system for crowdsourcing data is implemented utilizing deep learning. Initially, from the standard online sources, the crowdsourced data are accumulated. The novelty of the model is to propose an adaptive residual long short-term memory (ARes-LSTM) that learns the task's latent factor via the task features rather than the task ID. Here, this network's parameters are optimized by the fitness-based drawer algorithm (F-DA) to improve the efficacy rates. Further, the suggested ARes-LSTM is adopted to detect the user's preference score based on the user's historical behaviors. According to the historical behavior records of the users and task features, the ARes-LSTM provides personalized task recommendations and rectifies the issue of cold-start. From the outcomes, the better accuracy rate of the implemented model is 91.42857. Consequently, the accuracy rate of the traditional techniques such as AOA, TSA, BBRO, and DA is attained as 84.07, 85.42, 87.07, and 90.07. Finally, the simulation of the implemented recommendation system is conducted with various traditional techniques with standard efficiency metrics to show the supremacy of the designed recommendation system. Thus, it is proved that the developed recommendation system for the crowdsourcing data model chooses intended tasks based on individual preferences that can help to enlarge the number of chances to engage in crowdsourcing efforts across a broad range of platforms.</p>\u0000 </div>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"41 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143112656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-Time Single Channel Speech Enhancement Using Triple Attention and Stacked Squeeze-TCN","authors":"Chaitanya Jannu, Manaswini Burra, Sunny Dayal Vanambathina, Veeraswamy Parisae","doi":"10.1111/coin.70016","DOIUrl":"https://doi.org/10.1111/coin.70016","url":null,"abstract":"<div>\u0000 \u0000 <p>Speech enhancement is crucial in many speech processing applications. Recently, researchers have been exploring ways to improve performance by effectively capturing the long-term contextual relationships within speech signals. Using multiple stages of learning, where several deep learning modules are activated one after the other, has been shown to be an effective approach. Recently, the attention mechanism has been explored for improving speech quality, showing significant improvements. The attention modules have been developed to improve CNNs backbone network performance. However, these attention modules often use fully connected (FC) and convolution layers, which increase the model's parameter count and computational requirements. The present study employs multi-stage learning within the framework of speech enhancement. The proposed study uses a multi-stage structure in which a sequence of Squeeze temporal convolutional modules (STCM) with twice dilation rates comes after a Triple attention block (TAB) at each stage. An estimate is generated at each phase and refined in the subsequent phase. To reintroduce the original information, a feature fusion module (FFM) is inserted at the beginning of each following phase. In the proposed model, the intermediate output can go through several phases of step-by-step improvement by continually unfolding STCMs, which eventually leads to the precise estimation of the spectrum. A TAB is crafted to enhance the model performance, allowing it to concurrently concentrate on areas of interest in the channel, spatial, and time-frequency dimensions. To be more specific, the CSA has two parallel regions combining channel with spatial attention, enabling both the channel dimension and the spatial dimension to be captured simultaneously. Next, the signal can be emphasized as a function of time and frequency by aggregating the feature maps along these dimensions. This improves its capability to model the temporal dependencies of speech signals. Using the VCTK and Librispeech datasets, the proposed speech enhancement system is assessed against state-of-the-art deep learning techniques and yielded better results in terms of PESQ, STOI, CSIG, CBAK, and COVL.</p>\u0000 </div>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"41 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143112659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multimodal Integration of Mel Spectrograms and Text Transcripts for Enhanced Automatic Speech Recognition: Leveraging Extractive Transformer-Based Approaches and Late Fusion Strategies","authors":"Sunakshi Mehra, Virender Ranga, Ritu Agarwal","doi":"10.1111/coin.70012","DOIUrl":"https://doi.org/10.1111/coin.70012","url":null,"abstract":"<div>\u0000 \u0000 <p>This research endeavor aims to advance the field of Automatic Speech Recognition (ASR) by innovatively integrating multimodal data, specifically textual transcripts and Mel Spectrograms (2D images) obtained from raw audio. This study explores the less-explored potential of spectrograms and linguistic information in enhancing spoken word recognition accuracy. To elevate ASR performance, we propose two distinct transformer-based approaches: First, for the audio-centric approach, we leverage RegNet and ConvNeXt architectures, initially trained on a massive dataset of 14 million annotated images from ImageNet, to process Mel Spectrograms as image inputs. Second, we harness the Speech2Text transformer to decouple text transcript acquisition from raw audio. We pre-process Mel Spectrogram images, resizing them to 224 × 224 pixels to create two-dimensional audio representations. ImageNet, RegNet, and ConvNeXt individually categorize these images. The first channel generates the embeddings for visual modalities (RegNet and ConvNeXt) on 2D Mel Spectrograms. Additionally, we employ Sentence-BERT embeddings via Siamese BERT networks to transform Speech2Text transcripts into vectors. These image embeddings, along with Sentence-BERT embeddings from speech transcription, are subsequently fine-tuned within a deep dense model with five layers and batch normalization for spoken word classification. Our experiments focus on the Google Speech Command Dataset (GSCD) version 2, encompassing 35-word categories. To gauge the impact of spectrograms and linguistic features, we conducted an ablation analysis. Our novel late fusion strategy unites word embeddings and image embeddings, resulting in remarkable test accuracy rates of 95.87% for ConvNeXt, 99.95% for RegNet, and 85.93% for text transcripts across the 35-word categories, as processed by the deep dense layered model with Batch Normalization. We obtained a test accuracy of 99.96% for 35-word categories after using the late fusion of ConvNeXt + RegNet + SBERT, demonstrating superior results compared to other state-of-the-art methods.</p>\u0000 </div>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"40 6","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142861814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SHREA: A Systematic Hybrid Resampling Ensemble Approach Using One Class Classifier","authors":"Pranita Baro, Malaya Dutta Borah","doi":"10.1111/coin.70004","DOIUrl":"https://doi.org/10.1111/coin.70004","url":null,"abstract":"<div>\u0000 \u0000 <p>Imbalanced classification and data incompleteness are two critical issues in machine learning that, despite significant research, are difficult to solve. This paper presents the Systematic Hybrid Resampling Ensemble Approach that deals with the class imbalance and incompleteness of data at a given dataset and improves classification performance. We use an oscillator-guided Factor Based Multiple Imputation Oversampling technique to balance out the minority and majority data samples, while substituting missing values in the dataset. The improved dataset is an oversampled dataset and it goes through random undersample to create majority and minority class subsets. These subsets are then trained with the classifiers using one of the One Class Classifier-based methods, that is, One Class Support Vector Machine or Local Outlier Factor. Lastly, bootstrap aggregation ensemble setups are done using majority and minority class classifiers and combining them to come up with a score-based prediction. To mimic real-life scenarios where data could be missing, we introduce random missing values on each of these imbalance datasets to create <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mn>3</mn>\u0000 </mrow>\u0000 <annotation>$$ 3 $$</annotation>\u0000 </semantics></math> new sets from each dataset with different missing values, that is, (10%, 20%, and 30%). The proposed method is experimented with using datasets taken from the KEEL website, and the results are compared against RBG, SBG, SBT, DTE, and EUS. Experimental analysis shows that the proposed approach gives better results revealing the efficiency and significance compared to the existing methods. The proposed method Local Outlier Factor Systematic Hybrid Resampling Ensemble Approach improves by 3.46%, 5.30%, 10.51% and 9.26% in terms of Recall, AUC, f-measure and g-mean and One Class Support Vector Machine Systematic Hybrid Resampling Ensemble Approach by 4.82%, 5.95%, 11.03% and 8.80% respectively.</p>\u0000 </div>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"40 6","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142861811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Automated Histopathological Colorectal Cancer Multi-Class Classification System Based on Optimal Image Processing and Prominent Features","authors":"Tasnim Jahan Tonni, Shakil Rana, Kaniz Fatema, Asif Karim, Md. Awlad Hossen Rony, Md. Zahid Hasan, Md. Saddam Hossain Mukta, Sami Azam","doi":"10.1111/coin.70007","DOIUrl":"https://doi.org/10.1111/coin.70007","url":null,"abstract":"<div>\u0000 \u0000 <p>Colorectal cancer (CRC) is characterized by the uncontrollable growth of cancerous cells within the rectal mucosa. In contrast, colon polyps, precancerous growths, can develop into colon cancer, causing symptoms like rectal bleeding, abdominal pain, diarrhea, weight loss, and constipation. It is the leading cause of death worldwide, and this potentially fatal cancer severely afflicts the elderly. Furthermore, early diagnosis is crucial for effective treatment, as it is often more time-consuming and laborious for experts. This study improved the accuracy of CRC multi-class classification compared to previous research utilizing diverse datasets, such as NCT-CRC-HE-100 K (100,000 images) and CRC-VAL-HE-7 K (7,180 images). Initially, we utilized various image processing techniques on the NCT-CRC-HE-100 K dataset to improve image quality and noise-freeness, followed by multiple feature extraction and selection methods to identify prominent features from a large data hub and experimenting with different approaches to select the best classifiers for these critical features. The third ensemble model (XGB-LightGBM-RF) achieved an optimum accuracy of 99.63% with 40 prominent features using univariate feature selection methods. Moreover, the third ensemble model also achieved 99.73% accuracy from the CRC-VAL-HE-7 K dataset. After combining two datasets, the third ensemble model achieved 99.27% accuracy. In addition, we trained and tested our model with two different datasets. We used 80% data from NCT-CRC-HE-100 K and 20% data from CRC-VAL-HE-7 K, respectively, for training and testing purposes, while the third ensemble model obtained 98.43% accuracy in multi-class classification. The results show that this new framework, which was created using the third ensemble model, can help experts figure out what kinds of CRC diseases people are dealing with at the very beginning of an investigation.</p>\u0000 </div>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"40 6","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142861764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Owen Chambers, Robin Cohen, Maura R. Grossman, Liam Hebert, Elias Awad
{"title":"Mining User Study Data to Judge the Merit of a Model for Supporting User-Specific Explanations of AI Systems","authors":"Owen Chambers, Robin Cohen, Maura R. Grossman, Liam Hebert, Elias Awad","doi":"10.1111/coin.70015","DOIUrl":"https://doi.org/10.1111/coin.70015","url":null,"abstract":"<p>In this paper, we present a model for supporting user-specific explanations of AI systems. We then discuss a user study that was conducted to gauge whether the decisions for adjusting output to users with certain characteristics was confirmed to be of value to participants. We focus on the merit of having explanations attuned to particular psychological profiles of users, and the value of having different options for the level of explanation that is offered (including allowing for no explanation, as one possibility). Following the description of the study, we present an approach for mining data from user participant responses in order to determine whether the model that was developed for varying the output to users was well-founded. While our results in this respect are preliminary, we explain how using varied machine learning methods is of value as a concrete step toward validation of specific approaches for AI explanation. We conclude with a discussion of related work and some ideas for new directions with the research, in the future.</p>","PeriodicalId":55228,"journal":{"name":"Computational Intelligence","volume":"40 6","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/coin.70015","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142861637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}